Rate Limiting with Redis

May 21, 2023

A rate limiter has to answer two related questions: what algorithm matches your traffic, and where do you keep the counters. The right answers depend on how much traffic you’re handling and whether multiple services need to share the same state.

Naive approach

Start with a fixed window algorithm and an in-memory map from IP address to request count in the current window. For most low-traffic applications this is fine and you can stop here.

It has two known issues. The first is edge bursts: with a rate of 100 req/minute, a client can make 100 requests at 11:59:59 and another 100 at 12:00:01, getting 200 requests through in two seconds. The second is memory growth: without periodic pruning, the map keeps accumulating clients that haven’t been seen in a long time.

Smoothing the curve

The sliding window algorithm tracks request timestamps within a rolling window. As each new request comes in, entries that fall outside the window are dropped and the new one is added.

This smooths out bursts at the cost of more bookkeeping per decision. For high-traffic services where bursts are the actual concern, it’s worth the extra work.

Offloading to Redis

Once the limiter outgrows a single process, the next step is to move both the counters and the decision logic out of the application. Redis can handle both: it stores the per-client state and runs the limiter logic as a Lua script that executes atomically against that state.

Here’s an implementation using Go:

package ratelimit

import (
  "context"
  "fmt"
  "time"

  "github.com/redis/go-redis/v9"
)

type Limiter struct {
  client *redis.Client
  burst  int
  window time.Duration
}

// Lua script for atomic sliding window implementation
const limiterScript = `
    local key = KEYS[1]
    local now = tonumber(ARGV[1])
    local window_start = tonumber(ARGV[2])
    local limit = tonumber(ARGV[3])
    local window_ms = tonumber(ARGV[4])

    -- Remove old entries
    redis.call('ZREMRANGEBYSCORE', key, 0, window_start)

    -- Count current window
    local current = redis.call('ZCOUNT', key, window_start, now)

    -- Set key expiration
    redis.call('EXPIRE', key, window_ms / 1000)

    if current < limit then
      -- Add the current request with unique identifier
      -- Use timestamp as score, but unique member to handle concurrent requests
      local request_id = now .. ':' .. redis.call('INCR', key .. ':seq')
      redis.call('ZADD', key, now, request_id)
      redis.call('EXPIRE', key, window_ms / 1000)
      redis.call('EXPIRE', key .. ':seq', window_ms / 1000)
      return 1
    end

    return 0
  `

// NewLimiter creates a new Redis-backed sliding window rate limiter
// requestsPerSecond: allowed requests per second (e.g., 10 req/s)
// burst: maximum requests allowed in the window
func NewLimiter(redisAddr string, requestsPerSecond float64, burst int) *Limiter {
  return &Limiter{
    client: redis.NewClient(&redis.Options{
      Addr: redisAddr,
    }),
    burst:  burst,
    window: time.Duration(float64(burst)/requestsPerSecond) * time.Second,
  }
}

// Allow checks if a request from the given identifier should be allowed
func (l *Limiter) Allow(identifier string) (bool, error) {
  ctx := context.Background()
  now := time.Now().UnixMilli()
  windowStart := now - int64(l.window.Milliseconds())

  result, err := l.client.Eval(ctx, limiterScript,
    []string{identifier},
    now,
    windowStart,
    l.burst,
    l.window.Milliseconds(),
  ).Int()

  if err != nil {
    return false, fmt.Errorf("rate limiter error: %w", err)
  }

  return result == 1, nil
}

The window duration is derived from the rate and burst parameters. With requestsPerSecond=10 and burst=50, you get a five-second sliding window that allows up to 50 requests within it.

Every key written by the script gets an explicit TTL, which keeps Redis memory bounded even when traffic shifts across millions of unique identifiers.

This setup keeps the application servers stateless, so you can add or remove them without worrying about which one a client last hit. The limiter decision happens in a single Redis round trip, which is usually fast enough to disappear into the noise of the rest of the request.