Rate Limiting with Redis and Node.js: Under the Hood

Last updated: February 10th 2025

Introduction

Distributed Denial of Service (DDoS) attacks disrupt online service availability by overwhelming target resources with malicious traffic from multiple sources. Unlike single-source Denial of Service (DoS), DDoS uses botnets, networks of compromised devices commanded by attackers, to amplify the attack volume and complicate mitigation.

Botnets, often composed of computers, IoT devices, and servers infected with malware, are instructed to flood a target server, network, or application with requests. This orchestrated deluge of traffic surpasses the target's capacity to handle legitimate requests, exhausting resources like bandwidth, processing power, and memory. Consequently, legitimate users experience slow loading times, service errors, and complete denial of access.

DDoS attacks are categorized by their attack vectors:

Volumetric attacks saturate network bandwidth using sheer data volume, exemplified by UDP floods, ICMP (ping) floods, and DNS amplification.

Protocol attacks exploit weaknesses in network protocols, consuming server resources through techniques like SYN floods, Smurf attacks, and fragmented packet attacks.

Application-layer attacks, also known as Layer 7 attacks, target specific application features, mimicking legitimate user behavior to exhaust server resources at the application level; examples include HTTP floods and slowloris attacks.

The impact of successful DDoS attacks can range from minor service disruptions to catastrophic outages. Businesses suffer financial losses due to downtime, lost transactions, and reputational damage.

Customer trust erodes with unavailability.

Critical infrastructure, including hospitals, financial institutions, and government services, faces severe operational disruptions, potentially impacting public safety and essential services.

Defense against DDoS necessitates a layered approach.

Network infrastructure enhancements include increasing bandwidth capacity and implementing robust firewalls and intrusion detection systems to filter malicious traffic.
Content Delivery Networks (CDNs) distribute content globally, absorbing attack traffic across a wider network. Specialized
DDoS mitigation services offer advanced traffic analysis, anomaly detection, and traffic scrubbing techniques to identify and block malicious requests before they reach the target infrastructure. Ongoing monitoring, threat intelligence, and incident response plans are crucial for proactive defense and rapid mitigation when attacks occur. DDoS remains a significant and evolving cyber threat, requiring continuous vigilance and adaptive security strategies.

Rate limiting is a crucial technique for protecting your APIs and applications from abuse, ensuring fair use, and keeping the system stable, and available. It controls the number of requests a user or client can make within a given time frame.

Without rate limiting, your services can be overwhelmed by malicious attacks (like DDos, denial-of-service) or simply by a surge in legitimate traffic that you didn't see coming, leading to performance degradation or even service outages.

Redis, with its speed and atomic operations, is an excellent choice for implementing robust and efficient rate limiting.

Let's dive into how we can achieve this with Redis and Node.js, exploring the details under the hood.

Core Concepts

Identifying the Client: A crucial first step in implementing rate limiting is establishing a reliable method for distinguishing between individual clients or users who are making requests to your system. This identification can be accomplished through various means, such as utilizing the client's IP address, assigning unique user IDs upon registration or login, employing API keys for authenticated access, or leveraging any other unique identifier that is appropriate and secure within the context of your specific application and its architecture. The chosen method should be consistent and dependable to ensure accurate tracking and enforcement of rate limits.
Tracking Requests: Once a client has been identified, it's essential to meticulously track the number of requests they make within a designated time frame. Redis, with its exceptional speed and in-memory data storage capabilities, is ideally suited for this task. It allows for efficient incrementing of request counters associated with each identified client, enabling real-time monitoring of request volume and facilitating the enforcement of rate limits without introducing significant performance overhead.
Defining the Rate Limit: Before implementing rate limiting, you must clearly define the rate limit policy that will govern access to your resources. This policy specifies the maximum number of requests a client is permitted to make within a given time interval.
Examples of such policies include:
"100 requests per minute per user," which limits each user to 100 requests every minute
"500 requests per hour per IP address," which restricts each IP address to 500 requests per hour.

The specific policy should be tailored to the needs of your application and consider factors like server capacity, resource availability, and the desired level of service.

Time Window: Rate limiting is inherently time-based, meaning that the request count is tracked and the limit is enforced over a specific duration known as the time window. This time window can be configured to various granularities, such as seconds, minutes, or hours, depending on the desired level of control and the nature of the application. Choosing the appropriate time window is essential for balancing responsiveness and preventing abuse.
Action on Limit Exceeded: When a client surpasses the predefined rate limit, it becomes necessary to determine the appropriate action to take. Several options exist, each with its own implications:
- Reject the request (HTTP 429 Too Many Requests): This is the most common and generally recommended approach. When a client exceeds the limit, their request is rejected, and they receive an HTTP 429 "Too Many Requests" status code, informing them that they have exceeded their quota. This approach is clear, direct, and provides immediate feedback to the client.
- Delay the request: Instead of immediately rejecting the request, you could briefly pause processing before allowing it to proceed. This can be useful in situations where temporary bursts of traffic are expected, but it should be implemented carefully to avoid creating a poor user experience.
- Throttle the request: Throttling involves reducing the priority or resources allocated to the request. This might involve slowing down the processing of the request or limiting the amount of data returned. Throttling can be used to gracefully handle excess traffic without completely denying access, but it can also lead to inconsistent performance.

What is Redis (in case if you don't know)

Redis's story began in 2009 with Salvatore Sanfilippo, known as Antirez, an Italian developer working on a real-time web analytics startup. Frustrated by the limitations of traditional databases in handling rapid data processing, he sought a faster, more flexible solution. This led to the creation of Redis, initially named RedisDB, conceived as a high-performance key-value cache.

Open-sourced under a BSD license, Redis quickly gained traction within the developer community due to its exceptional speed and versatility. Early adopters were drawn to its in-memory nature and rich set of data structures beyond simple key-value pairs, including lists, sets, and sorted sets.

Over the years, a vibrant open-source community grew around Redis, contributing to its evolution and feature set. Key milestones include the introduction of persistence options to complement in-memory storage, clustering for scalability, and advanced features like Lua scripting and Streams for real-time data processing.

Redis Commands for Rate Limiting

We primarily use two key Redis commands for building a rate limiter:

INCR <key> (Increment): This atomic command increments the integer value stored at <key> by 1. If the key doesn't exist, it's initialized to 0 before being incremented. Atomicity is critical here; even with concurrent requests, INCR ensures that the count is incremented correctly without race conditions.
EXPIRE <key> <seconds> (Set Expiration): This command sets a timeout (in seconds) on the <key>. After the specified time, the key is automatically deleted. This is essential for implementing the time window for our rate limit. When the time window expires, the request count effectively resets.
TTL <key> (Time To Live): This command returns the remaining time to live in seconds for a key that has a timeout. If the key does not have a timeout, it returns -1. If the key does not exist, it returns -2. We can use TTL to check if the time window for a client's rate limit is still active or has expired.

Node.js Implementation with `ioredis`

Let's create a Node.js example using a popular Redis client library, ioredis. First, ensure you have ioredis installed:

$ pnpm pnpm ioredis

Here's a code example for a basic rate limiter middleware in Express.js:

const Redis = require('ioredis');
const redis = new Redis(); // Connect to Redis (default connection)

const RATE_LIMIT_WINDOW_SECONDS = 60; // 1 minute window
const MAX_REQUESTS_PER_WINDOW = 100; // 100 requests per minute

async function rateLimitMiddleware(req, res, next) {
    const clientIdentifier = req.ip; // Or use user ID, API key, etc.

    const redisKey = `rate_limit:${clientIdentifier}`;

    try {
        const requestCount = await redis.incr(redisKey);

        if (requestCount === 1) {
            // First request in the window, set expiration
            await redis.expire(redisKey, RATE_LIMIT_WINDOW_SECONDS);
        } else if (requestCount > MAX_REQUESTS_PER_WINDOW) {
            // Rate limit exceeded
            const ttl = await redis.ttl(redisKey); // Get remaining time
            res.setHeader('Retry-After', ttl); // Inform client when to retry
            return res.status(429).send({
                error: 'Too many requests',
                message: `Rate limit exceeded. Please try again in ${ttl} seconds.`,
            });
        }

        // Request within limit, proceed
        next();

    } catch (error) {
        console.error("Redis rate limiting error:", error);
        return res.status(500).send({ error: 'Internal server error' }); // Handle Redis errors
    }
}

// Example Express.js application
const express = require('express');
const app = express();

app.use(rateLimitMiddleware); // Apply rate limiting middleware to all routes

app.get('/', (req, res) => {
    res.send('Hello, World! (Rate Limited)');
});

app.listen(3000, () => {
    console.log('Server listening on port 3000');
});

Code Explanation (Step-by-Step):

Redis = require('ioredis'); and const redis = new Redis();: Imports the ioredis library and creates a Redis client instance. This establishes a connection to your Redis server (assuming it's running on the default host and port).
RATE_LIMIT_WINDOW_SECONDS and MAX_REQUESTS_PER_WINDOW: These constants define our rate limit policy: 100 requests are allowed within a 60-second (1-minute) window. You can adjust these values as needed for your application.
rateLimitMiddleware(req, res, next): This is our Express.js middleware function that will intercept incoming requests and apply the rate-limiting logic.
const clientIdentifier = req.ip;: This line extracts the client's IP address (req.ip) as the identifier. Important: For production applications, consider using a more robust identifier like a user ID (if authenticated) or an API key, especially if you are behind a proxy (you might need to inspect headers like X-Forwarded-For to get the client's real IP). Using IP alone can be problematic if clients are behind a shared NAT.
const redisKey = 'rate_limit:${clientIdentifier}';: Constructs a unique Redis key to store the request count for each client. Prefixing it with rate_limit: helps with organization and potential later cleanup or monitoring.
const requestCount = await redis.incr(redisKey);: This is the core rate-limiting operation. It uses the redis.incr(redisKey) command to atomically increment the counter associated with the client. await is used because ioredis uses Promises for asynchronous Redis operations.
if (requestCount === 1): This condition checks if it's the first request in the current time window. If requestCount is 1, it means the key didn't exist before and INCR initialized it to 1. In this case, we need to set the expiration for the key to define the time window.
await redis.expire(redisKey, RATE_LIMIT_WINDOW_SECONDS);: If it's the first request, this line sets the expiration on redisKey to RATE_LIMIT_WINDOW_SECONDS (60 seconds in our example). This starts the time window for rate limiting for this client.
else if (requestCount > MAX_REQUESTS_PER_WINDOW): If requestCount is greater than MAX_REQUESTS_PER_WINDOW (100), it means the client has exceeded the rate limit within the current window.
const ttl = await redis.ttl(redisKey);: Before rejecting the request, we retrieve the remaining time-to-live (TTL) of the redisKey using redis.ttl(redisKey). This tells us how many seconds are left until the rate limit window resets.
res.setHeader('Retry-After', ttl);: We set the Retry-After header in the HTTP response. This is a standard HTTP header that informs the client how many seconds to wait before making another request. Well-behaved clients and proxies may respect this header and automatically retry later.
return res.status(429).send({...});: We send a 429 Too Many Requests status code, which is the standard HTTP status code for rate limiting. We also send a JSON response with error details and a user-friendly message, including the ttl value so the client knows when to retry.
next();: If the request count is within the limit, next() is called, allowing the request to proceed to the next middleware or route handler in your Express.js application.
catch (error): This error handling block catches any potential errors during Redis operations (e.g., connection issues). In a production environment, you should have more robust error handling, logging, and potential fallback mechanisms.
app.use(rateLimitMiddleware);: This line applies the rateLimitMiddleware to all routes in the Express.js application. You can apply it to specific routes if you want rate limiting only on certain endpoints.

Under the Hood - Deep Dive

Atomicity and Concurrency:

The magic of this rate limiter lies in Redis's atomic INCR command. Imagine multiple concurrent requests arriving from the same client almost simultaneously. Without atomicity, there's a risk of race conditions where the count could be incremented incorrectly. Redis guarantees that INCR is atomic. This means that even if multiple clients try to increment the same key concurrently, Redis will process them sequentially in a safe manner, ensuring that the count is always accurate. This is crucial for the correctness of your rate limiter under high load.

Time Window Management:

The EXPIRE command is how we manage the time window. When the first request within a window arrives, we increment the counter and set an expiration on the key. Redis automatically handles the expiration in the background. Once the expiration time is reached, Redis will automatically delete the key.

When a new request arrives after the key has expired, redis.incr(redisKey) will behave as if the key didn't exist. It will:

Create the key.
Initialize it to 0.
Increment it to 1.
Return the new value (1).

This effectively resets the counter for the client and starts a new time window when the next request comes after the previous window has expired.

Scalability:

Redis is designed for scalability. You can easily scale your Redis deployment horizontally by using Redis Cluster if you need to handle a massive number of clients or very high request rates. The rate limiting logic itself is stateless, meaning you can easily run multiple instances of your Node.js application behind a load balancer, all sharing the same Redis instance for rate limiting state.

Retry-After Header:

Setting the Retry-After header is important for client-side behavior and good API design. It signals to the client (and any intermediaries like proxies or CDNs) that they have been rate-limited and when they should attempt to retry their request. This helps prevent clients from blindly retrying too quickly and further overloading the system.

Customization and Flexibility:

This example is a basic implementation. You can easily customize it further:

Different Rate Limits per Endpoint: You can apply different rate limits to different API endpoints based on their resource intensity or criticality. You would adjust the redisKey to be endpoint-specific (e.g., rate_limit:user:${clientIdentifier}:/api/heavy-endpoint).
Varying Time Windows: You can have different time windows (seconds, minutes, hours, days) for different rate limits.
Sliding Window Rate Limiting: For more sophisticated rate limiting, you could implement a "sliding window" approach. This is more complex but provides a smoother rate limiting experience. (This basic example is a "fixed window" or "leaky bucket" approach.)
Whitelist/Blacklist: You can add logic to whitelist or blacklist certain clients from rate limiting.
Granular Client Identification: Use more sophisticated identifiers like API keys or user IDs instead of just IP addresses for more accurate and flexible rate limiting.
Different Actions on Limit Exceeded: Instead of just rejecting requests, you could implement other actions like delaying requests or serving a cached fallback response in certain scenarios.

Error Handling and Resilience:

Robust error handling is crucial. You should gracefully handle cases where the connection to Redis is lost or Redis operations fail. Consider implementing retry mechanisms or fallback strategies in case of Redis unavailability. Monitoring Redis connection health and rate limiter performance is also essential in production environments.

In Conclusion

By understanding these underlying details, you can confidently implement a highly effective and efficient rate-limiting solution using Redis and Node.js to protect your applications and provide a better user experience.

This article was written by Ahmad Adel. Ahmad is a freelance writer and also a backend developer