Consistent Hashing Explained Simply

Why naive hashing breaks when you scale, how the hash ring solves it, virtual nodes, and real-world usage

Consistent hashing is one of the most important concepts in distributed systems — and one of the most frequently tested in system design interviews. It appears in Redis Cluster, Cassandra, Amazon DynamoDB, Akamai CDN, and many other systems you'll use or design.

This article builds from first principles: why naive modulo hashing breaks when you add or remove servers, how the hash ring solves the problem, and how virtual nodes handle uneven distribution.

Interview tip: When you say "we'll shard by user_id" or "we'll use a Redis cluster" — the interviewer may ask "how?" Consistent hashing is the answer.

1 The Problem with Naive Modulo Hashing

Suppose you have 3 cache servers (A, B, C) and want to distribute keys evenly. The simplest approach:

# Naive hashing: server = hash(key) % N
# N = number of servers

def get_server(key, servers):
    return servers[hash(key) % len(servers)]

# Works great with N=3:
hash("user_123") % 3 = 1  → Server B
hash("user_456") % 3 = 0  → Server A
hash("user_789") % 3 = 2  → Server C

What happens when you add a 4th server?

# N changes from 3 to 4
# Now: hash(key) % 4

hash("user_123") % 3 = 1  → was Server B
hash("user_123") % 4 = 3  → now Server D   ← MOVED

hash("user_456") % 3 = 0  → was Server A
hash("user_456") % 4 = 0  → still Server A ← OK

hash("user_789") % 3 = 2  → was Server C
hash("user_789") % 4 = 1  → now Server B   ← MOVED

# For N=3 → N=4: approximately 75% of all keys get remapped to different servers
# In a cache: 75% of requests suddenly miss (cache invalidated) → massive DB load spike
# In a database cluster: 75% of data must migrate to a new shard → huge I/O storm

# This is why naive modulo hashing doesn't work for distributed systems:
# Adding or removing ONE server remaps ~(N-1)/N of all keys
The goal of consistent hashing: When a server is added or removed, only K/N keys are remapped (where K = total keys, N = total servers). For 1 server added to a 10-server cluster: only ~10% of keys move. Naive hashing remaps ~91% of keys.

2 The Hash Ring — How Consistent Hashing Works

Consistent hashing maps both servers and keys onto a circular ring (0 to 2³²-1 for 32-bit hashes, or 0 to 2¹²⁸-1 for MD5).

Step 1: Place servers on the ring

Hash ring (0 to 2³²): imagine it as a clock face, 0 at top, going clockwise hash("Server_A") = 85 → position 85 on ring hash("Server_B") = 190 → position 190 hash("Server_C") = 310 → position 310 Ring (visualised as positions 0-360): 0 | ↑ 300 ←──┼── → A(85) | B(190) | C(310)

Step 2: Place keys on the ring

hash("user_123") = 50 → falls between Server_C(310) and Server_A(85) hash("user_456") = 120 → falls between Server_A(85) and Server_B(190) hash("user_789") = 250 → falls between Server_B(190) and Server_C(310) Rule: a key belongs to the FIRST server clockwise from its position on the ring user_123 (pos 50) → walk clockwise → first server = A(85) ✓ user_456 (pos 120) → walk clockwise → first server = B(190) ✓ user_789 (pos 250) → walk clockwise → first server = C(310) ✓

Step 3: Adding a new server

Add Server_D at position 130 New ring positions: A(85), B(190), C(310), D(130) Now: user_123 (pos 50) → first clockwise = A(85) ← SAME as before ✓ user_456 (pos 120) → first clockwise = D(130) ← MOVED from B to D user_789 (pos 250) → first clockwise = C(310) ← SAME as before ✓ Only 1 out of 3 keys moved (33%), not 75% like naive hashing! In general: adding 1 server to N moves 1/(N+1) of keys ≈ 1/N
class ConsistentHashRing:
    def __init__(self, servers=None, replicas=150):
        self.replicas = replicas          # virtual nodes per server
        self.ring = {}                    # position → server_name
        self.sorted_keys = []            # sorted list of positions
        for server in (servers or []):
            self.add_server(server)

    def _hash(self, key: str) -> int:
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def add_server(self, server: str):
        for i in range(self.replicas):
            key = f"{server}:{i}"        # virtual node: "Server_A:0", "Server_A:1", etc.
            position = self._hash(key)
            self.ring[position] = server
            bisect.insort(self.sorted_keys, position)

    def remove_server(self, server: str):
        for i in range(self.replicas):
            key = f"{server}:{i}"
            position = self._hash(key)
            del self.ring[position]
            self.sorted_keys.remove(position)

    def get_server(self, key: str) -> str:
        if not self.ring:
            return None
        position = self._hash(key)
        # Find the first position in sorted_keys >= position (clockwise)
        idx = bisect.bisect_left(self.sorted_keys, position)
        if idx == len(self.sorted_keys):
            idx = 0   # wrap around (circular ring)
        return self.ring[self.sorted_keys[idx]]

# Usage:
ring = ConsistentHashRing(["Server_A", "Server_B", "Server_C"], replicas=150)
ring.get_server("user_123")  # → "Server_A"
ring.get_server("user_456")  # → "Server_B"
ring.add_server("Server_D")
ring.get_server("user_456")  # → may now be "Server_D" (only moved if D is closer)

3 Virtual Nodes — Solving Uneven Distribution

With only 3 real server positions on the ring, distribution can be very uneven by chance:

3 servers, positions: A(50), B(200), C(250) Arc lengths (each arc = percentage of keys that server gets): A → B: 150 units → 41.7% of keys for Server A ← overloaded B → C: 50 units → 13.9% of keys for Server B ← underloaded C → A: 160 units → 44.4% of keys for Server C ← overloaded Real servers have wildly different loads — defeats the purpose of sharding!

Solution: Virtual Nodes (VNodes)

Each physical server is assigned multiple positions on the ring (virtual nodes). With 150 virtual nodes per server, each server ends up with approximately 1/N of the ring, regardless of where real servers happen to hash.

With 3 physical servers and 150 virtual nodes each = 450 total positions on ring Each virtual node = one position: "Server_A:0", "Server_A:1", ..., "Server_A:149" "Server_B:0", ..., "Server_B:149" "Server_C:0", ..., "Server_C:149" With 450 evenly spread positions: - Server_A gets ≈ 33.3% of ring (150/450) - Server_B gets ≈ 33.3% of ring - Server_C gets ≈ 33.3% of ring When Server_D is added (150 more virtual nodes = 600 total): - Each server now gets ≈ 25% of ring - Keys remapped: ≈ 25% (1/4 of all keys) — matches theoretical 1/N minimum

Heterogeneous Servers (Different Capacities)

# What if Server_A has 2× the RAM of Server_B?
# Assign 2× as many virtual nodes to Server_A:

ring = ConsistentHashRing()
ring.add_server("Server_A", weight=2)   # 300 virtual nodes (2 × 150)
ring.add_server("Server_B", weight=1)   # 150 virtual nodes
ring.add_server("Server_C", weight=1)   # 150 virtual nodes

# Result:
# Server_A gets 300/600 = 50% of keys (2× the data → needs 2× the memory ✓)
# Server_B gets 150/600 = 25%
# Server_C gets 150/600 = 25%

# This is exactly how Cassandra's vnodes work:
# Each node's token count is proportional to its disk/memory capacity

4 Replication with Consistent Hashing

Most distributed systems don't store data on just one server — they replicate it for fault tolerance. With consistent hashing, replication is elegant:

# Replication factor RF = 3 (store data on 3 servers)
# Rule: store the key on the server it hashes to, PLUS the next (RF-1) clockwise servers

def get_servers_for_key(ring, key, rf=3):
    """Returns list of RF servers where this key should be stored."""
    primary_server = ring.get_server(key)
    servers = [primary_server]

    # Walk clockwise to find next RF-1 unique servers
    position = ring._hash(key)
    idx = bisect.bisect_left(ring.sorted_keys, position)

    seen = {primary_server}
    while len(servers) < rf:
        idx = (idx + 1) % len(ring.sorted_keys)
        server = ring.ring[ring.sorted_keys[idx]]
        if server not in seen:         # skip virtual nodes of same physical server
            servers.append(server)
            seen.add(server)

    return servers   # e.g., ["Server_A", "Server_C", "Server_B"]

# Example:
# key "user_123" → primary: Server_A → replicas: Server_C, Server_B
# Written to all three. Reads can come from any replica.
# If Server_A fails: read from Server_C or Server_B.
# When Server_A recovers: hinted handoff or anti-entropy (gossip) repairs it.
This is exactly how Cassandra works. The "replication factor" determines how many clockwise nodes store each partition. The coordinator node writes to all RF nodes in parallel.

5 Real-World Usage

SystemHow It Uses Consistent HashingKey Detail
Amazon DynamoDB Partition key → hash → ring → node Original inspiration (Dynamo paper 2007). Uses virtual nodes for load balancing.
Apache Cassandra Partition key hashed to token (64-bit int), mapped to ring Each node owns a range of tokens. VNodes (256 per node default). Replication walks clockwise.
Redis Cluster Uses "hash slots" (16,384 slots) — similar concept but fixed slots, not a ring CRC16(key) % 16384 = slot. Slots assigned to nodes. Remapping = move slot assignment.
Akamai CDN Route request to nearest edge server using consistent hashing Adding a new PoP only steals cache from adjacent PoPs on the ring.
HAProxy / nginx Consistent hashing for sticky load balancing (same client → same upstream) balance uri consistent in HAProxy. Used for upstream cache servers.
Memcached clients Client-side consistent hashing to pick which Memcached server holds each key libmemcached, Spymemcached implement consistent hashing natively.

Redis Cluster: Hash Slots vs Hash Ring

# Redis uses a variant: 16,384 fixed hash slots (not a continuous ring)
# CRC16(key) % 16384 = slot number (0–16383)
# Slots are assigned to master nodes (not dynamically, but by configuration)

# 3-node cluster example:
# Master A: slots 0–5460       (~33%)
# Master B: slots 5461–10922   (~33%)
# Master C: slots 10923–16383  (~33%)

# Resharding: to add Master D, move some slots from A, B, C to D
# Only data in the moved slots needs to migrate (keys in other slots stay put)

# Key difference from pure consistent hashing:
# Redis Cluster: slot ranges assigned manually (predictable but less flexible)
# Cassandra: token ranges assigned dynamically (more automatic, with VNodes)

6 When to Mention Consistent Hashing in Interviews

Bring up consistent hashing whenever you're:

  • Designing a distributed cache (Redis Cluster, Memcached cluster) — "I'll use consistent hashing to route requests to the right cache server so that adding a new cache node only invalidates ~1/N of keys."
  • Sharding a database — "Consistent hashing means adding a new shard only requires migrating ~1/N of data, not a full reshuffle."
  • Building a load balancer with sticky sessions — "Consistent hashing on user_id ensures the same user always routes to the same app server (maintaining in-memory session state)."
  • Designing a CDN or distributed storage — "We use consistent hashing to assign content to edge nodes so that adding a new PoP doesn't cause a global cache invalidation storm."

Quick Interview Answer Template

"To distribute [data/requests] across N [servers/shards], I'll use consistent hashing.
 Each server is mapped to multiple positions on a hash ring using virtual nodes.
 When a key comes in, we hash it to a ring position and find the first clockwise server.

 The benefit over naive modulo hashing: when we add or remove a server,
 only ~1/N of keys are remapped instead of most of them — so there's no cache
 invalidation storm or mass data migration when we scale."

Limitations of Consistent Hashing

  • Hot keys: A popular key always routes to the same server regardless of the ring. Consistent hashing distributes keys, not traffic. If 90% of reads go to one key, one server gets 90% of load — consistent hashing doesn't help. Solution: replicate hot keys to multiple servers.
  • Data skew: Even with VNodes, random hashing might not be perfectly uniform for small N servers. Monitor per-shard sizes and rebalance manually if needed.
  • Operational complexity: When a server fails, data temporarily becomes inaccessible (or routes to the next server if using replication). You need rebalancing scripts and monitoring.

7 Naive Hashing vs Consistent Hashing vs Range Sharding

ApproachKeys Remapped on ScaleHot Spot RiskComplexityUsed By
Modulo hashing (N-1)/N ≈ 90% Low Very low Single-node caches, toy systems
Consistent hashing ~1/N ≈ 10% Low (with VNodes) Medium Cassandra, DynamoDB, Redis Cluster, CDNs
Range sharding Minimal (explicit) High (sequential keys) Low HBase, Bigtable, DynamoDB (sort key ranges)
Directory sharding Zero (lookup table) Low High (lookup is a SPOF) Custom sharding in legacy systems
Rule of thumb: Use consistent hashing when you need to scale the cluster elastically without downtime and without remapping most of your data. Use range sharding when you need efficient range queries (e.g., "give me all users with IDs 1000–2000").

What to Study Next