Consistent Hashing

Prerequisite

Hashing and the need for consistent hashing

Introduction

The standard hashing algorithm is inefficient while processing network queries. This traditional technique presupposes that we have a set number of servers and that all mapping locations are known in advance. This scenario is particularly troublesome in distributed systems when several users request multiple servers. If some servers fail, then mapping the work to new servers involves a huge and intensive calculation, which is wasteful and reduces the service's throughput while increasing the application's latency.

Consistent Hashing

Consistent Hashing is a distributed hashing scheme that operates independently of the number of servers or objects in a distributed hash table by assigning them a position on an abstract circle, or hash ring. This allows servers and things to scale without affecting the overall system. So the hash table is independent of the number of servers available to minimize key relocation when scale changes occur.

Assumptions for understanding the concept

We are considering 4 servers Server 0, Server 1, Server 2, and Server 3 with the hash value of 0, 90, 180, and 270 respectively.
We will be using 360 to find the position of the server in the ring. In the real world, it can be any number.

Explanation

Consider a circle or a ring, which will have 360 deg. Now we can place the servers based on their hash value modulo 360. So,

Server 0 will be placed at hash(server 0) % 360 = 0 % 360 = 0
Server 1 will be placed at hash(server 1) % 360 = 90 % 360 = 90
Server 2 will be placed at hash(server 2) % 360 = 180 % 360 = 180
Server 3 will be placed at hash(server 3) % 360 = 270 % 360 = 270

Now the ring will look like this,

consistent hashing

Scenario 1 -> A Request coming in

Now considering a request R1 is coming, now we need to redirect this request to one of the available servers. The hash of the request R1 is 40. Visually we can see 40 is in between server 0 and server 1. (Technically we need to apply binary search). The next available server to 40 (hash of R1) is Server 1. So we can redirect this request to server 1.

Now considering a request R2 is coming, now we need to redirect this request to one of the available servers. The hash of the request R2 is 60. Visually we can see 60 is in between server 0 and server 1. (Technically we need to apply binary search). The next available server to 60 (hash of R2) is Server 1. So we can redirect this request to server 1.

consistent hashing

Scenario 2 -> Adding a new Server

Now we are going to add a new server S4. Assume the hash value of this server S4 is 50, it will be placed in between S0 and S1.

consistent hashing

But we also need to check on one thing, (ie), in Scenario 1 we discussed about R1 (with the hash value of 40) is been placed in Server S1. Since we are going to introduce S4, the Request R1 (hash value of 40) needs to be placed in Server S4 - based on binary insertion.

consistent hashing

Server 4 will need to query the next available server, whether any request less than the hash value is present. If it's present we need to move the data related to the request.

consistent hashing

So this is how it goes,

S4 will query S1, Do you have any requests with my range?
If there are any requests they would be shared from S1 to S4.

After data transfer,

consistent hashing

Scenario 3 -> Deleting a server.

Currently, the hash ring with servers will look like this,

consistent hashing

Server 1 is to be deleted and the contents of the server will be moved to the next available server; in our case it's S2.

consistent hashing

Now we can see the request details of R2 (with the hash value of 60) are present in the S1, So this value is to be moved to S2.

After deletion,

consistent hashing

Final Thoughts

This is how consistent hashing works at a basic level. In this blog post, we haven't considered the scenarios of a server crash. That's a whole different topic.

Subscribe to my newsletter.

Consistent Hashing

Hash methodology, independent of the number of servers or objects in a distributed hash table

Table of contents

Prerequisite

Introduction

Consistent Hashing

Assumptions for understanding the concept

Explanation

Scenario 1 -> A Request coming in

Scenario 2 -> Adding a new Server

Scenario 3 -> Deleting a server.

Final Thoughts

Consistent Hashing

Hash methodology, independent of the number of servers or objects in a distributed hash table

Table of contents

Prerequisite

Introduction

Consistent Hashing

Assumptions for understanding the concept

Explanation

Scenario 1 -> A Request coming in

Scenario 2 -> Adding a new Server

Scenario 3 -> Deleting a server.

Final Thoughts

Did you find this article valuable?