
An interesting observation I had after talking to 31+ developers:
Most know about Active-Passive DB replication.
But just 15% knew about Active-Active and Multi-Active.
Let’s change that today.
[1] Active-Passive
Yes, it’s the first replication setup most developers come across
And they mostly stay with it forever.
Why?
It’s simple and easy to understand.
A standby node is always ready to go in case the active node goes down.
Data is continuously replicated to the passive node.
It’s great but has a few downsides:
If you replicate asynchronously, the replicas may not always be consistent.
If you go for synchronous replication, you end up sacrificing availability if the passive node goes down.
[2] Active-Active
Scaling reads is relatively easy. The trouble starts when you want to scale the writes!
Why?
Because with reads, you just replicate the data to multiple nodes.
The Primary node handles the writes and the Replicas handle the reads.
Sure, there might be replication lag but you can deal with that.
But scaling writes is a different game.
To scale writes, you allow multiple nodes to handle the write requests.
This is also known as an Active-Active setup.
But this arrangement can result in conflicts and you need to have some means of conflict resolution.
For example, last write wins, vector clocks, and so on.
[3] Multi-Active
What if databases also became democratic?
That’s the multi-active setup.
All replicas can handle read as well as write requests. But everything works on the basis of consensus.
For example, if there are 3 replicas, a change is committed when a majority of replicas acknowledge the request.
So, you can have failing nodes in the cluster but it doesn’t compromise on availability.
The cluster stops responding only if a majority of nodes go down.
So - which setup have you used the most often?
And would you like to add more details to any one of them?
GIF
English









