Member-only story
Replication & Partitioning in Distributed Systems
Part II of Series: Data-Intensive Applications
To build modern Data-Intensive applications, it’s almost a mandatory requirement for these applications to be distributed. And in every distributed system, data replication and partitioning play vital roles in its design.
Replication means keeping a copy of a dataset across multiple machines, connected via a network usually.
There are many reasons why you’d want to do this:
- Keeping data geographically close to your users to minimise latency.
- Increase availability and fault-tolerance in the case of certain nodes failure.
- Increase throughput and performance by allowing multiple nodes to spread the load and deal with many more requests as a whole.
Partitioning on the other hand means splitting your data across different nodes. The main reason for partitioning is scalability. When we have big datasets that cannot physically be in the same machine, or it’s just inefficient to do so, we would usually partition it.
A simple use case for partitioning is splitting data geographically, where only users of a certain region need data for their operations, for example an order’s system where Asian orders are only needed in Asia versus European orders only needed in that other region.
Replication and partitioning are not mutually exclusive. In fact, they are usually combined to achieve better performance, where copies or smaller parts of a big dataset is stored and replicated across different nodes. This means that portions of the data are stored across multiple places, which in turn increases fault-tolerance.
Replication Strategies
If the data you are storing doesn’t ever change, you don’t really need replication strategies, you just store it in different places and forget about it. But as you can imagine, that’s rarely the case.
The main challenge for replication is dealing with keeping data up-to-date when changes do happen. For this, there are many popular strategies:
Single Leader
As you can imagine by the name, this strategy consists of having a single leader and multiple followers handling the replication. Each node will store a replica of the data, and one of the nodes is elected the…