How I cracked the system design interview with simple core concepts

If you never designed a system before, this is the place to start

Published in

Level Up Coding

10 min readNov 2, 2022

Image from pexels.com by Tima Miroshnichenko

If you’re preparing for a system design interview or you’re a Software Engineer with a full-time 9 to 5 job and want to start a side hustle or a weekend project, or if you already have one that’s getting traction and you want to scale it, fasten your seatbelt and read on.

Honestly, designing a system is very challenging and takes years for big companies but it all starts with a single step. As Will Smith famously said,

You don’t set out to build a wall. You don’t say I’m going to build the biggest, badest, greatest wall that’s ever been built. You don’t start there. You say I’m gonna lay this brick as perfectly as a brick can be laid, and you do that every single day, and soon you have a wall.

Single server setup

Let’s start with something simple where your web application, cache, and database are running on a single server.

Separate Database

Once your user base grows, a single server is not enough. We can add one more server for the database. Separating the web-tier and data-tier allows us to scale them independently.

Now comes the question, which database to use? A traditional relational database or a non-relational database? It’s a hotly debated topic on the internet, but there is no definite answer, it depends on what kind of data you are storing. Both store the data but differently.

A relational database is a good fit

If your application needs complex transactions and data integrity
You need to ensure ACID compliance
You don’t anticipate a lot of feature updates

A non-relational database is a good fit

If your application needs flexible or no predefined schema
If your application needs constant features addition
You are not concerned about 100% data consistency

Scaling the Web Servers

When demand for your application skyrockets, you’ll realize the importance of the accessibility, availability, and capacity of your application. You want to scale the system and, you have two options, vertical scaling (scale-up) and horizontal scaling (scale-out). The difference between these two scaling types depends on the way computing resources are added to your system.

Vertical scaling increases the computational power of your current instances. Vertical scaling is like upgrading your old car with a Ferrari because you want more horsepower.

There is a hard limit to vertical scaling. It is not practical to add an infinite amount of CPU and RAM to a single server and if the server goes down, the entire application goes down.

Horizontal scaling adds capacity to a system by adding more instances to the system and distributing the processing and memory across multiple instances. It’s similar but you won’t ditch your old car to buy a Ferrari but keep your old car and add a Ferrari to your garage.

Load Balancing

To address the increased load, we’ve added more servers. We need a load balancer to distribute the load among multiple servers. A load balancer keeps a single server from getting overloaded, causing it to slow down or drop requests.

This is how the load balancer works,

A browser or an application gets a request and attempts to connect to a server.
The load balancer takes the request and forwards it to one of the servers in a server group depending on the predefined algorithm.
The server takes the connection request and, using the load balancer, responds to the client.

One important thing to notice here is that the clients interact with the system using public IPs and the interaction between the load balancer and the servers happen via private IPs for increased security.

Database Replication (Master-Slave)

Assume your application becomes large-scale and your server crashes for whatever reason. After a few debugging sessions, you discover that the problem is caused by heavy read/write requests to your database server. In such instances, a master-slave database design is recommended. This design, which distributes the load across multiple nodes, can enable your application scale-out to meet your growing user base.

A master database will only support write and update operations and, slave databases get copies of master data and supports only read operations.

If one of the slave databases crashes, the incoming requests will be forwarded temporarily to the healthy slave database or the master database. And if the master database is down, a slave database is promoted as the new master and incoming requests are forwarded to the new master.

Cache data as much as you can

A large-scale application experiences high traffic, which can gradually compromise application performance; we can cache the data to address this. Caching can be done at different layers like client-side, web application, database, and DNS. Here we are interested in the database layer, but the purpose is the same at every layer.

A Cache is a high-speed, temporary storage area that stores frequently accessed data. It can store previously retrieved or computed expensive responses so that future requests to the same data can be served faster than the primary database. The key advantages of caching are that it minimizes database workloads, lowers costs, and improves overall application performance.

There are several caching strategies and each will have a distinct influence on your system and its performance. A caching strategy defines the interaction between the data source and the caching system, as well as how your data may be retrieved. Consider how your data will be accessible so that you can choose which technique is ideal. Some of the more popular ones are listed below

Cache aside
Read through
Write Through
Write back
Write Around

The diagram below illustrates a simple read-through caching strategy. In my coming blog, I’ll go through all the caching strategies in detail.

Use Content Delivery Network (CDN) for static assets

Your application may be hosted in a single location, but when users from other countries access it, the contents, such as images, videos, CSS, and JavaScript files, must travel around the world, which might cause an increased latency in your application. We can make use of CDN to address this issue.

A CDN is a network of servers across the globe. It is a method of delivering content to users depending on their geographic location more quickly and effectively.

Support Multiple Geo-Diverse Data Centers

You can lower the latency and further improve the overall performance of your application by forwarding the user’s request to the application to the nearest data centers. Moreover, if a disaster hits, such as a flood or an earthquake, only the data center closest to the disaster would be affected, while other data centers will be able to store the data for emergency access.

The geo-diverse strategy reduces the distance that your data must travel and, as a result, the time it takes for the data to reach its destination.

The above figure shows how geo-diverse data centers work with two data centers in different locations. The users are geo-routed to the closest data center.

Use of Message Queues

You can decouple your application into independent, smaller building blocks which makes them easier to design, deploy, and manage. These distributed applications can effectively communicate, thanks to message queues.

The core concept of a message queue is simple and clear; messages are created by input services, sometimes known as producers or publishers, and they are published to a message queue. Consumers and subscribers are other services or servers that connect to the queue and carry out the tasks specified by the messages. The below figure shows the basic architecture of a message queue.

Source: https://www.cloudamqp.com/blog/what-is-message-queuing.html

The best use case of the message queues is they can act as a middleman between the microservices.

Another use case is that users can upload data to a website via an application. The website will handle this data, create an Excel, and email the user a copy of it. In this example, processing the data, creating Excel, and sending the email will take a few seconds, so a message queue can be used here.

Database Scaling

Our database becomes more overburdened as the number of data increases daily. This data layer needs to be scaled. Database scaling may be divided into two categories: vertical scaling and horizontal scaling.

Vertical Scaling (Scaling-up):

Assume you have a database server with 5GB of RAM that has reached its limit. To manage additional data, you now purchase a costly server with 2TB of RAM. Your server is now capable of handling massive volumes of data.

Vertical scaling, on the other hand, has several drawbacks:

Powerful servers for vertical scaling are more costly.
Hardware limits.
A single point of failure is more likely.

Horizontal Scaling (Sharding):

The below figure shows the difference between vertical and horizontal scaling.

Source: https://www.section.io/blog/scaling-horizontally-vs-vertically/

Horizontal scaling, as seen in the figure, is the horizontal expansion of the database by adding extra servers. It separates the data and distributes it over different servers, known as shards. Each shard is a separate database and shares the same schema, but the actual data on each shard is unique to the shard.

Think about the figure below. Users are kept in the database which is divided into shards according to user IDs. The incoming request will be sent to the appropriate shard via a hash function. We used user-id% 4 as the hash function on the four shards. Data is stored and retrieved from shard 0 if the hash function yields a value of 0. Shard 1 is utilized if it’s 1, and so on.

Sharding is a fantastic approach, but oftentimes it challenges the system with new challenges.

It is difficult to carry out join operations across database shards.
Some shards can be overloaded more rapidly than others. This is caused by the inconsistent distribution of data. This may be addressed by changing the shard function and redistributing data across shards.

Stateless over Stateful

An application can be stateful or stateless depending on how long and how the state of interaction with the users is stored.

A stateful application, like any other application, utilizes a database, but it stores state data like user authentication and past requests on the server, making the application faster. It labels if the user is connected or disconnected. It also keeps track of previous requests from the same users. When processing subsequent user requests, it uses all of this information as context.

Here’s a simple example: A stateful application is equivalent to visiting your local bank, where you’ve been a customer for many years. Everyone in the bank knows who you are, so you don’t need to present your ID before making a transaction. The bank also keeps details of your previous transactions on a file. Simply walk up, ask questions about your balance, and tell them what transactions you want to do. This approach is quick and efficient as long as the bank can handle the number of clients attempting to make transactions at the same time.

Now imagine a queue of 1,000 individuals waiting to make a transaction at the same branch location — and there’s only one in town. There will be a long wait since the bank cannot handle this many requests at once. You can’t just go to a new bank since they won’t know who you are and won’t have access to your past bank account data. The fact that you must continue to visit the same bank branch, even when it is overcrowded and unable to complete your transactions, is a scalability issue.

We can address this issue by using a stateless application. A stateless application does not keep any data about past transactions on its server. Statelessness does not mean the absence of a state, it simply means that the state is stored elsewhere other than on the server. The state is transferred to clients and/or databases. Clients that store their state and the context for later transactions are required for stateless applications. As a result, stateless apps and their clients can communicate more independently and it adds the ability to scale.

Logging and Monitoring your system

Logging generates a set of events that happen in your applications and error monitoring can be used to identify the issues with your application and tells you whether your application is running or not. Both work well together just like bread and butter.

We’ve discussed each concept briefly; please go over each one in depth to fine-tune your system. Now, pat yourself on the back for learning the basic concepts and scalable components of a system design.

References and further reading

System Design Interview by Alex Xu
https://www.redhat.com/en/topics/cloud-native-apps/stateful-vs-stateless
https://blog.dreamfactory.com/stateful-vs-stateless-web-app-design

Level Up Coding

Thanks for being a part of our community! Before you go:

👏 Clap for the story and follow the author 👉
📰 View more content in the Level Up Coding publication
🔔 Follow us: Twitter | LinkedIn | Newsletter

🚀👉 Placing developers like you at top startups and tech companies

How I cracked the system design interview with simple core concepts

If you never designed a system before, this is the place to start

Single server setup

Separate Database

Scaling the Web Servers

Load Balancing

Database Replication (Master-Slave)

Cache data as much as you can

Use Content Delivery Network (CDN) for static assets

Support Multiple Geo-Diverse Data Centers

Use of Message Queues

Database Scaling

Stateless over Stateful

Logging and Monitoring your system

References and further reading

Level Up Coding

Written by Kiran Suvarna