Master System Design Interviews with the Top 7 Most Asked Questions & Expert Answers

Maximize Your Preparation: Get the Expert Answers to the Top 7 Most Crucial System Design Questions

Arslan Ahmad
Level Up Coding

--

System design interviews are a crucial part of the tech interviews. They allow interviewers to assess a candidate’s ability to design scalable and complex systems, which are the backbone of any successful tech company.

This guide will cover essential topics required to prepare for and excel in system design interviews. Let’s start with most common system design questions.

The Most Common System Design Interview Questions

Here are 7 most frequently asked system design interview questions, listed roughly in order of how often they come up:

1. Design Facebook Messenger

Design an instant messaging service like Facebook Messenger where users can send text messages to each other through web and mobile interfaces.

Functional Requirements:

  1. Messenger should support one-on-one conversations between users.
  2. Messenger should keep track of the online/offline statuses of its users.
  3. Messenger should support the persistent storage of chat history.

Non-functional Requirements:

  1. Users should have a real-time chatting experience with minimum latency.
  2. Our system should be highly consistent; users should see the same chat history on all their devices.
  3. Messenger’s high availability is desirable; we can tolerate lower availability in the interest of consistency.

High-level solution

At a high level, we will need a chat server that will be the central piece orchestrating all the communications between users. For example, when a user wants to send a message to another user, they will connect to the chat server and send the message to the server; the server then passes that message to the other user and also stores it in the database.

The detailed workflow would look like this:

  1. User-A sends a message to User-B through the chat server.
  2. The server receives the message and sends an acknowledgment to User-A.
  3. The server stores the message in its database and sends the message to User-B.
  4. User-B receives the message and sends the acknowledgment to the server.
  5. The server notifies User-A that the message has been delivered successfully to User-B.
Request flow for sending a message

Learn more on Designing Facebook Messenger or see the following video.

2. Design Instagram

Design a photo-sharing service like Instagram, where users can upload photos to share them with other users.

Functional Requirements

  1. Users should be able to upload/download/view photos.
  2. Users can perform searches based on photo/video titles.
  3. Users can follow other users.
  4. The system should generate and display a user’s News Feed consisting of top photos from all the people the user follows.

Non-functional Requirements

  1. Our service needs to be highly available.
  2. The acceptable latency of the system is 200ms for News Feed generation.
  3. Consistency can take a hit (in the interest of availability) if a user doesn’t see a photo for a while; it should be fine.
  4. The system should be highly reliable; any uploaded photo or video should never be lost.

High-level solution

At a high-level, we need to support two scenarios, one to upload photos and the other to view/search photos. Our service would need some object storage servers to store photos and some database servers to store metadata information about the photos.

Learn more on Designing Instagram or see the following video.

3. Design Facebook’s Newsfeed

Design Facebook’s Newsfeed, which would contain posts, photos, videos, and status updates from all the people and pages a user follows.

Functional requirements:

  1. Newsfeed will be generated based on the posts from the people, pages, and groups that a user follows.
  2. A user may have many friends and follow a large number of pages/groups.
  3. Feeds may contain images, videos, or just text.
  4. Our service should support appending new posts as they arrive to the newsfeed for all active users.

Non-functional requirements:

  1. Our system should be able to generate any user’s newsfeed in real-time — maximum latency seen by the end user would be 2s.
  2. A post shouldn’t take more than 5s to make it to a user’s feed assuming a new newsfeed request comes in.

High-level solution

At a high level, we will need the following components in our Newsfeed service:

  1. Web servers: To maintain a connection with the user. This connection will be used to transfer data between the user and the server.
  2. Application server: To execute the workflows of storing new posts in the database servers. We will also need some application servers to retrieve and to push the newsfeed to the end user.
  3. Metadata database and cache: To store the metadata about Users, Pages, and Groups.
  4. Posts database and cache: To store metadata about posts and their contents.
  5. Video and photo storage, and cache: Blob storage, to store all the media included in the posts.
  6. Newsfeed generation service: To gather and rank all the relevant posts for a user to generate newsfeed and store in the cache. This service will also receive live updates and will add these newer feed items to any user’s timeline.
  7. Feed notification service: To notify the user that there are newer items available for their newsfeed.

Following is the high-level architecture diagram of our system. User B and C are following User A.

Facebook Newsfeed Architecture

Learn more on Designing Facebook’s Newsfeed or see the following video.

4. Design Twitter Search

Twitter is one of the largest social networking service where users can share photos, news, and text-based messages. Let’s design a service that can store and search user tweets.

Requirements

  • Let’s assume Twitter has 1.5 billion total users with 800 million daily active users.
  • On average Twitter gets 400 million tweets every day.
  • The average size of a tweet is 300 bytes.
  • Let’s assume there will be 500M searches every day.
  • The search query will consist of multiple words combined with AND/OR.

At the high level, we need to store all the tweets in a database and also build an index that can keep track of which word appears in which tweet. This index will help us quickly find tweets that the users are trying to search for.

High-level design for Twitter search

Learn more on Designing Twitter Search or see the following video.

5. Design Dropbox or Google Drive

Design a file hosting service like Dropbox or Google Drive. Cloud file storage enables users to store their data on remote servers. Usually, these servers are maintained by cloud storage providers and made available to users over a network (typically through the Internet). Users pay for their cloud data storage on a monthly basis.

Requirements

  1. Users should be able to upload and download their files/photos from any device.
  2. Users should be able to share files or folders with other users.
  3. Our service should support automatic synchronization between devices, i.e., after updating a file on one device, it should get synchronized on all devices.
  4. The system should support storing large files up to a GB.
  5. ACID-ity is required. Atomicity, Consistency, Isolation, and Durability of all file operations should be guaranteed.
  6. Our system should support offline editing. Users should be able to add/delete/modify files while offline, and as soon as they come online, all their changes should be synced to the remote servers and other online devices.

Extended Requirements

  • The system should support snapshotting of the data, so that users can go back to any version of the files.

High-level solution

The user will specify a folder as the workspace on their device. Any file/photo/folder placed in this folder will be uploaded to the cloud, and whenever a file is modified or deleted, it will be reflected in the same way in the cloud storage. The user can specify similar workspaces on all their devices and any modification done on one device will be propagated to all other devices to have the same view of the workspace everywhere.

At a high level, we need to store files and their metadata information like File Name, File Size, Directory, etc., and who this file is shared with. So, we need some servers that can help the clients to upload/download files to Cloud Storage and some servers that can facilitate updating metadata about files and users. We also need some mechanism to notify all clients whenever an update happens so they can synchronize their files.

As shown in the diagram below, Block servers will work with the clients to upload/download files from cloud storage and Metadata servers will keep metadata of files updated in a SQL or NoSQL database. Synchronization servers will handle the workflow of notifying all clients about different changes for synchronization.

High-level design for Dropbox

Learn more on Designing Dropbox or see the following video.

6. Designing Yelp or Nearby Friends or Proximity Server

Design a Yelp like service, where users can search for nearby places like restaurants, theaters, or shopping malls, etc., and can also add/view reviews of places.

Functional Requirements:

  1. Users should be able to add/delete/update Places.
  2. Given their location (longitude/latitude), users should be able to find all nearby places within a given radius.
  3. Users should be able to add feedback/review about a place. The feedback can have pictures, text, and a rating.

Non-functional Requirements:

  1. Users should have a real-time search experience with minimum latency.
  2. Our service should support a heavy search load. There will be a lot of search requests compared to adding a new place.

High-level solution

At a high level, we need to store and index each dataset described above (places, reviews, etc.). For users to query this massive database, the indexing should be read efficient, since while searching for nearby places, users expect to see the results in real-time.

Given that the location of a place doesn’t change that often, we don’t need to worry about frequent updates of the data. As a contrast, if we intend to build a service where objects do change their location frequently, e.g., people or taxis, then we might come up with a very different design.

Learn more on Designing Yelp or Nearby Friends or see the following video.

7. Design a Web Crawler

Design a Web Crawler that will systematically browse and download the World Wide Web. Web crawlers are also known as web spiders, robots, worms, walkers, and bots.

Requirements

Let’s assume we need to crawl all of the web.

Scalability: Our service needs to be scalable such that it can crawl the entire Web and can be used to fetch hundreds of millions of Web documents.

Extensibility: Our service should be designed in a modular way with the expectation that new functionality will be added to it. There could be newer document types that need to be downloaded and processed in the future.

High-level solution

A bare minimum crawler needs at least these components:

1. URL frontier: To store the list of URLs to download and also prioritize which URLs should be crawled first.
2. HTML Fetcher: To retrieve a web page from the server.
3. Extractor: To extract links from HTML documents.
4. Duplicate Eliminator: To make sure the same content is not extracted twice unintentionally.
5. Datastore: To store retrieved pages, URLs, and other metadata.

Learn more on Designing a Web Crawler or see the following video.

Fundamental Concepts of System Design

Here are some fundamental concepts you should be familiar with:

  • Scalability: The capability of a system to accommodate growing amounts of load or traffic. There are two main approaches to scalability, including horizontal scaling (adding more machines to the system) and vertical scaling (increasing resources on a single machine).
  • Fault tolerance: The resilience of a system to continue functioning even when one or more components fail. Techniques such as redundancy and load balancing can enhance a system’s fault tolerance.
  • Load balancing: The distribution of workloads across multiple machines to maximize resource utilization and prevent any single machine from being overwhelmed.
  • API Gateway: A server that acts as a single point of entry for a set of microservices, API servers, or backend servers. It receives client requests, forwards them to the appropriate microservice, and then returns the server’s response to the client. An API gateway is focused on routing requests to the appropriate microservice, while a load balancer is focused on distributing requests evenly across a group of backend servers.
  • Caching: Storing frequently accessed data in a fast storage layer to reduce the burden on the underlying data store and enhance system performance.
  • Availability: The readiness of a system to respond to requests in a timely manner. This is closely tied to fault tolerance and is usually measured as a percentage of time the system is operational.
  • Consistency: The extent to which all nodes in a distributed system see the same data at the same time. Consistency can be divided into different levels, including strong consistency, eventual consistency, and no consistency.
  • Latency: The time it takes for a request to be processed and a response to be returned. Latency is a crucial aspect of system design as it defines the response time of a system.
  • Throughput: The number of requests a system can handle per unit of time. Throughput is closely tied to scalability and is frequently used as a metric for a system’s performance.
  • Partition Tolerance: The ability of a system to continue functioning even when network partitions occur. In distributed systems, it’s impossible to have both consistency and partition tolerance at the same time, so the designer must choose which one is more important for the particular use case.
  • CAP Theorem: This theorem states that it’s impossible for a distributed system to simultaneously provide all three guarantees: Consistency, Availability, and Partition Tolerance.
  • ACID Properties: A set of properties that ensure that database transactions are processed reliably. The acronym stands for Atomicity, Consistency, Isolation, and Durability.

How to answer a system design question in an interview

Here are 7–steps summarizing design process to answer any system design question:

7–step process to answer any system design question

Step 1: Requirements clarification

Step 2: Back-of-the-envelope estimation

Step 3: System interface definition

Step 4: Defining the data model

Step 5: High-level design

Step 6: Detailed design

Step 7: Identifying and resolving bottlenecks

See more details here, on each of these steps.

More System Design Interview Questions

Here are the a few more system design asked at top tech companies, including FAANG (Facebook, Apple, Amazon, Netflix, and Google).

  1. Design a video streaming service like YouTube or Netflix
  2. Design an API Rate Limiter
  3. Design Twitter
  4. Design a Web Crawler
  5. Design a URL Shortening service like TinyURL
  6. Design Ticketmaster
  7. Design Parking Lot
  8. Design An Online Shopping System like Amazon

Where to go from here?

➡ Practice these questions to distinguish yourself from others!

➡ Check Grokking System Design Fundamentals for a list of common system design concepts.

➡ Learn more about these questions in “Grokking the System Design InterviewandGrokking the Advanced System Design Interview.”

➡ For Object Orineted Design questions, take alook at “Grokking the Object Oriented Design Interview.”

➡ Follow me on LinkedIn for tips on system design and coding interviews.

Thanks for reading

--

--

Founder www.designgurus.io | Formally a software engineer @ Facebook, Microsoft, Hulu, Formulatrix | Entrepreneur, Software Engineer, Writer.