Everything you need to know about Caching — System Design

Animesh Gaitonde

Published in

Level Up Coding

7 min readApr 6, 2020

Introduction, use cases, strategies, and policies to cache data

Introduction

Have you ever noticed if you are on a slow internet connection and browsing a website, the text is loaded before any high-quality image? However, in your subsequent visits to the same website, you will observe that page rendering occurs quickly. When you visit a brand new website, it takes more time to load than frequently visited sites like Facebook or Amazon. Do you know why this happens? The answer is Caching.

**Instagram page on a slow internet connection**

The above picture is how my Instagram page looks on a slow internet connection. As you can see, the text data is displayed whereas you can’t see the images as the page is still rendering.

It’s important to deliver the best experience to the user to improve retention and engagement. In today’s competitive world, businesses get impacted due to poor user experience. Imagine you are watching your favourite TV Series on any video streaming website, but the video keeps buffering. Would you stick around and continue your subscription on such a website?

Caching works on the principle of ‘locality of reference’. The cache acts as a local store for the data to speed up the lookup or retrieval. The primary goal of a Cache is to reduce the read latency and amplify the throughput of any application. Let’s have a look at a real-world analogy in the next section.

Real-World Analogy of a Cache

Let’s say you cook dinner every day. You need different ingredients, vegetables, spices, etc for food preparation. But do you visit the super-market to purchase this every day? That would be way too cumbersome and time-consuming. So, you check your kitchen or refrigerator first in case you have piled up your grocery. That would avoid a tiresome visit to the supermarket.

**Refrigerator behaves as a Cache for vegetables**

Here your refrigerator acts like a Cache or a local store for your vegetables. The biggest benefit of using a Cache is that it’s saving time and you can quickly prepare your food.

How does a Cache work?

A backend application generally stores the data in a database. When a client fetches any data, the application queries the database, fetches the data, and returns it to the user. The database server runs as a separate process and can run on a computer different than the application server.

**Application Server fetching data from DB**

Reading data from the database is time-consuming since it needs a network call and an IO operation to get the data from the file system. If the data is stored in a Cache, the read operation will be blazing fast. When the client requests the same data repeatedly, it makes more sense to fetch data from the Cache than from the database.

For eg: If a tweet becomes viral, all the clients will try to fetch the data for the same tweet. Since twitter has millions of users, using a cache will save millions of calls to the database.

Furthermore, a Cache also reduces the load on the database. If the data is found in the Cache, a database call will be saved thus reducing pressure on the database. In simple words, you can think of Cache as a Hash Table that stores key-value pairs.

The following diagram illustrates the process of reading data from the Cache:

Core Concepts in Caching

TTL (Time to Live)

There is a limitation on the amount of data that can be stored in the cache. It’s essential to evict the entries in the Cache which are no longer needed by the application server.

In the case of Netflix, the server will cache the most frequently watched or hit shows in the Cache. It doesn’t need to store the show whose viewership has reduced over time.

For instance: It makes more sense to cache TV show like Money Heist (at the time of this writing) than a movie like Indiana Jones.

Eviction Policy

The Cache may become full at some point in time depending on the way data is accessed by the application. Hence, we need to come up with a strategy to remove the data from the cache and replace it with the one that has more probability to be accessed in the future.

There are multiple Cache eviction policies such as LRU (Least Recently Used), LFU (Least Frequently Used), MRU (Most Recently Used). These policies evict the data from the Cache using pre-defined logic. We will discuss each of the above in the next section.

LRU (Least Recently Used)

This policy removes the entry from the Cache which is least recently used. As soon as the Cache becomes full, the least recently used entry is evicted from it & the recent entry is added to it.

You can imagine Facebook storing celebrity’s photos in a Cache. The data access patterns of the followers are such that they are interested in the recent photos. When the Cache becomes full, it will kick the photo that was least recently added to it.

LFU (Least Frequently Used)

LFU keeps track of the frequency or number of times a data item is accessed. In case of Cache size crossing a given threshold, it will evict the entry with the lowest frequency.

When you type any word while texting, your phone starts suggesting you multiple words that you can select instead of typing the whole word. Internally, your phone’s software maintains a cache of all the words that you typed along with its frequency.

**Phone’s software recommending words to complete**

The Cache later evicts the word with the lowest frequency. In case of a tie between multiple words, then the least recently used word is evicted. In the above example of a phone, if you start using the words “feature”, “features”, “feather”, etc, it will stop suggesting the word “feat” to you (as that would be evicted from the cache).

MRU (Most Recently Used)

In MRU, the most recently used entry is removed and preference is given to the older entries to persist in the Cache. If the data access pattern is such that the user is less probable to view the most recent entry, then this strategy is used for eviction. Let’s take a look at an example.

**Tinder Left/Right Swipe uses the MRU policy**

Dating apps like Tinder generally cache all the potential matches of a user. When a user either left or right swipes a profile, the app shouldn’t recommend the same profile to the user again. If this happens, it will result in poor user experience.

It’s necessary to evict the entries which were observed most recently. The application must remove cache entry of a profile that either gets right or left swiped.

Types of Cache

Write Through Cache

As the name goes, the data is first written to the Cache and then to the database. This ensures consistency between the data in the Cache and the database. Every read done on the Cache follows the most recent write.

However, the downside of this approach is the Application write latency increases. This approach is not suitable for a write-heavy system. It is useful for applications which re-read the data frequently once it's persisted in the database. Write latency can take a hit but its compensated by lower read latency and consistency.

Write Back Cache

As seen from above, Write through Cache is not applicable for write-heavy systems as the latency can spike up. An alternative approach is to write the data to the Cache first & mark the data as modified (to be updated in DB later).

An async job could regularly read all the modified entries in the Cache and update their corresponding value in the database. This approach would neither impact the read nor the write latency. The only disadvantage is there would be a lag due to data syncing between the Cache and DB. Since the database is the source of truth, any application reading from the DB would read stale entries.

Websites such as Youtube use Write Back Cache to store the view count of any video. Updating the database for every single view of a viral video would be expensive. Writing data to the cache and then syncing it in the DB is a better solution. Usage of Write-Back Cache ensures low read/write latency.

Write Around Cache

Few backend applications don’t frequently re-read the most recent data. In this case, Write Around Cache is used.

In this policy, the database is updated without writing to the cache. This doesn’t load the Cache with data that wouldn’t be re-read. If the application starts querying for the most recent data, it will result in Cache misses.

Examples of Caches used in Distributed Systems

Following is a list of open-source in-memory cache products

Redis
Memcached
VoltDB
Aerospike DBS
Apache Ignite