Have you ever had a problem with your computer, and someone told you to fix it by clearing your browser cache or your DNS cache? 🧐
Well in this article we would be looking at caches and in my next article we would be delving deep into implementing caches in node.js using Redis
What’s Caching? 🧐
Caching is the process of storing frequently accessed data in a cache or temporary storage location so that it can be retrieved more quickly when it is needed again. Simply put, caching means storing frequently demanded things closer to those asking for them by doing that, you increase the access speed.
Analogy
Let’s take a simple analogy from the book Algorithms to Live By.
Imagine you are writing a research report or paper, and you need to consult some/book(s) from the library. You can go to the library each time you need a piece of information instead, you will most likely take that book home with you and put it on your desk for faster access. Instead of making round trips to the library, which would slow down your progress, you can now grab the book straight from your desk.
Benefits of Caching
From the library analogy, you can tell that one major benefit of caching is Quicker/Faster access to data. Caching offers several other benefits here are some of the key advantages:
Faster Response Time: Caching allows web applications to serve data and resources quickly, leading to reduced response times. This is particularly beneficial for dynamic web content or data-intensive operations that involve database queries, API calls, or resource-intensive computations. By caching the results of such operations, subsequent requests can be handled more efficiently, resulting in faster response times.
Reduced Latency: Caching brings the data closer to the application, reducing network latency and improving overall system performance.
Enhanced User Experience: Faster response times and improved performance directly translate into a better user experience. Reduced waiting times, smoother page loading, and seamless interactions lead to higher user satisfaction and engagement. Improved user experience can also positively impact business metrics, such as conversion rates, user retention, and overall customer satisfaction.
Redundancy and High Availability: Redundancy refers to the replication of cached data across multiple cache nodes or servers. By maintaining redundant copies of data, the caching system can ensure data availability even if individual cache instances fail or become unavailable.
Low Latency & High Throughput
Latency refers to the time it takes for a system to respond to a request or input. Low latency means that the system responds quickly, with minimal delay. In other words, the time between a request being made and a response being received is short. For example, in a database system, low latency means that queries are executed quickly and data is returned to the user in a short amount of time. Low latency is important for applications that require real-time data processing, such as online gaming, financial trading, and real-time analytics.
Throughput refers to the rate at which a system can process data or requests. High throughput means that the system can handle a large volume of data or requests in a given period. In other words, the system can process a large number of transactions in parallel. For example, in a web server, high throughput means that the server can handle a large number of concurrent requests and serve web pages to users quickly. High throughput is important for applications that need to handle high volumes of data or requests, such as e-commerce websites, social networks, and content delivery networks.
Cache Storage Eviction Strategies
Caches get full after all it’s temporary storage, and space needs to be cleared for more important or useful data to be stored, below are some eviction strategies used in caching;
Least Recently Used (LRU): LRU caching algorithm removes the least recently used items from the cache when it reaches its capacity limit. It ensures that the most frequently accessed items stay in the cache, optimizing cache hits and minimizing cache misses.
First In First Out (FIFO): FIFO caching removes the oldest items from the cache when it exceeds its capacity. It follows a simple principle of evicting items based on their insertion order.
Least Frequently Used (LFU): LFU caching algorithm removes the least frequently used items from the cache when it is full. It keeps track of the usage frequency of items and evicts the least frequently accessed ones.
Those are the major caching eviction strategies when a cache is full.
Illustration
The perfect way to explain how caching works and how it is applied is using a frequently visited web application like YouTube where there are several data and meta data like video thumbnails.
The first time you visit YouTube.com, your browser knows nothing about it, so it downloads all of the resources that make up YouTube the logo, icons, fonts, scripts, and all the thumbnails. On subsequent visits, all these resources can be retrieved from the cache. Making the webpage load much faster because your browser only needs to download newer content that it hasn't seen before on YouTube, that might be only the thumbnails of videos uploaded after your last visit.
In-Memory vs On Disk vs Distributed Storage
In-memory storage means that the data is stored in the computer's Random Access Memory (RAM), which is a type of volatile memory that is fast and efficient for accessing data quickly. When data is stored in memory, it can be accessed almost instantaneously, which makes it ideal for use cases that require low-latency data access, such as real-time data processing, caching, and high-performance computing. However, data stored in memory is lost when the computer is turned off, so it is typically used for temporary data storage or caching
On-disk storage, means that the data is stored on a physical disk, such as a hard disk drive (HDD) or solid-state drive (SSD). This type of storage is persistent, meaning that the data is not lost when the computer is turned off, and it can be retrieved even after a power outage or system failure. However, accessing data from disk is much slower than accessing data from memory, which can be a bottleneck in systems that require fast data access.
Distributed storage, means using a separate cache server or service to store and retrieve data. It allows multiple application instances or servers to share a common cache, promoting scalability and enabling data sharing across the ecosystem.
Caching Libraries for Node.Js
Some popular caching libraries in node.js include:
Node-cache: Node-cache is a simple and lightweight caching library for Node.js. It provides an in-memory cache with various features such as expiration, time-to-live (TTL) settings, and support for key-value pairs.
Redis: Redis is a widely used in-memory data structure store that can be used as a caching solution in Node.js. It offers high performance and supports advanced caching features.
Memcached: Memcached is a popular distributed caching system that can be used with Node.js. It provides a simple and efficient key-value store and is known for its scalability and high-performance caching capabilities.
Caching is an essential technique for optimizing performance of web applications. By leveraging caching strategies and caching libraries, you can reduce response times, improve user experience, and increase the efficiency of your backend systems.
Having learnt about the effectiveness of caching I have a question : If there are eviction strategies then why do we still get notifications to clear our cache 🧐 Drop your thoughts in the comment section.
Thank you for reading 🥰