Agile Learning: caching

Showing posts with label caching. Show all posts

Tuesday, June 17, 2014

How does a cache works

In this post, I’ll try to share my understanding on how cache work and how a caching framework can be integrated in an application. From an implementation perspective cache is nothing but an interface with methods to get/put/remove a cache object. A sample cache interface is as follows:-

public interface Cache<K, V> {

/**

* Returns a cached value associated with a key.

* @param key the key

* @return the cached value

public V get(final K key);

/**

* Puts a value associated with a key in cache.

* @param key the key

* @param value the value to cache

* @return the cached value

public V put(final K key, final V value);

/**

* Removes a cached object.

* @param key the key

public void remove(K key);

}

A concrete implementation adds many more methods as well. A cache always store data as key-value pair in memory. Applications need to send the key to retrieve a value from cache. Caching follows a very simple workflow. It lookups up for an value using the key and if value is available, it is returned else it will lookup for the value in the primary storage (database, files etc) to get the value and then update the cache memory before returning the value to the user.

Typically, we use caching frameworks like Ehcache, which integrates with the application to get the data from cache. Applications should not put data directly into cache and should let the framework handle it. An typical example of how an application interacts with a cache framework is as follows:-

Cache Integration architectures

A caching framework could be standalone or distributed (I’ll cover it in a separate post). Distributed caching is used in enterprise applications. Distributed caching enables cached data to be shared across multiple nodes thus providing better performance, high availability and scalability. A cache framework can be integrated with an enterprise application in a number of ways. I'll try to cover a few of them here.

Cache Integration using an independent Cache Server

This integration styles uses independent cache server spanning across nodes. The application server connects to the cache server and the and the application servers connect to the cache server. The advantage of this architecture is that it can be scaled vertically as well as horizontally without any dependency of the application server. Memcache is used in enterprise application using this architecture. The diagram below is an example of such architecture.

Cache Integration using an Application Server

In this architecture, applications use the cache associated with the application server. In this case the cache server is "tied" to the application server. Websphere Dynacache is an example of utilizing the following architecture. Using this type of architecture requires very little configuration as the application takes the responsibility of it.

Cache Integration using an Independent Server and a local cache

In this architecture type, a local cache reside in the application server while the main cache server stands independent of the application server. Here actually, the cached data is moved closer to the application requiring cache which will further improve the performance. ObjectStore is an example which utilizes this kind of architecture.

Among all the above architecture, the first is simple and my preferred approach. The next one is dependent on the application server. Not all application servers available in the market comes with an caching solution. So if my application uses the 2nd approach and if I consider changing my application server, my choices will be limited. Although, the last approach is a very elegant solution for caching, there are very few such solutions available in the market and they are very expensive too.

In the next post, I'll try to share my knowledge on distributed caching topologies.

Wednesday, June 4, 2014

Cache Replacement Algorithms

Since last few days, I was studying cache replacement algorithms. After going through a lot of documents and papers, I thought of summarizing what I have read and understand about cache replacement algorithms in last few days. Although there are lot more algorithms, I am summarizing onlya few of the popular algorithms.

LRU (Least Recently Used)

In case of a cache hit, the cache returns a reference to the object and object gets moved to the first position in the cache (the most recently used object). On the other hand, if it is a cache miss, then the object needs to read from the primary storage (or some other source) and then puts the object in to the first position of the cache. If the cache is full, the last entry (the least recently used one) in the cache will be removed before a new object can be added.

LRU2 (Least Recently Used 2)

This algorithm is a development on the LRU algorithm. In this case, the algorithm replaces the object whose penultimate (second-to-last) access is least recent among all penultimate accesses. The objective of this algorithm is not to find and evict an object since it is accessed once but to find out how frequent the object is accessed over time (popularity) and to evict the less popular one.

MRU (Most Recently Used)

MRU algorithm is most useful in situations where the older an item is more likely to be accessed. When we need to get rid of a object in cache, we trash the one we just recently accessed. This algorithm has very specific use, for example, a photo gallery. Users who have already viewed a photo has less chance of viewing it again.

2Q is similar to LRU2 but improves upon it making it better and adaptive cache replacement algorithm. When an object in cache is first hit, 2Q place it in a special buffer, which is managed as a FIFO queue. If the object is referenced again while in the special buffer, the object is moved into another list which is managed as LRU. If a object is not referenced while on the special buffer, removed from the buffer.

LFU (Least Frequently Used)

In this algorithm, each object in cache has a corresponding counter associated with it. The counter counts how many times an object is accessed. As a replacement policy, the objects that are access least number of times are first to be replaced. So if an object is accessed a lot number of times, it is going to stay in cache.

FIFO

First-In First-Out (FIFO) is considered as the oldest cache replacement algorithm. It simply maintains a list of all objects in the cache such that head of the list is the oldest arrival and tail of the list is the most recent arrival. FIFO has a poor performance compared to LRU and is seldom used today in its original form.

Second Chance

Second Chance (SC) is an enhancement to FIFO. In SC, a reference bit is maintained for each object in cache while maintaining the objects as a FIFO queue. When a new object is fetched from a primary storage, it is appended at the tail of the queue. The reference bit for the new object added is set to zero. When an object in cache is accessed then its reference bit is set to 1. The replacement policy in SC checks the object at the head of the FIFO queue and replaces it if its reference bit is zero else the object at the head is moved to the tail and its reference bit is reset to zero. The policy keeps checking the objects at the head until an object is found with a zero reference bit. Once it finds an object with zero reference bit, it will be replaced. SC has a overhead of moving objects from head to the tail of queue which makes it a bit inconsistent.

CLOCK

CLOCK is functionally very similar to SC. The difference being in CLOCK uses a circular queue instead of FIFO and the replacement algorithm cycles through the queue like a arm of CLOCK. By default, the clock hand points to a object and whenever a replacement is needed the policy starts checking the object, pointed by the hand, in the same as SC. Clock eliminates the need to move a object from head to tail thus providing a better performance.

ARC (Adaptive Replacement Cache)

ARC is considered to be one of the best cache replacement algorithms. ARC uses 2 lists, L1 ans L2. The first list contains objects that have been seen only once recently, while the latter contains objects that have been seen at least twice recently. So the List L1 captures recency while the list L2 captures the frequency. The replacement policy is based on constant reshuffling of the L1 and L2 lists. ARC also adapts based on workloads. It is fast and self-tuning.

CAR (Clock with Adaptive Replacement)

This algorithm CAR is inspired by the Adaptive Replacement Cache (ARC) algorithm, and inherits all advantages of ARC including its high performance. CAR maintains four doubly linked lists, say, T1, T2, B1 and B2. The lists T1 and T2 contain the objects in cache, while the lists B1 and B2 maintain history information about the recently evicted objects. The CLOCKS T1 and T2 contain those objects that are in the cache and the lists B1 and B2 contain history objects that were recently evicted from the cache. The CLOCK T1 captures “recency” while the CLOCK T2 captures “frequency.” The lists B1 and B2 are simple LRU lists. Objects evicted from T1 are placed on B1 , and those evicted from T2 are placed on B2.

Sunday, June 1, 2014

Caching basics

Caching

Caching is the technique of storing data temporarily (frequently used data) which are expensive to fetch every time. Fetching the data from caching is much faster compared to fetching it from the source.

Cache Hit:

Simply put, if the cache contains the data being looked up, it is known as cache hit.
The percentage of requests that result in cache hits is known as the hit rate or hit ratio of the cache.

Cache Miss:

When the data being looked up is not available in the cache, then it is known as cache miss. In case of a cache miss, the data is fetched again from the primary storage and put into cache.

In case of cache miss there can be two scenarios:

Free Space available

In this case the object will be fetched from primary storage and added to the cache.

No Free Space available

In this case the object will be retrieved from the primary storage and will replace an existing entry in the cache. Now, the entry to be replaced is governed by a cache replacement algorithm (LRU, MRU, LRU2 etc, details of which I’ll share in a later post).

Storage Cost

When a cache miss occurs, data will be fetch it from the primary storage, load it and place it in the cache. Also data can be prefetched at application startup. Here what is important is how much space to be used by cache (space versus time tradeoff)

Miss Penalty

And when we need to load the data we need to know how much does it take to load the data. This is known as Retrieval cost

Replacement Policy:

When cache miss happens, the cache ejects entries in order to make room for the newly fetched data from primary storage. The heuristic used to select the entry to eject is known as the replacement policy.

Average memory-access time = Hit time + Miss rate x Miss penalty

Hit time – time to access the cache

Hit ratio -- percentage of time the data is found in the cache

Miss ratio -- (1 - hit ratio)

Miss penalty – Time to access the primary storage