Cache
access patterns refer to how an application is going access the cache. There
are primarily few strategies that all major cache providers like (EHCache, Infispan,
Coherence etc) supports.
Cache aside
In this
access pattern, the application will first search for the requested data in
cache and if the data does not exist in cache, then it is the responsibility of
the application to fetch the data from datasource and update the data into cache. Thus in cache aside access pattern, the
application code directly uses the cache by invoking its API to add any missing
data in the cache.
Read Through
In this
access pattern, the application requests for a data from cache and if the data
exist in cache, it is returned to the application. In case, if the data does
not exist in the cache (cache miss), it is the responsibility of the cache
provider to check for the existence of the data in the datasource. If data
exist in datasource, the cache provider will fetch the data, update the cache
and finally return the data to the application.
Write Through
In this
access pattern, whenever the application updates any data in the cache, the
operation will not be complete until the cache provider writes the data
directly into the underlying datasource. In this case the cache is always in sync with the underlying datasource. This pattern is easy to implement but
the disadvantage is that the write operation is slower due to latency because the datasource need to be accessed for every writes.
Write Behind
This access
pattern is similar to Write-Through, the only difference being that the data is
updated in the datasource asynchronously.
Writes to the datasource can be configured to take place at a specific time,
like after 1 hour or midnight or may be at weekends to avoid peak hours. While
this pattern is hard to implement, the write operation is very fast and does
not require dealing with latency. Another big advantage of this pattern is that
many transactions can be grouped in one single transactions which will further
reduce latency. The biggest challenge of this pattern is that the write to the datasource
happens after the write to the cache and the data is written to the datasource
outside the transaction. So there is always a risk of failure and transaction
rollback has to be handled very efficiently. Compensating actions like retry
counts are used by the cache providers to deal with transaction rollback. Another
big challenge for write-behind pattern is that there is a time gap between the
cache transaction and the actual datasource transaction which may lead to
out-of-order updates. Proper ordering of update actions is required to mitigate
this challenge.
Few other cache access patterns exists which are very specific to the cache providers which are basically a hybrid approaches of the above four patterns.
In the next post I'll try to share my knowledge on cache providers and try to do a comparison of the products.