Distributed
caching is around since the late 1990s and has been continually evolving since
then. Many new features have been added to distributed caching frameworks but
simply accessing data in key/value pairs from memory was not fulfilling the
expectations of the customer. In fact, storing data in key/value pair does not
add much value but processing of the data do add value. So the customers were
expecting to move the computation of data closer to data itself. Data grids
addressed this requirement by adding features to be able to do computations on
the cached data. For example data grids provide feature like querying data in
the memory using standard SQL syntax, data-indexing, providing map-reduce based
processing, support for various complex data models (document, relational) etc.
Distributed
Caching
Since late
1990s, distributed caching technologies evolved continuously by add more and
more features. While the first generation distributed caching provided simple
cache clusters with a sophisticated hashing algorithm to keep track of the
data. In the next generation of distributed caches, we find advanced features
like high availability using partitioned or replicated architecture, ACID
transactions, distributed locking, asynchronous events and active backups.
Data Grid
Although,
distributed caching has evolved and matured over the years, what was missing was
to bring the computation of cache data to in-memory. Data has become more and
more complex over the years. New requirements like providing dynamic
scalability, database-like persistence, map-reduce based processing, SQL like
querying features, support for different types of data model like document,
json, relational etc were coming up. Data grids addressed these requirements
and also provided additional features like capability for monitoring and
management, policy and security enforcement, support quality of service and
easy integration with existing enterprise applications. Growing
adaption of data grid is bringing in more sophisticated requirements and higher
customer expectations.
The important thing to note here is that Data
Grid is not backed by any specification or industry standards. So the growth of
Data grid is completely based on customer requirements. Few popular
Data Grid tools in terms of customer adoption are:-- VMWare Gemfire
- Oracle Coherance
- Gigaspaces-xap
Along with the previous 4 posts, I tried to
cover as much as possible about caching. There are many specific areas which
I'll address going forward like tools comparison. Please comment if you find
anything missing or need more information and explanation.
No comments:
Post a Comment