Overview

When we want to talk about caching for performance we are talking about the "Level 2" cache or the "server cache". It is called the "Level 2 cache" because the persistence context is often referred to as the "Level 1 cache".

The goal of the L2 server cache is to gain very significant performance improvement by not having to hit the database. Ebean has 2 types of caches – Bean caches and Query caches.

Bean Caches

Bean caches hold entity beans and are keyed by their Id values. The bean cache can be used with:

  • Find by id query
  • Find unique by "natural key" query

Query Caches

Query caches hold the results of queries (Lists, Sets, Maps of entity beans) and are keyed by the query hash value (effectively a hash of the query and its bind values). The entries in a query cache are invalidated by ANY change to the underlying table – insert, update or delete. This means that the query cache is only useful on entities that are infrequently modified (typically "lookup tables" such as countries, currencies, status codes etc).

The query cache can be used with:

  • Find list/set/map queries

L2/local and L3/remote

L2 cache can thought of as having Near Caches and Remote Caches where a "Near cache" is in the same process as the EbeanServer instance (and does not require a network access) and a "Remote cache" that requires a network access.

L3 bean cache options

The following are the 3 main approaches when we get a miss on the a local L2 bean cache.

The L2 bean cache is keyed by id value so really anything resembling a key value store (distributed map) works well for bean caching.

Note that ElasticSearch has the advantage that we update it with changes (as deltas to documents) rather than remove/invalidate an entry in a map so with ElasticSearch we are always going to get a hit (and hence always avoid hitting the database). There is also the potential benefit that ElasticSearch documents will often contain denormalisation (embedded documents) so a hit against ElasticSearch will populate more of the graph and avoid extra hits. For example, a hit of an order can populate customer details like customer name.

L3 query cache options

The following are the 2 main approaches when we get a miss on the local L2 query cache.

The significant benefit of using ElasticSearch is that it can address both parts of the L2 cache meaning that we can use it for L2 query cache (by translating ORM queries into ElasticSearch queries) as well as the L2 bean cache.

We can update ElasticSearch as changes occur and use the performance of inverted indexes to perform very fast queries (avoiding the database hit like L2 query cache but not having the issue of frequent invalidation that L2 query cache suffers from).

ElasticSearch

The current emphasis for Ebean ORM is to utilise ElasticSearch as the best L3 cache option. This means that

Read consistency

With Ebean L2 caching there is no attempt to provide transactional read consistency

ACID/Transactional databases do a lot of things for us and this includes providing "read consistency" based on transaction isolation levels (MVCC databases like Postgres and Oracle emphasis this more).

With Ebean L2 caching there is no attempt to provide transactional read consistency. What that means is that if part of your application needs transactional read consistency then use the database to do that.

L2 cache works to eventual consistency

Instead with Ebean L2 caching the view is that eventual consistency is good enough, much simpler and a more natural approach when using ElasticSearch. Given what ElasticSearch is doing for us behind the scenes (flush, merge, optimise etc) in order to provide inverted index awesomeness we happily go with eventual consistency and the associated simplicity.

Near cache

With Ebean the view is that the cost of trying to make L2 caching read consistent is not worth it. Instead we treat the L2 cache as reasonably up to date. In practical terms this means that cache invalidation occurs in the background after successful commits. L2 near caches are invalidated across the cluster very quickly (millisecond) but this is not strictly transactional (there is a delay after the commit).

ElasticSearch

When we use ElasticSearch the delay of a change (insert, update, delete) being visible in ElasticSearch is going to be longer with a default of 1 second delay but for various bean types and use cases we may choose to increase that delay to support more throughput or be more efficient overall. This is similar to decisions on how frequently database materialised views should be updated.

Table IUD invalidation (Bulk updates)

Bulk insert, update or delete events are processed by table. For a given table the bean types that depend on that table are determined and then for each bean type the L2 bean cache and L2 query cache are invalidated as necessary.

Cluster message

The message sent around the cluster contains the table name and boolean flags for insert, update and delete.

L2 query cache

  • Inserts, updates or deletes: Invalidate the entire L2 query cache for the related bean type

For any bulk statements (bulk table insert, update or delete statement) the entire L2 query cache is invalidated for the associated bean type. For example, a bulk update of the customer table invalidates the entire L2 query cache for the Customer bean type.

L2 bean cache

  • Inserts: Do not effect the L2 bean cache
  • Updates: Invalidate the entire L2 bean cache for the related bean type
  • Deletes: Invalidate the entire L2 bean cache for the related bean type

Bean IUD invalidation

Persisted beans are processed by bean type and id value.

Cluster message

The message sent around the cluster contains the bean type and 3 lists of ids - a list of id values for inserted, updated and deleted beans.

L2 query cache

  • Inserts, updates or deletes: Invalidate the entire L2 query cache for the bean type

For any bean persist event invalidates the entire related L2 query cache for the associated bean type. For example, saving a customer bean invalidates the entire L2 query cache for the Customer bean type.

L2 bean cache

  • Inserts: Do not effect the L2 bean cache
  • Updates: An entry is updated with changes
  • Deletes: An entry is removed from the L2 bean cache based on the id value

External invalidation

When you save/delete beans via Ebean.save() and Ebean.delete() etc Ebean will automatically maintain its cache (removing cached beans and cached queries as appropriate). However, you may often find yourself modifying the database outside of Ebean. (via stored procedures etc)

For example, you could be using other frameworks, your own JDBC code, stored procedures, batch systems etc. When you do so (and you are using Ebean caching) then you can inform Ebean so that it invalidates appropriate parts of its cache.

// inform Ebean that some rows have been inserted and updated
// on the o_country table.
// ... Ebean will invalidate the appropriate caches
boolean inserts = true;
boolean updates = true;
boolean deletes = false;
Ebean.externalModification("o_country", inserts, updates, deletes);
Alternative:

ServerCacheManager also provides explicit API for clearing caches.

// clearAll() caches via the ServerCacheManager ...
ServerCacheManager serverCacheManager = ebeanServer.getServerCacheManager();

// Clear all the caches on the default/primary EbeanServer
serverCacheManager.clearAll();

// clear both the bean and query cache
// for Country beans ...
serverCacheManager.clear(Country.class);