Quite often in java applications to reduce the load on the database using the cache. Not many people really understand how the cache works under the hood, just adding annotation is not always enough, you need to understand how the system works. Therefore, in this article I will try to reveal the topic of how the cache of the popular ORM framework works. So, first, a little theory.
First of all, Hibernate cache is 3 levels of caching:
- First-level cache (First-level cache);
- Second level cache (Second-level cache);
- Query cache (Query cache);
First level cache
The first level cache is always attached to the session object. Hibernate always uses this cache by default and cannot be disabled. Let's take a look at the following code right away:
SharedDoc persistedDoc = (SharedDoc) session.load(SharedDoc.class, docId); System.out.println(persistedDoc.getName()); user1.setDoc(persistedDoc); persistedDoc = (SharedDoc) session.load(SharedDoc.class, docId); System.out.println(persistedDoc.getName()); user2.setDoc(persistedDoc);
Perhaps, you expect that 2 queries in the database will be executed? This is not true. In this example, 1 request to the database will be executed, despite the fact that 2 calls to load () are being made, since these calls occur in the context of one session. During the second attempt to load a plan with the same identifier, the session cache will be used.
One important point is that when using the load () method, Hibernate does not unload data from the database until it is needed. In other words, at the moment when the first call to load is made, we receive the proxy object or the data itself if the data were already in the session cache. Therefore, there is getName () in the code in order to extract 100% of the data from the database. It also offers a great opportunity for potential optimization. In the case of a proxy object, we can link two objects without making a request to the database, in contrast to the get () method. When using the save (), update (), saveOrUpdate (), load (), get (), list (), iterate (), scroll () methods, the first level cache will always be used. Actually, there is nothing more to add.
Second level cache
If the first-level cache is tied to a session object, then the second-level cache is tied to a session factory object. What kind of implies that the visibility of this cache is much wider than the cache of the first level. Example:
Session session = factory.openSession(); SharedDoc doc = (SharedDoc) session.load(SharedDoc.class, 1L); System.out.println(doc.getName()); session.close(); session = factory.openSession(); doc = (SharedDoc) session.load(SharedDoc.class, 1L); System.out.println(doc.getName()); session.close();
In this example, 2 requests to the database will be executed, this is due to the fact that the second-level cache is disabled by default. To enable, you need to add the following lines in your JPA configuration file (persistence.xml):
<property name="hibernate.cache.provider_class" value="net.sf.ehcache.hibernate.SingletonEhCacheProvider"/>
Notice the first line. Actually, he does not implement caching as such. And only provides the structure for its implementation, so you can connect any implementation that meets the specifications of our ORM framework. From popular implementations, you can select the
following :
- Ehcache
- OSCache
- Swarmcache
- JBoss TreeCache
In addition to all this, most likely, you will also need to separately configure the implementation of the cache itself. In the case of EHCache, this must be done in the
ehcache.xml file. Well and in completion still it is necessary to specify to the most what exactly to cache. Fortunately, this can be done very easily using annotations, like this:
@Entity @Table(name = "shared_doc") @Cache(usage = CacheConcurrencyStrategy.READ_WRITE) public class SharedDoc{ private Set<User> users; }
Only after all these manipulations will the second level cache be included and in the example above only 1 request will be executed into the database.
Another important detail about the second-level cache about which it would be worth mentioning is that the hibernate does not store the objects of your classes themselves. It stores information in the form of arrays of strings, numbers, etc. And the object identifier acts as a pointer to this information. Conceptually, this is something like a Map, in which the object id is the key, and the data arrays are the value. Approximately you can imagine it like this:
1 -> { "Pupkin", 1, null , {1,2,5} }
What is very reasonable, considering
how much extra memory each object takes.
In addition to the above, it should be remembered - the dependencies of your class by default are also not cached. For example, if we consider the class above - SharedDoc, then when sampling the collection of users will come from the database, and not from the second-level cache. If you also want to cache dependencies, the class should look like this:
@Entity @Table(name = "shared_doc") @Cache(usage = CacheConcurrencyStrategy.READ_WRITE) public class SharedDoc{ @Cache(usage = CacheConcurrencyStrategy.READ_WRITE) private Set<User> users; }
And the last detail - reading from the second level cache occurs only if the desired object was not found in the first level cache.
')
Query cache
Rewrite the first example like this:
Query query = session.createQuery("from SharedDoc doc where doc.name = :name"); SharedDoc persistedDoc = (SharedDoc) query.setParameter("name", "first").uniqueResult(); System.out.println(persistedDoc.getName()); user1.setDoc(persistedDoc); persistedDoc = (SharedDoc) query.setParameter("name", "first").uniqueResult(); System.out.println(persistedDoc.getName()); user2.setDoc(persistedDoc);
The results of such queries are not saved by either the first or second level cache. This is the place where you can use the query cache. It is also disabled by default. To enable it, add the following line to the configuration file:
<property name="hibernate.cache.use_query_cache" value="true"/>
and also rewrite the example above by adding after creating the Query object (the same is true for Criteria):
Query query = session.createQuery("from SharedDoc doc where doc.name = :name"); query.setCacheable(true);
Query cache is similar to the second level cache. But unlike it, the key to the cache data is not an object identifier, but a set of query parameters. And the data itself is the identifiers of objects that match the query criteria. Thus, this cache is rationally used with a second-level cache.
Caching strategies
Caching strategies determine cache behavior in certain situations. There are four groups:
- Read-only
- Read-write
- Nonstrict-read-write
- Transactional
Read more
here .
Cache region
The region or region is the logical separator of your cache memory. For each region, you can configure your own caching policy (for EhCache in the same ehcache.xml). If the region is not specified, the default region is used, which has the full name of your class for which caching is applied. The code looks like this:
@Cache(usage = CacheConcurrencyStrategy.READ_WRITE, region = "STATIC_DATA")
And for the query cache like this:
query.setCacheRegion("STATIC_DATA");
What else do you need to know?
During the development of the application, especially at first, it is very convenient to see whether certain requests are really cached, for this you need to specify the following properties to the session factory:
<property name="hibernate.show_sql" value="true"/> <property name="hibernate.format_sql" value="true"/>
In addition, the session factory can also generate and save statistics on the use of all objects, regions, dependencies in the cache:
<property name="hibernate.generate_statistics" value="true"/> <property name="hibernate.cache.use_structured_entries" value="true"/>
For this, there are Statistics objects for the factory and SessionStatistics for the session.
Session methods:
flush () - synchronizes session objects from the database and at the same time updates the session cache itself.
evict () - needed to remove an object from the session cache.
contains () - determines whether the object is in the session cache or not.
clear () - clears the entire cache.
Conclusion
That's all. Naturally, there are still quite a few different nuances out of the article that arise when working with a cache, as well as many problems. But this is a topic for another article.