Releasing Predator - Precompiled Data Repositories

Today, the Micronaut team at Object Computing Inc (OCI) introduced Predator , a new open source project whose goal is to significantly improve the runtime and performance (from memory) of data access for microservices and serverless applications, without losing productivity compared to tools like GORM and Spring Data.

Data Access Tools History

We can track the history of the data repository template since 2004, when Ruby on Rails came out with the ActiveRecord subsystem, an API that revolutionized our understanding of data access in terms of developer productivity.

In 2007, the Grails team first introduced an ActiveRecord-like API for the JVM - GORM (part of Grails). GORM relied on the dynamic nature of Groovy to implement search methods on top of Hibernate and provided the same productivity benefits to JVM users.

Because GORM depends on the Groovy language, a Spring Data project was created in 2011 that allowed Java developers to define search methods, such as findByTitle , in the interface, and automatically implement query logic at run time.

How data access tools work

All the mentioned implementations use the same template, which consists in building a metamodel of project entities at runtime that models the relationships between your entity classes. In Spring Data it is a MappingContext, and in GORM it is also called a MappingContext. They are constructed by scanning classes using reflection. (The similarity in naming is not accidental here. In 2010, I worked with the Spring Data team to try to recreate GORM for Java, on a project that eventually turned into what is called Spring Data today)

This metamodel is then used to transform a search expression, such as bookRepository.findByTitle("The Stand") , into an abstract query model at runtime using a combination of regular expression parsing and framework logic. We need an abstract query model, because the target query dialect is different for each database (SQL, JPA-QL, Cypher, Bson, etc.)

Micronaut Repository Support

Since launching Micronaut a little over a year ago, the main missing feature we were asked about was "GORM for Java" or Spring Data support. So many developers are in love with the productivity that these tools provide, as well as the ease of defining the interfaces that the framework implements. I would say that most of the success of Grails and Spring Boot can be attributed to GORM and Spring Data respectively.

For Micronaut users using Groovy, we had GORM support from day one, and Java and Kotlin users were left with nothing, because they needed to implement repositories on their own.

It would be technically possible, and frankly easier, to simply add a module for Micronaut that would configure Spring Data. However, following this path, we would provide a subsystem implemented using all the methods that Micronaut tried to avoid: widespread use of proxies, reflection and high memory consumption.

Introducing Predator!

Predator, short for Precomputed Data Repositories, uses the Micronaut API to compile before execution (AoT, ahead-of-time) to transfer a meta-model of entities and transform search expressions (such as findByTitle ) into the appropriate SQL or JPA-QL in your compiler . As a result, the query executes a very thin program runtime layer without reflection, and it only remains for it to run the query and return the results.

The result is overwhelming ... the cold start is significantly reduced, we get an amazingly low memory consumption and a sharp improvement in performance.

Today we open the source code for Predator under the Apache 2 license, it will come with two initial implementations (more features planned for the future) for JPA (based on Hibernate) and for SQL with JDBC.

The JDBC implementation pleases me the most, as it is completely independent of reflection, does not use proxies and dynamic class loading for your level of data access, which leads to improved performance. The runtime layer is so light that even the equivalent repository code written by hand will not execute faster.

Performance Predator

Since Predator does not need to perform any query transformations at run time, the performance gain is significant. In the world of cloud computing utilization, where you pay for the amount of time your application runs or for the execution of a single function, developers often overlook the performance of their data access mechanisms.

The following table summarizes the performance differences that can be expected for a simple search expression such as findByTitle compared to other implementations. All tests were performed using a test bench on the 8-core Xeon iMac Pro under the same conditions, the tests are open and can be found in the repository :

Implementation	Operations per second
Predator JDBC	225K ops / sec
Predator jpa	130K ops / sec
Spring data jpa	90K ops / sec
GORM JPA	50K ops / sec
Spring Data JDBC	Finders not supporteded

Yes, you read it right. With Predator JDBC, you can expect an almost 4X performance increase over GORM and 2.5X over Spring Data.

And even if you use Predator JPA, you can count on more than 2X performance improvements compared to GORM and up to 40% increase compared to Spring Data JPA.

Look at the difference in the size of the execution stack when using Predator compared to alternatives:

Predator:

Predator JPA:

Spring Data:

GORM:

Predator JDBC uses only 15 frames until the moment your request is completed, while Predator JPA uses 30 (mainly because of Hibernate), compared to 50+ stack frames in Spring Data or GORM. And all thanks to AOP Micronaut mechanisms that do not use reflection.

Shorter stackraces also simplify application debugging. One of the advantages of doing most of the work during compilation is that errors can be detected before the application launches, which greatly improves the developer's experience. We get compilation errors immediately instead of runtime errors for the most common errors.

Compile time checks

Most implementations of the repository template rely solely on performing all operations at runtime. This means that if the developer makes a mistake in defining the interface of the repository, errors will not be visible until the application is actually launched.

This robs us of some of the benefits of Java for type checking and we have poor data experience. With Predator, this is not so. Consider the following example:

 @JdbcRepository(dialect = Dialect.H2) public interface BookRepository extends CrudRepository<Book, Long> { Book findByTile(String t); }

Here BookRepository we declared a request to an object named Book , which has a title property. Unfortunately, there is an error in this declaration: we named the findByTile method instead of findByTitle . Instead of running this code, Predator will not allow your code to compile with an informative error message:

 Error:(9, 10) java: Unable to implement Repository method: BookRepository.findByTile(String title). Cannot use [Equals] criterion on non-existent property path: tile

Many aspects of Predator are checked at compile time, when possible, to ensure that a runtime error is not caused by an incorrect repository declaration.

Predator JDBC and GraalVM Substrate

Another reason Predator should be happy is that it is out of the box compatible with native GraalVM Substrate images and does not require complex bytecode conversions during build, unlike those for Hibernate on GraalVM.

By completely eliminating reflection and dynamic proxies from the data access layer, Predator greatly simplifies the creation of applications that work with data running on GraalVM.

The Predator JDBC sample application runs on Substrate without problems and allows you to create a much smaller native image (25 MB less!) Than Hibernate needs to work, thanks to a much thinner runtime layer.

We saw the same result when we implemented the Bean Validation rule compilation for Micronaut 1.2. The native image size decreased by 10 MB, as soon as we removed the dependency on the Hibernate Validator, and the JAR size by 2 MB.

The advantage here is obvious: by doing more work at compile time and creating more compact runtimes, you get a smaller native image and a JAR file, which leads to smaller and easier to deploy microservices when deploying through Docker. The future of Java frameworks is more powerful compilers and smaller, lighter runtimes.

Predator and the future

We are just starting to work with Predator and are extremely pleased with the opportunities that it opens up.

Initially, we start with support for JPA and SQL, but in the future you can expect support for MongoDB, Neo4J, Reactive SQL and other databases. Fortunately, this job is a lot simpler because most of Predator is actually based on the GORM source code, and we can reuse the GORM logic for Neo4J and GORM logic for MongoDB to release these implementations faster than you expect.

Predator is the culmination of combining the various building blocks in Micronaut Core that made it possible to implement it, from the AoT APIs, which are also used to generate Swagger documentation, to the relatively new Bean Introspection support, which lets you analyze objects at runtime without reflection.

Micronaut provides building blocks for amazing things. Predator is one such thing, and we are just starting to work on some of the promising features of Micronaut 1.0.

Source: https://habr.com/ru/post/460839/

All Articles