At the main Siberian Java conference
JBreak-2018 , held in Novosibirsk, Christian Thalinger from
Twitter shared his practical experience of using Graal. The company (“Peter-Service”) sent our entire working group to the conference, and we came to listen to this report as a whole. This is understandable if you take into account the fact that Graal is still considered a bold and potentially dangerous experiment (although it is very likely that it will enter JDK 10). It was very interesting to learn how this new product manifests itself in battle - yes, not anywhere, but in the development of this level.

Christian Talinger has been working with Java virtual machines for more than a dozen years, and the key skill in his expertise is JIT compilers. That Christian introduced Graal and became the initiator of his current (very, according to Chris, active) use in the production environment Twitter. And, according to Talinger, this innovation saves the company a lot of money by saving iron resources.
Here in this interview with the organizers JBreak Christian lucidly explains the basics - what Graal is and how to manage it. Well, the report in Novosibirsk was more practice-oriented: its main task was to show the audience how to start working with Graal in a simple and painless way, and why it is worth trying.
')
For a start - still a couple of theoretical introductory. So what is a JIT - just-in-time compiler? To run the program in Java, you need to perform several steps: first compile the source code in the instructions for the JVM - bytecode, and then run this bytecode in the JVM. Here the JVM acts as an interpreter. The JIT compiler was created to speed up Java applications: it optimizes the bytecode being started by translating it into low-level machine instructions right while the program is running.
HotSpot / OpenJDK uses two levels of JIT compilation implemented in C ++. These are C1 and C2 (also known as client and server). By default, they work together: first, a quick but superficial optimization with C1 is performed, and then the hottest methods are further optimized with C2.
In Java 9, a mechanism was implemented within
JEP-243 for embedding a compiler written in Java into the JVM. And this is a dynamic compiler - JVMCI (Java Virtual Machine Compiler Interface). Actually, this mechanism also supports Graal. I must say, in Java 9 Graal was already available as part of
JEP-295 - AOT-compilation (Ahead-of-time) in the JVM. True, although the AOT compilation mechanisms use Graal as a compiler, this JEP states that initially the integration of Graal code into the JDK is assumed only within the framework of the Linux / x64 platform.
So, to try Graal, you need to take the JDK with AOT and JMVCI. And if you need to run on MacOS or Windows platforms, you will have to wait for the release of Java 10 (in the corresponding ticker
JDK-8172670 fix version is put in the top ten).

Here Christian drew attention to the fact that in the current JDK distributions, the Graal version is, to put it mildly, outdated (either a year ago, or even younger). But here the modularity of Java 9 comes to the rescue. Thanks to it, we can build the latest version of the Graal sources and embed it in the JVM using the --upgrade-module-path command. Since the development of Graal was started long before the system of modules, a special tool, mx, is used to build it, which to some extent repeats the modular Java system. The tool runs on Python 2.7, all links can be found in
the Graal repository in GitHub .
That is, we first extort and install mx, then we extort Graal and assemble it into the module via mx, which will then replace the original module in the JDK.
At first glance, these manipulations may seem complicated and labor-intensive, but in reality this feature is not so terrible. And in principle, the ability to replace the Graal version without waiting for the patch to be released on the JDK or even the new JDK, personally seems to me more than convenient. At least, Christian showed how he collected it all live on the machines in the cloud. When building Truffle, however, an error occurred - some additional dependencies were needed that were installed on the machine. But Graal gathered correctly and was used in this form (from which we conclude that we can forget about Truffle: Graal is completely independent of it).
Next: in order for the JVM to start using Graal, you need to additionally set 3 flags:
-XX:+UnlockExperimentalVMOptions -XX:+UseJVMCICompiler -XX:-EnableJVMCI
Since, in essence, Graal is a normal Java application, it also needs to compile and prepare itself for work (the so-called bootstrapping). In the default mode (on-demand), this happens in parallel with the launch of the application, in which case Graal uses C1 to optimize its code.
It is also possible to explicitly launch initialization before starting the application, and in this situation, you can even instruct Graal to optimize itself. However, this usually takes much more time and does not provide significant benefits. The Grail is initialized a little longer than C1 / C2, and it uses the free processor power more actively due to the fact that it needs to compile more classes. But these differences are not so great and are practically leveled, being lost in the general noise when the application is initialized.
In addition, since Graal is written in Java, it uses heap for initialization (in the case of C1 / C2, memory is also used only through malloc). The main memory consumption comes at the start of the application. Both Graal and C1 / C2 use free kernels when compiling. Memory consumption by Grail can be traced by enabling GC logging (currently, no heap isolation for Graal initialization from the main application heap is provided).
Well, we learned how to set it all up - it's time to understand why. What are the benefits of using Graal?
Christian used a practical example to answer this question. He launched a couple of benchmarks from one project written in Scala: one was actively working with the CPU, and the other was already interacting more actively with the memory. On the benchmark that worked with the CPU, when using Graal, there was a noticeable slowdown by an average of a second due to a longer start (the benchmark itself was running for 5 seconds). But on the second benchmark, Graal showed quite a good result - ~ 20 seconds versus ~ 28 on C1 / C2. And this is despite the fact that, as noted by Christian, the Scala Graal example does not work as well as it could (due to the dynamic structure of the generated Scala bytecode). That is, you can hope that in the case of a pure Java application, everything should be even better.
Plus, when outputting GC logs, it was clear that with the Graal application produces much less garbage collections (about 2 times). This is associated with more efficient escape analysis, which allows you to optimize the number of objects created on the heap.

Summarizing my personal impressions of what I heard, I will say that the report seemed to me rather comprehensive, and not at all carrying an advertising message in the spirit of “all go urgently to Graal”. It is clear that the magic tablet does not happen, and everything always determines the real application - Christian himself recognizes that the specific values, of course, depend on the specific benchmarks. Those who decide to try the Grail, in any case, will have to apply the method of scientific typing, run and (surely) find bugs (and it is better then to edit them and issue pull-requests in the Graal repo).
But in general, with the current trend towards the use of microservices and stateless applications - and, as a result, the Young Gen - Graal looks very good in a more active (and correct) application.
So, if the project can be translated with little blood into Java 9 (or write from scratch on it), I would definitely try Graal. And, for example, I was even pleased that the whole emphasis in the report was made specifically on Graal as a JIT compiler - because in general, it is in that capacity that an ordinary Java developer is required (that is, without Truffel and other things from GraalVM, which Oracle recently combined into a framework for development and runtime for various languages ​​based on JVM). It would be interesting to test the cost of memory and see how noticeable the difference between the standard C1 / C2 and Graal. On the other hand, despite the fact that quite a decent amount of memory is allocated to the application in our time, and its main volume is consumed at startup (and today it is usually the initialization and start of the container that the application is already launching), these figures are probably in any case not so significant.
Here you can download the presentation from the report.In truth, I personally was so interested in the idea that I plan to repeat all the steps taken by Christian, but I’ll try to get rid of Java benchmark suites directly (for example, DaCapo and SPECjvm2008 - I’m not so good at Java benchmarking, so I would appreciate it if someone will offer more adequate options in the comments or HP). Well, closer to the specifics of work - I will try to sketch a simple web application (for example, SpringBoot + Jetty + PostgreSQL), drive under load and compare the numbers. The results promise to share with the community.