Analysis of the report of Ruslan Cheryomin with JPoint 2016
Good afternoon, everyone, Roman Poborchy is again on the JUG.ru blog, and today we are reviewing the report by Ruslan cheremin Cheryomin about escape analysis and scalarization:
Slides can be viewed and downloaded here . Traditional disclaimer : about Java, only the article being analyzed, not the article itself.
Plot
Here I have most of the impressions - positive, everything that I think should be in the report, is in it. The proposed changes are mainly cosmetic.
Formulation of the problem
I don’t know how you are, but Ruslan completely convinced me that it is useful to understand how escape analysis works in Java. If you write code with this knowledge in mind, there will be fewer objects in the heap, less gc and in general the world will be better. The answer to the question "why" is. ')
However, the comic picture that appears at 03:37 slightly eclipses the statistics. After the first viewing, I remembered only this graph and just wanted to complain that there are no figures about how much you can gain "on average". At the second viewing it turned out that the numbers (~ 15% of allocations that can be eliminated) are mentioned. If we could find and show a similar “normal” case, it would be better stored in memory. Of course, it is not always easy to find such data from your practice to order, not so no.
findings
The conclusions and recommendations in the report naturally follow from the logic of the story; they are practically useful and concrete. We do not just understand what to do “well” (for example, write small methods) and why it should be done. We also found out that this “good” is almost always possible to find out the exact quantitative measure, and you can even redefine it if necessary! What more to wish for happiness?
Unless it would be possible to caution viewers: something will break if we often and mindlessly override the compiler defaults, but this will require examples and time and blur the focus.
Omission
What I lack here is the description of the experimental technique: for the first time the results of the launches appear on slide 16 (time 13:38), and there immediately appear 12 run'ov, limited not by the number of calls, but by time. Why so? Does this have any meaning, or is it historically so? Why in some cases the time is 5 seconds (slides 16, 17, 19 and several others), in others 3 (slides 68, 73 and several others), and in others not specified at all (slides 89, 90, 91)?
I am sure that it is enough to explain the technique here once, just to remove possible distrust, and all subsequent references will not require repetitive details, more about this in the section on slides.
On the stage hanging gun
The speaker was not hung by the speaker himself, someone from the audience made a shout from the spot, but, one way or another, it was about Φ-functions twice (in particular, half a minute since 16:40). The compiler is one of those things for understanding of which fundamental knowledge is needed, but our audience does not always have this knowledge. Since there was a contact of two or three understanders , it was worth trying in general terms to convey to everyone what it was about so that they did not feel deprived.
I wondered if it was impossible on the fingers, since there are no prepared slides in advance, tell me in a minute what these Φ-functions are and what approximate relation they have to the topic of the story. I focus on myself, that is, on the level of a person who, twenty years ago, passed a special course on compilers and has since worked with them only as a user. At the time of viewing the report, I, of course, did not remember what it was. So, we look at slide 23 and try to explain what is happening there:
Here's what I got:
The internal code representation used in HotSpot JIT complies with the requirements of a single assignment, SSA-form. This means that assignment is done only once in each variable, and if we write somewhere many times, then the compiler gets a new version of this variable for each record. This is necessary for a very large number of optimizations, which he spends just above his internal presentation. That is, in our code v from the if-branch, v from the else-branch and v from the return-statement from the point of view of the compiler - three different variables. For cases like return-statement, Φ-functions are just needed. This is an abstraction that assigns a value to a variable, depending on which branch of performance we have come to this point. The beauty and meaning of this function is that it is used without interrupting the base unit, although from the point of view of human logic it contains a conditional statement inside. But the function, like the conditional operator, is not real (actually Φ is short for phony, fake), such an allowed cheat. Nevertheless, we empirically see that this construct blocks the escape analysis implemented in JIT.
Only the text turns out to be quite difficult, but if you prepare in advance, you can draw a diagram and don’t be afraid to talk about the SSA-form: at the concept level, this is simple. In the implementations, there are probably a lot of unobvious details, and in some audiences one will have to make more digressions to explain who the internal presentation and the basic block are, but nonetheless.
If this post is read by a real compiler, he may not even remove his read-out Dragonbook from the shelf, point out errors and at the same time explain how to tell correctly.
Slides
To follow the course of the story is easy, the slides help in this. If somewhere there is a lot of text or a big issue of the console utility, then interesting places are highlighted. There are a lot of examples with the code, and they follow each other faster and faster in the course of the story, you get a little tired of this, but nothing critical. A couple of things still want to attract attention.
Laser pointer
Those who have read the previous analysis already know everything about using a laser pointer. Speech by Ruslan takes place in the same wide hall with two screens, as well as Sergey’s, and exactly the same time, the target designator is not visible at any one time about a third of the audience.
In each case, the slide can be modified to make it immediately clear what is at stake. For example, slide 13 (appears at 10:20) to cure is very simple:
It is enough to show not the entire text at once, but to open the lines one by one as speech comes to them.
Duplicate elements
When watching a performance, I always wanted to do something with the quantitative results of experiments, which are found in the same format on many slides:
What is wrong with them?
It is a lot of dense text, it is difficult to read it, if you wanted to do so.
An interesting moment comes not between the runes, but somewhere inside one, usually close to the beginning of the first. Therefore, the results grouped by run (0.01 byte per iteration) look unusual, although it is easier to get them.
Instant information about the outcome of the experiment, the viewer receives from the color. This is normal, but it can create difficulties for people with a red-green color perception disorder (which, unfortunately, is the most common).
How to simplify it?
Most indicative would be the graph, where the X-axis shows the call number (through for all the run'ov), and the Y-axis - the number of bytes allocated in the heap to this call. In a bad case, the schedule is horizontal; in a good case, the first is how many (thousand) calls are horizontal, and then it drops to zero. At the same time it would be clear from it how quickly the optimization is turned on, something like this:
In this form, it would be interesting to follow the dynamics of what is happening with the probability function (slides 98-91). There, the schedule, most likely, should be some interesting:
Perhaps, I do not understand something, and the data for such a schedule, even with stretch marks, cannot be obtained in principle. Then you can try to greatly reduce the size of run'ov and build a graph on them.
Regular Parsing
If you want to get feedback on your performance, I’ll be happy to give it to you.
What is needed for this?
Link to the video recording of the speech.
Link to slides.
Application from the author. Without the consent of the speaker himself, we will not analyze anything.
All this needs to be sent to the habrauzer p0b0rchy , that is, to me. I promise that the review will be constructive and polite, as well as highlight the positive aspects, and not just what needs to be improved.