"The work of engineers - to make a complaint" - Interview with Sergey Kuksenko from the Java Performance Team

Imagine that you came to a meeting of JUG.ru or CodeFreeze, or, for example, to a Java conference at which Sergey Walrus Kuksenko , a developer from the Java Performance Team, just spoke. And so, for some reason, all the other students ran away, and you and Sergey were left alone. And suddenly, he is not in a hurry, and he has free time to answer your questions, of which a great many have accumulated ...

Meet: today we have an absolute exclusive - a great interview with Sergey Kuksenko ! From the interview you will learn:

how does the java performance team
in which areas of Java is active performance-work underway
why do you need hardcore on jugs and conferences
what the performance engineer should know
what is highload and where is the border
what's happening with java strings right now
which way do runtime tuning evolve

')

- Sergey, you often make reports with us, and your reports each time become more and more difficult from the point of view of the material you tell, and go deeper and deeper into all sorts of iron things. When you tell, you look into the eyes of people - do you see some kind of understanding there, or as you deepen the understanding is lost?

Kuksenko: This is the most key feedback that I generally try to catch on my reports. The fact is that when I see the answer to the audience, I feel some pleasure from the fact that my report was not wasted, and on average, if I see at least a dozen active listeners, I think that the report was a success. This happens quite often.

- So you have a certain percentage for which you are trying to hold on?

Kuksenko: Yes, I convey the information, I need it to reach someone, so that I make a presentation not just for myself.

- And how to make this choice, when in the audience you have people with different backgrounds, they have a different level of understanding? You say in advance that "I will tell a fixed complexity," or do you adjust the complexity?

Kuksenko: I announce in advance that there will be a rather complicated report, and if I have information about the audience, and I know that there will not be enough interested people, then in that case I will not hold a report, I will not participate in such weak conferences.

- How do you find out information about the audience?

Kuksenko: As a rule, after the fact.

- As a conference organizer, what can I tell you about an audience of such that may affect your report?

Kuksenko: I do not know the answer to this question. As a rule, I assess the level of the audience on the visits to the conference and decide for myself in my head whether I will go here one more time or not.

- You have been performing in Russia for four years, no less. Have you started 2012?

Kuksenko: Somewhere in 2011.

- Do you see that the audience is growing? Not by quantity, but by the degree of understanding of what you are talking about?

Kuksenko: Honestly, no. I see a bit of feedback - I see that the audience is tired, I see that people get tired of the complex reports that we report from year to year. Well, at least I think so. I see that quite interesting people, whom I memorize at various conferences, are leaving this area. Either their career growth begins, or they switch to managers, or something else. And I still do not see a serious replacement.

It is possible that the effect of the first reports worked when people had some hunger for technical things, they asked a lot of questions. Now I see a decline in such interest from the audience to complex technical reports.

- This is interesting, because I really like that you and Lyosha (meaning Aleksey Shipilev - author's note) do not do self-copying. In 2011-2012, there was a performance theme, and then it expanded a little.

Kuksenko: We all told. Why repeat? Since 2012, nothing has changed in this area.

- How actively is the performance science developing now, and maybe a performance in Java, in particular?

Kuksenko: I will not say anything for performance-science. I do not know what is happening in science. Probably something is developing. Performance is not so much a science as applied engineering, when we have something real, and this real thing must be crammed into some kind of time frame, requirements, etc.

Java makes changes in various areas. There is no global plan: “By 2020 universal happiness will come, and we will carry out any operation in one picosecond.” We have a product, we find some places, we twist it there, here ... New things are being invented, adaptation to new hardware is underway, etc., that is, the usual process.

- How quickly does iron change so that people who are engaged in performance can use new iron chips?

Kuksenko: Every two years, Intel rolls out a new micro-architecture with interesting enough chips, but if you start looking at more or less serious research, it turns out that some kind of super-duper new feature came out, then the new architecture ... And how much did we get? in terms of the average temperature throughout the hospital? Well, plus 6%.

New software features appear periodically. Some Software does not have time for this. It is trying to catch up, especially at the Java level, where we have to present more or less architecturally independent solutions, we still have to chase after this.

- Vectorization and vector instructions have been around for many years in modern processors. Often, a Java virtual machine is accused of not using them.

Kuksenko: The problem here is twofold, is that, as a rule, all these instructions are sharpened for specific scenarios and use cases, and the question of defining these use cases from an abstract, non-sharpened code for this architecture is still open. Automatic vectorization is an issue solved for simple cases. But a step to the left, a step to the right - and the vectorization algorithms stop working.

It is clear why this is required in hardware - because there are millions of developers who can be sharpened for a specific platform in a fast way down to a la assembler level with the help of some kind of Intrisic and all the other native code, and get the necessary things. In Java, we have a slightly different goal, although we do not disdain this, and the very key things are sharpened in exactly the same way.

- Can you give an example when sharpened by hand?

Kuksenko: First, string operations. Most of them are sharpened for a specific iron, and soon there will be some small updates in this area, there will be more, more and more. This is one area.

The second area, which is classically always sharpened by hand, is the area of cryptography, because we have new instructions in hardware that are focused on faster and more secure cryptography (for example, the random generator is real in the last Intel pieces of hardware), etc. And it is obvious that from the point of view of cryptography in the field of security, it would not be very good to ignore these possibilities of iron. But here, with the same key cryptography, the question is not the issue of application performance, but the issue of application security. Using a real random generator, we increase the entropy and increase our protection, using cunning vectorized commands for the coding itself, we allow ourselves to cram, for example, more complex coding algorithms into the same time frame, which increases security. Here performance is mediated. And here a big question arises, how much a real user needs vectorization.

- That is not at all the fact that you need? It's just that at all conferences, this is a favorite question, when something concerns Java and performance - just a vectorization question. They like all kinds of sishniki to poke a finger, and to say that ...

Kuksenko: And how often do sishniki really achieve good vectorization of their products? In the entire history of presentations at the level of JPoint and Joker, on the past Joker, I received one single feedback when I showed that “I have an example. So we overclock it. And here there was a small vectorization. ” The man came up on the sidelines and said: “Yes, cool, you showed wonderful. I wanted to disperse one place, now I will ensure that it is vectorized. ” For all this time, one single person who really needs it, and he knows his tasks, I came across.

About performance hardcore

- When you tell the performance basics, it is clear that it is necessary more or less everything. But when you tell very advanced things - for example, the case when the CPU is 100% loaded, as in your last reports, which part of the people really need it? The feeling that every hundredth.

Kuksenko: If not less. We are guided by this audience.

How do you generally relate to the educational aspect of such reports?

Kuksenko: I do not consider them as educational, I consider them as more introductory, and just to show that "There, there is a lot of everything that can be done." And the person who needs it will understand that he can move in this direction and not stagnate.

“I spoke a couple of days ago with Oleg Bunin, who was familiar to you, the person who was doing the Highload conference, I asked him what Highload is in his opinion, and he said such a thing:“ Highload begins when it becomes important for you that you have occurs in the gland. As long as you don’t care what runs around you, it’s not Highload, as soon as it starts to worry, it’s Highload. ” How can you comment on this statement?

Kuksenko: For me, so Highload is another buzzword. As I have always said in my reports, performance is a binary metric: the client is either unhappy or satisfied, and then our internal kitchen begins on how to measure it, how much money we will spend on it, we will buy more iron or, on the contrary , we will twist handles at a software, etc. That is the question of where Highload ends, where Highload begins - this is the question where we draw the boundaries of colors. The spectrum of the rainbow is uninterrupted, but we say that "This is green and this is red", but in fact we cannot clearly draw on the border, that "Here begins green, and here red." Here, green and red are obvious to everyone except the color blind. So with Highload, and with all things. But there is a small requirement that we voice in all our reports: “If you want to deal with the performance of your application, if you need it, you have to imagine how everything works from top to bottom, the whole stack: how your proposal works, how the application server works, operating system, hardware, ethernet wire, etc. And if you know how it all works, then you can achieve something and get some winnings in this place. ”

- You and Lesha started the famous series of presentations of 2011-2012 with a story about software engineering and performance engineering, and what is the difference, it was one of your first slides. And on a toolset? I, working as a Java-engineer, roughly imagine what kind of toolset a typical Java engineer has, maybe even a Java enterprise engineer. And what is the toolset from a performance engineer? What tools does a performance engineer, in particular, do you use in your daily work?

Kuksenko: bash! Everything, first, depends on the tasks. The classic toolset, if you work with an external benchmark, is some kind of profiler that clings to it. Profiler can be any, they are all the same, by and large. And it is clear that there is an advantage in products like VisualVM or Mission Control.

If you have to move to a lower level, then you start using the standard thing called the Oracle Solaris Studio Performance Analyzer (long name) - a fairly effective tool. And secondly, for small things, for an overview, an understanding of the point of view is a Linux perf. Practice shows that lately, as a rule, I use this link: perf for review, and Oracle Solaris Studio Performance Analyzer for more or less serious digging. Other utilities are unnecessary. Maybe JFR occasionally, look for something that does not come out. JFR, Recorder and Mission Control, but just look at what does not go beyond the level of Java.

- And what about the amplicher?

I can't say anything about Amplifier, because I have never used Amplifier. I used this product five or six years ago, when it was simply called Intel's VTune, and from the point of view of working on the Intel hardware, the Windows platform was perfect at that time. But there were problems with working under non-Intel hardware, and with working with something other than Windows in those years. Now I heard that in VTune Amplifier, the guys made a very serious progress on automatic analysis, in the sense that it stops simply showing tons of various numbers, numbers, etc., trying to highlight key problems.

- That is, in essence, to give an interpretation?

Kuksenko: Yes, he is trying to produce already some accounting, classification of the problems that he finds. I saw it out of the corner of my presentation, I did not use it in practice.

- That is, it is not clear how true it is, how much is marketing?

Kuksenko: If you believe the presentation, then it is true, and it should work, but it is binding for Intel's hardware.

- 10 years ago we had AMD, as a player everywhere - a laptop, desktop, servers - now we hear less and less about AMD, but on the other hand, ARM and ARM architecture have appeared very seriously in recent years. How often do you in everyday work deal with ARMs of any kind, and what can you say, are there any differences with them?

Kuksenko: In my daily work I dealt with ARMs zero times. I once just made my experiments on ARMs. When I was doing one of my presentations, I was curious to make a parallel measurement on ARMs. Then I made these measurements, and just compared it with the Intel architectures. Therefore, since I’m not really working with ARMs, I don’t have anything to say about this - there is no basis, although the platform is quite promising and quite aggressively displacing Intel from various niches.

“It feels like it is developing much more interesting, because what you said about 6% growth is a feeling that Intel was a little stagnant. The feeling is that the performance is not growing. Energy efficiency, perhaps, is growing, something different type of encryption appears, but it is a feeling that the growth is in breadth. The laptop five years ago is from the point of view of the performance relevant.

Kuksenko: In terms of performance, which is used by the end user, it turns out to be exactly the same. As he had five years ago, you could watch your movies on Youtube, and now you watch your movies on Youtube, you do not see any difference. As five years ago you went to Twitter and wrote some mail, and now. You do not notice the difference. The point is that the underlying platforms have long achieved the performance required by the end user. Of course, every 3D, Blu-ray, super-high definition image begins, but this is a separate area, not in the area of computer performance. But really, thanks to my colleague Lesha Shipilev, he recently made an assessment of industry retrospectives on Twitter. He wrote that he measured a certain benchmark, the score of which he remembers well from our work at Intel 10 years ago. And he notes that during these 10 years this benchmark has become 50 times faster on his laptop than on a server machine 10 years ago. I think a performance increase of 50 times in 10 years is quite a normal progress.

- That is, in fact, the development is? How significant is that? Do you remember an example of what, what is this benchmark?

Kuksenko: I remember him well, but I think that we will not discuss it. I think that when you interview Alexei Shipilev, you would rather ask him about the relevance of this benchmark, because he wrote a substitute for this benchmark.

Thong

- Let's talk about String. It is no secret that various studies are now taking place there, related to how to write more compactly there. It is no secret that in recent years, this class is beginning to change. Replacing substring (), and what was in JDK 7u6, etc. How dumb to change the base class of the platform? How much work with this class differs from work with any other, how much more difficult are some changsets accepted there? Because this is a very visible area, and it is surprising that active work is going on there now.

Kuksenko: It’s not a question here: “Let’s come up with some sort of change, and then look at it,” but here’s a question of the winnings we get from it. And since if we have a class, which, we know that it takes 50% in memory in any application, if not more, then let's still do something with it. It is high time to do it, especially since we also had old practices, experiments that were not hidden.

- Why did the work in this area stand before, and now it has become more active? Over the years, many questions have accumulated on the string.

Kuksenko: I would not say that the work in this area stood. They were made. Maybe they were not always brought to the end, and remained experimental, but they were also available at the level of play. It just formed a vision of how this should be done. It's very easy to do optimizations when you have 10 usage patterns, and everyone jumps in these 10 patterns.

You sit, disassemble these 10 templates, optimize their behavior, get a win.
When your class is used in a thousand, a million different ways, and you have a huge cost of error in this place, it is obvious that you will not immediately rush and write something from the list. Here, you first need to see how the payoff for these, and for those so that it is not worse, it gets better, and at the same time that nothing breaks, and so on. That's all the action, all the work.

Oracle has never concealed that the greatest costs for man-hours on the development of Java, and Sun, too, they are not in the field of development, they are in the field of QA.

- For the current day, according to your estimates, without switching to personalities, how powerful is QA in Java, in Java, in OpenJDK? That is, how often do they miss something serious, in your opinion? Or so hard to say?

Kuksenko: It's hard for me to say. I try not to go into this area because it is too big. We have great specialists, I think they will answer much better. QA-, .

— JDK 7 update 6 substring(). : , , , . : , ? , . , , -.

: -, , … -, , . , , : « JDK 7 Update 6, , , », ., , , Offset , , – , .

— ?

: – . , , , , , , . , , , - . , , « , - ». , , , , , , . , – , , . . , .

Stream API

— Java 8. Stream', Stream API, Bulk Data Operations, . ? . , , , , , – , ? Stream' - ?

: , , Java 8, Stream', JDK 9 . , , , , , JDK 8. Stream' , , , . - , .

. , , , , Stream API . , , .

— , Stream API (Collections) API?

: , Stream API API Stream API , , Stream . Question: what will we gain by making these some expenses? Without parallelization from the point of view of performance, there is nothing to win, except for a more compact and better readable code. The issue of parallelization, this was also much talked about, and various examples, patterns, graphs were written, when it is better to go.

, , , , , : 100 ( , — .. apangin ), , , 99%. 100 , . 10 , … .

— 100 – ?

: , . Stream API , . hotspot JDK. , , , , , , , .

— . , , , Oracle , Sun, OpenJDK, . , Oracle , , - , , , , , , , , - , . . – , , , . , , ?

: - , , . , 100 , , . , . , , , , , , , -, , , , . – , , , , , , , , , , , , Java, . , .

— , ?

: , .

— , , ?

: Oracle Labs, , High Performance, High Concurrency . Oracle Labs – , , , , , .

— , , . GC – , – , Stream' – , VM – , JIT – ...

: ? – . — . . computer science, .

Team

— Java, , Oracle Java SE Performance Team. , , , .

: . , , JDK, , . : , . : , , - , . , , .

— – ?

: , .

— ?

: Lambda, Jigsaw, Application Data Sharing. String - . - G1 Garbage Collector.

— performance-, (, , )? , concurrency-interest , . community, ?

: : , . , , Hotspot, - , . - OpenJDK mailing list , . , Hotspot Class Libraries, , . , - , , , , , , .

— …

: , .

— , , , ?

: , , , - Hotspot, , , - , . Oracle, , , , , , , , , , .

— , ? - ? Java, - ?

: .

— ? , , , , Java – , , , . , ?

: . , - Highload, – . , , , . , a) , , b) , , c) , . , . , , . , , Java SE , 10 .

— , , ?

: , , . , , Garbage Collector Class Libraries, , . , - , , , . , . , .

— , Oracle Java , VM? , - , ?

: – . - , , - .

— , , , Java- Rocket Science, , , , , , Google V8, , .NET, ?

: .NET. . Google V8 – .

— , JVM- Dalvik . , – , VM . , Android', VM- ?

: – . , , , - . Java, .

— , - , , , , , - .

: . , .

— , , , .

: API , . , , API, , . , , , . API from scratch, , . That's how we live.

JVM

— , … , – , - .

: . .

— Oracle ?

: -?

— ? 20 VM , . VM- Oracle JDK? .

: Embedded VM, embedded, , .

— Hotspot?

: Hotspot, Embedded Hotspot, .

— footprint - ?

: , .

— ARM, ?

: ARM VM ARM. , Hotspot. JRockit, - . , , , . , , -, , JRockit Java 6.

— JRockit?

: .

— 2010 – Sun Oracle, JRockit Hotspot, - , , , , ? ?

: Oracle, Oracle , . , . – .

— Hotspot, JRockit.

: .

— ? , , , , , , .

: , .

— , ? , .

: . , , , . . , « Hotspot JRockit», 70-80%, , . : « JRockit, Hotspot», – . : « JRockit Hotspot, ?». , , , , , , . , , Hotspot, QA (00:48:34), , JRockit Hotspot, .

— Metaspace, Flight Recorder Mission Control. , JRockit Hotspot?

: , , , , , , JRockit. 100% , …

— ?

: . , VM .

— , JIT-, C2-, , ? , JIT - .

: , , , JIT , JIT- C2. , - , , . -, , , . – . , , .

— ?

: , , . – -, . .

— ?

: , , , , . C C++ .

— ? Java?

: , Graal – , , , .

— , , , .

: . , Oracle. , GPU , . , – . , , , Hotspot, .
— GC. Shenandoah, RedHat . - , ?

: , .

— ? ?

: , RedHat Shenandoah Hotspot, - .

— , Garbage First Collector, , CMS Collector. , , , , ?

: , , . – «»

— Garbage First , , , , ?

: , Garbage First, , . Java One , Garbage First. Java One, . , , , : « , ?», .

, Garbage First , - . , , - . , Garbage First . , .
? -, CMS- , . , , : «, , , – ». , Garbage First , . , Garbage First, , – - . – Java 7 Java 8, Java 7…

— end of life… , .

: - . – .

— . , .

: : , , « . Works? Works. ».

— , , , , - , , , . GC? , , , . Garbage First , . , . , . ?

: , , . , – «». , . 10 . , , , . , , , , «». , , , , , , . , , . Java Hotspot, Garbage Collector, , , - – , – , , , , .

— . , , . – Hotspot, GC, Runtime, , – , , -, , – , - , , , , . , GC – – , – .

: . , , , , , , - , , , , , - : « 60 /», . ?

— – ?

: . 60 /, 60 /, , ? , . , , , , , - , , CMS- , , -, , .

— , , ...

: . , « , . , ».

— ?

: .

— , , , , , , , «» , , , – - . , , , , , Runtime ?

: , -, , , Runtime , .

. , Runtime, , , , , , , . , 100 . , , : « , , , , , ». , , , , , , . , , , , . . , , - , – .

JPoint

— , JPoint . , . , , ? , ?

: - , . , , , , .

, «», , , , , , , - , .. , , , -. , - , , , . , , , , .

— , , , - ?

: , . , . , , 40% . , , , , , .

— - .

: , : , , , .

PS:
, , .
JPoint — .

Source: https://habr.com/ru/post/255219/

All Articles

"The work of engineers - to make a complaint" - Interview with Sergey Kuksenko from the Java Performance Team

More articles: