Today I installed a new edition all night. This time the topic is much more recent: the history of committing to experimental technology - SubstrateVM. But the degree of upward movement rose to a new level.
')
Really looking forward to your comments! I remind you that if you really want to improve something in this post, then it’s best to file your content on Github . I would like to say “put likes and subscribe to a new channel , but after all, all its releases will already be in your Java hub?
Technically: there is one glue in the video near the end. I just wrote an uncompressed video, and my m2 ssd, the size of just five hundred gigabytes, quickly overflowed. And no other hard disk could not withstand such a pressure data. So I had to disconnect for half an hour and, having gotten tired of finding an additional fifty gigs to record the last few minutes. This was achieved by deleting the files collected by GoogleChrome . Opinion about the recording software wrote in FB right at the moment of recording , there is a lot of pain.
More from the technically interesting: YouTube for some reason blocked me live streaming. At the same time on the account there is not a single strike and stigma. Let's hope it's just a cant, and in 90 days everything will be back.
This article will be quotes from code owned by Oracle. You cannot use this code in your home (unless you read the original license, and it allows it on terms, for example, the GPL). It is not joke. Olso, I warned.
Tip (and the tale will be ahead)
Many have already heard enough stories that “new Java will be written in Java,” and they wondered how this could be. There is a program document Project Metropolis and the corresponding letter from John Rose , but everything is rather vague.
It sounds like some kind of creepy, bloody magic. In the same thing you can try right now, not just there is no magic, but everything is stupid like the back of a shovel when you knock out your teeth. Of course, there are some nuances, but this will someday be very much later.
I will show it on the example of one instructive story that happened in the summer. How is it in schools write an essay "how I spent my summer".
To start a small remark. The project that Ahead-of-Time is currently compiling with Oracle Labs is GraalVM. The component that actually makes nishtyaki and turns the java code into an executable file (into an executable) is SubstrateVM or SVM for short. Do not confuse this with the same abbreviation used by data-satanists (support vector machine). This is about the SVM, as the key part, we'll talk further.
Formulation of the problem
So, "how I spent the summer". I was on vacation, dvucheval F5 on the Grail githabe and came across this ishshshuyu :
A person wants os.version give the correct value.
Well cho, I wanted to fix the bug? The boy said - the boy did.
At the beginning, what the exhaust looks like on real Java: 4.15.0-32-generic . Yes, this is a fresh Ubuntu LTS Bionic.
Now let's try to do the same on the SVM:
$ ls Main.java $ javac -cp . Main.java $ ls Main.class Main.java $ native-image Main Build on Server(pid: 18438, port: 35415) classlist: 151.77 ms (cap): 1,662.32 ms setup: 1,880.78 ms error: Basic header file missing (<zlib.h>). Make sure libc and zlib headers are available on your system. Error: Processing image build request failed
Well yes. This is because especially for the “clean” test I made a completely new virtual machine.
$ sudo apt-get install zlib1g-dev libc6 libc6-dev $ native-image Main Build on Server(pid: 18438, port: 35415) classlist: 135.17 ms (cap): 877.34 ms setup: 1,253.49 ms (typeflow): 4,103.97 ms (objects): 1,441.97 ms (features): 41.74 ms analysis: 5,690.63 ms universe: 252.43 ms (parse): 1,024.49 ms (inline): 819.27 ms (compile): 4,243.15 ms compile: 6,356.02 ms image: 632.29 ms write: 236.99 ms [total]: 14,591.30 ms
Absolute runtime numbers can be terrifying. But, first of all, this is what was intended: very hellish optimizations are being applied here. And secondly, it is a sickly virtual machine that you want.
And finally, the moment of truth:
$ ./main null
It seems that our guest did not lie, really does not work.
The first approach: theft of properties from the host
Then I searched the global search for os.version and found that all these properties are in the class SystemPropertiesSupport .
I will not write the full path to the file, because right in the SVM built the ability to generate the correct projects for IntelliJ IDEA and Eclipse. This is very cool and does not at all resemble the torment that OpenJDK has to endure. Let classes for us opens IDE. So:
Then I, completely without including my head, just went and added another variable to this set:
"os.arch", "os.name", "os.version"
I rebuild, I launch, I receive a treasured line 4.15.0-32-generic . Hooray!
But here’s the problem: now, on every machine running this code, it always gives 4.15.0-32-generic . Even where uname -a gives up the previous version of the bucket, on the old Ubunt.
It becomes clear that these variables are written to the source file at the time of compilation. And indeed, you need to carefully read the comments:
/** System properties that are taken from the VM hosting the image generator. */privatestaticfinal String[] HOSTED_PROPERTIES
It is necessary to apply other methods.
findings
If you want the system property from “main java” to appear in SVM, this is very easy to do. We register the desired property in the right place, everything.
You can work in an IDE that supports both Java and Python. For example, in IntelliJ IDEA Ultimate with a Python plugin or the same in Eclipse.
Second approach
If you dig into the SystemPropertiesSupport SystemPropertiesSupport , we find a much more reasonable thing:
/** System properties that are lazily computed at run time on first access. */privatefinal Map<String, Supplier<String>> lazyRuntimeValues;
Among other things, the use of these propertey still does not block the build process of the executable. It is clear that if we cram a lot in HOSTED_PROPERTIES , then everything will slow down.
Registration of the lazy properties occurs in an obvious way, by reference to the method that returns:
And all these links to methods are interface, and the same this::userDirValue is implemented for each of the supported platforms. In this case, it is PosixSystemPropertiesSupport and WindowsSystemPropertiesSupport .
If out of curiosity to go to the implementation for Windows, we will see the sad:
As you can see, Windows is not yet supported :-) However, the real problem is that the generation of executables for Windows has not yet been completed, so supporting these methods would actually be completely unnecessary efforts.
That is, you need to implement the following method:
And then support it in two or three available interfaces.
But what to write there?
findings
If you want to add a new property, calculated in runtime, then this is a matter of writing one method. The result may depend on the current operating system, the switching mechanism is already working and there is no request.
Bit of archeology
The first thing that comes to mind is to peek at the implementation in OpenJDK and brazenly copy-paste. A little archeology and looting will never prevent the brave explorer!
Feel free to open any Jav project in the Idea, write System.getProperty("os.version") , and by ctrl + click proceed to the implementation of the getProperty() method. It turns out that all this is stupid in Properties .
It would seem that it is enough to copy the place where these Properties are filled, and, laughing defiantly, to escape into the void. Unfortunately, we come across a problem:
Now we dive a little deeper and open the full OpenJDK sources.
If someone does not have them yet, then you can look at the web or download. I warn you, they are swinging from here , still with the help of Mercurial, and still it will take about half an hour.
The file we need is at src/java.base/share/native/libjava/System.c .
Notice that this is the path to the file, and not just the name? That's right, you can shove your new shiny, fashionable Idea, bought for $ 200 a year. You can try CLion , but in order to avoid irreversible mental damage, it is better to just take the Visual Studio Code . He already highlights something, but still does not understand what he saw (he doesn’t cross out everything in red).
In turn, they are taken in src/java.base/unix/native/libjava/java_props_md.c . Each platform has its own such file, they are switched via #define .
And here begins. There are many platforms. On any kind of necrophilia like AIX, you can score, because GraalVM officially does not support this (as far as I know, GNU-Linux, macOS and Windows are planned first). GNU / Linux and Windows support the use of <sys/utsname.h> , which has ready-made methods for obtaining the name and version of the operating system.
It has the name "Mac OS X" (although it has long been macOS);
It depends on the version of makosi. Before 10.9 in the SDK there was no operatingSystemVersion function, and you had to read SystemVersion.plist hand;
For this subtraction, it uses ObjC extensions something like this:
// Fallback if running on pre-10.9 Mac OS if (osVersionCStr == NULL) { NSDictionary *version = [NSDictionary dictionaryWithContentsOfFile : @"/System/Library/CoreServices/SystemVersion.plist"]; if (version != NULL) { NSString *nsVerStr = [version objectForKey : @"ProductVersion"]; if (nsVerStr != NULL) { osVersionCStr = strdup([nsVerStr UTF8String]); } } }
If initially the idea was to rewrite it manually in a good style, then it quickly broke about reality. And what if I’m somewhere in the jungle of this noodles jungle, for someone it breaks, and I am hanged in the central square? Well nafig. Need to copy-paste.
findings
IDE is not required;
Any communication with C ++ is painful, unpleasant, not understood at first glance.
Is copy-paste the norm?
This is an important question on which the amount of further torment depended. I really didn’t want to rewrite manually, but it was even worse to go to court for violating licenses. So I went to the githab and asked Codrut Stancu about it directly. Here is what he said :
»Reusing OpenJDK code, for example, copy-paste is a normal thing in terms of licensing.However, for this you need to have a very good reason.If the feature can be implemented by reusing the JDK code without copying, for example, patching it with a substitution, it will be much better. "
That sounds like official copy-paste permission!
Normally communicated ...
I began to transfer this piece of code, but rested on my laziness. To check the work under macOS of different versions, you need to find at least one with necrofile 10.8 Mountain Lion. I have two of my apple devices and one of my friend, plus you can deploy to some kind of VMWare trial.
But laziness. And this laziness saved me.
I went to chat and asked Chris Seaton which toolchain is the right one for the build. What is the supported version of the operating system, C ++ compiler and so on.
In response, he received a surprised silence of the chat and Chris's answer that he did not understand the essence of the question.
It was dofig time before Chris could understand what I want to do, and asked him not to do so .
That's really the idea of ​​SVM. SVM is pure Java, it’s not a code. But nobody wants C ++ code from OpenJDK. That's the last thing we want.
The example with mathematical libraries did not convince him. At a minimum, they are written in C, and the inclusion of C ++ would mean the connection of a perfect new language into the code base. And this, that fufufu.
What to do? Write on System Java .
And if a call to the C / C ++ Platform SDK cannot be avoided, then it must be some kind of single system call wrapped in a C API. The data is drawn in Java and then business logic is written strictly in Java, even if the Platform SDK has convenient ready-made ways to do it differently on the C ++ side.
I sighed and began to study the source code in order to figure out how this can be done differently.
findings
Talk to all the unclear details with the people in the chat . They answer if the questions are not completely idiotic. Although this example shows that Chris is ready to discuss idiotic questions, even if it does not save his time personally;
C ++ is not present in the project at all. There is no reason to believe that someone will let him drag under the hollow;
Instead, you need to write to System Java, using C as a last resort (for example, when calling the platform SDK).
Fiddler is not needed
A fiddler is not needed, dear. He only eats extra fuel.
Here I felt some sadness, because look here. If we have <sys/utsname.h> on Windows, and we stupidly hope for its answer, this is easy and simple.
But if it's not there, you have to do what?
Call cmd builtins or windows utilities? Issuing text in Russian, which must parse. This is the bottom, and it may not coincide with the fact that in this place the real OpenJDK will respond.
Take from the Registry? Even here there are nuances, for example, when switching from Windows 7 to 10, the method of storing tsiferok in the Registry has changed, and in Windows 10, you need to either glue the hands of the major and minor components, or simply answer that this is Windows 10 with a single digit. Which of these methods is more correct (it will not make the users asses their asses) is unclear.
It is interesting that at first everything was fixed in the wizard ( os.version stopped giving null in the test), and only then I noticed a pullrequest. The problem is that this commit is not marked as a pullrequest on Github - it is a simple commit with the PullRequest: graal/1885 . The fact is that the dudes in Oracle Labs do not use Github, they need it only to interact with external committers. All of us who are not fortunate enough to work at Oracle Labs need to subscribe to alerts about new commits to the repository and read them all.
But now you can relax and see how to implement this feature correctly .
Let's see what this beast is, System Java.
As I said earlier, everything is as simple as the back of a spade when they try to knock your teeth out. And just as painful. Let's look at a quote from the pool:
@Overrideprotected String osVersionValue(){ if (osVersionValue != null) { return osVersionValue; } /* On OSX Java returns the ProductVersion instead of kernel release info. */ CoreFoundation.CFDictionaryRef dict = CoreFoundation._CFCopyServerVersionDictionary(); if (dict.isNull()) { dict = CoreFoundation._CFCopySystemVersionDictionary(); } if (dict.isNull()) { return osVersionValue = "Unknown"; } CoreFoundation.CFStringRef dictKeyRef = DarwinCoreFoundationUtils.toCFStringRef("MacOSXProductVersion"); CoreFoundation.CFStringRef dictValue = CoreFoundation.CFDictionaryGetValue(dict, dictKeyRef); CoreFoundation.CFRelease(dictKeyRef); if (dictValue.isNull()) { dictKeyRef = DarwinCoreFoundationUtils.toCFStringRef("ProductVersion"); dictValue = CoreFoundation.CFDictionaryGetValue(dict, dictKeyRef); CoreFoundation.CFRelease(dictKeyRef); } if (dictValue.isNull()) { return osVersionValue = "Unknown"; } osVersionValue = DarwinCoreFoundationUtils.fromCFStringRef(dictValue); CoreFoundation.CFRelease(dictValue); return osVersionValue; }
In other words, we write in Java word for word what we would have written in C.
Look at how DarwinExecutableName written:
@Overridepublic Object apply(Object[] args){ /* Find out how long the executable path is. */final CIntPointer sizePointer = StackValue.get(CIntPointer.class); sizePointer.write(0); if (DarwinDyld._NSGetExecutablePath(WordFactory.nullPointer(), sizePointer) != -1) { VMError.shouldNotReachHere("DarwinExecutableName.getExecutableName: Executable path length is 0?"); } /* Allocate a correctly-sized buffer and ask again. */finalbyte[] byteBuffer = newbyte[sizePointer.read()]; try (PinnedObject pinnedBuffer = PinnedObject.create(byteBuffer)) { final CCharPointer bufferPointer = pinnedBuffer.addressOfArrayElement(0); if (DarwinDyld._NSGetExecutablePath(bufferPointer, sizePointer) == -1) { /* Failure to find executable path. */returnnull; } final String executableString = CTypeConversion.toJavaString(bufferPointer); final String result = realpath(executableString); return result; } }
All these CIntPointer , CCharPointer , PinnedObject , what.
For my taste, this is inconvenient and ugly. You need to manually work with pointers that look like Java classes. It is necessary to call the appropriate release in time so that the memory does not flow away.
But if it seems to you that these are unjustified measures, you can again look at the implementation of GC in .NET and be terrified, what does C ++ lead to if you don’t stop in time. Remember, this is one huge CPP file of more than a megabyte size. There are some descriptions of his work, but they are clearly insufficient for understanding by an external contributor. The code above, albeit ugly looking, is quite understandable and analyzed by means of static analysis for Java.
As for the essence of the commit, I have questions for him. And at least there is no support for Windows. When kodgen appears for Windows, I'll try to take on this task.
findings
It is necessary to write on System Java. To extol, call sweet bread. There are no options anyway;
Sign up for notifications from the repository on GitHub and read commits, otherwise important PR will fly by;
If possible, ask about any big features of those responsible for this area. There are a lot of things that are implemented, but they are not yet known to the general public. There is a chance to invent a bicycle, and much more bad, than made by guys from Oracle Labs;
When you take on a feature, be sure to tell the person responsible for the github. If he does not answer - write a letter, the addresses of all team members are easy to google.
I want to remind you that Oleg Shelayev, the only official GraalVM evangelist from Oracle, will come to the next Joker conference . Not just "the only Russian-speaking", but "the only one in general." The title of the report ( “Compiling Java ahead-of-time with GraalVM” ) hints that it won't do without SubstrateVM.
You can chat with Oleg and Oleg in our chat-room at the Telegram: @graalvm_ru . Unlike ishshuyov on Gitkhab, you can communicate in any form, and no one will be banned ( but this is not accurate ).
Also I remind you that every week we, together with the podcast “Debriefing”, make an issue of “Java-digest”. For example, this was the last digest . From time to time, there is also news about GraalVM (in fact, I don’t turn the whole issue into a GraalVM news release just because of respect for the audience :-)