📜 ⬆️ ⬇️

GraalVM: mixed in a bunch of C and Scala

I don’t know how you are, but I’m recently impressed by articles about new Java technologies - Graal, Truffle and everything. It looks as if you had invented a language before, wrote an interpreter, was glad which language was good and saddened, which was slow, the native compiler and / or JIT wrote to it, and you still need a debugger ... LLVM is, thanks to that. After reading this article , I got the impression (somewhat grotesque) that after writing a special type of interpreter, the work can, in principle, be completed. The feeling that now the button "Make it Zashib" has become available to compiler programmers. No, of course, JIT-languages ​​slowly start, they need time to warm up. But, in the end, the time and qualifications of the programmer are also not free - in what world of information technology would we live if we still wrote everything in assembly language? No, maybe everything would, of course, fly (this is if the programmer competently laid out the instructions), but I have some doubts about the total complexity of actively used programs ...


In general, I understand perfectly well that in the dilemma “time spent by a programmer vs the ideality of the product received (“ handwork ”)”, the border can be moved until the end of the centuries, so let's just try to use the traditional SQLite library today without loading the native code in its pure form. We will use a ready-made truffle language implementation for LLVM IR, called Sulong.


Disclaimer: this article should not be viewed as a story by pros to beginners, but as a kind of laboratory work by a novice who is just trying to get comfortable with the technology. And another thing: LLVM IR cannot be considered completely platform independent.


So, we will need to take, in fact, the sources of SQLite, write the linking code on Java Scala (well, sorry ...), and also get GraalVM with strapping and Clang (with its help we will compile SQLite into LLVM IR, which we will load into our Scala code).


Immediately make a reservation that everything will happen on Ubuntu 18.04 LTS (64 bit). With Mac OS X there are no big problems, I want to believe, it will not arise either, but if I have Graal and all its necessary components under Windows, I’m not sure. However, even if now is not, probably, will appear later.


Training


  1. We download our experimental rabbit SQLite (in fact, everything is already in the repository attached to the article).
  2. We read the official article SQLite In 5 Minutes Or Less . Since SQLite in this case is used only as an example, this is exactly what is needed. How To Compile SQLite is also useful.
  3. Download GraalVM Community Edition from here and unpack it. I would not recommend to give in to provocations to add him to PATH - why do we need node and lli , identical to natural ones?
  4. Install clang - in my case it is Clang 6 from the Ubuntu repository

Also in my test project, the sbt build system will be used. To edit the project, I personally prefer the IntelliJ Idea Community with the standard Scala plugin.


And here I personally started the first rake: on the GraalVM website it is said that this is just a directory with JDK. Well, if so, then add it to the Idea as a simple JDK. "1.8" - said Idea. Hmm, strange. Go to the console in the directory with the Grail, say bin/javac -version - really 1.8. Well, eight, so eight - not scary. The terrible thing is that the org.graal packages and all that Idea does not see, and we will need them. Well, let's go to File -> Other Settings -> Default Project Structure... , there in the JDK settings we see that in the Classpath there are jar files from jre/lib and jre/lib/ext . I didn’t check everything. But what we supposedly need:


Hidden text
 trosinenko@trosinenko-pc:~/tmp/graal/graalvm-1.0.0-rc1/jre/lib$ find . -name '*.jar' ./truffle/truffle-dsl-processor.jar ./truffle/truffle-api.jar ./truffle/truffle-nfi.jar ./truffle/locator.jar ./truffle/truffle-tck.jar ./polyglot/polyglot-native-api.jar ./boot/graaljs-scriptengine.jar ./boot/graal-sdk.jar ./management-agent.jar ./rt.jar ./jsse.jar ./resources.jar ./jvmci/jvmci-hotspot.jar ./jvmci/graal.jar ./jvmci/jvmci-api.jar ./installer/installer.jar ./ext/cldrdata.jar ./ext/sunjce_provider.jar ./ext/nashorn.jar ./ext/sunec.jar ./ext/zipfs.jar ./ext/sunpkcs11.jar ./ext/jaccess.jar ./ext/localedata.jar ./ext/dnsns.jar ./jce.jar ./svm/builder/objectfile.jar ./svm/builder/svm.jar ./svm/builder/pointsto.jar ./svm/library-support.jar ./graalvm/svm-driver.jar ./graalvm/launcher-common.jar ./graalvm/sulong-launcher.jar ./graalvm/graaljs-launcher.jar ./charsets.jar ./jvmci-services.jar ./security/policy/unlimited/US_export_policy.jar ./security/policy/unlimited/local_policy.jar ./security/policy/limited/US_export_policy.jar ./security/policy/limited/local_policy.jar 

From the total listing, we see some more subdirectories, and, judging by what was added for the usual JDK, ./security does not interest us. In this case, using the method "" + "- unfolded-directory-shift-click-click, OK" add the contents of the subdirectories truffle , polyglot , boot and graalvm . If something is not found later - we will add more - it’s something everyday ...


We create the project on Scala


So it seems , the idea has been customized. Let's try to create a sbt-project. Actually, there are no pitfalls, everything is intuitive, the main thing is not to forget to specify our new JDK.


Now just create a new scala file and copy-paste we creatively process the code written in Polyglot reference in the Start Language Java section of Start Language Java by clicking in Target Language - LLVM.


By the way, I recommend to pay attention to the abundance of other Start Language: JavaScript, R, Ruby and even just C, but this is a completely different story, which I haven’t read yet ...


 object SQLiteTest { val polyglot = Context.newBuilder().allowAllAccess(true).build() val file: File = ??? val source = Source.newBuilder("llvm", file).build() val cpart = polyglot.eval(source) ??? } 

We will not inherit our object from App or make the fields private - then you can access them from the Scala-console (its configuration has already been added to the project).


As a result, we almost (by as much as 80%) rolled over an example from as many as five meaningful lines - it's time to sit back and read at last what did we write Javadoc, especially since just calling main() is somehow boring, and in general, our model example is SQLite, so you need to understand what to write instead of the fifth line. Polyglot reference is fine, but API documentation is needed. To find it, you need to walk around the repository, there is a readme , and in them there are links to Javadoc .


In the meantime, the meaning of what was written to us is not yet clear, ask JS. The answer to the Main Question: choose the Scala console configuration in the Idea, and ...


 scala> import org.graalvm.polyglot.Context val polyglot = Context.newBuilder().allowAllAccess(true).build() polyglot.eval("js", "6 * 7") import org.graalvm.polyglot.Context scala> polyglot: org.graalvm.polyglot.Context = org.graalvm.polyglot.Context@68e24e7 scala> res0: org.graalvm.polyglot.Value = 42 

... well, it works, the answer is. And leave the question as an exercise for the reader.


Let's go back to the example code. The variable polyglot contains a context in which different languages ​​live - someone is turned off, someone is turned on, and someone has even been lazily initialized. In this harsh world, even for accessing files, you need to ask for permission, so in the example we simply disable the restrictions using allowAllAccess(true) .


Next we create a Source object with our LLVM bitcode. We indicate the language and file from which to download this "source code". You can also use the source line itself (we have already seen this), the URL (including from resources in the JAR file), and just an instance of java.io.Reader . Next, we calculate the resulting source in the context, and get the Value . In accordance with the documentation for this method, we will never get null , but there is a Value , which is a Null . But we still need to download something specific, so ...


We collected SQLite


... but for a replacement for fopen ()
- From About SQLite . As you can see, letting run SQLite in GraalVM was not a terrible mistake for developers.


On the advice of the already mentioned part of the documentation SQLite, as well as instructions Graal draw up a command line. Here she is:


 clang -g -c -O1 -emit-llvm sqlite3. \ -DSQLITE_OMIT_LOAD_EXTENSION \ -DSQLITE_THREADSAFE=0 \ -o ../../sqlite3.bc 

-O1 least -O1 is required for the correct operation of the code inside Sulong, -g save us the names (for these two, as well as other options, see the documentation for details), we use SQLITE_OMIT_LOAD_EXTENSION not to depend on libdl.so in our test example ( how we would do it, it is not clear from the start), and since it is not clear how to link with pthread, and why, we disable thread safety (otherwise it will fail with startup). That's all.


We start our project


Now we have what to write in the second line:


  val file: File = new File("./sqlite3.bc") 

Now we can pull the necessary functions from the library:


  val sqliteOpen = cpart.getMember("sqlite3_open") val sqliteExec = cpart.getMember("sqlite3_exec") val sqliteClose = cpart.getMember("sqlite3_close") val sqliteFree = cpart.getMember("sqlite3_free") 

And it works - it remains only to call them in the correct order - and that’s all! Well, for example, sqlite3_open requires a string with the file name and a pointer to a pointer to a structure (the insides of which do not interest us at all from the word). Hmm ... and how to form the second argument? The function of creating pointers is needed - probably, it is Sulong-specific. Add sulong.jar to Classpath, restart the entire sbt shell. And nothing. Whether long, shortly, did not find anything smarter to create the lib directory in the root of the sbt project (the standard directory for unmanaged jars) and execute in it


 find ../../graalvm-1.0.0-rc1/jre/languages/ -name '*.jar' -exec ln -s {} . \; 

After sbt refresh compilation completed successfully. That just does not start anything ... Well, we return the Classpath in place. In general, I thought I would finish the fifth line. Well, I’ll retell Javadoc for each of the five, it’s a small article, and everyone will say: “Do we have Twitter here or what?” ...


Probably three hours passed, and I tried to wrap the second argument of the sqlite3_open function ...


At some point it dawned on me: it’s necessary, like in a joke: “What are you starting with“ War and Peace ”, read“ Kolobok ”- just for your level” ... So sqlite3.c temporarily replaced with test.c


 void f(int *x) { *x = 42; } 

Having stumbled a little more into all sorts of API type conversion APIs of varying degrees of privacy, I was tired to say the least. In my head there were only anecdotes. For example: "iOS is an intuitive system. To understand it, logic is powerless — intuition is needed." And indeed, what is the main principle of GraalVM and of all this — everything should be transparent and relaxed, so you should reject the slightest experience with FFI and think like a developer of a convenient system. We need a container with int. Pass new java.lang.Integer(0) - write to the zero address. But what we were taught on the basics of C: the difference between the array and the pointer to the zero element is very conditional. In fact, the function f simply takes an array of ints and writes a value to the zero element. We try:


 scala> val x = Array(new java.lang.Integer(12)) x: Array[Integer] = Array(12) scala> SQLiteTest.cpart.getMember("f").execute(x) res0: org.graalvm.polyglot.Value = LLVMTruffleObject(null:0) scala> x res1: Array[Integer] = Array(42) 

TADAM !!!


Here, it would seem, to quickly write the query function and finish on this, but whatever you pass as the second argument: neither Array(new Object) or Array(Array(new Object)) - it refuses to work, cursing at strlen inside LLVM- bitcode O_O (by the way, LLVM IR, in contrast to the usual machine code from so-ki is quite a typed).


Even later, I stopped throwing away the idea that just passing to execute() as the first argument java.lang.String and even Array[Byte] is too intuitive, and reworking our void f() confirmed this.


As a result, in the built-in Sulong binding ( SQLiteTest.polyglot.getBindings("llvm") ), a function with a promising name __sulong_byte_array_to_native was found. We try:


 val str = SQLiteTest.polyglot.getBindings("llvm") .getMember("__sulong_byte_array_to_native") .execute("toc.db".getBytes) val db = new Array[Object](1) SQLiteTest.sqliteOpen.execute(str, db) scala> str: org.graalvm.polyglot.Value = LLVMTruffleObject(null:139990504321152) scala> db: Array[Object] = Array(null) scala> res0: org.graalvm.polyglot.Value = 0 scala> val str = SQLiteTest.polyglot.getBindings("llvm") .getMember("__sulong_byte_array_to_native") .execute("toc123.db".getBytes) SQLiteTest.sqliteOpen.execute(str, db) str: org.graalvm.polyglot.Value = LLVMTruffleObject(null:139990517528064) scala> res1: org.graalvm.polyglot.Value = 0 

Works!!! Oh, why does it work with the wrong file name too? .. with bated breath, we look into the project directory - and there is already a new one toc123.db . Hooray!


So, let's rewrite an example from the SQLite documentation on Scala:


  def query(dbFile: String, queryString: String): Unit = { val filenameStr = toCString(dbFile) val ptrToDb = new Array[Object](1) val rc = sqliteOpen.execute(filenameStr, ptrToDb) val db = ptrToDb.head if (rc.asInt() != 0) { println(s"Cannot open $dbFile: ${sqliteErrmsg.execute(db)}!") sqliteClose.execute(db) } else { val zErrMsg = new Array[Object](1) val execRc = sqliteExec.execute(db, toCString(queryString), ???, zErrMsg) if (execRc.asInt != 0) { val errorMessage = zErrMsg.head.asInstanceOf[Value] assert(errorMessage.isString) println(s"Cannot execute query: ${errorMessage.asString}") sqliteFree.execute(errorMessage) } sqliteClose.executeVoid(db) } } 

Here is just one catch - a callback. Well, when no one sees, a student engineer describes a core made of wood, and I will try to write a callback in JavaScript:


  val callback = polyglot.eval("js", """function(unused, argc, argv, azColName) { | print("argc = " + argc); | print("argv = " + argv); | print("azColName = " + azColName); | return 0; |} """.stripMargin) // ... val execRc = sqliteExec.execute(db, toCString(queryString), callback, Int.box(0), zErrMsg) 

And here is what we get:


 io.github.trosinenko.SQLiteTest.query("toc.db", "select * from toc;") argc = 5 argv = foreign {} azColName = foreign {} argc = 5 argv = foreign {} azColName = foreign {} argc = 5 argv = foreign {} azColName = foreign {} 

Well, magic is not enough. In addition, it turns out that in case of an error in zErrMsg is some kind of incomprehensible object, which itself is not convertible into a string. Well, lib.bc and load lib.bc , and in its source code lib.c we write the following:


 #include <polyglot.h> void *fromCString(const char *str) { return polyglot_from_string(str, "UTF-8"); } 

Why polyglot_from_string inaccessible right through bindings, I did not understand, so we’ll get it out and make a harness:


  val lib_fromCString = lib.getMember("fromCString") def fromCString(ptr: Value): String = { if (ptr.isNull) "<null>" else lib_fromCString.execute(ptr).asString() } 

Well, with the return of error messages sorted out, but the callback, let's still write on Scala:


  val lib_copyToArray = lib.getMember("copy_to_array_from_pointers") val callback = new ProxyExecutable { override def execute(arguments: Value*): AnyRef = { val argc = arguments(1).asInt() val xargv = new Array[Long](argc) val xazColName = new Array[Long](argc) lib_copyToArray.execute(xargv, arguments(2)) lib_copyToArray.execute(xazColName, arguments(3)) (0 until argc) foreach { i => val name = fromCString(polyglot.asValue(xazColName(i) ^ 1)) val value = fromCString(polyglot.asValue(xargv(i) ^ 1)) println(s"$name = $value") } println("========================") Int.box(0) } } 

At the same time, in our lib.c we will add this sort of transfer magic from the sish array to Polyglot:


 void copy_to_array_from_pointers(void *arr, void **ptrs) { int size = polyglot_get_array_size(arr); for(int i = 0; i < size; ++i) { polyglot_set_array_element(arr, i, ((uintptr_t)ptrs[i]) ^ 1); } } 

Pay attention to the ^ 1 pointer - you need this because someone is too clever: namely, polyglot_set_array_element is a variadic-function with exactly three arguments, which accepts both primitive types and pointers to Polyglot values. As a result, it works:


 io.github.atrosinenko.SQLiteTest.query("toc.db", "select * from toc;") name = sqlite3 type = object status = 0 title = Database Connection Handle uri = c3ref/sqlite3.html ======================== name = sqlite3_int64 type = object status = 0 title = 64-Bit Integer Types uri = c3ref/int64.html ======================== name = sqlite3_uint64 type = object status = 0 title = 64-Bit Integer Types uri = c3ref/int64.html ======================== ... 

It remains to add the main method:


  def main(args: Array[String]): Unit = { query(args(0), args(1)) polyglot.close() } 

in which, in fact, the context needs to be closed, but I did not do this in the object itself, because after initializing SQLiteTest we, of course, still need it for the Scala console.


This concludes my story, and the reader proposes:


  1. Trying to compile it all with the help of SubstrateVM in the native binary, as if there was no Scala here
  2. (*) Do the same, but with profile guided optimization

The resulting files:


SQLiteTest.scala
 package io.github.atrosinenko import java.io.File import org.graalvm.polyglot.proxy.ProxyExecutable import org.graalvm.polyglot.{Context, Source, Value} object SQLiteTest { val polyglot: Context = Context.newBuilder().allowAllAccess(true).build() def loadBcFile(file: File): Value = { val source = Source.newBuilder("llvm", file).build() polyglot.eval(source) } val cpart: Value = loadBcFile(new File("./sqlite3.bc")) val lib: Value = loadBcFile(new File("./lib.bc")) val sqliteOpen: Value = cpart.getMember("sqlite3_open") val sqliteExec: Value = cpart.getMember("sqlite3_exec") val sqliteErrmsg: Value = cpart.getMember("sqlite3_errmsg") val sqliteClose: Value = cpart.getMember("sqlite3_close") val sqliteFree: Value = cpart.getMember("sqlite3_free") val bytesToNative: Value = polyglot.getBindings("llvm").getMember("__sulong_byte_array_to_native") def toCString(str: String): Value = { bytesToNative.execute(str.getBytes()) } val lib_fromCString: Value = lib.getMember("fromCString") def fromCString(ptr: Value): String = { if (ptr.isNull) "<null>" else lib_fromCString.execute(ptr).asString() } val lib_copyToArray: Value = lib.getMember("copy_to_array_from_pointers") val callback: ProxyExecutable = new ProxyExecutable { override def execute(arguments: Value*): AnyRef = { val argc = arguments(1).asInt() val xargv = new Array[Long](argc) val xazColName = new Array[Long](argc) lib_copyToArray.execute(xargv, arguments(2)) lib_copyToArray.execute(xazColName, arguments(3)) (0 until argc) foreach { i => val name = fromCString(polyglot.asValue(xazColName(i) ^ 1)) val value = fromCString(polyglot.asValue(xargv(i) ^ 1)) println(s"$name = $value") } println("========================") Int.box(0) } } def query(dbFile: String, queryString: String): Unit = { val filenameStr = toCString(dbFile) val ptrToDb = new Array[Object](1) val rc = sqliteOpen.execute(filenameStr, ptrToDb) val db = ptrToDb.head if (rc.asInt() != 0) { println(s"Cannot open $dbFile: ${fromCString(sqliteErrmsg.execute(db))}!") sqliteClose.execute(db) } else { val zErrMsg = new Array[Object](1) val execRc = sqliteExec.execute(db, toCString(queryString), callback, Int.box(0), zErrMsg) if (execRc.asInt != 0) { val errorMessage = zErrMsg.head.asInstanceOf[Value] println(s"Cannot execute query: ${fromCString(errorMessage)}") sqliteFree.execute(errorMessage) } sqliteClose.execute(db) } } def main(args: Array[String]): Unit = { query(args(0), args(1)) polyglot.close() } } 

lib.c
 #include <polyglot.h> void *fromCString(const char *str) { return polyglot_from_string(str, "UTF-8"); } void copy_to_array_from_pointers(void *arr, void **ptrs) { int size = polyglot_get_array_size(arr); for(int i = 0; i < size; ++i) { polyglot_set_array_element(arr, i, ((uintptr_t)ptrs[i]) ^ 1); } } 

Link to the repository .


')

Source: https://habr.com/ru/post/358700/


All Articles