📜 ⬆️ ⬇️

What is there with JEP-303 or invent invokedynamic

Bloggers and authors who are trying to be at the forefront have already written a lot about the Amber project in Java 10. These articles always mention the output of local variable types , enum and lambda improvements , sometimes they write about pattern matching and data classes. But at the same time, it is unfairly bypassed by JEP 303 : Intrinsics for the LDC and INVOKEDYNAMIC Instructions. Perhaps because few people understand what this is all about. Although it is curious that the guys from NIX_Solutions fantasized about this feature on Habré a year ago.


It is widely known that in the Java virtual machine, starting from version 7, there is an interesting instruction invokedynamic (it is indy). Many have heard about it, but few know what it actually does. Someone knows that it is used when compiling lambda expressions and references to methods in Java 8. Some have heard that it is used to concatenate strings in Java 9. But although these are useful uses of indy, the original goal is still slightly different: to do dynamic a call where you can call different code in the same place. This feature is not used in lambdas or in string concatenation: there the behavior is always generated on the first call and remains constant until the end of the program (always used by ConstantCallSite ). Let's see what else you can do.


Suppose we want to write a method that multiplies two long numbers and returns BigInteger . It would seem, what is the difficulty? One line:


 static BigInteger multiplyNaive(long l1, long l2) { return BigInteger.valueOf(l1).multiply(BigInteger.valueOf(l2)); } 

We tagged and saw that it worked, say, 40 nanoseconds. Here we notice that there seems to be a lot of overhead. Very often the product of two longs in practice also holds in a long. And in such cases, why do we need to create two honest BigInteger and multiply, when we could multiply first, and then wrap the result into one BigInteger? Something like this:


 static BigInteger multiplyIncorrect(long l1, long l2) { return BigInteger.valueOf(l1 * l2); } 

This version works about twice as fast, about 20 nanoseconds. But what if she is wrong? If the overflow does occur, then all, write is gone. How to check if there is an overflow or not? It turns out you can. There is a multiplyExact method that multiplies, but throws an exception when overflowed. Its Java implementation is not very trivial, but you should not look at it. In fact, this is a JVM intrinsic: the JIT compiler can turn its call into a sequence of assembler instructions. On x86, this is imul (multiply) and jo (jump in case of overflow), and there you already have to watch how we handle the exception. But the point is that if there is no overflow, then it costs us almost nothing. Let's write like this:


 static BigInteger multiplyOverflow(long l1, long l2) { try { return BigInteger.valueOf(Math.multiplyExact(l1, l2)); } catch (ArithmeticException e) { return BigInteger.valueOf(l1).multiply(BigInteger.valueOf(l2)); } } 

If we feed only small numbers here, we get the cherished 20 nanoseconds, great. But if we serve big ones, then it costs not more than 20 or even 40, but about 20 thousand nanoseconds. For the exception has to pay a big price.


Well, let's start with a quick implementation, and if suddenly an overflow has occurred at least once, then we switch to the slow one:


 private static boolean fast = true; static BigInteger multiplySwitch(long a, long b) { if (fast) { try { return BigInteger.valueOf(Math.multiplyExact(a, b)); } catch (ArithmeticException ex) { fast = false; } } return BigInteger.valueOf(a).multiply(BigInteger.valueOf(b)); } 

Fine, it gives us 20 nanoseconds with small numbers and 40 nanoseconds with large ones. Here are just benchmarks - this is not a real application. In a real application, you multiply in a bunch of different places. Most likely in most of them overflow never happens, and it happens only in some places. For example, you have the following code:


 return multiplySwitch(bigNum, bigNum).add(multiplySwitch(smallNum, smallNum)); 

According to the logic of the program, in the second case there are always small numbers, and in the first case they are often large. However, our multiplier will switch to a slow implementation in both places, which is not very pleasant.


Let's make our checkbox non-static by wrapping a multiplier in an object:


 class DynamicMultiplier { private boolean fast = true; BigInteger multiply(long a, long b) { if (fast) { try { return BigInteger.valueOf(Math.multiplyExact(a, b)); } catch (ArithmeticException ex) { fast = false; } } return BigInteger.valueOf(a).multiply(BigInteger.valueOf(b)); } } 

Then we can create a static field for each multiplication in the code, and it will track if there are any overflows in this particular place:


 static final DynamicMultiplier DYNAMIC1 = new DynamicMultiplier(); static final DynamicMultiplier DYNAMIC2 = new DynamicMultiplier(); return DYNAMIC1.multiply(bigNum, bigNum).add(DYNAMIC2.multiply(smallNum, smallNum)); 

We are already close to the goal. In this implementation there are inconveniences: it is necessary to create a separate static field for each multiplication call. In addition, I would not want to initialize them before actual use. Suddenly, we never perform the method with multiplications? Then we will need a lazy initialization of each of these fields (it would also be good thread-safe). This is about us and does invokedynamic: he himself connects a hidden static field with each call and is responsible for ensuring that it is initialized lazily and thread-safe. This field has a special type - “call point” ( CallSite ). By and large, this is simply a link to the target executable code pointed to by MethodHandle . But if the call point is changeable, then it can replace this MethodHandle whenever it wants. You can create a mutable callpoint using the MutableCallSite class (or VolatileCallSite , if you need guarantees of visibility of changes in other threads). It is convenient to extend one of these classes to provide the necessary behavior. Let's write your call point to solve our problem. This is somewhat verbose, but try:


 static class MultiplyCallSite extends MutableCallSite { // :   long',  BigInteger static final MethodType TYPE = MethodType.methodType(BigInteger.class, long.class, long.class); private static final MethodHandle FAST; private static final MethodHandle SLOW; static { try { FAST = MethodHandles.lookup().findVirtual(MultiplyCallSite.class, "fast", TYPE); SLOW = MethodHandles.lookup().findStatic(MultiplyCallSite.class, "slow", TYPE); } catch (NoSuchMethodException | IllegalAccessException e) { throw new InternalError(e); //   ! } } MultiplyCallSite(MethodType type) { super(type); //    FAST  this setTarget(FAST.bindTo(this).asType(type)); } BigInteger fast(long a, long b) { try { return BigInteger.valueOf(Math.multiplyExact(a, b)); } catch (ArithmeticException ex) { //    : SLOW   ,      setTarget(SLOW.asType(type())); return slow(a, b); } } static BigInteger slow(long a, long b) { return BigInteger.valueOf(a).multiply(BigInteger.valueOf(b)); } } 

The asType () transformations are useful if at the call point the type of the expression does not exactly match our type (for example, parameters of type int are passed instead of long ). Further, in principle, we can use it without indy in the same way as we did above:


 static final MethodHandle MULTIPLIER1 = new MultiplyCallSite(TYPE).dynamicInvoker(); static final MethodHandle MULTIPLIER2 = new MultiplyCallSite(TYPE).dynamicInvoker(); try { return ((BigInteger) MULTIPLIER1.invokeExact(bigNum, bigNum)) .add((BigInteger) MULTIPLIER2.invokeExact(smallNum, smallNum)); } catch (Throwable throwable) { throw new InternalError(throwable); // ! ! } 

Here dynamicInvoker is a dynamic MethodHandle that pulls the current target from the call point. Despite the verbosity, this all works as fast as the previous example with DynamicMultiplier, because the JIT compiler knows a lot about all these MethodHandle and knows how to inline very well through them.


But where is our indy? Here's the whole hitch that even in Java 9 you cannot write a Java program that would create an arbitrary indy instruction in bytecode. Indy is used in very specific places that we have already mentioned: lambdas, references to methods, string concatenation. Our MultiplyCallSite can be used, but only if we generate bytecode by some library like ASM. And just write Java code will not work.


This is what JEP 303 is aimed at: letting people use indy anywhere and in any way, and also to load MethodH objects like MethodHandle with one ldc bytecode instruction. For this, the Intrinsics class has been created, which is interpreted in a special way by the javac compiler. This is the bytecode intrinsic (the method call is replaced with a specific bytecode instruction). Do not confuse them with JIT compiler intrinsics (where the method call is replaced with assembly instructions). Auxiliary classes have also been created that implement the Constable interface: in order to collapse into one bytecode instruction, the values ​​of all the corresponding arguments must be combined from these Constables and be known at the compilation stage.


Using ldc, by the way, will simplify our MultiplyCallSite us:


 static class MultiplyCallSite extends MutableCallSite { // :  MethodTypeConstant (J = long) private static final MethodTypeConstant TYPE = MethodTypeConstant.of( ClassConstant.of("Ljava/math/BigInteger;"), ClassConstant.of("J"), ClassConstant.of("J")); //  ,        MultiplyCallSite.class private static final ClassConstant ME = ClassConstant.of("LIndyTest$MultiplyCallSite;"); //  ! private static final MethodHandle FAST = Intrinsics.ldc(MethodHandleConstant.ofVirtual( ME, "fast", TYPE)); private static final MethodHandle SLOW = Intrinsics.ldc(MethodHandleConstant.ofStatic( ME, "slow", TYPE)); ... } 

Since some MethodHandle objects can be referenced directly from the class-file pool of constants, Intrinsics.ldc just generates such a constant and loads it using the ldc instruction. We still need a bootstrap method that constructs our cue point:


 public static CallSite multiplyFactory(MethodHandles.Lookup lookup, String name, MethodType type) { return new MultiplyCallSite(type); } 

And it is convenient to create a BootstrapSpecifier constant that will indicate it:


 public static final BootstrapSpecifier MULT = BootstrapSpecifier.of(MethodHandleConstant.ofStatic( ClassConstant.of("LIndyTest;"), "multiplyFactory", //    , . MethodTypeConstant.of("(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;"))); 

In essence, this MULT constant is all you need to know about library code. The rest is implementation details that do not bother you. Now the main thing is to finally generate the indy-instruction!


 try { //  "foo"     -    return ((BigInteger) Intrinsics.invokedynamic(MULT, "foo", bigNum, bigNum)) .add((BigInteger) Intrinsics.invokedynamic(MULT, "foo", smallNum, smallNum)); } catch (Throwable throwable) { throw new InternalError(throwable); //      ! } 

And it really works! The patched compiler replaces the call with an indy instruction, and we get the same result, but without additional explicit static fields.


It looks, of course, while not very beautiful. But if the custom code is once compiled with indy, you can replace the library implementation as much as you like. For example, you can make an intermediate implementation that defines overflow without exception (slower if there is no overflow, but much faster if there is). Then you can read the statistics and switch to it, if in this place often come both small and large numbers. You can also optimize for the call point signature. Say, if a method in some place is actually called with arguments (int, int), you know that there will definitely not be a long overflow. For such a signature, you can return ConstantCallSite , which simply multiplies two ints without any overflow checks. These changes can be made after the publication of the library code and everything that was compiled earlier, will work faster.


To experiment with this API, you will have to dump the Amber hg-forest yourself, switch to the constant-folding branch and build OpenJDK via configure and make (the assembly instructions are here ). After you build, run javac with the -XDdoConstantFold option.


Perhaps this API is interesting to you, but it seems like a waste of time to experiment with it now. It can be seen that the API is raw, you need to write a bunch of boilerplate and obviously still change ten times. Maybe it's better to wait until everything is settled down? No, this approach is wrong. If the API is interesting, you need to experiment now, because right now you can influence exactly how it will change ten times. Try it now and if you have any ideas or comments, write to amber-dev . If you come in a couple of years, no one wants to change anything.


')

Source: https://habr.com/ru/post/328240/


All Articles