public double octaPow(double a) { return Math.pow(a, 8); } public double octaPow(double a) { return a * a * a * a * a * a * a * a; } public double octaPow(double a) { return Math.pow(Math.pow(Math.pow(a, 2), 2), 2); } public double octaPow(double a) { a *= a; a *= a; return a * a; }
Determine which methods are fast and which methods are slow (JRE 1.8.0_161).Under the bench benchmark, assembler pieces and analysis of optimizations from the JVM.
==
in most cases does not make sense. public static void main(String[] args) { double value = 1e15; double delta = 0.0001; System.out.println(value + delta == value); // true double a = 1.010101; double b = 101.0101; double c = 10101.01; System.out.println((a * b) * c != a * (b * c)); // true }
All options are the same, because cool jit compiler in java! /* */
The second or fourth option is the fastest, because this is a simple multiplication.
public double mathOctaPow(double a) { return Math.pow(a, 8); } public double plainOctaPow(double a) { return a * a * a * a * a * a * a * a; } public double trickyMathOctaPow(double a) { return Math.pow(Math.pow(Math.pow(a, 2), 2), 2); } public double trickyPlainOctaPow(double a) { a *= a; a *= a; return a * a; }
The output of the disassembled code is implemented using the following set of keys:
-XX:+UnlockDiagnosticVMOptions -XX:CompileCommand=print,<>.<> -XX:PrintAssemblyOptions=intel
plainOctaPow
plainOctaPow
. In fact, the code a * a * a * a * a * a * a * a
((((((a * a) * a) * a) * a) * a) * a) * a
double a
parameter is in the xmm0
register): 0x0000000002c96a3e: vmovapd xmm1, xmm0 0x0000000002c96a42: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a46: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a4a: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a4e: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a52: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a56: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a5a: vmulsd xmm1, xmm1, xmm0 0x0000000002c96a5e: vmovapd xmm0, xmm1
vmovapd xmm1, xmm2
- place double-precision floating-point aligned numbers (double-precision float-point will be called double everywhere) from the xmm2
register to the xmm1
register. Since the size of the XMM
registers is 128bit
, you can take up to two doubles at a time. This instruction supports the YMM and ZMM registers, which are 256bit and 512bit, respectively.vmulsd xmm1, xmm2, xmm3
- multiply the double values from the xmm2
and xmm3
, and place the result in the xmm1
register. Similar to the previous instruction - you can multiply up to two doubles simultaneously. If you use YMM and ZMM registers, then up to four and eight double, respectively.a
. In this case, you cannot break the left-associativity and in any way optimize the resulting code.trickyPlainOctaPow
trickyPlainOctaPow()
method is meaningfully compiled into the following set of instructions: 0x0000000002b501be: vmovapd xmm1, xmm0 0x0000000002b501c2: vmulsd xmm1, xmm1, xmm0 0x0000000002b501c6: vmovapd xmm0, xmm1 0x0000000002b501ca: vmulsd xmm0, xmm0, xmm1 0x0000000002b501ce: vmovapd xmm1, xmm0 0x0000000002b501d2: vmulsd xmm1, xmm1, xmm0 0x0000000002b501d6: vmovapd xmm0, xmm1
vmovapd
instructions for preparing the second operand in multiplication. The resulting code is approximately twice as fast if you count in the conditional latency
each instruction.mathOctaPow
Math.pow()
method: public static double pow(double a, double b) { return StrictMath.pow(a, b); }
double
. For this reason, the implementation of the function can no longer be as simple as with ordinary multiplication.StrictMath.pow()
is a native method: public static native double pow(double a, double b);
Math.pow()
comes down to calling the native method using JNI, which, as you know, is expensive . On the other hand, intrinsic-functions are widely used in JDK (see the full list of intrinsics in HotSpot ). Among which is _dpow
- intrinsic , replacing the call Math.pow()
.mathOctaPow()
method: 0x0000000002aaacd0: vmovsd xmm1,QWORD PTR [rip+0xffffffffffffff68] # 0x0000000002aaac40 ; {section_word} 0x0000000002aaacd8: vmovsd QWORD PTR [rsp],xmm1 0x0000000002aaacdd: fld QWORD PTR [rsp] 0x0000000002aaace0: vmovsd QWORD PTR [rsp],xmm0 0x0000000002aaace5: fld QWORD PTR [rsp] 0x0000000002aaace8: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002aaacf2: fld QWORD PTR [rax] 0x0000000002aaacf4: fucomip st,st(2) 0x0000000002aaacf6: jp 0x0000000002aaad0f 0x0000000002aaacfc: jne 0x0000000002aaad0f 0x0000000002aaad02: fxch st(1) 0x0000000002aaad04: ffree st(0) 0x0000000002aaad06: fincstp 0x0000000002aaad08: fmul st,st(0) 0x0000000002aaad0a: jmp 0x0000000002aab166 0x0000000002aaad0f: fldz 0x0000000002aaad11: fucomip st,st(1) 0x0000000002aaad13: ja 0x0000000002aaad96 0x0000000002aaad19: fld st(1) 0x0000000002aaad1b: fld st(1) 0x0000000002aaad1d: sub rsp,0x8 0x0000000002aaad21: fstcw WORD PTR [rsp] 0x0000000002aaad25: mov eax,DWORD PTR [rsp] 0x0000000002aaad28: or eax,0x300 0x0000000002aaad2e: push rax 0x0000000002aaad2f: fldcw WORD PTR [rsp] 0x0000000002aaad32: pop rax 0x0000000002aaad33: fyl2x 0x0000000002aaad35: sub rsp,0x8 0x0000000002aaad39: fld st(0) 0x0000000002aaad3b: frndint 0x0000000002aaad3d: fsubr st(1),st 0x0000000002aaad3f: fistp DWORD PTR [rsp] 0x0000000002aaad42: f2xm1 0x0000000002aaad44: fld1 0x0000000002aaad46: faddp st(1),st 0x0000000002aaad48: mov eax,DWORD PTR [rsp] 0x0000000002aaad4b: mov ecx,0xfffff800 0x0000000002aaad50: add eax,0x3ff 0x0000000002aaad56: mov edx,eax 0x0000000002aaad58: shl eax,0x14 0x0000000002aaad5b: add edx,0x1 0x0000000002aaad5e: cmove eax,ecx 0x0000000002aaad61: cmp edx,0x1 0x0000000002aaad64: cmove eax,ecx 0x0000000002aaad67: test ecx,edx 0x0000000002aaad69: cmovne eax,ecx 0x0000000002aaad6c: mov DWORD PTR [rsp+0x4],eax 0x0000000002aaad70: mov DWORD PTR [rsp],0x0 0x0000000002aaad77: fmul QWORD PTR [rsp] 0x0000000002aaad7a: add rsp,0x8 0x0000000002aaad7e: fldcw WORD PTR [rsp] 0x0000000002aaad81: add rsp,0x8 0x0000000002aaad85: fucomi st,st(0) 0x0000000002aaad87: jp 0x0000000002aaae36 0x0000000002aaad8d: ffree st(2) 0x0000000002aaad8f: ffree st(1) 0x0000000002aaad91: jmp 0x0000000002aab166 0x0000000002aaad96: fld st(1) 0x0000000002aaad98: frndint 0x0000000002aaad9a: fucomi st,st(2) 0x0000000002aaad9c: jne 0x0000000002aaae36 0x0000000002aaada2: sub rsp,0x8 0x0000000002aaada6: fistp QWORD PTR [rsp] 0x0000000002aaada9: fld st(1) 0x0000000002aaadab: fld st(1) 0x0000000002aaadad: fabs 0x0000000002aaadaf: sub rsp,0x8 0x0000000002aaadb3: fstcw WORD PTR [rsp] 0x0000000002aaadb7: mov eax,DWORD PTR [rsp] 0x0000000002aaadba: or eax,0x300 0x0000000002aaadc0: push rax 0x0000000002aaadc1: fldcw WORD PTR [rsp] 0x0000000002aaadc4: pop rax 0x0000000002aaadc5: fyl2x 0x0000000002aaadc7: sub rsp,0x8 0x0000000002aaadcb: fld st(0) 0x0000000002aaadcd: frndint 0x0000000002aaadcf: fsubr st(1),st 0x0000000002aaadd1: fistp DWORD PTR [rsp] 0x0000000002aaadd4: f2xm1 0x0000000002aaadd6: fld1 0x0000000002aaadd8: faddp st(1),st 0x0000000002aaadda: mov eax,DWORD PTR [rsp] 0x0000000002aaaddd: mov ecx,0xfffff800 0x0000000002aaade2: add eax,0x3ff 0x0000000002aaade8: mov edx,eax 0x0000000002aaadea: shl eax,0x14 0x0000000002aaaded: add edx,0x1 0x0000000002aaadf0: cmove eax,ecx 0x0000000002aaadf3: cmp edx,0x1 0x0000000002aaadf6: cmove eax,ecx 0x0000000002aaadf9: test ecx,edx 0x0000000002aaadfb: cmovne eax,ecx 0x0000000002aaadfe: mov DWORD PTR [rsp+0x4],eax 0x0000000002aaae02: mov DWORD PTR [rsp],0x0 0x0000000002aaae09: fmul QWORD PTR [rsp] 0x0000000002aaae0c: add rsp,0x8 0x0000000002aaae10: fldcw WORD PTR [rsp] 0x0000000002aaae13: add rsp,0x8 0x0000000002aaae17: fucomi st,st(0) 0x0000000002aaae19: pop rax 0x0000000002aaae1a: jp 0x0000000002aaae36 0x0000000002aaae20: ffree st(2) 0x0000000002aaae22: ffree st(1) 0x0000000002aaae24: test eax,0x1 0x0000000002aaae29: je 0x0000000002aab166 0x0000000002aaae2f: fchs 0x0000000002aaae31: jmp 0x0000000002aab166 0x0000000002aaae36: ffree st(0) 0x0000000002aaae38: fincstp 0x0000000002aaae3a: mov QWORD PTR [rsp-0x28],rsp 0x0000000002aaae3f: sub rsp,0x80 0x0000000002aaae46: mov QWORD PTR [rsp+0x78],rax 0x0000000002aaae4b: mov QWORD PTR [rsp+0x70],rcx 0x0000000002aaae50: mov QWORD PTR [rsp+0x68],rdx 0x0000000002aaae55: mov QWORD PTR [rsp+0x60],rbx 0x0000000002aaae5a: mov QWORD PTR [rsp+0x50],rbp 0x0000000002aaae5f: mov QWORD PTR [rsp+0x48],rsi 0x0000000002aaae64: mov QWORD PTR [rsp+0x40],rdi 0x0000000002aaae69: mov QWORD PTR [rsp+0x38],r8 0x0000000002aaae6e: mov QWORD PTR [rsp+0x30],r9 0x0000000002aaae73: mov QWORD PTR [rsp+0x28],r10 0x0000000002aaae78: mov QWORD PTR [rsp+0x20],r11 0x0000000002aaae7d: mov QWORD PTR [rsp+0x18],r12 0x0000000002aaae82: mov QWORD PTR [rsp+0x10],r13 0x0000000002aaae87: mov QWORD PTR [rsp+0x8],r14 0x0000000002aaae8c: mov QWORD PTR [rsp],r15 0x0000000002aaae90: sub rsp,0x100 0x0000000002aaae97: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002aaae9e: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002aaaea6: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002aaaeae: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002aaaeb6: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002aaaebe: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002aaaec6: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002aaaece: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002aaaed6: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002aaaee1: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002aaaeec: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002aaaef7: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002aaaf02: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002aaaf0d: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002aaaf18: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002aaaf23: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002aaaf2e: sub rsp,0x100 0x0000000002aaaf35: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002aaaf3a: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002aaaf40: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002aaaf46: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002aaaf4c: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002aaaf52: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002aaaf58: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002aaaf5e: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002aaaf64: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002aaaf6d: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002aaaf76: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002aaaf7f: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002aaaf88: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002aaaf91: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002aaaf9a: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002aaafa3: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002aaafac: sub rsp,0x10 0x0000000002aaafb0: fstp QWORD PTR [rsp] 0x0000000002aaafb3: fstp QWORD PTR [rsp+0x8] 0x0000000002aaafb7: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002aaafbc: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002aaafc2: sub rsp,0x20 0x0000000002aaafc6: test esp,0xf 0x0000000002aaafcc: je 0x0000000002aaafe4 0x0000000002aaafd2: sub rsp,0x8 0x0000000002aaafd6: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002aaafdb: add rsp,0x8 0x0000000002aaafdf: jmp 0x0000000002aaafe9 0x0000000002aaafe4: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002aaafe9: add rsp,0x20 0x0000000002aaafed: vmovsd QWORD PTR [rsp],xmm0 0x0000000002aaaff2: fld QWORD PTR [rsp] 0x0000000002aaaff5: add rsp,0x10 0x0000000002aaaff9: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002aaaffe: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002aab004: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002aab00a: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002aab010: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002aab016: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002aab01c: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002aab022: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002aab028: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002aab031: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002aab03a: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002aab043: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002aab04c: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002aab055: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002aab05e: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002aab067: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002aab070: add rsp,0x100 0x0000000002aab077: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002aab07e: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002aab086: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002aab08e: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002aab096: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002aab09e: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002aab0a6: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002aab0ae: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002aab0b6: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002aab0c1: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002aab0cc: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002aab0d7: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002aab0e2: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002aab0ed: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002aab0f8: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002aab103: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002aab10e: add rsp,0x100 0x0000000002aab115: mov r15,QWORD PTR [rsp] 0x0000000002aab119: mov r14,QWORD PTR [rsp+0x8] 0x0000000002aab11e: mov r13,QWORD PTR [rsp+0x10] 0x0000000002aab123: mov r12,QWORD PTR [rsp+0x18] 0x0000000002aab128: mov r11,QWORD PTR [rsp+0x20] 0x0000000002aab12d: mov r10,QWORD PTR [rsp+0x28] 0x0000000002aab132: mov r9,QWORD PTR [rsp+0x30] 0x0000000002aab137: mov r8,QWORD PTR [rsp+0x38] 0x0000000002aab13c: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002aab141: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002aab146: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002aab14b: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002aab150: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002aab155: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002aab15a: mov rax,QWORD PTR [rsp+0x78] 0x0000000002aab15f: add rsp,0x80 0x0000000002aab166: fstp QWORD PTR [rsp] 0x0000000002aab169: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::mathOctaPow@4 (line 55)
8.0
to the xmm1
register, and the xmm0
register already contains the value a
. Next is the body of the intrinsic function .trickyMathOctaPow
Math.pow()
we got as many as three. The JIT compiler replaced the body of the trickyMathOctaPow()
method with three consecutive _dpow implementations. 0x0000000002a70b14: vmovsd xmm1,QWORD PTR [rip+0xffffffffffffff44] # 0x0000000002a70a60 ; {section_word} 0x0000000002a70b1c: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a70b21: fld QWORD PTR [rsp] 0x0000000002a70b24: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70b29: fld QWORD PTR [rsp] 0x0000000002a70b2c: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002a70b36: fld QWORD PTR [rax] 0x0000000002a70b38: fucomip st,st(2) 0x0000000002a70b3a: jp 0x0000000002a70b53 0x0000000002a70b40: jne 0x0000000002a70b53 0x0000000002a70b46: fxch st(1) 0x0000000002a70b48: ffree st(0) 0x0000000002a70b4a: fincstp 0x0000000002a70b4c: fmul st,st(0) 0x0000000002a70b4e: jmp 0x0000000002a70faa 0x0000000002a70b53: fldz 0x0000000002a70b55: fucomip st,st(1) 0x0000000002a70b57: ja 0x0000000002a70bda 0x0000000002a70b5d: fld st(1) 0x0000000002a70b5f: fld st(1) 0x0000000002a70b61: sub rsp,0x8 0x0000000002a70b65: fstcw WORD PTR [rsp] 0x0000000002a70b69: mov eax,DWORD PTR [rsp] 0x0000000002a70b6c: or eax,0x300 0x0000000002a70b72: push rax 0x0000000002a70b73: fldcw WORD PTR [rsp] 0x0000000002a70b76: pop rax 0x0000000002a70b77: fyl2x 0x0000000002a70b79: sub rsp,0x8 0x0000000002a70b7d: fld st(0) 0x0000000002a70b7f: frndint 0x0000000002a70b81: fsubr st(1),st 0x0000000002a70b83: fistp DWORD PTR [rsp] 0x0000000002a70b86: f2xm1 0x0000000002a70b88: fld1 0x0000000002a70b8a: faddp st(1),st 0x0000000002a70b8c: mov eax,DWORD PTR [rsp] 0x0000000002a70b8f: mov ecx,0xfffff800 0x0000000002a70b94: add eax,0x3ff 0x0000000002a70b9a: mov edx,eax 0x0000000002a70b9c: shl eax,0x14 0x0000000002a70b9f: add edx,0x1 0x0000000002a70ba2: cmove eax,ecx 0x0000000002a70ba5: cmp edx,0x1 0x0000000002a70ba8: cmove eax,ecx 0x0000000002a70bab: test ecx,edx 0x0000000002a70bad: cmovne eax,ecx 0x0000000002a70bb0: mov DWORD PTR [rsp+0x4],eax 0x0000000002a70bb4: mov DWORD PTR [rsp],0x0 0x0000000002a70bbb: fmul QWORD PTR [rsp] 0x0000000002a70bbe: add rsp,0x8 0x0000000002a70bc2: fldcw WORD PTR [rsp] 0x0000000002a70bc5: add rsp,0x8 0x0000000002a70bc9: fucomi st,st(0) 0x0000000002a70bcb: jp 0x0000000002a70c7a 0x0000000002a70bd1: ffree st(2) 0x0000000002a70bd3: ffree st(1) 0x0000000002a70bd5: jmp 0x0000000002a70faa 0x0000000002a70bda: fld st(1) 0x0000000002a70bdc: frndint 0x0000000002a70bde: fucomi st,st(2) 0x0000000002a70be0: jne 0x0000000002a70c7a 0x0000000002a70be6: sub rsp,0x8 0x0000000002a70bea: fistp QWORD PTR [rsp] 0x0000000002a70bed: fld st(1) 0x0000000002a70bef: fld st(1) 0x0000000002a70bf1: fabs 0x0000000002a70bf3: sub rsp,0x8 0x0000000002a70bf7: fstcw WORD PTR [rsp] 0x0000000002a70bfb: mov eax,DWORD PTR [rsp] 0x0000000002a70bfe: or eax,0x300 0x0000000002a70c04: push rax 0x0000000002a70c05: fldcw WORD PTR [rsp] 0x0000000002a70c08: pop rax 0x0000000002a70c09: fyl2x 0x0000000002a70c0b: sub rsp,0x8 0x0000000002a70c0f: fld st(0) 0x0000000002a70c11: frndint 0x0000000002a70c13: fsubr st(1),st 0x0000000002a70c15: fistp DWORD PTR [rsp] 0x0000000002a70c18: f2xm1 0x0000000002a70c1a: fld1 0x0000000002a70c1c: faddp st(1),st 0x0000000002a70c1e: mov eax,DWORD PTR [rsp] 0x0000000002a70c21: mov ecx,0xfffff800 0x0000000002a70c26: add eax,0x3ff 0x0000000002a70c2c: mov edx,eax 0x0000000002a70c2e: shl eax,0x14 0x0000000002a70c31: add edx,0x1 0x0000000002a70c34: cmove eax,ecx 0x0000000002a70c37: cmp edx,0x1 0x0000000002a70c3a: cmove eax,ecx 0x0000000002a70c3d: test ecx,edx 0x0000000002a70c3f: cmovne eax,ecx 0x0000000002a70c42: mov DWORD PTR [rsp+0x4],eax 0x0000000002a70c46: mov DWORD PTR [rsp],0x0 0x0000000002a70c4d: fmul QWORD PTR [rsp] 0x0000000002a70c50: add rsp,0x8 0x0000000002a70c54: fldcw WORD PTR [rsp] 0x0000000002a70c57: add rsp,0x8 0x0000000002a70c5b: fucomi st,st(0) 0x0000000002a70c5d: pop rax 0x0000000002a70c5e: jp 0x0000000002a70c7a 0x0000000002a70c64: ffree st(2) 0x0000000002a70c66: ffree st(1) 0x0000000002a70c68: test eax,0x1 0x0000000002a70c6d: je 0x0000000002a70faa 0x0000000002a70c73: fchs 0x0000000002a70c75: jmp 0x0000000002a70faa 0x0000000002a70c7a: ffree st(0) 0x0000000002a70c7c: fincstp 0x0000000002a70c7e: mov QWORD PTR [rsp-0x28],rsp 0x0000000002a70c83: sub rsp,0x80 0x0000000002a70c8a: mov QWORD PTR [rsp+0x78],rax 0x0000000002a70c8f: mov QWORD PTR [rsp+0x70],rcx 0x0000000002a70c94: mov QWORD PTR [rsp+0x68],rdx 0x0000000002a70c99: mov QWORD PTR [rsp+0x60],rbx 0x0000000002a70c9e: mov QWORD PTR [rsp+0x50],rbp 0x0000000002a70ca3: mov QWORD PTR [rsp+0x48],rsi 0x0000000002a70ca8: mov QWORD PTR [rsp+0x40],rdi 0x0000000002a70cad: mov QWORD PTR [rsp+0x38],r8 0x0000000002a70cb2: mov QWORD PTR [rsp+0x30],r9 0x0000000002a70cb7: mov QWORD PTR [rsp+0x28],r10 0x0000000002a70cbc: mov QWORD PTR [rsp+0x20],r11 0x0000000002a70cc1: mov QWORD PTR [rsp+0x18],r12 0x0000000002a70cc6: mov QWORD PTR [rsp+0x10],r13 0x0000000002a70ccb: mov QWORD PTR [rsp+0x8],r14 0x0000000002a70cd0: mov QWORD PTR [rsp],r15 0x0000000002a70cd4: sub rsp,0x100 0x0000000002a70cdb: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002a70ce2: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002a70cea: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002a70cf2: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002a70cfa: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002a70d02: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002a70d0a: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002a70d12: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002a70d1a: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002a70d25: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002a70d30: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002a70d3b: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002a70d46: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002a70d51: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002a70d5c: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002a70d67: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002a70d72: sub rsp,0x100 0x0000000002a70d79: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002a70d7e: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002a70d84: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002a70d8a: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002a70d90: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002a70d96: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002a70d9c: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002a70da2: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002a70da8: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002a70db1: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002a70dba: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002a70dc3: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002a70dcc: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002a70dd5: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002a70dde: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002a70de7: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002a70df0: sub rsp,0x10 0x0000000002a70df4: fstp QWORD PTR [rsp] 0x0000000002a70df7: fstp QWORD PTR [rsp+0x8] 0x0000000002a70dfb: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002a70e00: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002a70e06: sub rsp,0x20 0x0000000002a70e0a: test esp,0xf 0x0000000002a70e10: je 0x0000000002a70e28 0x0000000002a70e16: sub rsp,0x8 0x0000000002a70e1a: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a70e1f: add rsp,0x8 0x0000000002a70e23: jmp 0x0000000002a70e2d 0x0000000002a70e28: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a70e2d: add rsp,0x20 0x0000000002a70e31: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70e36: fld QWORD PTR [rsp] 0x0000000002a70e39: add rsp,0x10 0x0000000002a70e3d: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002a70e42: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002a70e48: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002a70e4e: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002a70e54: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002a70e5a: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002a70e60: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002a70e66: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002a70e6c: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002a70e75: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002a70e7e: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002a70e87: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002a70e90: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002a70e99: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002a70ea2: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002a70eab: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002a70eb4: add rsp,0x100 0x0000000002a70ebb: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002a70ec2: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002a70eca: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002a70ed2: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002a70eda: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002a70ee2: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002a70eea: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002a70ef2: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002a70efa: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002a70f05: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002a70f10: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002a70f1b: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002a70f26: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002a70f31: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002a70f3c: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002a70f47: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002a70f52: add rsp,0x100 0x0000000002a70f59: mov r15,QWORD PTR [rsp] 0x0000000002a70f5d: mov r14,QWORD PTR [rsp+0x8] 0x0000000002a70f62: mov r13,QWORD PTR [rsp+0x10] 0x0000000002a70f67: mov r12,QWORD PTR [rsp+0x18] 0x0000000002a70f6c: mov r11,QWORD PTR [rsp+0x20] 0x0000000002a70f71: mov r10,QWORD PTR [rsp+0x28] 0x0000000002a70f76: mov r9,QWORD PTR [rsp+0x30] 0x0000000002a70f7b: mov r8,QWORD PTR [rsp+0x38] 0x0000000002a70f80: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002a70f85: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002a70f8a: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002a70f8f: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002a70f94: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002a70f99: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002a70f9e: mov rax,QWORD PTR [rsp+0x78] 0x0000000002a70fa3: add rsp,0x80 0x0000000002a70faa: fstp QWORD PTR [rsp] 0x0000000002a70fad: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::trickyMathOctaPow@4 (line 63) 0x0000000002a70fb2: vmovsd xmm1,QWORD PTR [rip+0xfffffffffffffaae] # 0x0000000002a70a68 ; {section_word} 0x0000000002a70fba: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a70fbf: fld QWORD PTR [rsp] 0x0000000002a70fc2: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70fc7: fld QWORD PTR [rsp] 0x0000000002a70fca: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002a70fd4: fld QWORD PTR [rax] 0x0000000002a70fd6: fucomip st,st(2) 0x0000000002a70fd8: jp 0x0000000002a70ff1 0x0000000002a70fde: jne 0x0000000002a70ff1 0x0000000002a70fe4: fxch st(1) 0x0000000002a70fe6: ffree st(0) 0x0000000002a70fe8: fincstp 0x0000000002a70fea: fmul st,st(0) 0x0000000002a70fec: jmp 0x0000000002a71448 0x0000000002a70ff1: fldz 0x0000000002a70ff3: fucomip st,st(1) 0x0000000002a70ff5: ja 0x0000000002a71078 0x0000000002a70ffb: fld st(1) 0x0000000002a70ffd: fld st(1) 0x0000000002a70fff: sub rsp,0x8 0x0000000002a71003: fstcw WORD PTR [rsp] 0x0000000002a71007: mov eax,DWORD PTR [rsp] 0x0000000002a7100a: or eax,0x300 0x0000000002a71010: push rax 0x0000000002a71011: fldcw WORD PTR [rsp] 0x0000000002a71014: pop rax 0x0000000002a71015: fyl2x 0x0000000002a71017: sub rsp,0x8 0x0000000002a7101b: fld st(0) 0x0000000002a7101d: frndint 0x0000000002a7101f: fsubr st(1),st 0x0000000002a71021: fistp DWORD PTR [rsp] 0x0000000002a71024: f2xm1 0x0000000002a71026: fld1 0x0000000002a71028: faddp st(1),st 0x0000000002a7102a: mov eax,DWORD PTR [rsp] 0x0000000002a7102d: mov ecx,0xfffff800 0x0000000002a71032: add eax,0x3ff 0x0000000002a71038: mov edx,eax 0x0000000002a7103a: shl eax,0x14 0x0000000002a7103d: add edx,0x1 0x0000000002a71040: cmove eax,ecx 0x0000000002a71043: cmp edx,0x1 0x0000000002a71046: cmove eax,ecx 0x0000000002a71049: test ecx,edx 0x0000000002a7104b: cmovne eax,ecx 0x0000000002a7104e: mov DWORD PTR [rsp+0x4],eax 0x0000000002a71052: mov DWORD PTR [rsp],0x0 0x0000000002a71059: fmul QWORD PTR [rsp] 0x0000000002a7105c: add rsp,0x8 0x0000000002a71060: fldcw WORD PTR [rsp] 0x0000000002a71063: add rsp,0x8 0x0000000002a71067: fucomi st,st(0) 0x0000000002a71069: jp 0x0000000002a71118 0x0000000002a7106f: ffree st(2) 0x0000000002a71071: ffree st(1) 0x0000000002a71073: jmp 0x0000000002a71448 0x0000000002a71078: fld st(1) 0x0000000002a7107a: frndint 0x0000000002a7107c: fucomi st,st(2) 0x0000000002a7107e: jne 0x0000000002a71118 0x0000000002a71084: sub rsp,0x8 0x0000000002a71088: fistp QWORD PTR [rsp] 0x0000000002a7108b: fld st(1) 0x0000000002a7108d: fld st(1) 0x0000000002a7108f: fabs 0x0000000002a71091: sub rsp,0x8 0x0000000002a71095: fstcw WORD PTR [rsp] 0x0000000002a71099: mov eax,DWORD PTR [rsp] 0x0000000002a7109c: or eax,0x300 0x0000000002a710a2: push rax 0x0000000002a710a3: fldcw WORD PTR [rsp] 0x0000000002a710a6: pop rax 0x0000000002a710a7: fyl2x 0x0000000002a710a9: sub rsp,0x8 0x0000000002a710ad: fld st(0) 0x0000000002a710af: frndint 0x0000000002a710b1: fsubr st(1),st 0x0000000002a710b3: fistp DWORD PTR [rsp] 0x0000000002a710b6: f2xm1 0x0000000002a710b8: fld1 0x0000000002a710ba: faddp st(1),st 0x0000000002a710bc: mov eax,DWORD PTR [rsp] 0x0000000002a710bf: mov ecx,0xfffff800 0x0000000002a710c4: add eax,0x3ff 0x0000000002a710ca: mov edx,eax 0x0000000002a710cc: shl eax,0x14 0x0000000002a710cf: add edx,0x1 0x0000000002a710d2: cmove eax,ecx 0x0000000002a710d5: cmp edx,0x1 0x0000000002a710d8: cmove eax,ecx 0x0000000002a710db: test ecx,edx 0x0000000002a710dd: cmovne eax,ecx 0x0000000002a710e0: mov DWORD PTR [rsp+0x4],eax 0x0000000002a710e4: mov DWORD PTR [rsp],0x0 0x0000000002a710eb: fmul QWORD PTR [rsp] 0x0000000002a710ee: add rsp,0x8 0x0000000002a710f2: fldcw WORD PTR [rsp] 0x0000000002a710f5: add rsp,0x8 0x0000000002a710f9: fucomi st,st(0) 0x0000000002a710fb: pop rax 0x0000000002a710fc: jp 0x0000000002a71118 0x0000000002a71102: ffree st(2) 0x0000000002a71104: ffree st(1) 0x0000000002a71106: test eax,0x1 0x0000000002a7110b: je 0x0000000002a71448 0x0000000002a71111: fchs 0x0000000002a71113: jmp 0x0000000002a71448 0x0000000002a71118: ffree st(0) 0x0000000002a7111a: fincstp 0x0000000002a7111c: mov QWORD PTR [rsp-0x28],rsp 0x0000000002a71121: sub rsp,0x80 0x0000000002a71128: mov QWORD PTR [rsp+0x78],rax 0x0000000002a7112d: mov QWORD PTR [rsp+0x70],rcx 0x0000000002a71132: mov QWORD PTR [rsp+0x68],rdx 0x0000000002a71137: mov QWORD PTR [rsp+0x60],rbx 0x0000000002a7113c: mov QWORD PTR [rsp+0x50],rbp 0x0000000002a71141: mov QWORD PTR [rsp+0x48],rsi 0x0000000002a71146: mov QWORD PTR [rsp+0x40],rdi 0x0000000002a7114b: mov QWORD PTR [rsp+0x38],r8 0x0000000002a71150: mov QWORD PTR [rsp+0x30],r9 0x0000000002a71155: mov QWORD PTR [rsp+0x28],r10 0x0000000002a7115a: mov QWORD PTR [rsp+0x20],r11 0x0000000002a7115f: mov QWORD PTR [rsp+0x18],r12 0x0000000002a71164: mov QWORD PTR [rsp+0x10],r13 0x0000000002a71169: mov QWORD PTR [rsp+0x8],r14 0x0000000002a7116e: mov QWORD PTR [rsp],r15 0x0000000002a71172: sub rsp,0x100 0x0000000002a71179: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002a71180: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002a71188: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002a71190: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002a71198: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002a711a0: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002a711a8: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002a711b0: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002a711b8: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002a711c3: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002a711ce: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002a711d9: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002a711e4: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002a711ef: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002a711fa: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002a71205: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002a71210: sub rsp,0x100 0x0000000002a71217: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002a7121c: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002a71222: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002a71228: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002a7122e: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002a71234: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002a7123a: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002a71240: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002a71246: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002a7124f: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002a71258: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002a71261: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002a7126a: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002a71273: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002a7127c: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002a71285: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002a7128e: sub rsp,0x10 0x0000000002a71292: fstp QWORD PTR [rsp] 0x0000000002a71295: fstp QWORD PTR [rsp+0x8] 0x0000000002a71299: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002a7129e: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002a712a4: sub rsp,0x20 0x0000000002a712a8: test esp,0xf 0x0000000002a712ae: je 0x0000000002a712c6 0x0000000002a712b4: sub rsp,0x8 0x0000000002a712b8: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a712bd: add rsp,0x8 0x0000000002a712c1: jmp 0x0000000002a712cb 0x0000000002a712c6: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a712cb: add rsp,0x20 0x0000000002a712cf: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a712d4: fld QWORD PTR [rsp] 0x0000000002a712d7: add rsp,0x10 0x0000000002a712db: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002a712e0: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002a712e6: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002a712ec: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002a712f2: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002a712f8: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002a712fe: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002a71304: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002a7130a: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002a71313: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002a7131c: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002a71325: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002a7132e: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002a71337: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002a71340: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002a71349: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002a71352: add rsp,0x100 0x0000000002a71359: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002a71360: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002a71368: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002a71370: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002a71378: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002a71380: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002a71388: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002a71390: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002a71398: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002a713a3: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002a713ae: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002a713b9: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002a713c4: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002a713cf: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002a713da: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002a713e5: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002a713f0: add rsp,0x100 0x0000000002a713f7: mov r15,QWORD PTR [rsp] 0x0000000002a713fb: mov r14,QWORD PTR [rsp+0x8] 0x0000000002a71400: mov r13,QWORD PTR [rsp+0x10] 0x0000000002a71405: mov r12,QWORD PTR [rsp+0x18] 0x0000000002a7140a: mov r11,QWORD PTR [rsp+0x20] 0x0000000002a7140f: mov r10,QWORD PTR [rsp+0x28] 0x0000000002a71414: mov r9,QWORD PTR [rsp+0x30] 0x0000000002a71419: mov r8,QWORD PTR [rsp+0x38] 0x0000000002a7141e: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002a71423: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002a71428: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002a7142d: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002a71432: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002a71437: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002a7143c: mov rax,QWORD PTR [rsp+0x78] 0x0000000002a71441: add rsp,0x80 0x0000000002a71448: fstp QWORD PTR [rsp] 0x0000000002a7144b: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::trickyMathOctaPow@10 (line 63) 0x0000000002a71450: vmovsd xmm1,QWORD PTR [rip+0xfffffffffffff618] # 0x0000000002a70a70 ; {section_word} 0x0000000002a71458: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a7145d: fld QWORD PTR [rsp] 0x0000000002a71460: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a71465: fld QWORD PTR [rsp] 0x0000000002a71468: movabs rax,0x6c4ba7d0 ; {external_word} 0x0000000002a71472: fld QWORD PTR [rax] 0x0000000002a71474: fucomip st,st(2) 0x0000000002a71476: jp 0x0000000002a7148f 0x0000000002a7147c: jne 0x0000000002a7148f 0x0000000002a71482: fxch st(1) 0x0000000002a71484: ffree st(0) 0x0000000002a71486: fincstp 0x0000000002a71488: fmul st,st(0) 0x0000000002a7148a: jmp 0x0000000002a718e6 0x0000000002a7148f: fldz 0x0000000002a71491: fucomip st,st(1) 0x0000000002a71493: ja 0x0000000002a71516 0x0000000002a71499: fld st(1) 0x0000000002a7149b: fld st(1) 0x0000000002a7149d: sub rsp,0x8 0x0000000002a714a1: fstcw WORD PTR [rsp] 0x0000000002a714a5: mov eax,DWORD PTR [rsp] 0x0000000002a714a8: or eax,0x300 0x0000000002a714ae: push rax 0x0000000002a714af: fldcw WORD PTR [rsp] 0x0000000002a714b2: pop rax 0x0000000002a714b3: fyl2x 0x0000000002a714b5: sub rsp,0x8 0x0000000002a714b9: fld st(0) 0x0000000002a714bb: frndint 0x0000000002a714bd: fsubr st(1),st 0x0000000002a714bf: fistp DWORD PTR [rsp] 0x0000000002a714c2: f2xm1 0x0000000002a714c4: fld1 0x0000000002a714c6: faddp st(1),st 0x0000000002a714c8: mov eax,DWORD PTR [rsp] 0x0000000002a714cb: mov ecx,0xfffff800 0x0000000002a714d0: add eax,0x3ff 0x0000000002a714d6: mov edx,eax 0x0000000002a714d8: shl eax,0x14 0x0000000002a714db: add edx,0x1 0x0000000002a714de: cmove eax,ecx 0x0000000002a714e1: cmp edx,0x1 0x0000000002a714e4: cmove eax,ecx 0x0000000002a714e7: test ecx,edx 0x0000000002a714e9: cmovne eax,ecx 0x0000000002a714ec: mov DWORD PTR [rsp+0x4],eax 0x0000000002a714f0: mov DWORD PTR [rsp],0x0 0x0000000002a714f7: fmul QWORD PTR [rsp] 0x0000000002a714fa: add rsp,0x8 0x0000000002a714fe: fldcw WORD PTR [rsp] 0x0000000002a71501: add rsp,0x8 0x0000000002a71505: fucomi st,st(0) 0x0000000002a71507: jp 0x0000000002a715b6 0x0000000002a7150d: ffree st(2) 0x0000000002a7150f: ffree st(1) 0x0000000002a71511: jmp 0x0000000002a718e6 0x0000000002a71516: fld st(1) 0x0000000002a71518: frndint 0x0000000002a7151a: fucomi st,st(2) 0x0000000002a7151c: jne 0x0000000002a715b6 0x0000000002a71522: sub rsp,0x8 0x0000000002a71526: fistp QWORD PTR [rsp] 0x0000000002a71529: fld st(1) 0x0000000002a7152b: fld st(1) 0x0000000002a7152d: fabs 0x0000000002a7152f: sub rsp,0x8 0x0000000002a71533: fstcw WORD PTR [rsp] 0x0000000002a71537: mov eax,DWORD PTR [rsp] 0x0000000002a7153a: or eax,0x300 0x0000000002a71540: push rax 0x0000000002a71541: fldcw WORD PTR [rsp] 0x0000000002a71544: pop rax 0x0000000002a71545: fyl2x 0x0000000002a71547: sub rsp,0x8 0x0000000002a7154b: fld st(0) 0x0000000002a7154d: frndint 0x0000000002a7154f: fsubr st(1),st 0x0000000002a71551: fistp DWORD PTR [rsp] 0x0000000002a71554: f2xm1 0x0000000002a71556: fld1 0x0000000002a71558: faddp st(1),st 0x0000000002a7155a: mov eax,DWORD PTR [rsp] 0x0000000002a7155d: mov ecx,0xfffff800 0x0000000002a71562: add eax,0x3ff 0x0000000002a71568: mov edx,eax 0x0000000002a7156a: shl eax,0x14 0x0000000002a7156d: add edx,0x1 0x0000000002a71570: cmove eax,ecx 0x0000000002a71573: cmp edx,0x1 0x0000000002a71576: cmove eax,ecx 0x0000000002a71579: test ecx,edx 0x0000000002a7157b: cmovne eax,ecx 0x0000000002a7157e: mov DWORD PTR [rsp+0x4],eax 0x0000000002a71582: mov DWORD PTR [rsp],0x0 0x0000000002a71589: fmul QWORD PTR [rsp] 0x0000000002a7158c: add rsp,0x8 0x0000000002a71590: fldcw WORD PTR [rsp] 0x0000000002a71593: add rsp,0x8 0x0000000002a71597: fucomi st,st(0) 0x0000000002a71599: pop rax 0x0000000002a7159a: jp 0x0000000002a715b6 0x0000000002a715a0: ffree st(2) 0x0000000002a715a2: ffree st(1) 0x0000000002a715a4: test eax,0x1 0x0000000002a715a9: je 0x0000000002a718e6 0x0000000002a715af: fchs 0x0000000002a715b1: jmp 0x0000000002a718e6 0x0000000002a715b6: ffree st(0) 0x0000000002a715b8: fincstp 0x0000000002a715ba: mov QWORD PTR [rsp-0x28],rsp 0x0000000002a715bf: sub rsp,0x80 0x0000000002a715c6: mov QWORD PTR [rsp+0x78],rax 0x0000000002a715cb: mov QWORD PTR [rsp+0x70],rcx 0x0000000002a715d0: mov QWORD PTR [rsp+0x68],rdx 0x0000000002a715d5: mov QWORD PTR [rsp+0x60],rbx 0x0000000002a715da: mov QWORD PTR [rsp+0x50],rbp 0x0000000002a715df: mov QWORD PTR [rsp+0x48],rsi 0x0000000002a715e4: mov QWORD PTR [rsp+0x40],rdi 0x0000000002a715e9: mov QWORD PTR [rsp+0x38],r8 0x0000000002a715ee: mov QWORD PTR [rsp+0x30],r9 0x0000000002a715f3: mov QWORD PTR [rsp+0x28],r10 0x0000000002a715f8: mov QWORD PTR [rsp+0x20],r11 0x0000000002a715fd: mov QWORD PTR [rsp+0x18],r12 0x0000000002a71602: mov QWORD PTR [rsp+0x10],r13 0x0000000002a71607: mov QWORD PTR [rsp+0x8],r14 0x0000000002a7160c: mov QWORD PTR [rsp],r15 0x0000000002a71610: sub rsp,0x100 0x0000000002a71617: vextractf128 XMMWORD PTR [rsp],ymm0,0x1 0x0000000002a7161e: vextractf128 XMMWORD PTR [rsp+0x10],ymm1,0x1 0x0000000002a71626: vextractf128 XMMWORD PTR [rsp+0x20],ymm2,0x1 0x0000000002a7162e: vextractf128 XMMWORD PTR [rsp+0x30],ymm3,0x1 0x0000000002a71636: vextractf128 XMMWORD PTR [rsp+0x40],ymm4,0x1 0x0000000002a7163e: vextractf128 XMMWORD PTR [rsp+0x50],ymm5,0x1 0x0000000002a71646: vextractf128 XMMWORD PTR [rsp+0x60],ymm6,0x1 0x0000000002a7164e: vextractf128 XMMWORD PTR [rsp+0x70],ymm7,0x1 0x0000000002a71656: vextractf128 XMMWORD PTR [rsp+0x80],ymm8,0x1 0x0000000002a71661: vextractf128 XMMWORD PTR [rsp+0x90],ymm9,0x1 0x0000000002a7166c: vextractf128 XMMWORD PTR [rsp+0xa0],ymm10,0x1 0x0000000002a71677: vextractf128 XMMWORD PTR [rsp+0xb0],ymm11,0x1 0x0000000002a71682: vextractf128 XMMWORD PTR [rsp+0xc0],ymm12,0x1 0x0000000002a7168d: vextractf128 XMMWORD PTR [rsp+0xd0],ymm13,0x1 0x0000000002a71698: vextractf128 XMMWORD PTR [rsp+0xe0],ymm14,0x1 0x0000000002a716a3: vextractf128 XMMWORD PTR [rsp+0xf0],ymm15,0x1 0x0000000002a716ae: sub rsp,0x100 0x0000000002a716b5: vmovdqu XMMWORD PTR [rsp],xmm0 0x0000000002a716ba: vmovdqu XMMWORD PTR [rsp+0x10],xmm1 0x0000000002a716c0: vmovdqu XMMWORD PTR [rsp+0x20],xmm2 0x0000000002a716c6: vmovdqu XMMWORD PTR [rsp+0x30],xmm3 0x0000000002a716cc: vmovdqu XMMWORD PTR [rsp+0x40],xmm4 0x0000000002a716d2: vmovdqu XMMWORD PTR [rsp+0x50],xmm5 0x0000000002a716d8: vmovdqu XMMWORD PTR [rsp+0x60],xmm6 0x0000000002a716de: vmovdqu XMMWORD PTR [rsp+0x70],xmm7 0x0000000002a716e4: vmovdqu XMMWORD PTR [rsp+0x80],xmm8 0x0000000002a716ed: vmovdqu XMMWORD PTR [rsp+0x90],xmm9 0x0000000002a716f6: vmovdqu XMMWORD PTR [rsp+0xa0],xmm10 0x0000000002a716ff: vmovdqu XMMWORD PTR [rsp+0xb0],xmm11 0x0000000002a71708: vmovdqu XMMWORD PTR [rsp+0xc0],xmm12 0x0000000002a71711: vmovdqu XMMWORD PTR [rsp+0xd0],xmm13 0x0000000002a7171a: vmovdqu XMMWORD PTR [rsp+0xe0],xmm14 0x0000000002a71723: vmovdqu XMMWORD PTR [rsp+0xf0],xmm15 0x0000000002a7172c: sub rsp,0x10 0x0000000002a71730: fstp QWORD PTR [rsp] 0x0000000002a71733: fstp QWORD PTR [rsp+0x8] 0x0000000002a71737: vmovsd xmm0,QWORD PTR [rsp] 0x0000000002a7173c: vmovsd xmm1,QWORD PTR [rsp+0x8] 0x0000000002a71742: sub rsp,0x20 0x0000000002a71746: test esp,0xf 0x0000000002a7174c: je 0x0000000002a71764 0x0000000002a71752: sub rsp,0x8 0x0000000002a71756: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a7175b: add rsp,0x8 0x0000000002a7175f: jmp 0x0000000002a71769 0x0000000002a71764: call 0x000000006bf240d0 ; {runtime_call} 0x0000000002a71769: add rsp,0x20 0x0000000002a7176d: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a71772: fld QWORD PTR [rsp] 0x0000000002a71775: add rsp,0x10 0x0000000002a71779: vmovdqu xmm0,XMMWORD PTR [rsp] 0x0000000002a7177e: vmovdqu xmm1,XMMWORD PTR [rsp+0x10] 0x0000000002a71784: vmovdqu xmm2,XMMWORD PTR [rsp+0x20] 0x0000000002a7178a: vmovdqu xmm3,XMMWORD PTR [rsp+0x30] 0x0000000002a71790: vmovdqu xmm4,XMMWORD PTR [rsp+0x40] 0x0000000002a71796: vmovdqu xmm5,XMMWORD PTR [rsp+0x50] 0x0000000002a7179c: vmovdqu xmm6,XMMWORD PTR [rsp+0x60] 0x0000000002a717a2: vmovdqu xmm7,XMMWORD PTR [rsp+0x70] 0x0000000002a717a8: vmovdqu xmm8,XMMWORD PTR [rsp+0x80] 0x0000000002a717b1: vmovdqu xmm9,XMMWORD PTR [rsp+0x90] 0x0000000002a717ba: vmovdqu xmm10,XMMWORD PTR [rsp+0xa0] 0x0000000002a717c3: vmovdqu xmm11,XMMWORD PTR [rsp+0xb0] 0x0000000002a717cc: vmovdqu xmm12,XMMWORD PTR [rsp+0xc0] 0x0000000002a717d5: vmovdqu xmm13,XMMWORD PTR [rsp+0xd0] 0x0000000002a717de: vmovdqu xmm14,XMMWORD PTR [rsp+0xe0] 0x0000000002a717e7: vmovdqu xmm15,XMMWORD PTR [rsp+0xf0] 0x0000000002a717f0: add rsp,0x100 0x0000000002a717f7: vinsertf128 ymm0,ymm0,XMMWORD PTR [rsp],0x1 0x0000000002a717fe: vinsertf128 ymm1,ymm1,XMMWORD PTR [rsp+0x10],0x1 0x0000000002a71806: vinsertf128 ymm2,ymm2,XMMWORD PTR [rsp+0x20],0x1 0x0000000002a7180e: vinsertf128 ymm3,ymm3,XMMWORD PTR [rsp+0x30],0x1 0x0000000002a71816: vinsertf128 ymm4,ymm4,XMMWORD PTR [rsp+0x40],0x1 0x0000000002a7181e: vinsertf128 ymm5,ymm5,XMMWORD PTR [rsp+0x50],0x1 0x0000000002a71826: vinsertf128 ymm6,ymm6,XMMWORD PTR [rsp+0x60],0x1 0x0000000002a7182e: vinsertf128 ymm7,ymm7,XMMWORD PTR [rsp+0x70],0x1 0x0000000002a71836: vinsertf128 ymm8,ymm8,XMMWORD PTR [rsp+0x80],0x1 0x0000000002a71841: vinsertf128 ymm9,ymm9,XMMWORD PTR [rsp+0x90],0x1 0x0000000002a7184c: vinsertf128 ymm10,ymm10,XMMWORD PTR [rsp+0xa0],0x1 0x0000000002a71857: vinsertf128 ymm11,ymm11,XMMWORD PTR [rsp+0xb0],0x1 0x0000000002a71862: vinsertf128 ymm12,ymm12,XMMWORD PTR [rsp+0xc0],0x1 0x0000000002a7186d: vinsertf128 ymm13,ymm13,XMMWORD PTR [rsp+0xd0],0x1 0x0000000002a71878: vinsertf128 ymm14,ymm14,XMMWORD PTR [rsp+0xe0],0x1 0x0000000002a71883: vinsertf128 ymm15,ymm15,XMMWORD PTR [rsp+0xf0],0x1 0x0000000002a7188e: add rsp,0x100 0x0000000002a71895: mov r15,QWORD PTR [rsp] 0x0000000002a71899: mov r14,QWORD PTR [rsp+0x8] 0x0000000002a7189e: mov r13,QWORD PTR [rsp+0x10] 0x0000000002a718a3: mov r12,QWORD PTR [rsp+0x18] 0x0000000002a718a8: mov r11,QWORD PTR [rsp+0x20] 0x0000000002a718ad: mov r10,QWORD PTR [rsp+0x28] 0x0000000002a718b2: mov r9,QWORD PTR [rsp+0x30] 0x0000000002a718b7: mov r8,QWORD PTR [rsp+0x38] 0x0000000002a718bc: mov rdi,QWORD PTR [rsp+0x40] 0x0000000002a718c1: mov rsi,QWORD PTR [rsp+0x48] 0x0000000002a718c6: mov rbp,QWORD PTR [rsp+0x50] 0x0000000002a718cb: mov rbx,QWORD PTR [rsp+0x60] 0x0000000002a718d0: mov rdx,QWORD PTR [rsp+0x68] 0x0000000002a718d5: mov rcx,QWORD PTR [rsp+0x70] 0x0000000002a718da: mov rax,QWORD PTR [rsp+0x78] 0x0000000002a718df: add rsp,0x80 0x0000000002a718e6: fstp QWORD PTR [rsp] 0x0000000002a718e9: vmovsd xmm0,QWORD PTR [rsp] ;*invokestatic pow ; - ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark::trickyMathOctaPow@16 (line 63)
_dpow
there is an interesting feature, namely the processing of the “special case”. Below is a fragment of the source code library library_call.cpp OpenJDK 8: //------------------------------inline_pow------------------------------------- // Inline power instructions, if possible. bool LibraryCallKit::inline_pow() { // Pseudocode for pow // if (y == 2) { // return x * x; // } else { // if (x <= 0.0) { // long longy = (long)y; // if ((double)longy == y) { // if y is long // if (y + 1 == y) longy = 0; // huge number: even // result = ((1&longy) == 0)?-DPow(abs(x), y):DPow(abs(x), y); // } else { // result = NaN; // } // } else { // result = DPow(x,y); // } // if (result != result)? { // result = uncommon_trap() or runtime_call(); // } // return result; // } /* code omitted */ }
x * x
. Let's find this check in the disassembled code using the first call as an example Math.pow(a, 2)
: 0x0000000002a70b14: vmovsd xmm1,QWORD PTR [rip+0xffffffffffffff44] ; xmm1 2.0 0x0000000002a70b1c: vmovsd QWORD PTR [rsp],xmm1 0x0000000002a70b21: fld QWORD PTR [rsp] ; 2.0 FPU register stack 0x0000000002a70b24: vmovsd QWORD PTR [rsp],xmm0 0x0000000002a70b29: fld QWORD PTR [rsp] ; a FPU register stack 0x0000000002a70b2c: movabs rax,0x6c4ba7d0 0x0000000002a70b36: fld QWORD PTR [rax] ; 2.0 FPU register stack 0x0000000002a70b38: fucomip st,st(2) ; 2.0 2.0 0x0000000002a70b3a: jp 0x0000000002a70b53 0x0000000002a70b40: jne 0x0000000002a70b53 0x0000000002a70b46: fxch st(1) ; FPU a 0x0000000002a70b48: ffree st(0) 0x0000000002a70b4a: fincstp 0x0000000002a70b4c: fmul st,st(0) ; a a 0x0000000002a70b4e: jmp 0x0000000002a70faa ; code omitted 0x0000000002a70faa: fstp QWORD PTR [rsp] 0x0000000002a70fad: vmovsd xmm0,QWORD PTR [rsp] ; xmm0 a * a ; code omitted
@Fork(value = 3, warmups = 0) @Warmup(iterations = 5, time = 1_000, timeUnit = TimeUnit.MILLISECONDS) @Measurement(iterations = 10, time = 1_000, timeUnit = TimeUnit.MILLISECONDS) @OutputTimeUnit(value = TimeUnit.NANOSECONDS) @BenchmarkMode(Mode.AverageTime) @State(Scope.Benchmark) public class MathBenchmark { public double a; @Setup public void setup() { a = 1234567.890; } @Benchmark public void mathOctaPowBenchmark(Blackhole bh) { bh.consume(mathOctaPow(a)); } @Benchmark public void plainOctaPowBenchmark(Blackhole bh) { bh.consume(plainOctaPow(a)); } @Benchmark public void trickyMathOctaPowBenchmark(Blackhole bh) { bh.consume(trickyMathOctaPow(a)); } @Benchmark public void trickyPlainOctaPowBenchmark(Blackhole bh) { bh.consume(trickyPlainOctaPow(a)); } public double mathOctaPow(double a) { return Math.pow(a, 8); } public double plainOctaPow(double a) { return a * a * a * a * a * a * a * a; } public double trickyMathOctaPow(double a) { return Math.pow(Math.pow(Math.pow(a, 2), 2), 2); } public double trickyPlainOctaPow(double a) { a *= a; a *= a; return a * a; } }
Benchmark Mode Cnt Score Error Units MathBenchmark.mathOctaPowBenchmark avgt 30 76,041 ± 0,428 ns/op MathBenchmark.plainOctaPowBenchmark avgt 30 4,174 ± 0,027 ns/op MathBenchmark.trickyMathOctaPowBenchmark avgt 30 3,010 ± 0,014 ns/op MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,011 ± 0,015 ns/op
# JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark # Run progress: 0,00% complete, ETA 00:03:00 # Fork: 1 of 3 # Warmup Iteration 1: 77,026 ns/op # Warmup Iteration 2: 76,561 ns/op # Warmup Iteration 3: 77,623 ns/op # Warmup Iteration 4: 76,192 ns/op # Warmup Iteration 5: 76,012 ns/op Iteration 1: 75,947 ns/op Iteration 2: 75,739 ns/op Iteration 3: 75,864 ns/op Iteration 4: 76,179 ns/op Iteration 5: 75,934 ns/op Iteration 6: 75,783 ns/op Iteration 7: 75,820 ns/op Iteration 8: 75,898 ns/op Iteration 9: 75,798 ns/op Iteration 10: 76,053 ns/op # Run progress: 8,33% complete, ETA 00:02:48 # Fork: 2 of 3 # Warmup Iteration 1: 75,975 ns/op # Warmup Iteration 2: 76,008 ns/op # Warmup Iteration 3: 75,867 ns/op # Warmup Iteration 4: 76,061 ns/op # Warmup Iteration 5: 75,710 ns/op Iteration 1: 75,874 ns/op Iteration 2: 75,862 ns/op Iteration 3: 76,080 ns/op Iteration 4: 75,948 ns/op Iteration 5: 75,848 ns/op Iteration 6: 75,883 ns/op Iteration 7: 76,004 ns/op Iteration 8: 75,790 ns/op Iteration 9: 75,894 ns/op Iteration 10: 75,847 ns/op # Run progress: 16,67% complete, ETA 00:02:33 # Fork: 3 of 3 # Warmup Iteration 1: 75,778 ns/op # Warmup Iteration 2: 75,850 ns/op # Warmup Iteration 3: 75,878 ns/op # Warmup Iteration 4: 76,025 ns/op # Warmup Iteration 5: 76,450 ns/op Iteration 1: 75,791 ns/op Iteration 2: 75,941 ns/op Iteration 3: 75,652 ns/op Iteration 4: 75,795 ns/op Iteration 5: 75,906 ns/op Iteration 6: 78,971 ns/op Iteration 7: 76,055 ns/op Iteration 8: 75,736 ns/op Iteration 9: 75,816 ns/op Iteration 10: 77,537 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.mathOctaPowBenchmark": 76,041 ±(99.9%) 0,428 ns/op [Average] (min, avg, max) = (75,652, 76,041, 78,971), stdev = 0,640 CI (99.9%): [75,614, 76,469] (assumes normal distribution) # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark # Run progress: 25,00% complete, ETA 00:02:17 # Fork: 1 of 3 # Warmup Iteration 1: 4,622 ns/op # Warmup Iteration 2: 4,406 ns/op # Warmup Iteration 3: 4,169 ns/op # Warmup Iteration 4: 4,163 ns/op # Warmup Iteration 5: 4,153 ns/op Iteration 1: 4,141 ns/op Iteration 2: 4,144 ns/op Iteration 3: 4,141 ns/op Iteration 4: 4,141 ns/op Iteration 5: 4,149 ns/op Iteration 6: 4,136 ns/op Iteration 7: 4,143 ns/op Iteration 8: 4,136 ns/op Iteration 9: 4,140 ns/op Iteration 10: 4,134 ns/op # Run progress: 33,33% complete, ETA 00:02:02 # Fork: 2 of 3 # Warmup Iteration 1: 4,567 ns/op # Warmup Iteration 2: 4,267 ns/op # Warmup Iteration 3: 4,162 ns/op # Warmup Iteration 4: 4,155 ns/op # Warmup Iteration 5: 4,157 ns/op Iteration 1: 4,157 ns/op Iteration 2: 4,151 ns/op Iteration 3: 4,161 ns/op Iteration 4: 4,175 ns/op Iteration 5: 4,136 ns/op Iteration 6: 4,154 ns/op Iteration 7: 4,192 ns/op Iteration 8: 4,206 ns/op Iteration 9: 4,203 ns/op Iteration 10: 4,180 ns/op # Run progress: 41,67% complete, ETA 00:01:47 # Fork: 3 of 3 # Warmup Iteration 1: 4,569 ns/op # Warmup Iteration 2: 4,204 ns/op # Warmup Iteration 3: 4,172 ns/op # Warmup Iteration 4: 4,151 ns/op # Warmup Iteration 5: 4,159 ns/op Iteration 1: 4,141 ns/op Iteration 2: 4,175 ns/op Iteration 3: 4,182 ns/op Iteration 4: 4,205 ns/op Iteration 5: 4,246 ns/op Iteration 6: 4,186 ns/op Iteration 7: 4,273 ns/op Iteration 8: 4,240 ns/op Iteration 9: 4,169 ns/op Iteration 10: 4,270 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.plainOctaPowBenchmark": 4,174 ±(99.9%) 0,027 ns/op [Average] (min, avg, max) = (4,134, 4,174, 4,273), stdev = 0,040 CI (99.9%): [4,147, 4,201] (assumes normal distribution) # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark # Run progress: 50,00% complete, ETA 00:01:31 # Fork: 1 of 3 # Warmup Iteration 1: 3,396 ns/op # Warmup Iteration 2: 3,237 ns/op # Warmup Iteration 3: 3,156 ns/op # Warmup Iteration 4: 3,020 ns/op # Warmup Iteration 5: 3,001 ns/op Iteration 1: 2,995 ns/op Iteration 2: 3,012 ns/op Iteration 3: 3,014 ns/op Iteration 4: 2,997 ns/op Iteration 5: 3,025 ns/op Iteration 6: 3,015 ns/op Iteration 7: 3,004 ns/op Iteration 8: 2,999 ns/op Iteration 9: 3,033 ns/op Iteration 10: 3,003 ns/op # Run progress: 58,33% complete, ETA 00:01:16 # Fork: 2 of 3 # Warmup Iteration 1: 3,409 ns/op # Warmup Iteration 2: 3,230 ns/op # Warmup Iteration 3: 3,057 ns/op # Warmup Iteration 4: 3,027 ns/op # Warmup Iteration 5: 3,010 ns/op Iteration 1: 3,001 ns/op Iteration 2: 3,001 ns/op Iteration 3: 3,023 ns/op Iteration 4: 3,097 ns/op Iteration 5: 3,017 ns/op Iteration 6: 2,997 ns/op Iteration 7: 3,017 ns/op Iteration 8: 3,011 ns/op Iteration 9: 2,998 ns/op Iteration 10: 2,991 ns/op # Run progress: 66,67% complete, ETA 00:01:01 # Fork: 3 of 3 # Warmup Iteration 1: 3,476 ns/op # Warmup Iteration 2: 3,188 ns/op # Warmup Iteration 3: 2,998 ns/op # Warmup Iteration 4: 2,984 ns/op # Warmup Iteration 5: 3,023 ns/op Iteration 1: 2,999 ns/op Iteration 2: 3,004 ns/op Iteration 3: 2,998 ns/op Iteration 4: 3,059 ns/op Iteration 5: 3,001 ns/op Iteration 6: 3,006 ns/op Iteration 7: 3,002 ns/op Iteration 8: 2,994 ns/op Iteration 9: 3,005 ns/op Iteration 10: 2,989 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyMathOctaPowBenchmark": 3,010 ±(99.9%) 0,014 ns/op [Average] (min, avg, max) = (2,989, 3,010, 3,097), stdev = 0,022 CI (99.9%): [2,996, 3,025] (assumes normal distribution) # JMH version: 1.20 # VM version: JDK 1.8.0_161, VM 25.161-b12 # VM invoker: C:\Program Files\Java\jre1.8.0_161\bin\java.exe # VM options: <none> # Warmup: 5 iterations, 1000 ms each # Measurement: 10 iterations, 1000 ms each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark # Run progress: 75,00% complete, ETA 00:00:45 # Fork: 1 of 3 # Warmup Iteration 1: 3,353 ns/op # Warmup Iteration 2: 3,169 ns/op # Warmup Iteration 3: 2,985 ns/op # Warmup Iteration 4: 3,004 ns/op # Warmup Iteration 5: 3,018 ns/op Iteration 1: 2,994 ns/op Iteration 2: 2,986 ns/op Iteration 3: 2,986 ns/op Iteration 4: 3,041 ns/op Iteration 5: 3,000 ns/op Iteration 6: 2,993 ns/op Iteration 7: 2,999 ns/op Iteration 8: 3,001 ns/op Iteration 9: 3,024 ns/op Iteration 10: 2,995 ns/op # Run progress: 83,33% complete, ETA 00:00:30 # Fork: 2 of 3 # Warmup Iteration 1: 3,371 ns/op # Warmup Iteration 2: 3,190 ns/op # Warmup Iteration 3: 3,010 ns/op # Warmup Iteration 4: 2,992 ns/op # Warmup Iteration 5: 2,995 ns/op Iteration 1: 2,993 ns/op Iteration 2: 3,007 ns/op Iteration 3: 2,999 ns/op Iteration 4: 3,006 ns/op Iteration 5: 2,992 ns/op Iteration 6: 3,009 ns/op Iteration 7: 3,013 ns/op Iteration 8: 3,012 ns/op Iteration 9: 3,010 ns/op Iteration 10: 3,000 ns/op # Run progress: 91,67% complete, ETA 00:00:15 # Fork: 3 of 3 # Warmup Iteration 1: 3,388 ns/op # Warmup Iteration 2: 3,239 ns/op # Warmup Iteration 3: 3,046 ns/op # Warmup Iteration 4: 3,146 ns/op # Warmup Iteration 5: 3,008 ns/op Iteration 1: 3,023 ns/op Iteration 2: 3,048 ns/op Iteration 3: 3,039 ns/op Iteration 4: 3,094 ns/op Iteration 5: 3,024 ns/op Iteration 6: 3,004 ns/op Iteration 7: 2,991 ns/op Iteration 8: 3,025 ns/op Iteration 9: 3,006 ns/op Iteration 10: 3,006 ns/op Result "ru.gnkoshelev.jbreak2018.perf_tests.pow.MathBenchmark.trickyPlainOctaPowBenchmark": 3,011 ±(99.9%) 0,015 ns/op [Average] (min, avg, max) = (2,986, 3,011, 3,094), stdev = 0,023 CI (99.9%): [2,996, 3,026] (assumes normal distribution) # Run complete. Total time: 00:03:03 Benchmark Mode Cnt Score Error Units MathBenchmark.mathOctaPowBenchmark avgt 30 76,041 ± 0,428 ns/op MathBenchmark.plainOctaPowBenchmark avgt 30 4,174 ± 0,027 ns/op MathBenchmark.trickyMathOctaPowBenchmark avgt 30 3,010 ± 0,014 ns/op MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,011 ± 0,015 ns/op
Math.pow(a, 2)
and (a * a)
was not significant._dpow
: Benchmark Mode Cnt Score Error Units MathBenchmark.mathOctaPowBenchmark avgt 30 195,222 ± 0,850 ns/op MathBenchmark.plainOctaPowBenchmark avgt 30 4,183 ± 0,030 ns/op MathBenchmark.trickyMathOctaPowBenchmark avgt 30 41,158 ± 0,381 ns/op MathBenchmark.trickyPlainOctaPowBenchmark avgt 30 3,081 ± 0,032 ns/op
StrictMath.pow()
. An interesting fact is that calling a few StrictMath.pow(x, 2)
is still better StrictMath.pow(x, 8)
. This indicates that in the implementation of the native-method there is also a special case with squaring._dpow
in general deserves a separate chapter: judging by the changes in the OpenJDK repository, the intrinsic undergoes constant changes in different releases, and the developers constantly forget about the particular case. Andrei apangin Pangin talked about this at the Joker 2016 conference - Myths and facts about slow Java .Options 3 and 4 are equally fast due to a special case in the implementation of the intrinsic-function , which essentially reduces tox * x
.
Option 2 loses in speed due to more operations.
Option 1 is significantly inferior in speed, because despite the use of intrinsic , the complex logic of raising a number to a power of type isdouble
called.
Source: https://habr.com/ru/post/351812/
All Articles