Five ways to optimize code for Android 5.0 Lollipop

How to make programs faster? One of the effective ways is code optimization. Knowing the features of the platform for which the application is created, you can find effective ways to accelerate it.

Preliminary Information

ART (Android Runtime) is a new Android application execution environment. In Android 5.0, Lollipop ART is first used by default. It includes many enhancements aimed at improving performance. In this material, we will talk about some of the new features of ART, compare it with the previously used Android Dalvik runtime and share five tips that will make your applications faster.

What's new in ART?

In the course of profiling many Android applications that ran in the Dalvik environment, two key features were discovered that users pay special attention to. The first feature is the time required to launch the application. The second is the number of various “slowdowns” (jank). At its worst, this is a staggering sound, a jerky animation, if not an unexpected stop of the application. This usually happens because the application takes too much time to prepare the next frame. As a result, it simply does not keep up with the device screen refresh rate. Frame formation speed can be a problem if the next frame is formed much faster or slower than the previous one. If something like this happens, the user sees jerks in the work of the interface elements. This makes interaction with the program much less convenient than it would be desirable for both users and developers. In ART, there are several new features designed to solve the problems described above.
')

Compile before execution. ART compiles applications during installation using the dex2oat tool installed on the device. The result is an executable file compiled for the target architecture. For comparison, Dalvik uses an interpreter and compiles applications on the fly. During installation, Dalvik converts APK-files into optimized DEX-code, and already during the launch of the application compiles it into machine instructions. As a result, applications run in the ART environment faster, although the time needed for installation increases. In addition, with this approach, applications use more flash memory of the device, since additional space is required to store the code compiled during installation.
Improved memory allocation mechanisms. Applications that use memory intensively may experience performance problems with Dalvik. Mitigate this problem with a separate storage space for large objects and an improved memory allocation mechanism in ART.
Improved garbage collection . ART is equipped with a faster garbage collector that supports parallel data processing, which leads to less memory fragmentation and more efficient use of it.
Improved JNI performance. Optimization of the JNI-code call and return from it, which reduces the number of instructions necessary to make JNI-calls.
Support for 64-bit architectures. ART feels great on 64-bit architectures. This improves the performance of many applications when running them on the appropriate hardware.

The combined effect of these enhancements improves the perception by users of both applications that are written only using the Android SDK and programs that heavily use JNI calls. The additional benefits, from the point of view of users, include a longer device operation time from a single charge. The point here is that applications are compiled only once, they run faster, and as a result, they consume less battery power for everyday use.

Performance comparison of ART and Dalvik

When ART was just released, in the form of a preliminary version on Android KitKat 4.4, there were criticisms of its performance in comparison with Dalvik. I must say that such a comparison can not be called honest. After all, we compared the early preliminary version of ART with a mature product that has been subjected to many improvements over the years of work on it. As a result of these early tests, some applications worked in the ART environment more slowly than in Dalvik.

Now we have the opportunity to compare the matured ART environment, which is used in mass-produced devices, with Dalvik. Since Android 5.0 only uses ART, a direct comparison of ART and Dalvik is possible only if you first run tests on a device with Android KitKat 4.4 installed, retrieving data for the Dalvik environment, then update it to Android Lollipop 5.0 and run The same series of tests for the ART-environment.

In preparing this material, we did similar tests with the SurfTab xintron i7.0 tablet, which is based on the Intel Atom processor. At first, Android 4.4.4 was installed on it, and, accordingly, Dalvik was used in tests, then the device was updated to Android 5.0. and tested the performance of ART.

Since the tests were performed on different versions of Android, there is the possibility that some of the improvements found come not from ART, but from other improvements to Android. However, based on our internal performance analysis, we can say that it is the use of ART that is the main reason for the increase in system performance.

We used performance tests in which Dalvik, by aggressively optimizing code that runs many times, is able to gain an advantage. In addition, we tested the system using a gaming simulator developed by Intel.

Based on the data obtained, we can conclude that ART is superior to Dalvik in all of our tests. In some cases, this superiority is quite significant.

Relative test rates ART (Android Lollipop) and Dalvik (Android KitKat)

Details of the test applications that we used, you can find by clicking on the following links:

Quadrant 2.1.1
CaffeineMark 3.0
Smartbench 2012 productivity
Antutu 4.4 Overall (the version used for testing is no longer available for download)
CF-Bench 1.3 Overall

IcyRocks version 1.0 is a device performance testing application created by Intel. It simulates real computer games. For most of the calculations, it uses the open source library Cocos2D and the library JBox2D (physics engine, Java Physics Engine). The application measures the average number of frames that it manages to display per second (FPS, Frame Per Second) at various levels of load. It then calculates the final indicator, taking the geometric average of the FPS values obtained in different modes of operation. In addition, the program calculates the level of "incorrect" frames per second (jank per second), as the average between such frames at different load levels. IcyRocks shows the superiority of ART over Dalvik.

Relative Frames Per Second Performance Testing in ART and Dalvik Environments

As a result of testing, we managed to find out that in ART the characteristics of frames are more constant than in Dalvik, with fewer “wrong” frames. As a result, in ART, the application interface runs more smoothly.

The relative rates of "wrong" frames per second when tested in ART and Dalvik environments

The results obtained allow us to say with confidence that today ART allows for better perception of applications by users and greater performance than Dalvik.

Transferring software code from Dalvik to ART

The transition from Dalvik to ART is transparent, most applications that run in the Dalvik environment will work in the ART environment without the need to modify their code. As a result, when users update the system, applications start to work faster without any additional effort. However, especially if your applications use Java Native Interface, they will not interfere with testing in the ART environment. The fact is that ART uses a more rigorous JNI error handling mechanism than Dalvik. Here you can find out more about it.

Five tips for code optimization

The performance of most of the applications that will be launched in the ART environment will increase only because of the platform improvements we talked about above. However, there is a set of recommendations, following which you can optimize applications for even greater performance. Each of the code optimization techniques described below is provided with a simple code example illustrating the features of its work.

It is impossible to predict in advance what kind of performance increase can be expected using one or another approach to optimization. The point here is that all applications are different, their final performance depends very much on the rest of the code and on the features of their use. However, we will explain why the proposed optimization methods can improve application performance. In order to assess their impact on your application, test them by applying to your code.

Recommendations that we offer are applicable quite widely, but we focus on the fact that when working with ART, improvements will be perceived by the dex2oat compiler, which generates binary executable code from dex files and optimizes it.

Council number 1. Whenever possible, use local variables instead of public class fields

By limiting the scope of variables, you will not only improve the readability of the code and reduce the number of potential errors, but also make it better suited for optimization.

In a block of non-optimized code, which is shown below, the value of the variable v is calculated during the execution of the application. This is due to the fact that this variable is available outside the m () method and can be changed in any part of the code. Therefore, the value of the variable is unknown at compile time. The compiler does not know whether the call to the some_global_call () method will change the value of this variable or not, since the v variable, again, can change any code outside of the method.

In the optimized version of this example, v is a local variable. So, its value can be calculated at the compilation stage. As a result, the compiler can put the value of a variable in the code it generates, which will help to avoid calculating the value of the variable at run time.

Non-optimized code	Optimized code
`class A { public int v = 0; public int m(){ v = 42; some_global_call(); return v*3; } }`	`class A { public int m(){ int v = 42; some_global_call(); return v*3; } }`

Council number 2. Use the final keyword to tell the compiler that the field value is a constant.

The final keyword can be used to protect the code from accidentally changing variables that should be constants. However, it allows you to improve performance, as it tells the compiler that it is a constant before it.

In a fragment of a non-optimized code, the value of v * v * v must be calculated during the execution of the program, since the value of v may change. In the optimized version, using the final keyword when declaring a variable and assigning it a value tells the compiler that the value of the variable will not change. Thus, the calculation of the value can be made at the compilation stage and a value will be added to the output code, and not commands for its calculation during the execution of the program.

Non-optimized code	Optimized code
`class A { int v = 42; public int m(){ return vvv; } }`	`class A { final int v = 42; public int m(){ return vvv; } }`

Council number 3. Use the final keyword when declaring classes and methods.

Since any method in Java can be polymorphic, declaring a method or class with the final keyword indicates to the compiler that the method is not overridden in any of the subclasses.

In a non-optimized version of the code, before calling the m () function, its permission must be made.
In optimized code, because the m () method is used to declare the keyword final , the compiler knows which version of the method will be called. Therefore, it can avoid searching for a method and replace the method call m () with its contents by embedding it in the necessary place of the program. The result is an increase in performance.

Non-optimized code Optimized code

 class A { public int m(){   return 42;  } public int f(){   int sum = 0;   for (int i = 0; i < 1000; i++)     sum += m(); // m       return sum; } }

 class A { public final int m(){   return 42;  } public int f(){   int sum = 0;   for (int i = 0; i < 1000; i++)     sum += m();    return sum; } }

Council number 4. Avoid calling small methods via JNI

There are good reasons for using JNI calls. For example, if you have code or libraries in C / C ++ that you want to reuse in Java applications. Perhaps you are creating a cross-platform application, or your goal is to increase productivity through the use of low-level mechanisms. However, it is important to keep the number of JNI calls to a minimum, since each of them creates a significant load on the system. When JNI is used to optimize performance, this additional burden can negate the expected benefit. In particular, frequent calls to short, non-significant computational JNI methods can degrade performance. And if such calls are placed in a loop, then the unnecessary load on the system will only increase.

Code example

 class A { public final int factorial(int x){   int f = 1;   for (int i =2; i <= x; i++)     f *= i;   return f;  } public int compute (){   int sum = 0;   for (int i = 0; i < 1000; i++)     sum += factorial(i % 5); //    JNI-  factorial(), //     , //        //          JNI-   return sum; } }

Council number 5. Use standard libraries instead of implementing the same functionality in native code.

Standard Java libraries are seriously optimized. If you use, wherever possible, the internal mechanisms of Java, this will achieve the best performance. Standard solutions can work much faster than self-written implementations. Attempting to avoid additional load on the system by refusing to call a standard library function can, in fact, degrade performance.
In a non-optimized version of the code, an attempt is made to avoid calling the standard function Math.abs () by using our own implementation of the algorithm for obtaining the absolute value of a number. However, the code in which the library function is called is faster due to the fact that the call is replaced with an optimized internal implementation in ART at compile time.

Non-optimized code	Optimized code
`class A { public static final int abs(int a){ int b; if (a < 0) b = a; else b = -a; return b; } }`	`class A { public static final int abs (int a){ return Math.abs(a); } }`

Testing optimization techniques

Find out what the performance difference is in the optimized and non-optimized code from tip # 2 when running it in ART. For the experiment, we will use the Asus Fonepad 8 tablet, built on the basis of the Intel Atom Z3530 CPU. The device has been updated to Android 5.0.

Here is the code we are testing:

 public final class Ops {   int v = 42;   final int w = 42;   public int testUnoptimized(){       return v*v*v;   }   public int testOptimized(){       return w*w*w;   } }

The difference between the testUnoptimized and testOptimized methods is that the second is optimized, the variable w , which is used in it, is declared with the keyword final .

During the tests, each method will be called a specified number of times. The cycles in which these calls are made are executed in the background thread. After the tests are completed, the results are displayed in the user interface of the application.

Application interface for testing optimization results

The table shows the results of ten consecutive test launches in the release version of the application. Each of the individual indicators is obtained by performing a cyclic call to the corresponding method 10 million times.

Comparison of execution speed of optimized and non-optimized code

No	Optimized, ms.	Not optimized, ms.
one	25	193
2	21	203
3	thirty	220
four	25	175
five	23	184
6	28	177
7	thirty	186
eight	27	191
9	34	212
ten	27	174
The average	27	191.5

As a result, it turned out that the optimized method is executed, on average, 7 times faster than the non-optimized one.

The source code of the project, suitable for import into Android Studio, can be found here .

Intel Optimizations in ART

Intel worked with OEMs of devices, providing them with an optimized version of Dalvik based on Intel processors. The same thing happens in the case of ART, as a result, the performance of the new runtime environment will increase over time. Optimized versions of the code will be available either in the Android Open Source Project (AOSP), or directly from device manufacturers. As before, the optimizations are transparent to both users and developers, that is, in order to take advantage of their benefits, neither one nor the other will have to make additional efforts.

To learn more about optimizing Android applications for devices based on Intel processors, see the compilers, visit the Intel Developer Zone .

Results

In this material, we reviewed the main features of the new runtime environment for Android applications, ART. Other things being equal, it achieves better performance than Dalvik. But the speed of each specific application is very much dependent not only on the execution environment, but also on the developer. We hope our tips on code optimization will help you in writing fast and convenient applications.

Source: https://habr.com/ru/post/263873/

All Articles