⬆️ ⬇️

Intel GPA for Android - we optimize graphics in mobile applications

image In the continuation of a series of articles devoted to Intel INDE, I want to talk about a unique (I’m not afraid of this word) tool for developers created by Intel - Intel GPA ( Graphics Performance Analyzers ). I already mentioned it in a review article on Intel INDE .



Intel GPA includes tools for analyzing performance, finding bottlenecks, and optimizing application performance. At the same time, the tools have an intuitive, user-friendly graphical interface, which gives developers the opportunity to start "right off the bat", even without extensive experience in optimizing and debugging graphical applications.



Intel GPA supports performance analysis of applications for Windows and Android. A Windows version (DirectX) has existed on the market for a long time, many articles have been written about it; the version for Android, intended for applications using OpenGL , appeared relatively recently, and many developers are not even aware of its existence. I am going to fill this gap.



')

System requirements


Intel GPA is available for the following host operating systems.





Mobile device


You also need an Android device based on Intel Atom, this is due to the fact that the analysis of the application is performed directly on the device, without rutting or somehow preparing the device, the only requirement is that the device must be defined by ADB ( Android Debug Bridge ).



application


The profiled application must have the debuggable = ”true” flag in the manifest .



<application android:debuggable=”true” … /> 




And permission android.permission.INTERNET



 <uses-permission android:name="android.permission.INTERNET" /> 




Installing Intel GPA



There are two ways to install Intel GPA:



As part of Intel INDE


Go to the Intel INDE website , download and install the package manager (more on this in the review article on Intel INDE ).



Choose GPA System Analyzer , click Download , wait for the download to complete and install.



From the Intel GPA home page


Open the Intel GPA home page , choose a package for your OS, download, install.



What's inside



GPA Perfomance Analyzers and GPA Frame Analyzer will be installed along with GPA System Analyzer . All the tools in the package deserve special attention, however, in order not to overload the article with information about all three, this time I will talk about GPA System Analyzer .



Beginning of work


First you need to connect the device to the host and make sure that it is displayed in the ADB list



Command line



 adb devices 


image



GPA System Analyzer



This tool allows you to analyze application performance, find potential bottlenecks, and remove metrics for more detailed analysis. Displays various performance characteristics of the mobile platform such as:







GPA System Analyzer allows you to "play" with different rendering parameters, without making changes to the code. In this case, the result can be immediately seen on the screen of the mobile device and observed on the graphs ( CPU load, GPU , FPS ), as this affects the performance. In addition, the tool allows you to collect data on the performance of the current frame, which can be analyzed in more detail using the GPA Performance Analyzers and GPA Frame Analyzer .



So run GPA System Analyzer . After launch, a list of available devices should be displayed:



image



If the device is determined by ADB , but is not listed, try manually specifying the path to the ADB . To do this, press Ctrl + F1 and enter the path to the folder with ADB .



image



After connecting to the device, you will see a list of applications installed on the device:



image



To start and start analyzing the application, simply click on its name in the list.



After running the application on your mobile device, you will see the following screen.



image



The column on the left displays the various metrics ( Metrics ) and the rendering options of the State Overrides . On the right side are graphs that display performance metrics (in this case, CPU loading by the analyzed application and FPS ).



To add a graph of the selected performance indicator, simply drag the line with its name to the graph area.



You can also combine two or more graphs on one (useful for monitoring related metrics) if you hold down the Ctrl key while moving the metric to the graph area:



image



Metrics



CPU






Device IO


The metrics listed below take into account read-write operations by all applications on the device, without reference to the application being profiled.







GPU






Memory






Opengl






Power






Finding problems and ways to improve performance with metrics



As you can see, GPA System Analyzer helps to monitor almost any indicators reflecting application performance. But not all metrics can directly indicate problems (as in the case of CPU loading, the higher the worse), some of them will tell something useful, only when compared with others.



GPU Performance Metrics



TA Load and USSE Vertex Load


Ideally, both indicators should be balanced, which allows for better performance.



TA Load high, USSE Vertex Load low - the scene contains too many vertices, you can improve performance by simplifying objects.



TA Load low, USSE Vertex Load high — the vertex shader is too complex, there is room for optimizing the shader code.



PB Primitives / Second


Too high a score indicates that the problem is most likely in the size of the vertex format.



PB Vertices / Second


A high score may indicate a large amount of data transferred between the vertex and fragment shaders.



PB Vertices / Primitive


A high figure indicates the possibility of optimization due to a decrease in the number of vertices in models, for example, through their reuse using an index buffer.



ISP Load


A high score may be in cases where a single Z-buffer is used with multiple Render Target . To improve the situation, you can create your own buffer for each RT .



TSP Load, Texture Unit Load, USSE Pixel Load


High TSP Load indicates the possibility of improving performance by optimizing shaders (high load USSE Pixel Load ) or textures (high loading Texture Unit ) by reducing the resolution, using compression.



USSE Total Load, USSE Vertex Load, USSE Pixel Load


High USSE Total Load indicates the possibility of improving performance by optimizing the operation of vertex (high USSE Vertex Load ) or fragmentary ( USSE Pixel Load ) shaders.



OpenGL metrics



Draw Calls & Indexed Draw Calls


In terms of performance, calling drawing functions is a costly operation. High metrics may indicate performance improvements by grouping vertices and drawing with one call.



Buffer Creations


Buffer allocation is a costly operation, ideally should occur somewhere during the stage initialization stage. The appearance of this indicator on the graph suggests that you have the opportunity to improve the efficiency of the code by transferring the code that creates buffers in the initialization, loading the scene.



Error Gets


GlGetError calls degrade performance. In the final version of your application, this indicator should be zero.



State overrides



Another interesting tool is the ability to redefine the state of the analyzed application without making changes to the code. In fact, these are various experiments that you can put on your application in order to understand how the inclusion of certain parameters affects the performance of the application.



Disable All


Disables all active options. Displays the active scene as is.



image



1x1 Scissor Rect


Disables pixel processing in the graphic pipeline. If the FPS option does not change, then most likely the problem is in too complex stage geometry or vertex shader.



image

(in this case, an empty scene will be displayed)



Disable Alpha Blending


Disables Alpha Blending . Transparency operations can seriously affect performance. This experiment will show you how disabling blending affects FPS .



image



Disable Draw Calls


Ignores drawing functions. This experiment will help you understand how an application will behave on a device with an infinitely fast graphics chip.



Disable Z-Test


Z buffer is used to trim objects that are located completely or partially behind other objects on the stage. Enabling this option should “slow down” the drawing of the scene. If this does not happen, then you have the opportunity to improve performance by sorting objects from near to far before drawing them.



Show wireframe


Includes wireframe display mode, allows you to visually assess the order of the objects, the complexity of the models.



image



Simple Fragment Shader


Replaces the fragmentary shader with a simple, monochrome. If performance is improved when this option is enabled, try optimizing the fragment shader code.



image



Texture 2x2


Replaces used textures with simpler ones. If you see an improvement in performance when you turn on this option, there is a way to optimize the application by optimizing textures (reducing resolution, using compression).



image



This is where the GPA System Analyzer part ends. I hope the information obtained will help you quickly master this tool and put the knowledge gained into practice in practice.



Next time I will talk about a tool called the GPA Frame Debugger , which allows you to conduct a detailed analysis of OpenGL scenes in a simple and intuitive form.

Source: https://habr.com/ru/post/223415/



All Articles