
Hello!
In the world there are a huge number of applications on OpenGL, and it seems that Apple is not quite agree with this. Starting with iOS 12 and MacOS Mojave, OpenGL has been rendered obsolete. We integrated Apple Metal into MAPS.ME and are ready to share our experience and results. Let us tell you how our graphic engine refactored, what difficulties we had to face and, most importantly, how many FPS we have now.
Anyone who is interested or is thinking about adding support for Apple Metal in the graphics engine, we invite under the cat.
Problematics
Our graphics engine was designed as cross-platform, and since OpenGL is, in fact, the only cross-platform graphics API for a set of platforms of interest to us (iOS, Android, MacOS and Linux), we chose it as the basis. We didn’t do an extra layer of abstraction that would hide OpenGL-specific features, but, fortunately, left the potential for its implementation.
')
With the advent of the graphics API of the new generation of Apple Metal and Vulkan, we, of course, considered the possibility of their appearance in our application, however, the following stopped us:
- Vulkan could only work on Android and Linux, and Apple Metal only on iOS and MacOS. We did not want to lose cross-platform at the level of the graphics API, this would complicate the development and debugging processes, increase the workload.
- An application on Apple Metal cannot be compiled and run on an iOS simulator (by the way, so far), which would also complicate our development and prevent us from completely getting rid of OpenGL.
- The Qt Framework, which we use to create internal tools, only supported OpenGL ( now supported by Vulkan ).
- Apple Metal did not have and does not have a C ++ API, which would force us to invent abstractions not only for the execution stage, but also for the application assembly stage, when a part of the engine is compiled into Objective C ++, and the other, essentially large, in C ++.
- We were not ready to make a separate engine or a separate code branch specifically for iOS.
- The implementation was estimated at least in half a year of work of one graphic developer.
When in the spring of 2018, Apple announced the transfer of OpenGL to deprecated status, it became clear that it was no longer possible to postpone, and the problems described above had to be solved in one way or another. In addition, we have long worked to optimize both the speed of the application and energy consumption, and Apple Metal, it seemed, could help with this.
Decision making
Almost immediately we noticed
MoltenVK . This framework emulates the Vulkan API using Apple Metal, and its source code was recently discovered. Using MoltenVK, it seemed, would have replaced OpenGL with Vulkan, and not at all engaged in a separate Apple Metal integration. In addition, Qt developers
abandoned separate rendering support for Apple Metal in favor of MoltenVK. However, we were stopped:
- the need to support Android devices on which Vulkan is unavailable;
- the inability to start on the iOS simulator without a fallback on OpenGL;
- Inability to use Apple tools for debugging, profiling and precompiling shaders, since MoltenVK generates real-time shaders for Apple Metal from source codes on SPIR-V or GLSL;
- the need to wait for updates and bugfixes of MoltenVK when new versions of Metal are released;
- the impossibility of fine optimization, specific to Metal, but not specific or not existing for Vulkan.
It turned out that OpenGL we need to save, and therefore can not do without abstracting the engine from the graphics API. Apple Metal, OpenGL ES, and in the future, Vulkan, will be used to create independent internal components of the graphics engine, which can be completely interchangeable. OpenGL will play the role of a fallback option in cases where Metal or Vulkan is unavailable for one reason or another.
The implementation plan was:
- Refactoring the graphics engine to abstract the used graphics API.
- Render to Apple Metal for the iOS version of the application.
- Make appropriate benchmarks for rendering speed and power consumption in order to understand whether modern, lower-level graphics APIs can benefit the product.
Key differences between OpenGL and Metal
To understand how to abstract the graphics API, let's first determine what key conceptual differences exist between OpenGL and Metal.
- It is believed, and rightly so, that Metal is a lower-level API. However, this does not mean that you have to write in assembler or implement rasterization yourself. Metal can be called a low-level API in the sense that it performs a very small number of implicit actions, that is, almost all actions must be prescribed to the programmer himself. OpenGL does a lot of things implicitly, starting from supporting an implicit reference to the OpenGL context and linking that context to the thread in which it was created.
- In Metal, there is no realtime validation of commands. In debug mode, validation, of course, exists and is done significantly better than in many other APIs, largely due to the tight integration with XCode. But when the program is sent to the user, then there is no longer any validation, the program simply crashes on the first error. Needless to say that OpenGL falls only in the most extreme cases. The most common practice is to ignore the error and continue working.
- Metal can precompile shaders and build libraries from them. In OpenGL, shaders are compiled from source in the process of running the program, the specific low-level implementation of OpenGL on a particular device is responsible for this. Difference and / or errors in the implementation of shader compilers sometimes lead to fantastic bugs, especially on Android devices of Chinese brands.
- OpenGL makes extensive use of the state machine, which adds side effects to almost every function. Thus, OpenGL functions are not pure functions, and order and call history are often important. Metal does not use states implicitly and does not save them longer than necessary for rendering. States exist in the form of previously created and validated objects.
Graphics Engine Refactoring and Embedding Metal
The process of refactoring the graphics engine, basically, was to find the best solution to get rid of the features of OpenGL, which our engine actively used. Embedding Metal, starting with one of the stages, went in parallel.
- As already noted, the OpenGL API has an implicit entity called the context. The context is associated with a specific stream, and the OpenGL function called in this stream itself finds and uses this context. Metal, Vulkan (yes, and other APIs, for example, Direct3D) do not work this way, they have similar explicit objects called device or instance. The user himself creates these objects and is responsible for their transfer to different subsystems. It is through these objects that all calls to graphic commands are made.
We called our abstract object a graphical context, and in the case of OpenGL it simply decorates calls to OpenGL commands, and in the case of Metal it contains the root interface MTLDevice, through which Metal commands are called.
Of course, we had to extend this object (and since our rendering is multi-threaded, even a few such objects) across all subsystems.
We hid the creation of queues of commands, encoders (encoders) and their management within the graphic context, so as not to distribute to the entity engine, which simply does not exist in OpenGL. - The prospect of the disappearance of the validation of graphic commands on the devices of the users frankly did not please us. A wide range of devices and OS versions could not be fully covered by our QA department. Therefore it was necessary to add the expanded logs where earlier we received a sensible error from the graphic API. Of course, this validation was added only to the potentially dangerous and critical parts of the graphics engine, since coverage with the diagnostic code of the entire engine is almost impossible and generally harmful for performance. The new reality is that testing on users and debugging with logs is now in the past, at least in terms of rendering.
- Our previous shader system was unsuitable for refactoring, we had to completely rewrite it. The point here is not only in the precompilation of shaders and their validation at the assembly stage of the project. In OpenGL, so-called uniform variables are used to pass parameters to shaders. The transfer of structured data is only available with OpenGL ES 3.0, and since we still support OpenGL ES 2.0, we simply did not use this method. Metal made us use data structures to pass parameters, and for OpenGL we had to invent structure mapping fields into uniform variables. In addition, I had to re-write each of the shaders in Metal Shading Language.
- When using state objects, we had to go for a trick. In OpenGL, all states, as a rule, are set immediately before rendering, and in Metal this should be a previously created and validated object. Our engine, obviously, used the OpenGL approach, and the refactoring with the preliminary creation of state objects was commensurate with the complete rewriting of the engine. To split this node, we created a state cache inside the graphic context. The first time that a unique combination of state parameters is formed, a state object is created in Metal and placed in the cache. The second and subsequent times, the object is simply retrieved from the cache. This works in our maps, since the number of different combinations of state parameters is not too large (about 20-30). For a complex gaming graphics engine, this method is unlikely to work.
As a result, after about 5 months of work, we were able to launch MAPS.ME for the first time with full rendering on Apple Metal. It was time to find out what we did.
Rendering speed testing
Experimental technique
We used in the experiment Apple devices of different generations. All of them were updated to iOS 12. At all, the same user scenario was executed - map navigation (movement and scaling). The script was scripted to guarantee almost complete identity of the processes within the application each time it was run on each device. As a test location, we chose the area of ​​Los Angeles - one of the most heavily loaded areas in MAPS.ME.
First, the script was executed with rendering on OpenGL ES 3.0, then on the same device with rendering on Apple Metal. Between launches, the application is completely unloaded from memory.
The following indicators were measured:
- FPS (frames per second) for the entire frame;
- FPS for the part of the frame that deals only with rendering, excluding data preparation and other frame by frame operations;
- The percentage of slow frames (more than ~ 30 ms), i.e. those that the human eye can perceive as jerks.
When measuring FPS, drawing directly on the device screen was excluded, since vertical synchronization with the screen refresh rate does not allow to obtain reliable results. Therefore, the frame was drawn in texture in memory. OpenGL used an additional call to the
glFinish
command to synchronize the CPU and GPU, and
waitUntilCompleted
for
MTLFrameCommandBuffer
used in Apple Metal.
| iPhone 6s | | iPhone 7+ | | iPhone 8 | |
---|
| Opengl | Metal | Opengl | Metal | Opengl | Metal |
---|
FPS | 106 | 160 | 159 | 221 | 196 | 298 |
FPS (rendering only) | 157 | 596 | 247 | 597 | 271 | 833 |
Slow frame rate (<30 fps) | 4.13% | 1.25% | 5.45% | 0.76% | 1.5% | 0.29% |
| iPhone X | | iPad Pro 12.9 ' | |
---|
| Opengl | Metal | Opengl | Metal |
---|
FPS | 145 | 210 | 104 | 137 |
FPS (rendering only) | 248 | 705 | 147 | 463 |
Slow frame rate (<30 fps) | 0.15% | 0.15% | 17.52% | 4.46% |
| iPhone 6s | iPhone 7+ | iPhone 8 | iPhone X | iPad Pro 12.9 ' |
---|
Frame acceleration on Metal (N times) | 1.5 | 1.39 | 1.52 | 1.45 | 1.32 |
Acceleration of rendering on Metal (N times) | 3.78 | 2.41 | 3.07 | 2.84 | 3.15 |
Improvement in slow frames (N times) | 3.3 | 7.17 | 5.17 | one | 3.93 |
Results analysis
On average, the increase in frame performance when using Apple Metal was 43%. The minimum value is fixed on iPad Pro 12.9 '- 32%, the maximum - 52% on the iPhone 8. Dependency is viewed: the smaller the screen resolution, the more Apple Metal exceeds OpenGL ES 3.0.
If we evaluate the part of the frame that is directly responsible for rendering, then on average, the rendering speed on Apple Metal has increased 3 times. This suggests a significantly better organization, and, as a result, the efficiency of the Apple Metal API compared to OpenGL ES 3.0.
The number of slow frames (more than ~ 30 ms) on Apple Metal has decreased by about 4 times. This means that the perception of animations and moving around the map has become smoother. The worst result is fixed on iPad Pro 12.9 'with a resolution of 2732 x 2048 pixels: OpenGL ES 3.0 gives about 17.5% of slow frames, while Apple Metal only has 4.5%.
Energy Testing
Experimental technique
Power consumption was tested on iPhone 8 on iOS 12. The same user script was executed - navigation on the map (moving and scaling) for 1 hour. The script was scripted to guarantee almost complete identity of the processes within the application at each launch. The Los Angeles area was also chosen as a test location.
We used the following approach to measuring energy consumption. The device is not connected to charging. In the developer settings, power consumption logging is enabled. Before the start of the experiment, the device is fully charged. The end of the experiment comes at the end of the script. At the end of the experiment, the state of charge of the battery was recorded, and the energy logs were imported to the battery profiling utility in Xcode. We recorded how much of the charge was spent on the work of the GPU. In addition, here we additionally weighed up the rendering by including the display of the metro map and full-screen anti-aliasing.
The brightness of the screen did not change in all cases. No other processes, except system and MAPS.ME, were executed. Airplane mode was turned on, Wi-Fi and GPS were turned off. Additionally, several control measurements were performed.
As a result, a comparison of Metal with OpenGL was formed for each of the indicators, and then the coefficients of the relationship were averaged to get one aggregated estimate.
| Opengl | Metal | Growth |
---|
Spent battery charge | 32% | 28% | 12.5% |
Profiling Battery Usage in Xcode | 1.95% | 1.83% | 6.16% |
Results analysis
On average, the power consumption of the version with rendering to Apple Metal has slightly improved. The power consumption of our application doesn’t affect the GPU too much, about 2%, because MAPS.ME cannot be called highly loaded in terms of using the GPU. A small gain is probably achieved by reducing the computational cost when preparing commands for the GPU on the CPU, which, unfortunately, cannot be distinguished using the profiling tools.
Results
Embedding Metal cost us 5 months of development. This involved two developers, however, almost always take turns. We obviously gained a lot in rendering performance, and won a little in terms of power consumption. In addition, we were able to embed new graphical APIs, in particular, Vulkan, with much less effort. Almost completely "sifted through" the graphics engine, as a result, found and fixed a few old bugs and performance problems.
To the question whether our project really needs rendering on Apple Metal, we are ready to answer in the affirmative. It's not so much the fact that we love innovation, or that Apple can finally abandon OpenGL. Just in the yard in 2018, and OpenGL appeared in the distant 1997, it's time to take the next step.
PS So far we have not launched the feature on all iOS devices. To turn it on manually, type
?metal
in the search bar and restart the application. To return rendering to OpenGL, enter the command
?gl
and restart the application.
PPS MAPS.ME is an open-source project. You can find the source code on
github .