
My name is Mitya Kurkin, I lead the development of iOS messengers Mail.Ru Group. Today I will talk about our experience in accelerating
apps on iOS. High speed is very important for 99% of applications. This is especially true on mobile platforms, where computing power and, accordingly, the battery charge is very limited. Therefore, every self-respecting developer seeks to optimize the performance of his application in order to eliminate the various delays that make up the total reaction time.
Measurement
Before you make any manipulations, you need to fix the current state of affairs. That is, measure how much time is now lost in problem areas. The measurement method must be reproducible; otherwise, it will be meaningless to compare this data with subsequent achievements. How to measure? Situations may be different, but we always have a stopwatch. True, this is the least accurate option.
Can be measured with a profiler. If there are characteristic areas on the graph (decline or peak load), then you can measure them. This option gives a more accurate result. In addition, the graph will show the influence of additional factors. For example, if you measure the speed of all device processes, you can find out what other applications are doing and whether this affects the result of our measurement. If the profiler does not catch on anything, then you can measure your own logs. This can give even more accurate results, but this will require changing the application, which may in some way affect its operation. Also in this case, the influence of additional factors will not be visible.
')
Perfect result
To understand whether it is possible to accelerate in a particular situation, it is desirable to understand in advance what the minimum time can be. The fastest solution is to consider the operation of a similar function in a competitor application. This can give a guideline how quickly such an operation can be performed. It is necessary to evaluate whether the selected technologies allow achieving the desired speed. It is necessary to remove everything that is possible in order to get the very minimum of the function performed:
- disable parallel running processes;
- replace variables with constants;
- instead of a full screen load, show only the cap;
- leave from the network operation only a sequence of constant network requests;
- you can even create a clean application that performs only this constant function.
If in this case we get the desired speed, then you can slowly return the disconnected elements and see how this affects performance. If, even after all the described procedures are completed, the result is unsatisfactory, then more radical actions are needed: changing the libraries used, reducing the traffic volume due to its quality, changing the protocol used, etc.
Profiler
When optimizing a simple code reading can easily lead the wrong way. Perhaps you will find some “hard” operation that is difficult, but you can optimize a little. And so, having spent a lot of time, applying the latest and most modern algorithms, you succeed. But at the same time, the ideal is still as before China. Or maybe even worse. Although in reality the problem may lie in the most unexpected places that are not at all suspicious. Somewhere completely unnecessary actions can be performed, or it is just an error leading to a hang. Therefore, you first need to measure what takes time. Moreover, we have such an opportunity thanks to the tools from Apple.

To speed up the application, Time Profiler is primarily required. Its interface is quite clear: on top of the graph of the load on the processor, at the bottom of the call tree, showing what method ate. There is a splitting into streams, filters, fragment selection, various sorts and many more.
To work most effectively with this data, you need to understand how they are calculated. Let's take this schedule:

At high magnification, it looks like this:

Profiler measures the flow of time due to periodic polling of the application status. If during such a measurement it uses a processor, then all methods from the call stack use processor time. According to the total amount of such measurements, we get a call tree indicating the elapsed time:

Tools allow you to adjust the frequency of such measurements:

It turns out that the more often the processor usage is noted in the measurements, the higher the level on the original graph. But then, if the processor is not used, this time does not affect the overall result. How then to look for those cases when the application is in a waiting state for an event, for example, a response to an http request? In this case, the setting “Record Waiting Threads” can help. Then the measurements will be recorded and those states when the processor is not used. A column with the number of measurements per function instead of the elapsed time will automatically turn on in the bottom table. The display of these columns can be customized, but by default either the time or the number of measurements is shown.
Consider this example:
- (void)someMethod { [self performSelector:@selector(nothing:) onThread:[self backThread] withObject:nil waitUntilDone:YES]; } - (void)nothing:(id)object { for (int i=0; i<10000000; ++i) { [NSString stringWithFormat:@"get%@", @"Some"]; } }
Measuring the application with the launch of such a code will give something like this:

The figure shows that the main stream takes 94 ms and 2.3% in time, and 9276 and 27% in measurements (samples). However, the difference may not always be so noticeable. How to search for such cases in real applications? Here helps the mode of displaying the graph as streams:

In this mode, you can see when threads start, when they perform some actions and when they “sleep”. In addition to viewing the graph in the upper part, you can also enable the display of the list of measurements (Sample List) in the lower table. Looking through the areas of "sleep" of the main thread, you can find the culprit hang interface.
Do not stop on system calls
Conducting measurements, it is very easy to rest against system calls. It turns out that all the time goes to the work of the system code. What is there to do? In fact, the main thing is not to dwell on this. While there is an opportunity, you need to delve into these challenges. If you dig, it can easily be a calbek or a delegate who calls your code, and the tangible waste of time is precisely because of it.
Turn off
So, the suspect is found. Before you redo, you need to check how the application will work completely without it.
It so happens that there are a lot of potential culprits for braking, and it is problematic to measure all this by the profiler. For example, the problem is well reproduced only by some users and not always, but only in a certain situation. In order to quickly understand whether we are moving in the right direction, whether we are measuring and optimizing those places, disabling these modules helps very well.
If everything is already off, and far from ideal, then you need to try to move from the other side. Create an empty application and increase its functionality.
Consider the technical features of devices
Technical specifications of devices are changing, and it is also worth considering. For example, starting with the iPhone 4S began to use multi-core processors. Therefore, there is more efficient use of multithreading through the use of multiple cores. However, on a single-core processor, this may slow down the final result, since we can still use only one core, but at the same time spend additional resources on switching the context of the thread.
Be careful when connecting large frameworks.
The more and more powerful the mechanism you connect, the more it takes in your hands. The less you control the situation. And, accordingly, the application becomes less flexible. In our case, we firmly sat on CoreData. Great technology. Full support for migrations, FetchResultController, caching is very tempting. But take the launch of the application. To initialize the CoreData stack, you must at least load the base and load the model. If you use sqlite without CoreData, you do not need to load the model. In our case, the model contains 26 entities. Its download takes considerable time, especially on older devices, where the launch speed is felt most acutely.
Our applications are actively developing, so there is always the need to add entities to the database. Thanks to the convenient migration mechanism this does not cause problems. But there are already almost 40 of them. First of all, this greatly affects the size of the application. In total, all migrations add about 30%. In addition, migrations work consistently. So the more of them - the longer the migration takes place. And this again affects the launch speed.
We also encountered a removal problem. With our model and a sufficiently large base, the removal affecting all entities took about 10 minutes. Having turned on the magic debugging option CoreData SQLDebug, we saw a huge number of SELECTs, UPDATEs and a bit of DELETEs. The main problem here is that there is no deleteObjects method in NSManagedObjectContext. That is, objects can be deleted only one at a time, although SQL itself can delete via DELETE ... WHERE someValue IN ... In addition, to delete each object, its key is selected and then deleted. Similarly, the removal of dependent objects.
In our situation, the situation is aggravated by the fact that users of mobile devices, as a rule, do not wait for such a huge period of time and “kill” the application. The result is a broken base.
findings
As you can see, there are quite a few ways to optimize the speed of mobile applications. But, surrounded by numbers and graphs, you need not to break away from reality. It is desirable to keep the application running so that the effect of optimization can be felt in combat conditions. Unfortunately, developers often either don’t pay enough attention to optimization, or they’re too enthusiastic about this activity. The main thing is to remember that optimization should produce tangible user results. Optimization for the sake of optimization itself, when the effect is obtained homeopathic, is a waste of time and effort. Total should be in moderation.