📜 ⬆️ ⬇️

Profiling project build

A couple of months ago, I screwed profiling to our build system (the JamPlus fork). It was implemented on the Chrome Tracing View that I described earlier, so it was easy to add its support to Jam. Jam is written in C, so I just found a suitable library for profiling in C (it was a minitrace ) and literally in several lines wrapped up the places of interest to me (the assembly itself).

image

There is nothing outstanding here. However ... as soon as you get the first results of profiling, they often make you think and start to notice something.

Things seen


Once I was doing something unrelated to the topic of this article and for some reason I looked at the output of the profiler for the new build of my product. Experience suggests that linking is the lion's share of C ++ code assembly. This time, however, this was not the case:
')
image

A big delay is easily seen in the diagram before the linking stage. Most of the code is already compiled and only one file with C ++ code continues to be collected. Then I was busy with another task, so I just added a task to figure it out on our board with tasks. Another time I was building a build of another component of our product and again looked at the output of the build profiler:

image

And here things looked already frankly bad. The total build time was about 10 minutes and almost 7 of them were compiled with just one file (5 of which nothing else was compiled). At that moment, it became clear that the problem was in an assembly system of such a scale that does not allow it to be ignored or postponed anymore.

The average compile time for C ++ files in this project and in this configuration was about 2 seconds. There were a couple of files that were collected for 30 seconds, but 400+ seconds for the assembly went beyond all reasonable limits. What is going on?

I did some experiments and found out that:


Whether our approach was ideal was a separate question, but then it gave us enough advantages not to abandon it out of the blue. But all the same, something had to be done with the speed of compilation.

Compile acceleration


One simple change that was entirely at the mercy of the build system could be the exclusion of slowly compiled files from Unity builds. Their whole point is to slightly save on compiler launches and preprocessing common header files. However, for several files, the compilation of which takes 30+ seconds, this gain will be minimal, but the need to wait several minutes at each assembly because of the "stuck" at the end of the file assembly is a significant problem.

It would be nice to somehow force the build system to start compiling the “slow” files as early as possible. Let's start earlier - finish earlier. The ideal option would be to fasten an analysis of historical data on previous assemblies to the assembly system and automatic compilation queue generation based on them. But this was not necessary in this particular case - just the exclusion of files from the unity builds in our build system moved them to the top of the queue. Ok, that's enough for us.

This trick didn’t really speed up our 7-minute build of that “bad” file for a second, but it was easy to do and he immediately gave about one minute of total gain on the entire assembly (which had taken 10 minutes before).

And after that, I did something that I generally had no hopes for - I broke the largest template function into several smaller ones in that “slow” file (some of which were no longer template). Trivial refactoring. Some IDEs are able to do such things in the mode “selected with a mouse a part of the code, right click, Extract Function”. Well, that's just C ++ and the code, as I said, contained many macros and templates, so I had to do everything manually.

After selecting about 5 functions, the compile time of the problem file dropped from 420 seconds to 70. It became 6 times faster!

Of course, highlighting functions means that they are no longer an inline code and we have the costs of calling them (passing arguments, jump, return). At the same time, this approach still allows the calling function to use registers (better or worse), to reduce the total amount of code, etc. We measured the speed of the modified code on different platforms and concluded that the performance changes are insignificant. So, this time it worked!

Of course, a minute to compile a single file is still a lot. But further attempts to speed up the compilation would entail significant changes in the design of our mathematical library. This required more thoughtful planning.

Build after the changes are made already looks better. There are no more heaps of processor cores waiting to complete the compilation process on one of them. Linking is still consistent, but this is not news. The total assembly time fell from 10 minutes to 5 minutes 10 seconds, i.e. It became almost 2 times faster.

image

Morality


Source: https://habr.com/ru/post/338672/


All Articles