(Static) Selection of optimal containers in C ++ programs

Hello. Today I would like to talk again about static analysis. And again about C ++. But unlike PVS-Studio, we will not search for any errors in our programs (although they are looking not only for errors), but for places that are not optimally written. And one of these places is the choice of a container for data in the program. If I am interested in you, then welcome under the cat!

Problem

At CoreHard 2018 Autumn (a very good conference, come) I told you that C ++ compilers are poorly optimized at the moment. And one of my complaints was that compilers cannot optimize the use of containers in our programs. Let's look at a few code examples.

void foo() { std::vector<int> v(42); }

It would seem that in such a simple case the compiler should be able to optimize this function and simply throw out a variable declaration of type std :: vector, since starting from C ++ 14 the compiler is allowed to remove dynamic memory allocations, but the compiler does not. The reason is that at the moment only one C ++ compiler implements optimization for the removal of dynamic allocations - Clang. All the other compilers just do not know how to do it yet. But even Clang can do this in a limited number of cases.

In this case, we could replace std :: vector with std :: array, provided that the size of the vector to be allocated is not too large, since we may simply not have enough stack for such a replacement. Such a replacement will remove a rather expensive memory allocation on the heap, and the advantage is that when using std :: array, the compiler can already throw std :: array out of a function at all!
')
If we are talking about performance optimization, I propose to consider the following example:

 void foo() { std::vector<int> v; for(int i = 0; i < 42; ++i) { v.insert(v.begin(), 42); } for(const auto val : v) { std::cout << val << ' '; } }

In this case, we see the use of extremely inefficient in the case of std :: vector operation - insertion at the beginning of the container. All C ++ programmers know that it is extremely bad to do, as it causes all elements to shift each time, which leads to high costs of copying / moving. Much nicer in this case would be to replace std :: list, which doesn’t care where the insertion takes place, or std :: deque (although it is in this case you can perfectly see that you don’t just need to use insert. But this is just an example, not more :)

Let's look at another sample code:

 void foo() { std::list<int> v; for(int i = 0; i < 3; ++i) { v.push_front(i); } for(const auto val : v) { std::cout << val << ' '; } }

In this case, we can notice that we can replace std :: list without pain (yes, I know that rarely anyone uses it) with std :: forward_list. In this case, we will absolutely lose nothing, but we will get memory savings. Naturally, the compiler now does not optimize them.

A similar trick can also be done in the following example:

 void foo() { std::deque<int> v; for(int i = 0; i < 3; ++i) { v.push_back(i); } for(int i = 0; i < 3; ++i) { std::cout << v.back() << ' '; v.pop_back(); } }

Here we can see that what we really need is not std :: deque, but std :: stack. You can’t call this optimization, because std :: stack is an adapter, and by default it uses std :: deque inside (unless the user specifies otherwise). Here you can talk more about semantic optimization, i.e. simplifying code to understand. From my point of view, this is also important. If you ask, “Maybe such a replacement also gives a performance boost?”, I will answer “Maybe. See implementation details in your version of the standard library. ”

Well, I think enough examples. Each of you can come up with a lot of them too.

Means used

To implement the static analyzer, I used the Clang Static Analzyer (CSA) and Clang Tidy, which are included in the LLVM distribution. I chose these tools, because I consider them the most promising among the open means of static analysis. In addition, Clang provides one of the highest quality parsers of the C ++ language, which other static analyzers cannot boast of (unless of course they use libclang).

Both CSA and Clang Tidy are static analyzers, both of which are part of LLVM. What is the difference? The difference is that Clang Tidy is designed to write simple checks, which basically consist in finding a pattern on an abstract syntax tree, displaying a warning and possibly replacing it automatically with another one. You can read more about Clang Tidy here .

CSA is designed to write more “serious” and resource-intensive (both in terms of implementation and in terms of execution time / memory consumption) checks. There, for example, a symbolic execution mechanism is available.

I decided to implement the check in CSA, since it does not seem banal to me, moreover, in the future it will become harder and harder. And it was decided to run through Clang Tidy, since this static analyzer has many integrations with various IDEs.

How are we going to (try) to solve problems

First of all, it is worthwhile to introduce a couple of quite strong restrictions, which are mainly related to the fact that this is only a prototype so far:

Analysis only at the level of functions; This restriction means that there will be no analysis between functions, as well as between translation units. The restriction on the analysis between functions was imposed to simplify the implementation of this analysis and in the future can be relatively easily corrected by running a static analysis for the entire translation unit, and not just for each function. The limitation on the analysis between translation units is imposed by the existing limitations in the CSA, which will soon be fixed (commits are already merging into the upstream);
Support for only a limited number of containers. This is relatively easily fixed in the future by adding new rules for new containers.
Use for the analysis tree only abstract syntax. Since prototyping is the simplest type of analysis. For more accurate results, of course, you can try using at least a symbolic implementation, but this method has its drawbacks. More details about the methods can be found here .

Now in the prototype the following simple algorithm is implemented:

First, in the abstract syntax tree, we find the vertices that are responsible for declaring container type variables that we support.
Then we find the operations that relate to these containers, classify them and store this information in a temporary cache.
After reaching the end of the function, we analyze the collected statistics and, based on the predefined rules, issue a recommendation on the use of a container.

The classification of operations on containers at the moment is as follows (will be expanded in the future):

Adding an item to the beginning of the container.
Add an item to the middle of the container.
Add an item to the end of the container.
Remove item from the beginning of the container.
Remove the item from the middle of the container.
Remove item from end of container.

The classification is currently incomplete, and even this list does not work quite correctly. For example, the insert operation, even if it is performed at the beginning, the analyzer classifies as an insert in the middle, although in fact this is not at all the case.

Combating False Positives

In any static analysis, the main headache is the false positives. If there are too many, useful messages are lost in the trash. Therefore, in this case we have to act very conservatively and issue warnings only in cases when we are really confident in our diagnostics and we can quite say that there is really something wrong in some place in the code.

If we are talking about compiler optimization, then it is still sadder - correct optimization cannot change the behavior of the program according to the Standard C ++ (otherwise the cost to such an optimizer). And pessimization should not introduce optimization either :) So here we have to be much more careful in our decisions.

In this analyzer, this struggle resulted in the fact that if the analyzer sees that some unsupported operation is currently being performed, the analysis for this container is disabled.

Disadvantages and possible solutions

This method has several problems.

The first problem is that for the analyzer at the moment all the code branches are equally probable. More precisely, he does not even know about such a thing as different branches of code execution.
This translates into problems with analysis for something like this:

 void foo(int* ptr, std::vector<int>& v) { if(ptr == nullptr) { v.insert(v.begin(), 42); } else { v.push_back(84); } }

Most likely, in our application, these code branches are not equal to the likelihood of execution, since in the real world, usually the pointer usually indicates something normal, and not nullptr. In the same LLVM there are static heuristics on this account. For example, it takes into account both the above case with comparing pointers with nullptr, and comparing each other for equality of the values of two floating point variables, and some other interesting cases. But it looks more and more like crutches, and from my point of view, the real solution to this problem is to add dynamic analysis or instrumentation.

The second problem is the lack of support for custom containers. We live in the world of C ++, they love cycling here (let's leave the discussion of the reasons for this not always a bad thing beyond the scope of this article) everything, including our containers. Examples are all the same LLVM, LibreOffice and many others. In this regard, the question arises - how to analyze and containers are not from the library STL? After all, I would like to include analysis for as many containers as possible.

There are different ways to solve the problem.

The first is that the user annotates his containers in some way (a special kind of comment, C ++ attributes, something else). With this method, the problem is that you need to understand how to conduct annotation in general, what information we need for a qualitative analysis. Another problem may be the modification of the code of the containers themselves, which is not always possible.

The second method offers the user a mechanism for writing their own rules. At the moment, the analyzer has the rules embedded into the source code of the analyzer itself, and if the user wants to add his own rules, he will need to download the source code of the analyzer, build it, figure out how to write checks, write, rebuild, and so on. You can provide the user with a way to set his checks on some DSL, where the user writes only checks for his containers, and the analyzer itself is engaged in the whole routine. I consider this method as more promising than the previous one.

Also, automatic replacement of containers is not supported, as this functionality is not in the CSA (but is in Clang Tidy). But in difficult cases it is not always a trivial task to perform autochange, and the analyzer is more likely to work in a semi-manual mode.

Possible uses

I see several applications for this type of analysis:

As a static analyzer. Everything is simple - another static analysis check that you run as your heart desires (by hand, automatically in the IDE during development, on CI, etc.), where you may be given a hint that somewhere you could pick up a better container.
Like compiler optimization. In some cases, we can guarantee that replacing the container will not adversely affect performance. For example, replacing std :: vector for small sizes known at compile time for std :: array or replacing std :: list with std :: forward_list when we don’t need doubling and we don’t take the size from the list. The compiler could replace containers with more optimal ones and without our knowledge, as it already does for a very large number of things.
As a dynamic analyzer. This is the direction that seems to me the most promising for this type of analysis. After all, with the help of knowledge about the program execution profile, for example, we can get such important information for us as the probabilities of each code branch. And this is necessary for a more accurate assessment. And with such an analysis one can already think in the direction of integration with PGO ...

It is also worth noting that this method is applicable of course not only for C ++ programs. I would really like to see this kind of static analysis \ optimization in the compiler and for other programming languages. For example, the SAP static analyzer for ABAP already knows how to carry out a static analysis of optimality at a basic level, which is good news. If you know similar projects for other programming languages - write in the comments and I will add to the article!

Works in similar directions

For the world C ++, I have not found such analyzers anywhere. For the ABAP world, I have already mentioned the analyzer above, which can find inefficient operations for some part of standard containers, but as far as I know, quite a simple static analysis is implemented there.

Much more interesting work is Chameleon - a dynamic analyzer for Java, which is very cleverly done. They twirled the JVM a little, and during the work they collect various statistics on the use of containers, and depending on the current load profile they choose certain containers and replace them automatically during the work. Unfortunately, the source is closed and there is no chance to get them (I tried).

I also recommend looking at the various works (there are a lot of them) on SETL . In them, the authors also often raised questions about the automatic selection of the container.

Links

Current implementation on GitHub
C ++ Russia 2017: Yuri Efimochev, clang-tidy: a journey inside C ++ Abstract Syntax Tree
Chameleon: Adaptive Selection of Collections
Clang Static Analyzer guide
Russian-language chat on the development of compilers in the Telegram. If you are interested - go, it is very interesting there. Just be careful with flood - they immediately punish him :)

Instead of a conclusion, I would like to focus on the fact that this is only a prototype so far and it has too many “holes” in its implementation. In this article, I just want to share with you the idea of such an analysis and its popularization. Well, maybe someone will be interested in this topic and there will be a desire to connect to the project - I will be just happy! In addition, you can always collect this analyzer from yourself to try it on your test examples.

If you have something to supplement the material, faced with a similar task, or simply have some information that may be useful on this topic - please do not hesitate and share this information in the comments.

Thanks for attention!

Source: https://habr.com/ru/post/455379/

All Articles