In the wake of C ++ Siberia: a dragon in a bag

Conferences are different. Some gather huge crowds of spectators, others may be of interest only to one and a half specialists.

Another thing is funny: it often happens that the hall gathers a large number of listeners who are curious about the topic, they ask questions and later enthusiastically talk about their experiences to their colleagues. At the same time, the recording of this event collects disproportionately less views than the seals on YouTube. I suppose that videos are banal lost in the vast video hosting sites and cannot find viewers. This annoying fact must be corrected!

In fact, the post is not about that.
')
It just so happened that I had the opportunity to speak at the aforementioned conference , where I told my fingers and priples what LLVM is , why SSA notation is interesting, what IR code is and, finally, how does it seem that C ++ programs that are seemingly at first glance It turns out to provoke indefinite behavior .

By the way, this report can be put in the fifth number in a series of articles about the Smalltalk virtual machine . Many asked for more details about LLVM. In general, we kill all the hares at once. Interested, I suggest to “sit back”, optionally pour something interesting and listen. I promise that I will not take more than an hour.

Oh yeah, under the cut, you can find explanations of those moments that were not given due attention at the conference. I tried to answer the frequently asked questions and analyze the LLVM IR listings in detail. In principle, the text part of the article can be read as an independent work, however, I hoped that the reader would turn to it after watching the video.

Problem with strict aliasing violation

In the report, I mentioned the situation with the conversion of pointers to different types, which can violate the rule strict aliasing. Unfortunately, I called the problem, but there is no solution.

So, the code:

float invert(float f) { uint32_t* raw = reinterpret_cast<uint32_t*>(&f); *raw ^= (1 << 31); //   return * reinterpret_cast<float*>(raw); }

As mentioned, the problem is the unsafe conversion of a pointer to float to a pointer to uint32_t . If the -no-strict-aliasing option was specified during compilation, then the code will work exactly as intended, but if not ... How to solve the problem without shooting the limbs?

There are three solutions to this problem - two are correct and one is conditionally safe.

The correct solution is number one - copy

Copying memory regions is guaranteed safe operation. In this case, the compiler will not try to make assumptions about the nature of pointers and the possibility of their intersection in memory:

 float invert(float f) { uint32_t raw = 0; memcpy(&raw, &f, sizeof(float)); raw ^= (1 << 31); //   memcpy(&f, &raw, sizeof(float)); return f; }

What's interesting: the compiler will definitely notice that the copied regions are always set uniquely and with a fixed size, and therefore can replace the call to the system function memcpy() with register operations, if both values are on his hands (and so, most likely will be). Thus, there is no overhead to use a function call.

The correct solution is number two - use char *

The char type and pointers to it are interpreted by the compiler in a special way. First, the standard requires that the char type always occupies exactly 1 byte of memory. Unlike numeric types, the size of char is set strictly.

Secondly, the compiler allows the char* pointer to store addresses of arbitrary chunks of memory, that is, to point to objects of different types. By standard, char* is considered compatible (“aliases everything”) with all other pointers in strict aliasing terms. Working with memory through char* safe, subject to endianness and alignment.

So with great reservations (on x86), you can write this:

 float invert(float f) { char* const raw = reinterpret_cast<char*>(&f) + sizeof(float) - 1; *raw ^= 0x80; //   return * reinterpret_cast<float*>(raw); }

Of course, the real code must take into account the order of bytes on the platform and select the desired byte for the operation.

Conditionally safe solution - use union

We come to the most controversial part, which has always caused a lot of controversy.

First, I’ll give the code:

 float invert(float f) { union { uint32_t as_int; float as_float; }; as_float = f; //   as_int ^= (1 << 31); //   return as_float; //   }

So, the standard says that you can not do that. According to the standard, union can only be used to save and reuse memory for different types of data. The standard considers that only the value that was written before should always be read. Writing one type and then reading another - undefined behavior.

In nature, there is a huge amount of code that violates this rule. If compilers followed the letter of the law, then everything would be absolutely bad. Fortunately, and perhaps unfortunately, all compilers known to me turn a blind eye to such a prank. Accordingly, the code will work. But the decision is bad because it is based on a blind belief that everything will be fine and “it works for me”.

Such are the kitty pies ...

Debriefing with IR code

In the second part of the article, I will give a detailed analysis of the IR code for the array counting algorithm discussed in the report.

To begin with, the listing itself in the form in which it was presented on slide 21:

 1 ; Function Attrs: nounwind readonly 2 define i32 @sum_array(int*, int)(i32* nocapture readonly %input, i32 %length) #0 { 3 %1 = icmp sgt i32 %length, 0 ;     ? 4 br i1 %1, label %.lr.ph, label %._crit_edge 5 ._crit_edge: 6 %sum.0.lcssa = phi i32 [ 0, %0 ], [ %4, %.lr.ph ] 7 ret i32 %sum.0.lcssa ;   8 .lr.ph: 9 %i.02 = phi i32 [ %5, %.lr.ph ], [ 0, %0 ] 10 %sum.01 = phi i32 [ %4, %.lr.ph ], [ 0, %0 ] 11 ;            12 %2 = getelementptr inbounds i32, i32* %input, i32 %i.02 13 %3 = load i32, i32* %2, align 4 14 ;      15 %4 = add nsw i32 %3, %sum.01 ;   sum 16 %5 = add nuw nsw i32 %i.02, 1 ;   i 17 ;   18 %exitcond = icmp eq i32 %5, %length 19 ;      20 br i1 %exitcond, label %._crit_edge, label %.lr.ph 21 }

So, the listing starts with a function declaration with the name " sum_array(int*, int) ", which takes two parameters with the types i32* and i32 and returns i32 . Yes, all that is written in quotes and there is a name. LLVM does not impose restrictions on naming identifiers. The only requirement is the uniqueness of the string. Therefore, clang for simplicity of perception places the entire prototype of the function in the name.

As in C-like languages, the function declaration first comes with the return value type, then the name itself, and then the parameters. The second pair of parentheses is a section describing the function parameters. About the types we have already said, it remains to deal with keywords.

The nocapture keyword tells LLVM that the function does not save the passed pointer and does not write it to external memory. This information can be used by the analyzer to determine the fact that the pointer does not “leak”. A typical use is escape analysis and optimization, which turns heap allocation into a stack, if the optimizer can prove that the pointer does not leave the execution context. The result is a minus one memory allocation for each call.

The readonly keyword has the same semantics as the const specifier when declaring a pointer to a constant in C ++. Thus it is guaranteed that the function does not change the contents of the memory according to such a pointer.

Lines 3 and 4 are the quick cut-off, if 0 was passed to the %length parameter, lines 6 and 7 are the exit point of the function.

 3 %1 = icmp sgt i32 %length, 0 ;   %length   4 br i1 %1, label %.lr.ph, label %._crit_edge ;   ,    %.lr.ph,   %._crit_edge 5 ._crit_edge: ;    0,       %0  %4,    %.lr.ph (. ) 6 %sum.0.lcssa = phi i32 [ 0, %0 ], [ %4, %.lr.ph ] ;  %sum.0.lcssa     7 ret i32 %sum.0.lcssa

The following is the main body of the function - the algorithm itself for counting the sum of the elements of an array:

 8 .lr.ph: 9 %i.02 = phi i32 [ %5, %.lr.ph ], [ 0, %0 ] ;   —  0,       (%5) 10 %sum.01 = phi i32 [ %4, %.lr.ph ], [ 0, %0 ] ;   —  0,      (%4) 11 ;      %input   %i.02 12 %2 = getelementptr inbounds i32, i32* %input, i32 %i.02 13 %3 = load i32, i32* %2, align 4 14 ;      15 %4 = add nsw i32 %3, %sum.01 ;    —     (%3)   (%sum.01) 16 %5 = add nuw nsw i32 %i.02, 1 ;    —  1     %i.02 17 ;   —        %length 18 %exitcond = icmp eq i32 %5, %length 19 ;     —    %._crit_edge,    %.lr.ph 20 br i1 %exitcond, label %._crit_edge, label %.lr.ph

I think everything here should be clear from the comments in the listing itself. Nevertheless, it is worth noting a couple of points.

First, novice LLVM programmers are often confused by the “magic” instruction getelementpointer (GEP). In fact, all it does is calculate the offset of the field in the data type, taking into account the base address of the object and a series of indices - the paths to the elements. In the case of an array, we have only one dimension — a linear sequence of elements. Accordingly, the element offset by index is calculated trivially. In the case of a complex structure with nested elements, it is necessary to set the field index at each nesting level.

For details, I suggest contacting the LLVM manual for this instruction and a special article designed to resolve misunderstandings.

Secondly, you should pay attention to the specifiers nsw and nuw in the add instructions on lines 15 and 16 .

Literally, they tell LLVM that the execution result does not imply signed ( n o s igned w rap) and unsigned ( n o u nsigned w rap) overflows. They allow you to speed up the code at the cost of undefined behavior, if the assumption is false.

The concept of value poisoning is closely associated with these concepts and with UB, about which one should also be read.

Conclusion

Finally, I want to sincerely thank Sergey Platonov - sermp . Without him, this event would not take place. Especially when you consider what it cost him. Thank you, Seryoga!

From my point of view, C ++ Siberia is one of the best C ++ conferences in Siberia. The level of reports is very high, almost everything is interesting to hear.

I especially liked the reports:

Of course, the report by Eric Niebler , which allows you to take a fresh look at C ++
An excellent introduction to future concepts was given by Alexander Fokin
Alexander Granin once again blew the whole brain with functionalism! Lenses in C ++ , how do you, eh?
An interesting solution to the ODR problem was suggested by Alexey Kutumov (by the way, he also overlaps with clang / LLVM!)
Finally, Yevgeny Ryzhkov from all here at the well-known PVS Studio team spoke about the use of static analysis on the example of the Unreal Engine. The report should be of interest primarily to project managers, since it mainly affects management issues. But I also highly recommend programmers to listen to it.

... That's all. I hope you enjoyed the show and learned something new for yourself. If not, it is always helpful to repeat and test yourself. See you!

PS: It would be great if the authors of the above reports wrote their articles with additions. It is definitely worth it.

PPS: The guys from Novosibirsk State University (NSU) were asked to tell about LLVM in a more accessible form for students. The lecture itself will be next week. If anyone wants to attend - welcome. The event will also stream online.

Source: https://habr.com/ru/post/267239/

All Articles