What is Strict Aliasing and why should we care? Part 2

(OR pun pun typing, indefinite behavior and leveling, oh my god!)

Friends, before the launch of the new thread in the “Developer C ++” course, there is very little time left. The time has come to publish the translation of the second part of the material, which describes what pun is a typification.

What is pun pun typing?
')
We have come to the point where we may ask ourselves why we may need pseudonyms at all? Usually for the implementation of puns typing, because often used methods violate strict aliasing rules.

Sometimes we want to bypass the type system and interpret an object as a different type. Reinterpretation of a memory segment as another type is called type punning . Typing puns are useful for tasks that need access to the base representation of the object for viewing, transporting or manipulating the provided data. Typical areas in which we can come across the use of puns of typification: compilers, serialization, network code, etc.
Traditionally, this was achieved by taking the address of an object, casting it to a pointer to the type to which we want to interpret, and then accessing the value, or in other words, using pseudonyms. For example:

int x = 1 ; //   C float *fp = (float*)&x ; //   //  C++ float *fp = reinterpret_cast<float*>(&x) ; //   printf( “%f\n”, *fp ) ;

As we saw earlier, this is invalid aliasing; this will cause undefined behavior. But traditionally, compilers did not use strict aliasing rules, and this type of code usually just worked, and developers, unfortunately, are used to allowing such things. A common alternative pun method of typing is through unions, which is valid in C, but will cause undefined behavior in C ++ ( see an example ):

 union u1 { int n; float f; } ; union u1 u; uf = 1.0f; printf( "%d\n”, un ); // UB(undefined behaviour)  C++ “n is not the active member”

This is unacceptable in C ++, and some consider that associations are intended solely for implementing variant types, and consider that using associations for puns of typing is an abuse.

How to implement a pun?

The standard blessed method for puns typing in C and C ++ is memcpy. This may seem a bit complicated, but the optimizer must recognize the use of memcpy for a pun, optimize it, and generate a register for registering a move. For example, if we know that int64_t is the same size as double:

 static_assert( sizeof( double ) == sizeof( int64_t ) ); // C++17

We can use memcpy :

 void func1( double d ) { std::int64_t n; std::memcpy(&n, &d, sizeof d); //…

With a sufficient level of optimization, any decent modern compiler generates code that is identical to the previously mentioned reinterpret_cast method or the combination method for getting puns. Studying the generated code, we see that it uses only the register mov ( example ).

Pun typing and arrays

But what if we want to implement the pun of the unsigned char array into the unsigned int series and then perform the operation with each unsigned int value? We can use memcpy to turn an unsigned char array into a temporary unsinged int type. The optimizer will still be able to see everything through memcpy and optimize both the temporary object and the copy, and work directly with the underlying data ( example ):

 //  ,    int foo( unsigned int x ) { return x ; } // ,  len  sizeof(unsigned int) int bar( unsigned char *p, size_t len ) { int result = 0; for( size_t index = 0; index < len; index += sizeof(unsigned int) ) { unsigned int ui = 0; std::memcpy( &ui, &p[index], sizeof(unsigned int) ); result += foo( ui ) ; } return result; }

In this example, we take char*p , assume that it points to several sizeof(unsigned int) fragments, interpret each data fragment as unsigned int , calculate foo() for each pun fragment, summarize it in result, and return the final value .

The build for the loop body shows that the optimizer turns the body into direct access to the unsigned char base array as an unsigned int , adding it directly to eax :

 add eax, dword ptr [rdi + rcx]

The same code, but using reinterpret_cast to implement a pun (violates strict aliasing):

 // ,  len  sizeof(unsigned int) int bar( unsigned char *p, size_t len ) { int result = 0; for( size_t index = 0; index < len; index += sizeof(unsigned int) ) { unsigned int ui = *reinterpret_cast<unsigned int*>(&p[index]); result += foo( ui ); } return result; }

C ++ 20 and bit_cast

In C ++ 20, we have bit_cast , which provides a simple and safe way to interpret, and can also be used in the context of constexpr .

Below is an example of how to use bit_cast to interpret an unsigned integer in a float ( example ):

 std::cout << bit_cast<float>(0x447a0000) << "\n" ; //,  sizeof(float) == sizeof(unsigned int)

In the case when the types To and From do not have the same size, this requires us to use an intermediate structure. We will use a structure containing a character array multiple of sizeof(unsigned int) (assuming 4-byte unsigned int) as the type of From, and unsigned int as the type of To.:

 struct uint_chars { unsigned char arr[sizeof( unsigned int )] = {} ; //  sizeof( unsigned int ) == 4 }; //  len  4 int bar( unsigned char *p, size_t len ) { int result = 0; for( size_t index = 0; index < len; index += sizeof(unsigned int) ) { uint_chars f; std::memcpy( f.arr, &p[index], sizeof(unsigned int)); unsigned int result = bit_cast<unsigned int>(f); result += foo( result ); } return result ; }

Unfortunately, we need this intermediate type - this is the current bit_cast constraint.

Alignment

In previous examples, we have seen that violating strict aliasing rules can lead to the exclusion of storage during optimization. Violation of strict aliasing can also violate alignment requirements. Both C standards and C ++ state that alignment requirements are imposed on objects that limit the place where objects can be placed (in memory) and, therefore, accessible. C11 section 6.2.8 Aligning objects reads as follows :

Full object types have alignment requirements that impose restrictions on addresses at which objects of this type can be placed. Alignment is an implementation-defined integer value representing the number of bytes between consecutive addresses at which a given object can be located. An object type imposes an alignment requirement on each object of this type: you can request more strict alignment using the _Alignas .

C ++ 17 project standard in section 1 [basic.align] :

Object types have alignment requirements (6.7.1, 6.7.2) that impose restrictions on addresses at which an object of this type can be placed. Alignment is an implementation-defined integer value representing the number of bytes between consecutive addresses at which a given object can be located. An object type imposes an alignment requirement on each object of this type; Stricter alignment can be requested using the alignment specifier (10.6.2).

Both C99 and C11 clearly indicate that the transformation that results in an unaligned pointer is undefined behavior, section 6.3.2.3. Pointers says:

A pointer to an object or an incomplete type can be converted to a pointer to another object or an incomplete type. If the result pointer is not properly aligned for the pointer type, the behavior is undefined. ...

Although C ++ is not so obvious, I think this sentence from clause 1 of [basic.align] sufficient:

... An object type imposes an alignment requirement on each object of this type; ...

Example

So let's assume:

alignof (char) and alignof (int) are 1 and 4 respectively
sizeof (int) is 4

Thus, interpreting an array of char size 4 as int violates strict aliasing, and may also violate alignment requirements if the array has 1 or 2 byte alignment.

 char arr[4] = { 0x0F, 0x0, 0x0, 0x00 }; //        1  2  int x = *reinterpret_cast<int*>(arr); // Undefined behavior

That can lead to poor performance or bus error in some situations. Taking into account that using alignas to force an identical alignment to an int in an array will prevent violation of alignment requirements:

 alignas(alignof(int)) char arr[4] = { 0x0F, 0x0, 0x0, 0x00 }; int x = *reinterpret_cast<int*>(arr);

Atomicity

Another unexpected penalty for unallocated access is that it violates the atomicity of some architectures. Atomic vaults may not be displayed as atomic for other x86 streams if they are not aligned.

Catching strict aliasing violations

We do not have many good tools for tracking strict aliasing in C ++. The tools that we have will catch some cases of violations and some cases of improper loading and storage.

gcc using the -fstrict-aliasing and -Wstrict-aliasing flags can catch some cases, although not without false positives / rejection. For example, the following cases will generate a warning in gcc ( example ):

 int a = 1; short j; float f = 1.f; //   ,   TIS ,         printf("%i\n", j = *(reinterpret_cast<short*>(&a))); printf("%i\n", j = *(reinterpret_cast<int*>(&f)));

although he will not catch this extra case ( example ):

 int *p; p=&a; printf("%i\n", j = *(reinterpret_cast<short*>(p)));

Although the clang allows these flags, it does not seem to implement the warnings.

Another tool that we have is ASan, which can pick up non-aligned recording and storage. Although they are not direct violations of strict aliasing, their result is quite common. For example, the following cases will generate run-time errors when building with clang using -fsanitize=address

 int *x = new int[2]; // 8 : [0,7]. int *u = (int*)((char*)x + 6); //     x    *u = 1; //    [6-9] printf( "%d\n", *u ); //    [6-9]

The last tool I recommend is C ++ specific and, in fact, not only the tool, but coding practice that does not allow C-style conversions. Both gcc and clang will produce diagnostics for C -Wold-style-cast using -Wold-style-cast . This will cause any unspecified typing puns to use reinterpret_cast. In general, the reinterpret_cast should be a beacon for more thorough code analysis.
It is also easier to search the code base for reinterpret_cast to perform an audit.

For C, we have all the tools that are already described, and we also have a tis-interpreter , a static analyzer that comprehensively analyzes the program for a large subset of the C language. Considering the C-versions of the previous example, where using -fstrict-aliasing skips one case ( example )

 int a = 1; short j; float f = 1.0 ; printf("%i\n", j = *((short*)&a)); printf("%i\n", j = *((int*)&f)); int *p; p=&a; printf("%i\n", j = *((short*)p));

A TIS interpreter can intercept all three, the following example calls the TIS core as a TIS interpreter (the output is edited for brevity):

 ./bin/tis-kernel -sa example1.c ... example1.c:9:[sa] warning: The pointer (short *)(& a) has type short *. It violates strict aliasing rules by accessing a cell with effective type int. ... example1.c:10:[sa] warning: The pointer (int *)(& f) has type int *. It violates strict aliasing rules by accessing a cell with effective type float. Callstack: main ... example1.c:15:[sa] warning: The pointer (short *)p has type short *. It violates strict aliasing rules by accessing a cell with effective type int.

And finally, TySan , which is in development. This sanitizer adds type checking information to the shadow memory segment and checks accesses to determine if they violate the aliasing rules. The tool should potentially be able to track all aliasing violations, but may have large overhead during execution.

Conclusion

We learned about the aliasing rules in C and C ++, which means that the compiler expects us to strictly follow these rules and accept the consequences of not implementing them. We learned about some tools that will help us identify some of the abuse of pseudonyms. We have seen that the usual use of aliasing is a pun of typing. We also learned how to implement it correctly.

Optimizers are gradually improving the analysis of pseudonyms based on types and are already breaking some code that is based on violations of strict aliasing. We can expect that the optimizations will only get better and break even more code that just worked before.

We have standard ready-made compatible methods for interpreting types. Sometimes for debug builds, these methods should be free abstractions. We have several tools for detecting strict aliasing violations, but for C ++ they will catch only a small part of cases, and for C using the tis-interpreter we will be able to track most of the violations.

Thanks to those who gave feedback about this article: JF Bastien, Christopher Di Bella, Pascal Quoc, Matt P. Dziubinski, Patrice Roy and Olafur Vaage
Of course, in the end, all errors belong to the author.

The translation of a rather large material has come to an end, the first part of which can be read here . And we traditionally invite you to the open day , which is already held on March 14 by the head of technology development at Rambler & Co, Dmitry Shebordayev.

Source: https://habr.com/ru/post/443602/

All Articles

What is Strict Aliasing and why should we care? Part 2

More articles: