One of the users of the Visual C ++ compiler gave the following code example and asked why his conditional loop runs infinitely, although at some point the condition should stop and the loop should end:
#include <windows.h> int x = 0, y = 1; int* ptr; DWORD CALLBACK ThreadProc(void*) { Sleep(1000); ptr = &y; return 0; } int main(int, char**) { ptr = &x; // starts out pointing to x DWORD id; HANDLE hThread = CreateThread(nullptr, 0, ThreadProc, 0, &id); // , ptr // while (*ptr == 0) { } return 0; }
For those who are not familiar with platform-specific functions, here is the equivalent in pure C ++:
#include <chrono> #include <thread> int x = 0, y = 1; int* ptr = &x; void ThreadProc() { std::this_thread::sleep_for(std::chrono::seconds(1)); ptr = &y; } int main(int, char**) { ptr = &x; // starts out pointing to x std::thread thread(ThreadProc); // , ptr // while (*ptr == 0) { } return 0; }
Next, the user brought his understanding of the program:
The conditional loop has been turned by the compiler into an infinite one. I see it on the generated assembler code, which once loads the value of the ptr pointer into the register (at the start of the cycle), and then at each iteration compares the value of this register with zero. Since the reloading of the value from ptr never happens again, the cycle never ends.
')
I understand that declaring a ptr as “volatile int *” should cause the compiler to discard the optimization and read the ptr value at each iteration of the loop, which fixes the problem. But still I would like to know why the compiler cannot be smart enough to do such things automatically? Obviously, a global variable used in two different threads can be changed, which means it cannot simply be cached in a register. Why the compiler can not immediately generate the correct code?
Before answering this question, let's start with a little niggle: “volatile int * ptr” does not declare the variable ptr as a “pointer for which optimization is prohibited”. This is "a regular pointer to a variable for which optimization is prohibited." What the author of the above question meant was to be declared as "int * volatile ptr".
And now back to the main issue. What is going on here?
Even the quickest look at the code will tell us that there are no variables of the type std :: atomic, no use of std :: memory_order (neither explicit nor implicit). This means that any attempt to access ptr or * ptr from two different streams leads to undefined behavior. Intuitively, you can think of it this way: “The compiler optimizes each stream as if it works in the program alone. The only points where the standard compiler MUST think about accessing data from different streams is using std :: atomic or std :: memory_order. ”
This explains why the program behaved differently than the programmer expected. From the moment you allow for undefined behavior, absolutely nothing can be guaranteed.
But okay, let's think about the second part of his question: why is the compiler not smart enough to recognize this situation and automatically turn off optimization with loading the pointer value into the register? Well, the compiler automatically applies all possible optimizations that do not conflict with the standard. It would be strange to require him to be able to read the thoughts of the programmer and turn off some optimizations that do not contradict the standard, which, according to the programmer, should have changed the logic of the program for the better. “Oh, what if this cycle expects a change in the value of a global variable in another thread, although it has not announced this explicitly? I'll take it, I will slow it down a hundred times to be ready for this situation! ” Should it be so? Hardly.
But suppose that we add to the compiler a rule like “If optimization has led to the appearance of an infinite loop, then we need to cancel it and assemble the code without optimization”. Or even like this: “Sequentially cancel individual optimizations until the result is a non-infinite loop.” In addition to the amazing surprises that this will bring, will it give any benefit at all?
Yes, in this theoretical case we will not get an infinite loop. It will break if some other thread writes a non-zero value to * ptr. It will also be interrupted if another thread writes a non-zero value to the variable x. It is becoming unclear how deeply the dependency analysis should be worked out in order to “catch” all the cases that may affect the situation. Since the compiler does not actually run the created program and does not analyze its behavior at runtime, the only solution would be to assume that in general no references to global variables, pointers and links can be optimized.
int limit; void do_something() { ... if (value > limit) value = limit;
This is completely contrary to the spirit of C ++. The language standard says that if you modify a variable and expect to see this modification in another thread, you must EXACTLY say: use an atomic operation or streamlining memory access (usually using the synchronization object).
So please do just that.