📜 ⬆️ ⬇️

C ++ thread safety analysis

Writing multithreaded applications is not easy. Some static code analysis tools help developers by giving them the ability to clearly define policies for the behavior of threads and provide automatic verification of the implementation of these policies. This makes it possible to catch the race race conditions or their mutual blocking. This article describes the thread safety analysis tool for C ++ code embedded in the Clang compiler. It can be enabled using the –Wthread − safety command line option. This approach is widely used by Google - the benefits derived from its use have led to widespread voluntary use of this technology by various teams. Contrary to popular opinion, the need for additional code annotations did not become a burden, but, on the contrary, yielded results in simplifying the support and development of the code.

Foreword

Writing multi-threaded applications is not easy, because developers have to imagine all the many options for the interaction of threads and their use of resources. Experience shows that some toolkit would not hurt programmers. Many libraries and frameworks impose on the programmer some requirements for thread safety when using them, but not always these requirements are clearly indicated. And even in those cases where everything is well documented - this is just a request expressed in the text of the documentation, and not a strict automatic check.

Static code analysis tools help developers define thread safety policies and test them when building a project. An example of such policies might be the statements “the mu mutex should always be used when accessing the variable accountBalance ” or “the draw () method must be called only from a GUI thread”. The formal definition of a politician gives two main advantages:
')
  1. The compiler may display warnings when policy violations are detected. Finding an error at compile time is much cheaper than debugging fallen unit tests or, even worse, the appearance of “floating” bugs in the production code.
  2. The explicit thread-safety specifications play the role of documentation. Such documentation is very important for libraries and SDK, because programmers need to know how to use them correctly. Of course, this information can be placed in comments, however, practice shows that such comments tend to become obsolete, since they do not always change simultaneously when updating code.


This article talks about applying this approach to Clang, although it was originally developed for GCC, but the GCC version is no longer supported. In Clang, this feature is implemented as a compiler warning. In Google, the entire C ++ codebase is currently compiled with default thread-safety analysis enabled.


All this works as follows: in addition to the type of a variable ( int , float , etc.), the programmer can optionally determine how access to this variable should be controlled in a multi-threaded environment. Clnag uses annotations for this. Annotations can be written either in the GNU attribute style (i.e. attribute ((...)) ) or in the C ++ 11 attribute style (i.e. [[...]] ). For portability, attributes are usually hidden inside a macro, which is defined only if the code is compiled with Clang. The examples in this article assume the use of this macro. These attribute names can be found in the Clang documentation.

The code in the example below demonstrates the basic application of technology, using the example of a classic bank account. The GUARDED_BY attribute requires the use of the mu mutex to read or write a balance, which will guarantee the atomic nature of operations for changing it. Similarly, the REQUIRES macro requires the one who calls the withdrawImpl method to block the mutex mu before calling it - only after this the operation to change the balance in the method body will be considered safe.

In the example, the depositImpl () method does not have the REQUIRES attribute and does not block the mu mutex before changing the balance, which means that compiling this code will show a warning about a potential error in this method. The thread safety analysis does not check whether the mutex was used in the method that called depositImpl (), so the REQUIRES attribute must be explicitly defined. We will also receive a warning about a potential error in the transferFrom () method, since it must use the b.mu mutex and use this-> mu . The analysis understands that these are two different mutexes in two different objects. And finally, another warning is waiting for us in the withdraw () method, where we forget to unblock the mu mutex after changing the balance. Each mutex lock operation must correspond to an unlock operation; The analysis also correctly identifies double locks and double unlocks. The function may, if necessary, perform a lock without unlocking (or unlocking without locking), but this behavior must be annotated in a special way.

Code example:
#include ” mutex.h ” class BankAcct { Mutex mu; int balance GUARDED BY(mu); void depositImpl(int amount) { // WARNING! Must lock mu. balance += amount; } void withd rawImpl(int amount) REQUIRES (mu) { // OK. Caller must have locked mu. balance −= amount; } public: void withdraw(int amount) { mu.lock(); // OK. We've locked mu. withdrawImpl(amount); // WARNING! Failed to unlock mu. } void transferFrom(BankAcct& b, int amount) { mu.lock(); // WARNING! Must lock b.mu. b.withdrawImpl(amount); // OK. depositImpl() has no requirements. depositImpl(amount); mu.unlock(); } }; 


The thread safety analysis was originally designed for cases like the one above. But the requirements for using mutexes when accessing certain objects is not the only thing that needs to be checked to ensure reliability. Another common scenario is the assignment of certain roles to threads, for example, “workflow”, “GUI-flow”. The same concepts that we talked about with mutexes can also be applied to thread roles. In the example below, we see some class Widget , which can be used from two streams. In one of the threads, event handling (for example, mouse clicks) occurs, and in the other, rendering. At the same time, the draw () method should be called only from the rendering stream, and never delay the work of the stream processing user actions. The analysis will warn if the draw () method is called from the wrong thread. Further in the article we will talk about mutexes, but similar examples can be given for the roles of threads.

 #include ”ThreadRole.h” ThreadRole Input_Thread; ThreadRole GUI_Thread ; class Widget { public : virtual void onClick() REQUIRES (Input_Thread); virtual void draw() REQUIRES (GUI_Thread); }; class Button : public Widget { public : void onClick() override { depressed = true; draw(); // WARNING! } }; 


Basic concepts

Clang's thread safety analysis is built on capability calculations. To read or write a specific area of ​​memory, a thread must have the ability (or rights) to it. This opportunity can be imagined as a kind of key or token that a thread must provide in order to get read or write permissions. The opportunity may be “unique” or “shared”. A “unique” opportunity cannot be copied, that is, only one stream can have access to it at a time. A shared opportunity can have several duplicates belonging to different threads. The analysis uses the “one writer / many readers” approach, There is a stream that must have a “unique” opportunity for writing to a specific memory area, but a stream can have both a “unique” and one of the “shared” capabilities for reading the same area. In other words, many threads can read a pa area They can be shared at the same time, because they can share the opportunity, but only one stream can write at a time. Moreover, the stream cannot write while another thread reads this memory area, since the opportunity cannot be simultaneously “shared” and “unique ".

This approach allows you to make sure that the program is free from the race condition, where the “race condition” is defined as an attempt by several threads to access the same memory area, while at least one of the threads is trying to write. Since the write operation requires the stream to have a “unique” capability, no other thread will gain access to this memory at the same time.

Uniqueness and linear logic


Linear logic is a formal theory that can be used, for example, to express logical statements like “You cannot have a whole cake and at the same time eat it already”. A unique, or linear, variable can be used exactly once. It can not be copied, used several times or forget to use. A unique object can be created at one point in the program, and then later used. Functions that have access to an object, but do not use it, can only pass it on. For example, if std :: stringstream were a linear type, programs would be written as follows:
 std::string stream ss; // produce ss auto& ss2 = ss << ”Hello” ; // consume ss auto& ss3 = ss2 << ”World. ” ; // consume ss2 return ss3.str() ; // consume ss3 


Note that each thread variable was used exactly once. The linear type system does not know that ss and ss2 refer to the same data, a call << conceptually uses one stream and creates another with a new name. Attempting to use ss again will result in an error. Similarly, an error will be to return something without using the ss3.str () call, since then ss3 will remain created but unused.

Naming of opportunities

Passing unique features in an explicit form, similar to the examples above, would be an incredibly graceful exercise, since each read operation and each write operation would require new names. Instead, Clang, in its thread-safety analysis engine, tracks capabilities as anonymous implicitly passed objects. The resulting type system is formally equivalent to linear logic, but is simpler in practical programming.

Each feature is associated with a named C ++ object that defines the capability and provides operations for its creation and use. The C ++ object itself is not unique. For example, if mu is a mutex, then mu.lock () creates a unique anonymous opportunity such as Cap <mu> . Similarly, mu.unlock () implicitly accepts and uses the ability of type Cap <mu> . Operations that read or write data protected by the mu mutex follow the capability transfer protocol: they accept and use an implicit parameter of type Cap <mu> and create an implicit result of the same type of Cap <mu> .

Thread safety annotations

This section briefly describes all the main annotations that are supported by static analysis of thread safety in Clang.

GUARDED_BY (...) and PT_GUARDED_BY (...)

GUARDED_BY is an attribute that is hung on a member of a class. It shows that access to a given class member is protected by some possibility. Read operations require at least a “shared” opportunity; write operations require a “unique” opportunity. PT_GUARDED_BY works in a similar way, with the only difference being that it is intended for pointers and smart pointers.

 Mutex mu; int *p2 PT_GUARDED BY(mu) ; void test() { *p2 = 42; / / Warning ! p2 = new int; / / OK (no GUARDED_BY) . } 


REQUIRES (...) and REQUIRES_SHARED (...)

REQUIRES is a function attribute. It requires the caller to have a “unique” opportunity. You can specify more than one option. REQUIRES_SHARED works in the same way, but the required capability can be either “unique” or “shared”. Formally, REQUIRES defines the behavior of a function in such a way that it takes an opportunity as an implicit argument and returns it as an implicit result.

 Mutex mu; int a GUARDED_BY(mu); void foo() REQUIRES (mu) { a = 0; // OK. } void test() { foo(); // Warning ! Requi res mu. } 


ACQUIRE (...) and RELEASE (...)

Attributing ACQUIRE indicates that the function creates a “unique” opportunity (or capabilities), for example, by receiving it from some thread. The caller of this function should not pass the opportunity to it, but will receive it from the function when it returns the result. The RELEASE attribute indicates that the function uses a “unique” feature (for example, giving it to another thread). The caller must pass this opportunity to the function, but will not receive it back when the function returns a result.

ACQUIRE_SHARED and RELEASE_SHARED

These attributes work in the same way as described above, but they create and use “shared” capabilities.

CAPABILITY (...)

An attribute CAPABILITY can be applied to a structure, class, or typedef . It shows that an object of this class can be used to identify opportunities. For example, the mutex class in Google libraries is defined as follows:

 class CAPABILITY (”mutex”) Mutex { public : void lock() ACQUIRE (this); void readerLock() ACQUIRE_SHARED(this); void unlock() RELEASE(this); void readerUnlock() RELEASE_SHARED(this); }; 


Mutexes are ordinary C ++ objects. However, each mutex has an associated ability. The lock () and unlock () methods create and release this feature. Note that Clang does not attempt to verify whether these methods actually perform the corresponding operations with the mutex. The annotations apply only to the interface of the mutex class and express how its various methods create and use features.

TRY_ACQUIRE (b, ...) and TRY_ACQUIRE_SHARED (b, ...)

These attributes of a function or method try to get the specified opportunity and return true or false depending on the result.

NO_THREAD_SAFETY_ANALYSIS

This attribute disables static thread safety analysis for the specified function. This can be useful either when the function by definition should not be thread-safe, or in cases where the logic of the function is so complex that the static analysis fails.

Negative Requirements

All the requirements described above were “positive”, i.e. It was specified what possibility should be present at the time of calling a certain function. There are, however, also "negative" requirements, describing what opportunities this wash should not be. Positive requirements allow to avoid race conditions, while negative ones help to fight deadlocks. Many implementations of mutexes are not reentrable, since they can only be reentable by the cost of a significant drop in performance. For such mutexes, trying a second time to invoke the lock () operation will result in a deadlock. To avoid the deadlock, we can explicitly indicate that the opportunity currently being used should not be held by someone at the moment. This "negative opportunity" is expressed as the operator "!":

 Mutex mu; int a GUARDED_BY(mu); void clear() REQUIRES (!mu) { mu.lock(); a = 0; mu.unlock(); } void reset() { mu.lock(); // Warning ! Caller cannot hold 'mu' . clear(); mu.unlock(); } 


Results and conclusions


C ++ code thread safety analysis is currently widely used in Google products. It is enabled by default for each assembly of each module. More than 20,000 C ++ code files have correct annotations according to the above rules, the total number of annotations reaches 140,000 and grows every day. The use of these annotations is voluntary on Google, and, accordingly, the wide distribution of technology is a sign that Google engineers sincerely find it useful.

Since race and deadlock conditions are very insidious things, Google uses both static code analysis and dynamic analysis tools such as Thread Sanitizer. It was found that these tools complement each other well. Dynamic analysis does not require annotations and, accordingly, can be applied more widely. However, it can detect problems only in those ways of executing the code that were actually performed in the course of the analysis, which means that the effectiveness of dynamic analysis directly depends on the test coverage of the code. Static analysis is not so flexible, but covers all possible variants of code execution. In addition, static analysis reveals problems at the compilation stage, which is much more efficient.

Although the need to manually write annotations may seem a disadvantage, we found that annotations greatly simplify the maintenance and development of the code. Annotations are especially widely used in libraries and APIs, since there they also serve as machine-verifiable documentation. The developers and users of libraries most often belong to different teams, which means that who will use the library in a real project will not necessarily fully understand the multithreading control protocol adopted in it. Documentation may be missing or outdated, which means it is easy to make a mistake. With the use of annotations, the control protocol with synchronization tools becomes part of the API and the compiler will warn about errors in its use.

Annotations also proved to be effective in controlling internal software limitations as it progresses. For example, the initial design of some thread-safe class required the use of a mutex each time it accessed its private data. Over time, new people came to the development team who, not being in the know about this requirement (or random refactoring), could change this behavior. When analyzing the history of code changes, we several times found places where the engineer added a new method to the class, forgetting to use the necessary mutex when accessing protected data. After that, he (or another person) was forced to painfully and for a long time debug the state of the race and fix the bug. In the case when the restrictions were expressed as annotations, such a problem would be revealed at the first compilation of the code.

Admittedly, the use of annotations has its price of support. We found that about 50% of compiler warnings were triggered not by bugs in the code, but by bugs like forgotten, outdated, or incorrectly used annotation (like the lack of REQUIRES on get \ set methods). In this regard, thread safety annotations are similar to the use of the const qualifier. How to look at these errors depends on your point of view. On Google, they are considered documentation errors. Since the API is read frequently by many engineers, it is very important to keep the public interfaces up to date. If we exclude the cases of obviously incorrect use of annotations, the remaining number of false-positive responses is quite low - less than 5%. Such cases are mainly related to the use of access to the same memory area through different pointers, conditional use of mutexes, access to internal data from the object constructor, where synchronization is not needed yet.

Source: https://habr.com/ru/post/304176/


All Articles