
When I learned to write multi-threaded applications - I read a bunch of literature and background information on this area. But between theory and practice - a huge abyss. I filled a bunch of cones, and still sometimes I get on the head from my own streams. For myself, I developed a set of some rules that I try to strictly follow, and this
greatly helps me in writing multi-threaded code.
Since errors associated with synchronization of threads are extremely difficult to debug, the most effective way here is to prevent these very errors. To do this, use different programming paradigms at different levels of abstraction. The lower level of abstraction will be considered as working with synchronization objects (critical sections, mutexes, semaphores). Upper - such programming paradigms as Futures and promises, STM (software transactional memory), exchange of asynchronous messages, etc. The upper level of abstraction is often always based on the lower.
In this article I will share my style of writing code at the lower level of abstraction. Since I am a delphist, all examples will be in Delphi, but all of the following is true for other programming languages ​​(allowing you to work with synchronization objects of course)
Thread safe object
The first rule is to work only with thread safe objects between threads. This is the most simple, logical and understandable rule. However, even here there are some features. The object must be entirely thread-safe, which means that all public methods (except the constructor and destructor) must be synchronized. Constructors and destructors, in turn, should always be synchronized outside the object. One of the mistakes in the early stages of working with threads was that I forgot about the synchronization of constructors and destructors. And if there is no problem with the constructor (we get the pointer to the object only when the constructor has already completed), then you need to be careful with the destructor. Synchronization of destructors is a very slippery topic, and I cannot give any instructions on how best to implement it (I'm not a genius of multi-threaded programming, but just learning;)). I myself try to carry out such synchronization through the TThread class destructor, but this is true only for objects that exist for the whole life of the thread.
')
Locks
Description
Another common problem is deadlocks. Despite the fact that this is the most common problem that occurs during synchronization - there is one not obvious rule. If the stream at a time performs no more than one synchronization, then there will be no deadlocks. Here, under the word synchronization - I mean both the lock of the resource, and the expectation of a resource. Thus, stopping on a mutex, closing a mutex, entering a semaphore, entering a critical section, or sending a message (SendMessage) is all in sync. And in fact, if flow A expects a resource, and at the same time it has not blocked any resource, then nobody in turn expects it, which means there can be no interlocking.
Examples
Understanding and strict implementation of this condition is the key to the absence of deadlocks. Let's look at an example of what I'm talking about. Suppose we have some class:
TMyObj = class private FCS: TCriticalSection; FA: Integer; FB: Integer; public property A: Integer read GetA write SetA; property B: Integer read GetB write SetB; function DoSomething: Integer;
Following the fact that we must have a thread-safe object - I implemented properties A and B through getters and setters with a critical section:
function TMyObj.GetA: Integer; begin FCS.Enter; try Result := FA; finally FCS.Leave; end; end; function TMyObj.GetB: Integer; begin FCS.Enter; try Result := FB; finally FCS.Leave; end; end; procedure TMyObj.SetA(const Value: Integer); begin FCS.Enter; try FA := Value; finally FCS.Leave; end; end; procedure TMyObj.SetB(const Value: Integer); begin FCS.Enter; try FB := Value; finally FCS.Leave; end; end;
Suppose the DoSomething function works for us with A and B somehow like this:
function TMyObj.DoSomething: Integer; begin Result := SendMessage(SomeHandle, WM_MYMESSAGE, A mod 3, B mod 4); end;
Hey, but we’re using one critical section for A and B, an inexperienced writer will say. And immediately "optimizes" this piece:
function TMyObj.DoSomething: Integer; begin FCS.Enter; try Result := SendMessage(SomeHandle, WM_MYMESSAGE, FA mod 3, FB mod 4); finally FCS.Leave; end; end;
And it will be a mistake. Now, if we attempt to access field A or B in the WM_MYMESSAGE handler, we will get deadlock. This deadlock is obvious, since the amount of code is small, the data is simple. But it becomes not trivial, when when the code is huge, a bunch of connections and dependencies appear. According to the rule - to work with only one synchronization at a time, the above code can be “optimized” as follows:
function TMyObj.DoSomething: Integer; var k, n: Integer; begin FCS.Enter; try k := FA mod 3; n := FB mod 4; finally FCS.Leave; end; Result := SendMessage(SomeHandle, WM_MYMESSAGE, k, n); end;
Therefore, always, before calling a new synchronization, you need to release other synchronization objects. Code in the spirit of:
FCS1.Enter; try
In most cases, it can be considered a multi-thread bydlokodom. I think you already imagine how to rewrite it:
FCS1.Enter; try
This approach shows that we have to copy data, which may affect performance. However, in most cases the data volumes are not large, and we can allow them to be copied. Think four
times four times to apply the approach without copying.
Diagnostics
At the compilation level, such a diagnosis will not work. However, you can diagnose in realtime. To do this, we need to store the current synchronization object for each stream. Here is an example implementation of a diagnostic tool in Delphi.
procedure InitSyncObject; procedure PushSyncObject(handle: Cardinal); overload; procedure PushSyncObject(obj: TObject); overload; procedure PopSyncObject; implementation threadvar syncobj: Cardinal; synccnt: Cardinal; procedure InitSyncObject; begin syncobj := 0; synccnt := 0; end; procedure PushSyncObject(handle: Cardinal); begin if handle = 0 then raise EProgrammerNotFound.Create(' '); if (syncobj <> 0) and (handle <> syncobj) then raise EProgrammerNotFound.Create(' '); syncobj := handle; inc(synccnt); end; procedure PushSyncObject(obj: TObject); begin PushSyncObject(Cardinal(obj)); end; procedure PopSyncObject; begin if (syncobj = 0) or (synccnt = 0) then raise EProgrammerNotFound.Create(' '); Dec(synccnt); if synccnt = 0 then syncobj := 0; end;
Call InitSyncObject when we start a new thread.
Before capturing the synchronization object, we call PushThreadObject, after releasing the synchronization object, we call PopThreadObject.
For ease of use of these functions, I recommend copying the code of the SyncObjs.pas module into a new one, say SyncObjsDbg.pas. It has the base class of the synchronization object:
TSynchroObject = class(TObject) public procedure Acquire; virtual; procedure Release; virtual; end;
In Acquire add call PushSyncObject (Self), and in Release PopSyncObject. Also, do not forget to frame the WaitFor methods of THandleObject into these functions. In addition, if we use the TThread.Synchronize method, we save the TThread object before the call, and then retrieve it (PopSyncObject), if we use the SendMessage API or the WaitFor function API, we save the handle (PushSyncObject) before the call, then we retrieve (PopSyncObject).
That's all, now when you try to capture the second synchronization object, an exception will be raised, and the modules (SyncObjs / SyncObjsDbg) can be changed through defines.
Bad code
As an example of bad code, let's take ... the TThreadList class from the Classes.pas module
TThreadList = class private FList: TList; FLock: TRTLCriticalSection; FDuplicates: TDuplicates; public constructor Create; destructor Destroy; override; procedure Add(Item: Pointer); procedure Clear; function LockList: TList; procedure Remove(Item: Pointer); inline; procedure RemoveItem(Item: Pointer; Direction: TList.TDirection); procedure UnlockList; inline; property Duplicates: TDuplicates read FDuplicates write FDuplicates; end;
It would seem that a thread-safe class, with access through a critical section, what's wrong with it? And the bad thing is that the LockList and UnlockList methods are available. If we have synchronization between the pair of calls of LockList and UnlockList, then we break the above rule. Therefore, making a couple of Lock / Unlock functions in public is not good, and such functions should be used extremely carefully.
By the way, various APIs from Microsoft often return Enum interfaces,
for example . Why are they doing that? After all, it is much more convenient to get the quantity, say, through the Count function, and then in the loop, through the GetItem function, by index get the item. But in this case, they would have to endure a couple more Lock / Unlock functions so that no one could change the list while you are in the loop. In addition, if you suddenly call an API function between Lock / Unlock that performs internal synchronization, you can easily get deadlock. Therefore, everything is done through Enum interfaces. Upon receipt of such an interface, a list of objects is formed, and their reference count increases. This means that no objects in the Enum interface will be destroyed until at least the enum interface exists, and while you are working with Enum, everyone has access to the internal list, and this list can even be changed.
Probably enough
I pressed the preview button, I saw the resulting volume, and I realized that for now it would be enough. In the next article I would like to tell you about the TThread class delphi, and show the rules that I follow when creating and working with threads.