📜 ⬆️ ⬇️

Optimistic synchronization primitives, queues and all-all-all. Tragicomedy in three acts

I warn you in advance, for those who are in the subject, there will not be much interesting. :)

I have an urgent task to implement the basic synchronization primitives ( mutex , semaphore and read / write lock ), using only the synchronous queue - the only available primitive. At the same time along the way, I will tell you how the spinlock is arranged and we will even assemble a small Frankenstein.

Part 1: All - Queue

The synchronous queue works like this: queue.get () returns the first event in the queue (number), if it does not exist, it puts the stream to sleep. queue.post (number) sends an event and wakes one of the threads waiting for an event in get ().
')

1.1 Mutex

So, to begin with, the simplest task is the mutex. As the name implies, it (mutex) provides exclusive access to the resource. The resource in this case is just a single event in the queue. I.e:

Initialization:
queue.post (0);
Capture:
queue.get (); // we will sleep if the queue is empty.
Release:
queue.post (0); // we will return the event to the place.

1.2 Semaphore

In a naive way, this scheme can be expanded to a semaphore — you just don't have to put anything in the queue in the initialization, the increase in the semaphore counter will be queue.post (0), a decrease in queue.get ().

1.3 RWLock

Now the most difficult thing is read write lock. The simplest solution can be constructed from a semaphore and mutex. The disadvantage of this option is that you must initially fix the maximum number of simultaneously reading streams. How is this technically performed? Very simple:

Initialization:
initialize the semaphore with the maximum number of clients.
Capture reading:
we capture a semaphore (the number of clients decreases by 1)
Exemption to read: release the semaphore (the number of clients increases by 1)
Capture Record:
we capture the mutex, and we capture the maximum number of clients on the semaphore (we will wait until all reading streams release the lock.) The mutex is needed to serialize locks for recording between recording streams, otherwise one stream may capture some of the counter on the semaphore, and the other part. Dedlock. Release the mutex after the semaphore is fully captured. (counter in semaphore 0)
Release:
We capture the mutex, release the semaphore (the counter is again equal to the maximum number of reading clients, release the mutex.

On two queues, you can implement a similar scheme, but without specifying the maximum number of clients:
create two queues: a read queue (read_queue) and a write queue (write_queue). We send zeros to both queues. Capturing a post looks very simple (like a mutex at the very beginning): instead of capturing we write write_queue.get () and free it with write_queue.post (0). The capture for reading looks a bit more complicated: first we read our number from the turn turn = read_queue.get (), then we look - if turn == 0, then we also capture write - write_queue.get (). If this fails, it is nothing, the following reading threads fall asleep on read_queue.get () and there will be no flight. Now we send in read_queue (turn + 1). read_queue.post (turn + 1). Write lock does not return back. When freeing such a lock, do the following: turn = read_event.get () - 1; if turn == 0, then you need to return the write lock in place. After that do read_event.post (turn);

Part 2: Spinlock

Spinlock is a great way to do most of the synchronization work in process space. This will work if the processors and / or your platform supports some hardware synchronization primitives. Typically, the semantics of such instructions are combined into several groups: Fetch-And-Add (FAA), Compare-And-Swap (CAS). There are others, but they are more exotic and I will not touch them.

The most interesting of course is the FAA and CAS. Fetch-And-Add is a processor instruction that atomically increments the value of a variable in memory by some arbitrary number and returns the previous value. CAS A, B, C atomically compares the value in memory [A] with the desired value B, and if everything is correct, sets the value in memory to number C. On x86 processors, the CAS is represented by the instruction cmpxchg8b.

Spinlock in this case is a very simple and fast thing. We, instead of calling the operating system and wasting precious processor cycles, can arrange the synchronization ourselves. The simplest spinlock does not even return time to the system — you can simply spin in a loop until the desired number appears in the memory. There are various improvements to this method, but they are outside of this introductory article. :)

2.1 Mutex

Now I will show the implementation of a mutex on these primitives with the simplest examples:
Implementing with CAS:
Initialization:
value = 1
Capture:
while (__ compare_and_swap (& value, 1, 0)! = 1) {/ * wait until the memory cell contains 1, which means that the resource is free, if 1, then we atomically write 0 * /} there
Release:
value = 1; / * just write 1 to the memory * /

A little harder on the FAA:
Two numbers must be entered: a ticket (ticket) and a turn (turn), initially both contain 0.
Capture mutex:
my_turn = __fetch_and_add (& ticket, 1); // get the number of your turn.
while (turn! = my_turn) {/ * waiting for our turn * /}
Release:
__fetch_and_add (& turn, 1); // increase the turn, the next client gets a lock.

The rest of the primitives inquisitive reader can come up with himself and even get the Nobel Prize!

Part 3: I'm tired, I want to sleep and when will it all end? Optimism.

Now collect our semaphore frankenstein. If you think about it, it is not at all necessary to present free resources in the form of real messages in a queue and to spend precious resources on it. We can help fetch-and-add.
Semaphore seizure (wait):
old_value = __fetch_and_add (& value, -1);
if here old_value is greater than 0, then we enjoy life, and we don’t call for any functions of the operating system, we have enough resources, everything is fine, otherwise we need to do queue.get () and go to sleep.

Release of the semaphore (post) should increase the counter by 1: old_value = __fetch_and_add (& value, 1) ;.
If old_value is less than zero here, it means someone slept on the semaphore and needs to be woken up by event.post (0) ;.

Voila! We got an extremely light, simple semaphore optimistically not using the operating system. A mutex is obtained from a semaphore with a value of 1, and rw-lock is obtained from a semaphore and a mutex from clause 1.3, for example.

Thank you for reading to the end! :)

Source: https://habr.com/ru/post/56578/


All Articles