Atomic operations

It became interesting how exactly atomicity of operations is achieved. Who cares - welcome under cat.

Atomic operations are operations that are performed as a whole or not at all. That is, it is an operation, during the execution of which the data read / modified by this operation cannot be changed by another operation. There are special assembler commands that indicate to the processor that the operation will be atomic. There are several types of commands that can be used to achieve atomicity: load-link and store-conditional ( LL / SC ), compare-and-swap ( CAS ), and others.

1. LL / SC

Examples of LL / SC commands:
ldl_l / stl_c and ldq_l / stq_c (Alpha), lwarx / stwcx (PowerPC), ll / sc (MIPS), and ldrex / strex (ARM version 6 and above).
Commands are given in pairs, because the first loads the current value of the operand in a register or other controlled place, the second saves the new value in the operand, if the operand in the interval between the 1st and 2nd has not changed. Together they ensure that all the commands between them work with the same version of the operand that was loaded with LL. For example, take a pair of lwarx / stwcx (PowerPC):

void atomic_incr(int * operand, int incr) { asm volatile ( "loop:\n\t" /* repeat until this succeeds */ "lwarx 6,0,%0\n\t" /* reserve the operand */ "add 6,6,%1\n\t" /* add incr to it */ "stwcx. 6,0,%0\n\t" /* put the sum back, and release it */ "bne- loop" /* start-over on failure */ : : "r" (operand), "r" (incr) : "r6" ); }

The code is taken from here.
The 6th register loads data from the operand and it (the operand) is backed up. Then in the 6th register we get the sum of the operand and incr. At the end, unload the received amount from the register into the operand and release it. If in the interval between the reservation and the release of the operand its value is changed somewhere else besides the current operation (even if the same value was assigned, which was already stored in the operand), then we get an error. This is checked by the 5th line, which starts the operation again in case of an error.

Gcc assembler parameters

Example:
: "= r" (val), "= m" (* mem)
: “0” (val), “m” (* mem)
: “Memory”, “cc”);
The first line is the output parameters.
"= r" (val) -% 0 is a constant or operand. gcc must use any register for% 0.
"= m" (* mem) -% 1 is a variable in memory that gcc should use to record the result (mem is a pointer to this variable).
The second line is the input parameters.
“0” (val) - 0 means that the same register should be used as in the output parameters and the value from the variable should be loaded into it.
“M” (* mem) - “m” means the same as “= m”, only the absence of the “=” sign says that reading is happening.
The third line is what will be changed by the instructions.
“Memory” - the memory will be changed.
"Cc" - "condition code register". This means that the processor flags will be changed.
Read more here .

2. CAS

The principle of operation is simple - we look at what value is stored in the operand. If it is what we expect, then change it to a new one.
Algorithm:

 int compare_and_swap (int* reg, int oldval, int newval) { int old_reg_val = *reg; if (old_reg_val == oldval) *reg = newval; return old_reg_val; }

CAS Extension - Double Compare And Swap ( CAS2 ). Pointers to two independent variables are given instead of one, respectively, two expected values and two new values for each variable, respectively.
But this group of teams is more “weak” than LL / SC, because the situation is possible:
initial conditions: compare_and_swap (reg, 5, 7) and * reg = 5;
1) int old_reg_val = * reg; (old_reg_val = 5)
2) another operation does * reg = 42; funny_stuff; * reg = 5;
3) if (old_reg_val == oldval) - the statement is true, although there have been two changes in the value of a variable. That is, a change has occurred, but our operation does not know about it. This situation is called the ABA problem .
There are many ways around this problem. It seemed to me an interesting method using the Double-Word Compare-And-Swap (often confused with Double Compare And Swap).
Example:

 /* double-compare-and-swap: atomically sets the two-word data at address @mem * to the two-word value of @new if value at @mem * equals @old * @mem: pointer to value * @old: old value * @new: new value * * returns: 0 on failure, non-zero on success */ static inline char DWCAS(volatile DWORD *mem, DWORD old, DWORD new) { char r = 0; unsigned long old_h = old >> SHIFT, old_l = old; unsigned long new_h = new >> SHIFT, new_l = new; __asm__ __volatile__("lock; " CMPXCHGxB " (%6);" "setz %7; " : "=a" (old_l), "=d" (old_h) : "0" (old_l), "1" (old_h), "b" (new_l), "c" (new_h), "r" (mem), "m" (r) : "cc", "memory"); return r; }

The code is taken from here.
The prefix "lock;" means that the memory access bus can only be used by the command following it.
The transmitted word contains a pointer to the memory with the required value and an ABA number, which varies with each operation (I saw it in one of the patents ).

3. Simple atomic operations.

This is usually an addition or increment.
Examples of commands: addl (i386), xaddl (x86).
Here is the implementation of the atomic addition from wikipedia :

 void atomic_add(int * operand, int incr) { asm volatile ( "lock; xaddl %1, (%0)\n" // add incr to operand : // no output : "r" (operand), "r" (incr) ); }

On some architectures, the prefix “lock;” is not needed. An example is incrementing on Itanium (taken from here ):

 void atomic_oneup(int * operand) { uint64_t res; // this MUST be 64 bits asm volatile ( "fetchadd4.rel %0=%1,%2" : "=r" (res) : "m" (*operand), "i" (1) : "memory" ); }

~~By the way, increment, decrement, + =, - = in C ++ are atomic (at least on x86 and x86_64).~~ There is no “lock;” prefix, so there is no atomicity either. Atomic operations appeared in C ++ 11.

 int a =0; ++a;

compiled into

 c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) 83 45 fc 01 addl $0x1,-0x4(%rbp)

For decrement will be subl.

Conclusion

Q: Every schoolchild knows this! Why write about it?
A: I was wondering how it works. Represented only that there are some lock'i. I hope it was interesting to someone else.

Q: The article has a lot of errors and inaccuracies.
A: Write, I am happy to fix it.

Q: What? Already a conclusion? Where is the article itself? Where are MSI , Cache Coherence, and% more_uche_user_words%?
A: I did not dig deep. I hope that people who know these topics will write about it.

UPD: Thanks for the fixes thanks to lesha_penguin and TheShade .

Source: https://habr.com/ru/post/157163/

All Articles

Atomic operations

1. LL / SC

2. CAS

3. Simple atomic operations.

Conclusion

More articles: