Concurrency: 6 ways to live with shared state

In multithreaded programming there are many difficulties, the main of which are the work with shared state and the effective use of provided kernels. The use of cores will be discussed in the next article.

With a shared state in a multithreaded environment, there are two things that cause all the difficulties: the race condition and the appearance of changes. In a race state, threads simultaneously change state, leading to non-deterministic behavior. And the problem with visibility lies in the fact that the result of changing data in one stream may be invisible to another. The article will explain six ways to deal with these problems.
')
All the examples are in Java, but they contain comments and I hope they will be clear to programmers who are not familiar with Java. This article is an overview and is not intended to be complete. At the same time, it is filled with references that give a more detailed explanation of the terms and statements.

Shared state

I will show the work with the shared state using the example of calculating Fibonacci numbers.
The formula for the calculation is as follows:

f(0) = 0 f(1) = 1 f(n) = f(n - 1) + f(n - 2) , n >= 2

First, we define the interface:

 public interface FibonacciGenerator<T> { /** *    */ T next(); /** *     */ public T val(); }

Our classes will implement the FibonacciGenerator interface. From the formula it is clear that in order to provide the next number, it is necessary to keep the previous two - this will be a shared state for which competition will take place. Our implementation should be thread safe . That is, regardless of the simultaneous number of consumers, it will retain its invariants and stick to the contract. And so, let's get started.

Locking

The first way to make a class work correctly in a multi-threaded environment is to use locks and declare all synchronized methods. An example would be the Vector class.

 public class IntrinsicLock implements FibonacciGenerator<BigInteger> { private BigInteger curr = BigInteger.ONE; private BigInteger next = BigInteger.ONE; @Override public synchronized BigInteger next() { BigInteger result = curr; curr = next; next = result.add(next); return result; } @Override public synchronized BigInteger val() { return curr; } }

We got a class that works correctly in a multi-threaded environment, while spending a minimum of effort. First of all, we sacrifice performance. The performance of the class is the same as if it were run in a single-threaded environment. In addition, the use of locks brings problems such as deadlock , livelock , etc.

Fine-grained locking

The next way is to break our structures into parts and protect with locks only those sections in which work with a shared state takes place. An example of this approach is the ConcurrentHashMap . In it all the data is divided into several parts. During access, only the part that is currently changing is blocked. There is also an option to use more functional locks, such as: StampedLock (java 8) , ReadWriteLock . With the correct algorithm and implementation, we get a higher level of parallelism. Example using ReadWriteLock:

 public class FineGrainedLock implements FibonacciGenerator<BigInteger> { private final ReadWriteLock lock = new ReentrantReadWriteLock(); private BigInteger curr = BigInteger.ONE; private BigInteger next = BigInteger.ONE; @Override public BigInteger next() { BigInteger result; lock.writeLock().lock(); try { //     result = curr; curr = next; next = result.add(next); return result; } finally { lock.writeLock().unlock(); } } @Override public BigInteger val() { lock.readLock().lock(); try { //   write lock // `    return curr; } finally { lock.readLock().unlock(); } } }

We improved the class and achieved a higher reading capacity. This implementation has all the disadvantages of locks. In addition, from the minuses, the fact that the algorithm became more complicated (added noise, not related to the logic of operation) was added and the probability of incorrect implementation increased.

Lock-free algorithms

The use of locks causes a lot of problems with performance and correctness. There is an alternative in the form of non-blocking algorithms . Such algorithms are built on atomic operations provided by processors. An example is the get method in a ConcurrentHashMap. To write non-blocking algorithms, it makes sense to use existing non-blocking classes: ConcurrentLinkedQueue , ConcurrentHashMap, etc. We write a non-blocking implementation of our class.

 public class LockFree implements FibonacciGenerator<BigInteger> { //      private static class State { final BigInteger next; final BigInteger curr; public State(BigInteger curr, BigInteger next) { this.next = next; this.curr = curr; } } //     private final AtomicReference<State> atomicState = new AtomicReference<>( new State(BigInteger.ONE, BigInteger.ONE)); public BigInteger next() { BigInteger value = null; while (true) { //    State state = atomicState.get(); //         value = state.curr; //    State newState = new State(state.next, state.curr.add(state.next)); //         //   ,     , //      if (atomicState.compareAndSet(state, newState)) {break;} } return value; } @Override public BigInteger val() { return atomicState.get().curr; } }

One of the advantages of this approach is an increase in performance compared with blocking algorithms. And also, last but not least, getting rid of lock flaws . The downside is that it is much more difficult to invent a non-blocking algorithm.

Software Transactional Memory

An alternative to non-blocking algorithms is the use of software transactional memory . Its use is similar to the use of transactions when working with databases. The concept is pretty new (1995) and among popular languages, native support is only in Clojure. Support for STM at the library level is available in many languages, including Java. I will use the STM implemented in the Akka project.

 public class STM implements FibonacciGenerator<BigInteger> { //      //         private final Ref<BigInteger> curr = new Ref<>(BigInteger.ONE); private final Ref<BigInteger> next = new Ref<>(BigInteger.ONE); @Override public BigInteger next() { //   return new Atomic<BigInteger>() { //    //   I (https://en.wikipedia.org/wiki/ACID) @Override public BigInteger atomically() { //      BigInteger result = curr.get(); //      curr.set(next.get()); next.set(result.add(next.get())); //       // .  ,   ,   //      . //   ,     . return result; } //    }.execute(); } @Override public BigInteger val() { //    return curr.get(); } }

The code is easily understood and there is no noise created by non-blocking algorithms and the use of locks. But for this we pay lower productivity and limited applicability, because the STM shows itself well when there are more readers than writers.

Immutability

Problems with shared access to the state from the fact that it is mutable. That is, having designed a class unchangeable , you can work with it in a multithreaded environment without restrictions, since it will also be thread safe. But immutable data structures often require functional approaches and special data structures , since memory costs can be very high.

 public class Immutable { private final BigInteger next; //   public final BigInteger val; private Immutable(BigInteger next, BigInteger val) { this.next = next; this.val = val; } public Immutable next() { return new Immutable(val.add(next), next); } public static Immutable first() { return new Immutable(BigInteger.ONE, BigInteger.ONE); } }

As you noticed, the class interface has changed and this will require other ways of working with it.

Isolated mutability

The idea of isolated object variability is that access to them is always limited to one stream. Consequently, we will not have problems characteristic of multi-threaded programs. This approach uses the model of actors . Actors are entities similar to objects that have a changing state and behavior. The interaction of actors occurs through asynchronous messaging . Messages are immutable and the actor processes one message at a time. The result of processing a message is a change in behavior, state, and other actions. An example of the use of actors will be given in the next article.

Total

Each approach has its drawbacks and advantages, and you cannot give universal advice.
A combination of fine-grained locks and non-blocking algorithms, the most commonly used approach in Java. In Clojure, on the contrary, transactional memory and immutable data structures. Transactional memory, in my opinion, is a promising tool (I suggest the reader to draw an analogy of garbage collection on his own). I hope that the next time you design a system, module or class, you will recall the approaches described in this article and choose the one that suits you best.

Thanks for attention. Waiting for your comments and suggestions.

Source: https://habr.com/ru/post/216049/

All Articles