It all started with a single interview, which prompted me to write this article. Quite a large number of developers on the .Net platform do not understand basic things, although they use them on a daily basis, for example, they lock all methods using ConcurrentDictionary with a lock, although it would be possible to do with an ordinary Dictionary <>.
In science, there are 3 main ways to implement competitive data structures:
• Lock-free data structures;
• Fine-grained lock;
• Transactional memory implementation (transactional memory);
ConcurrentDictionary <TKey, TValue> is a thread-safe analogue of Dictionary <TKey, TValue>. It is based on the so-called Fine-grained lock.
general information
There are 2 main parameters for setting:
•
capacity - the initial number of elements. The default is 31.
•
concurrencyLevel - estimated number of streams per write. Default is set to = 4 * number of processors
')
I propose to consider these parameters in more detail in order to understand their true essence.
Capacity
This parameter is similar to that used in the usual “dictionary” Dictionary <TKey, TValue> is the initial size of the array for storing elements.
If we look at the sources, we will see the line:
ConcurrentDictionary<TKey, TValue>.Node[] buckets = new ConcurrentDictionary<TKey, TValue>.Node[capacity];
As you know, Dictionary is a classic hash table, so a hash value is used to store elements, which in .Net is calculated using the GetHashCode () function and is of type int. The values ​​of the hashes are exactly in the buckets above, the desired bucket is calculated:
bucketNo = (hashcode & int.MaxValue) % bucketCount;
From here it turns out the access speed - O (1).
In case of overflow, the array with the buckets is increased (newLength = oldLength * 2 + 1) and all elements are redistributed.
ConcurrencyLevel
Estimated number of threads per write. From the definition, the question immediately arises - why should a “dictionary” know about the number of threads. In fact, everything is simple. This is nothing but the number of independent locks (pool of locks), i.e. This is the maximum number of threads that can "write" to the "dictionary" at the same time. In fact, each lock is “given” about the same set of buckets, in which it ensures the synchronization of the write streams.
ConcurrentDictionary operations.
The main operations on the dictionary can be divided into 3 groups:
- completely unlockable;
- blocking one item from the lock pool;
- lock the entire dictionary;
The fully non-blocking operations include:- ContainsKey
- TryGet
- this []
- GetEnumerator - operation does not ensure data integrity (does not use snapshots), i.e. data during the operation of the function may change.
All read operations (Get / ContainsKey) have approximately the same operation algorithm:
- calculation of a key hash with GetHashCode ()
- calculation of the bucket in which our element lies
- comparing the value of the key in the bucket with the one we have
- reading the value using Volatile.Read
The operations with blocking of one element from the pool of locks include:Below is an approximate algorithm of work:
- Calculate the hash key of the new item
- Calculate the bucketNo bucket to which the item will be added, and the block numbers from the pool
bucketNo = (hashcode & int.MaxValue) % bucketCount; lockNo = bucketNo % lockCount;
- Lock bucketNo via Monitor.Enter
- Write an element using Volatile.Write
- Release Lock Monitor.Exit
The most inefficient operations that block the entire dictionary include:- Count, IsEmpty. Yes, these operations require complete dictionary locking. If you need to save the number of elements in the log file, you can use GetEnumerator and LINQ.
- Keys, Values ​​- getting a list of keys and a list of values, respectively. By the way, here you get complete data - snapshots.
- CopyTo - explicit ICollection
- Clear, ToArray
Methods AddOrUpdate, GetOrAdd
These methods are quite interesting in that they use atomic Add / Get / Update operations, but they themselves are not atomic, they do not use a lock on the entire operation. Here is the implementation of AddOrUpdate:
do { while (!TryGetValue(key, out comparisonValue)) { TValue obj = addValueFactory(key); TValue resultingValue; if (TryAddInternal(key, obj, false, true, out resultingValue)) return resultingValue; } newValue = updateValueFactory(key, comparisonValue); } while (!TryUpdate(key, newValue, comparisonValue)); return newValue;
MSDN:
Also, though not all methods are thread-safe, not all methods are specifically GetOrAdd and AddOrUpdate. The user has a lock. (This is done to prevent all threads from blocking all threads.)
Usage example:
items.AddOrUpdate(srcKey, srcValue, (key, existingVal) => { // if (srcValue != existingVal) throw new ArgumentException("...");
In the end, I want to note the complexity of the methods:
- TryAdd, TryUpdate, TryRemove, - O (1)
- Get / Contains, Item [] by key - O (1)
- ContainsValue, ToArray (), Keys, Values ​​- O (n)
Ps In this article I wanted to draw attention to the subtleties of using ConcurrentDictionary and the main points in its implementation.