📜 ⬆️ ⬇️

Change ConcurrentDictionary during iteration

Recently decided to deal with the internal device of thread-safe collections, the publication on Habré was chosen as the starting point in the study of the ConcurrentDictionary device. The principle of its work is described simply and clearly, for which a special thank you to the author.

It seemed to me that one moment in the publication was not fully covered and I decided to fill this gap.

Thread-safe collections are designed for use in a multi-threaded environment and should be able to change at any time. Accordingly, they can be changed even at the time of their search. From here I had a question, if during the iteration the collection is changed, will the iterator see these changes?

Refer to the article above:
GetEnumerator - can return old values ​​in case changes were made by another thread after calling the method and how the iterator passed this element.

Well, it is quite logical that changes to the elements that the iterator has already passed will not be taken into account when searching the collection. And what will happen if you change the element to which the iterator has not “reached” yet or if you insert a new element into the collection?
Referring to MSDN (the Russian translation of this note is not very well done, so I will also insert a note in the original language):
The enumerator returned from the dictionary is safe to use simultaneously with reading from the dictionary and writing to it, however, it does not represent a snapshot of the dictionary. Content accessible through an enumerator may contain changes made to the dictionary after calling GetEnumerator.
')
It doesn’t represent a moment-in-time snapshot of the dictionary. This is an enumerator.

I, as a person with technical education, are confused by the wording “may contain”. Those. may contain, and may not contain? Let's check:

ConcurrentDictionary<int, string> dictionary = new ConcurrentDictionary<int, string>(); dictionary.TryAdd(0, "item0"); int x = 1; foreach (var element in dictionary) { var tmp = x++; if (!dictionary.TryAdd(tmp, "item" + tmp.ToString())) { throw new Exception("   "); } Console.WriteLine(element); } 

What will be displayed in the console? Will one element or program go into an infinite loop? Neither one nor the other. The following will be displayed:

[0, item0]
[1, item1]
[2, item2]
[3, item3]
[4, item4]
[5, item5]
[6, item6]
[7, item7]
[8, item8]
[9, item9]
[10, item10]
[11, item11]
[12, item12]
[13, item13]
[14, item14]
[15, item15]
[16, item16]

There were no exceptions, respectively, the 18th element was successfully inserted into the collection, but the iterator did not see it, why?

Let's take a look at the sources of this collection, namely the implementation of the GetEnumerator method:

 public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator() { Node[] buckets = m_tables.m_buckets; for (int i = 0; i < buckets.Length; i++) { // The Volatile.Read ensures that the load of the fields of 'current' doesn't move before the load from buckets[i]. Node current = Volatile.Read<Node>(ref buckets[i]); while (current != null) { yield return new KeyValuePair<TKey, TValue>(current.m_key, current.m_value); current = current.m_next; } } } 

The m_tables field is marked with the volatile keyword, so a change in the Node [] m_buckets array it contains is visible to all threads. Each element of this array represents the first element in a single-linked list and contains a link to the next element in the list. Further, it is easy to guess that as long as adding / changing elements leads to a change in the simply linked lists themselves, the iterator "sees" these changes, but the changes in the array itself are not visible to the iterator.

The m_buckets array is changed in two cases. The first is to increase the size when inserting elements, the second is to call the Clear () method (resets the size of the array to its default value).
Update:
In order to answer the question when the size of the m_buckets array increases, I will make a small remark about the internal structure of ConcurrentDictionary.
To provide locks for the operations Add / Modify / Remove an item from the collection, there is an array object [] m_locks, its default size is 4 * number of processors (with each increase of m_tables.m_buckets, the size of the array with objects for locks is doubled).
There is also an int m_budget field that defines the maximum number of elements per lock. It is calculated as follows: m_buckets.Length / m_locks.Length.
The number of items for each lock is contained in the int [] m_countPerLock field, which is incremented when a new item is added to the linked list, and decremented when the item is removed from the list.
Now back to the m_buckets array expansion condition. It increases after the condition tables.m_countPerLock [lockNo]> m_budget, i.e. when the number of elements per lock exceeds the maximum allowed number. It should be noted that this check occurs at the end of the element's insert method, and the size of the internal m_buckets collection will be changed after the current item has been inserted into it.
In my example: I have 4 processors, respectively, the size of the array is m_locks = 16, and m_budget = 31/16 = 1. As soon as we insert the 17th element, now there are 2 elements for one lock and the collection expands.
/ Update
The Update and Remove operations do not change the size of the array and, accordingly, these changes will always be visible to the iterator (of course, if we are talking about changing the element to which the iterator has not “reached” yet).

Conclusion

In spite of the fact that we now know when changes made during collection iteration are visible, and when not, you should not take this knowledge into account when programming with ConcurrentDictionary. It is best to adhere to the rule described on MSDN that the changes made may or may not be visible.

Source: https://habr.com/ru/post/242281/


All Articles