ConcurrentMap_Crude
. For brevity, we will call it the Crude map. In this post I will explain the difference between the Crude map and the Linear map from the Junction library. Linear is the simplest map in Junction, supporting both resizing and deletion.m_cells
pointer and the m_cells
integer.m_root
, which points to the Table
structure, followed by the cells themselves as a single continuous block of memory.Table
structure stores a new shared cellsRemaining
counter, with an initial value of 75% of the table size. Each time a thread tries to add a new key, it first reduces cellsRemaining
. If cellsRemaining
becomes less than zero, then the table is full and it's time to transfer data to a new one.sizeMask
and cellsRemaining
in one atomic step, simply by redirecting the m_root
pointer.assign(2, "apple")
, reduces cellsRemaining
to 0.assign(14, "peach")
and reduces cellsRemaining
to -1. Migration required.assign(2, "banana")
on the old table. Since the cell already exists for this key, the function does not reduce cellsRemaining
. She will simply replace “apple” with “banana” in the old cell.m_root
pointer, thereby m_root
the changes in stream 1.get(2)
on the new table.get(2)
return “banana”, because the key was changed only by one thread, and this was the last value recorded. Unfortunately, get(2)
returns the old value of “apple”, which is wrong. So we need a different strategy.get
at all.value
field of each cell is replaced with a special Redirect
marker.assign
function can no longer blindly change the value
field of a cell. Instead, it must perform a read-modify-write ( read-modify-write ) operation on value
to avoid rewriting the Redirect
token if it has been set. If the function finds a Redirect
token corresponding to value
, then the new table already exists and the operation must be performed on it already.Redirect
for the old cell and at the same time copy its old value to the new cell. However, we can ensure the integrity of the data by transferring each value in the loop. Linear map uses the following loop:value
immediately after the stream responsible for the migration reads it, since the Redirect
marker has not yet been set. In this case, when the stream transferring the data tries to set the Redirect
marker using CAS , CAS returns an error, and the stream tries to perform the operation again, already with a new value. As long as the value
in the source table changes, the data transfer stream will continue to try, but sooner or later the transfer will be successful. This approach allows multithreaded get calls to safely find values in a new table, while parallel assign
calls cannot modify a new table until the migration is complete (Cliff Click hash map has no such limit, so there are a few more steps in its migration cycle).get
calls do not read from the new table until all data has been transferred. Therefore, in the latest version of the cycle is not necessary; Migration could be accomplished by performing an atomic exchange of value
in the source table and then simply saving the value
in the new table (I realized this when I wrote this post). Now, if the get
call bumps into Redirect
, it will help complete the migration. Perhaps if he immediately read the value from the new table instead, our solution would be more scalable. This is a topic for further research.Table
structure that I haven't mentioned yet. One of them is a jobCoordinator
. During data migration, a jobCoordinator
points to a TableMigration
object that reflects the migration process. A new table is stored in it until it is shared with m_root
. I will not go into details, but the jobCoordinator
allows several threads to participate in the migration process.TableMigration
object. That is why each Table
object has a mutex (Cliff's Click clipping is also different here. It optimistically allows the flow of threads to create new tables).erase
function, because it is simple: it simply replaces the cell value
with a special NullValue
value, which we have already used to initialize it. The hash
field, however, remains unchanged. This means that, sooner or later, the table may become overflowed with deleted cells, but when moving to a new table, they will be cleared. There are also a few subtleties regarding the choice of the dimension of the new table, but for now I’ll omit these details.Oh, and come to work with us? :)wunderfund.io is a young foundation that deals with high-frequency algorithmic trading . High-frequency trading is a continuous competition of the best programmers and mathematicians of the whole world. By joining us, you will become part of this fascinating fight.
We offer interesting and challenging data analysis and low latency tasks for enthusiastic researchers and programmers. Flexible schedule and no bureaucracy, decisions are quickly made and implemented.
Join our team: wunderfund.io
Source: https://habr.com/ru/post/324060/
All Articles