📜 ⬆️ ⬇️

Swift dictionary implementation

image

Swift dictionary is a container that stores multiple values ​​of the same type. Each value is associated with a unique key that acts as an identifier for this value within the dictionary. Unlike elements in an array, elements in the dictionary do not have a specific order. Use a dictionary when you need to search for values ​​based on their identifier, just like in the real world a dictionary is used to look up the definition of a specific word. (note)

Swift dictionary:
')

There are several ways to store data records (keys, values), one of which involves open addressing by means of linear probing required to run the Swift dictionary.

Consider an example of a dictionary designed for 8 elements: it provides a maximum of 7 entries (keys, values) and at least one empty space (the so-called space) in the dictionary buffer, thanks to which a sort of blocking of the search by selections / inserts (retrivals / insertions).

In memory, it will look like this:

image

Permanent memory is given by:

size = capacity * (sizeof(Bitmap) + sizeof(Keys) + sizeof(Values))

If we systematize the data logically, we get something like:

image

Each column forms a memory segment in which three parameters are fixed: bitmap value, key and value.

The bitmap value in the segment indicates whether the key and the corresponding value are valid and determined. If the data is invalid / uninitialized, the segment is called a space.

_HashedContainerStorageHeader (struct)
image

This structure acts as a header for the future memory buffer. It contains the capacity and information about the volume, where the capacity is the actual capacity of the allocated memory, and the information about the volume is the number of actual elements currently in the buffer.

maxLoadFactorInverse is a coefficient that allows the buffer to be expanded if necessary.

_NativeDictionaryStorageImpl<Key, Value> (class)


Subclass of ManagedBuffer<_HashedContainerStorageHeader, UInt8>

The main task of this class is to allocate memory that is optimal for the dictionary and create pointers to bitmap, keys and values, which will then need to be processed. For example:

 internal var _values: UnsafeMutablePointer<Value> { @warn_unused_result get { let start = UInt(Builtin.ptrtoint_Word(_keys._rawValue)) &+ UInt(_capacity) &* UInt(strideof(Key.self)) let alignment = UInt(alignof(Value)) let alignMask = alignment &- UInt(1) return UnsafeMutablePointer<Value>( bitPattern:(start &+ alignMask) & ~alignMask) } } 

Since the memory Bitmap, keys and values ​​are adjacent, the values ​​for pointers will be:

 internal class func create(capacity: Int) -> StorageImpl { let requiredCapacity = bytesForBitMap(capacity) + bytesForKeys(capacity) + bytesForValues(capacity) let r = super.create(requiredCapacity) { _ in return _HashedContainerStorageHeader(capacity: capacity) } let storage = r as! StorageImpl let initializedEntries = _BitMap( storage: storage._initializedHashtableEntriesBitMapStorage, bitCount: capacity) initializedEntries.initializeToZero() return storage } 

In the process of creating a buffer, all entries in bitmap are equated to zero, that is, all memory segments, in fact, are spaces. It is also worth noting that for such a method of storing information, the general key type of this class does not necessarily have to be Hashable.

// : , ,
// (: Hashable) -
// , Hashable.


image

_NativeDictionaryStorage<Key : Hashable, Value> ()


This structure captures the stored information in the buffer and selects a function that converts the adjacent allocated memory into an array of a logical segment. Also note that the end of the segment array is associated with the beginning of the array, thereby forming a logical ring. And that means if you are at the end of the segment array and are requesting the next function, in the end, you will be redirected to the beginning of the array.

In order to insert or set a value using a key, it is necessary to find out which segment this key belongs to. Since the key is Hashable, the hashValue function is suitable, the result of which is Int. True, the specified hashValue may exceed the capacity of the dictionary buffer, and therefore, you may have to shrink it from zero to the maximum value of the capacity:

 @warn_unused_result internal func _bucket(k: Key) -> Int { return _squeezeHashValue(k.hashValue, 0..<capacity) } 

We can move between segments by accessing the _next and _prev . Capacity is always a number that can be expressed in binary. If we are at the end of the segment array, if desired, the rollback to the initial position is specified via the _next call.

 internal var _bucketMask: Int { // The capacity is not negative, therefore subtracting 1 will not overflow. return capacity &- 1 } @warn_unused_result internal func _next(bucket: Int) -> Int { // Bucket is within 0 and capacity. Therefore adding 1 does not overflow. return (bucket &+ 1) & _bucketMask } @warn_unused_result internal func _prev(bucket: Int) -> Int { // Bucket is not negative. Therefore subtracting 1 does not overflow. return (bucket &- 1) & _bucketMask } 

To add a new item, we need to find out which segment it belongs to and then call a function that inserts the corresponding key and value into the selected segment, as well as setting the actual parameters for writing bitmap.

 @_transparent internal func initializeKey(k: Key, value v: Value, at i: Int) { _sanityCheck(!isInitializedEntry(i)) (keys + i).initialize(k) (values + i).initialize(v) initializedEntries[i] = true _fixLifetime(self) } 


_find function




 ///        . /// ///   ,      /// . @warn_unused_result internal func _find(key: Key, _ startBucket: Int) -> (pos: Index, found: Bool) { var bucket = startBucket // The invariant guarantees there's always a hole, so we just loop // until we find one while true { let isHole = !isInitializedEntry(bucket) if isHole { return (Index(nativeStorage: self, offset: bucket), false) } if keyAt(bucket) == key { return (Index(nativeStorage: self, offset: bucket), true) } bucket = _next(bucket) } } 



That is why, when analyzing the values ​​of the hash, Hashable acts as an almost ideal function that guarantees high performance.



Visually we get, approximately, the following:

image

_NativeDictionaryStorageOwner (class)


This class is designed to track the correct counting of footnotes for copies made during the recording process. Since the data is saved using the Dictionary and DictionaryIndex options, the link count will assign the entered information to 2, but if the dictionary belongs to the above class and the number of links for it is monitored, you have nothing to worry about.

/// This class is the launch ID COW. is he
/// is needed solely to separate unique reference counters for:
/// - `Dictionary` and` NSDictionary`,
/// - `DictionaryIndex`.
///
/// This is important because the uniqueness check for COW is carried out exclusively
/// to identify the counters of the first type.

So it does not look very nice:

image

_VariantDictionaryStorage<Key : Hashable, Value> ( )


This enumerated type works with two Native / Cocoa statements depending on the data stored in them.

 case Native(_NativeDictionaryStorageOwner<Key, Value>) case Cocoa(_CocoaDictionaryStorage) 

Main characteristics:



 /// -  idealBucket:  ,    . /// -  offset:  ,   . ///    . internal mutating func nativeDeleteImpl( nativeStorage: NativeStorage, idealBucket: Int, offset: Int ) { _sanityCheck( nativeStorage.isInitializedEntry(offset), "expected initialized entry") //   nativeStorage.destroyEntryAt(offset) nativeStorage.count -= 1 //       , //         . var hole = offset //      var start = idealBucket while nativeStorage.isInitializedEntry(nativeStorage._prev(start)) { start = nativeStorage._prev(start) } //      var lastInChain = hole var b = nativeStorage._next(lastInChain) while nativeStorage.isInitializedEntry(b) { lastInChain = b b = nativeStorage._next(b) } //  ,      ,     , //        . while hole != lastInChain { //     ,        // ,     . var b = lastInChain while b != hole { let idealBucket = nativeStorage._bucket(nativeStorage.keyAt(b)) //      ?   //         [, ] //    let c0 = idealBucket >= start let c1 = idealBucket <= hole if start <= hole ? (c0 && c1) : (c0 || c1) { break // Found it } b = nativeStorage._prev(b) } if b == hole { // No out-of-place elements found; we're done adjusting break } //       nativeStorage.moveInitializeFrom(nativeStorage, at: b, toEntryAt: hole) hole = b } } 


image

And this is what our dictionary looks like.


(just one type of listed data)
image

Source: https://habr.com/ru/post/275911/


All Articles