Greetings to you!
I keep trying to visualize data structures in Java. In previous episodes, we’ve already reviewed
ArrayList and
LinkedList , but today we’ll look at HashMap.

')
HashMap - is based on hash tables, implements the Map interface (which means storing data in the form of key / value pairs). Keys and values can be of any type, including null. This implementation does not guarantee the order of the elements over time. Collision resolution is performed using the chaining method .Object creation
Map<String, String> hashmap = new HashMap<String, String>();
Footprint{Objects=2, References=20, Primitives=[int x 3, float]}
Object size: 120 bytes
The newly appeared hashmap object contains a number of properties:
- table - An array of type Entry [] , which is a repository of references to lists (chains) of values;
- loadFactor - load factor. The default value of 0.75 is a good compromise between the access time and the amount of stored data;
- threshold - The limit on the number of elements that, when reached, will double the size of the hash table. Calculated by the formula (capacity * loadFactor) ;
- size - The number of HashMap elements;
In the constructor, the validity of the passed parameters is checked and the values are set to the appropriate class properties. In short, nothing unusual.

You can specify your capacity and load factor using the
HashMap (capacity) and
HashMap (capacity, loadFactor) constructors . The maximum capacity you can set is half the maximum
int value (1073741824).
Adding items
hashmap.put("0", "zero");
Footprint{Objects=7, References=25, Primitives=[int x 10, char x 5, float]}
Object size: 232 bytes
When adding an item, the sequence of steps is as follows:
- First, the key is checked for null equality. If this check returns true, the putForNullKey (value) method will be called ( we will consider the option with adding a null key a little later ).
- Next, a hash is generated based on the key. For generation, the hash (hashCode) method is used, to which key.hashCode () is passed .
static int hash(int h) { h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); }
A comment from the source explains which results to expect - the hash (key) method ensures that the resulting hash codes will have only a limited number of collisions (approximately 8, with the default load factor).
In my case, for the key with the value “0” the hashCode () method returned the value 48, as a result:
h ^ (h >>> 20) ^ (h >>> 12) = 48 h ^ (h >>> 7) ^ (h >>> 4) = 51
- Using the indexFor (hash, tableLength) method , the position in the array where the element will be placed is determined.
static int indexFor(int h, int length) { return h & (length - 1); }
With the hash value of 51 and the size of table 16 , we get the index in the array:
h & (length - 1) = 3
- Now, knowing the index in the array, we get a list (chain) of elements attached to this cell. The hash and key of the new item are compared in turn with the hashes and keys of the items from the list and, if these parameters match, the value of the item is overwritten.
if (e.hash == hash && (e.key == key || key.equals(e.key))) { V oldValue = e.value; e.value = value; return oldValue; }
- If the previous step did not reveal matches, the addEntry method (hash, key, value, index) will be called to add a new item.
void addEntry(int hash, K key, V value, int index) { Entry<K, V> e = table[index]; table[index] = new Entry<K, V>(hash, key, value, e); ... }

To demonstrate how the HashMap is populated, let's add some more elements.
hashmap.put("key", "one");
Footprint{Objects=12, References=30, Primitives=[int x 17, char x 11, float]}
Object size: 352 bytes
- Skipped, key is not null
- '' key ''. hashCode () = 106079
h ^ (h >>> 20) ^ (h >>> 12) = 106054 h ^ (h >>> 7) ^ (h >>> 4) = 99486
- Determining the position in the array
h & (length - 1) = 14
- No items found
- Add item

hashmap.put(null, null);
Footprint{Objects=13, References=33, Primitives=[int x 18, char x 11, float]}
Object size: 376 bytes
As mentioned above, if you added null as a key when adding an item, the actions will be different. The
putForNullKey (value) method will be called, inside which there is no call to the
hash () and
indexFor () methods (because all items with null keys are always placed in
table [0] ), but there are such actions:
- All chaining elements bound to table [0] are alternately scanned for the element with the null key. If such an element in the chain exists, its value is overwritten.
- If the element with the null key was not found, the already familiar addEntry () method will be called.
addEntry(0, null, value, 0);

hashmap.put("idx", "two");
Footprint{Objects=18, References=38, Primitives=[int x 25, char x 17, float]}
Object size: 496 bytes
Now consider the case when a collision occurs when adding an element.
- Skipped, key is not null
- '' idx ''. hashCode () = 104125
h ^ (h >>> 20) ^ (h >>> 12) = 104100 h ^ (h >>> 7) ^ (h >>> 4) = 101603
- Determining the position in the array
h & (length - 1) = 3
- No items found
- Add item

Resize and Transfer
When the
table [] array is filled to the limit value, its size is doubled and the elements are redistributed. As you can see for yourself, there is nothing difficult in the
resize (capacity) and
transfer (newTable) methods .
void resize(int newCapacity) { if (table.length == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable); table = newTable; threshold = (int)(newCapacity * loadFactor); }
The
transfer () method
enumerates all elements of the current storage, recalculates their indices (taking into account the new size) and redistributes the elements over the new array.
If
we add, say, 15 more elements to the original
hashmap , then the size will be increased and the distribution of elements will change.

Deleting items
HashMap has the same “problem” as ArrayList - when deleting elements, the size of the array
table [] does not decrease. And if the
trimToSize () method is provided in the ArrayList, then there are no such methods in the HashMap (although, as one colleague said, “
Or maybe it is not necessary? ”).
A small test to demonstrate what is written above. The source object is 496 bytes. Add, for example, 150 elements.
Footprint{Objects=768, References=1028, Primitives=[int x 1075, char x 2201, float]}
Object size: 21064 bytes
Now remove the same 150 items, and measure again.
Footprint{Objects=18, References=278, Primitives=[int x 25, char x 17, float]}
Object size: 1456 bytes
As you can see, the size did not even come close to the original. If there is a desire / need to correct the situation, you can, for example, use the
HashMap (Map) designer.
hashmap = new HashMap<String, String>(hashmap);
Footprint{Objects=18, References=38, Primitives=[int x 25, char x 17, float]}
Object size: 496 bytes
Iterators
HashMap has built-in iterators such that you can get a list of all
keySet () keys, all
values () values, or all key / value
entrySet () pairs. Below are some options for iterating over elements:
It is worth remembering that if during the work of the HashMap iterator was changed (without using the iterator’s own methods), then the result of iterating over the elements will be unpredictable.
Results
- Adding an element is performed in O (1) time, because new elements are inserted at the beginning of the chain;
- The operations of obtaining and deleting an element can be performed in O (1) time, if the hash function evenly distributes the elements and there are no collisions. The mean running time will be Θ (1 + α), where α is the load factor. In the worst case, the execution time can be Θ (n) (all elements in one chain);
- Keys and values can be of any type, including null. For storage of primitive types, the corresponding wrapper classes are used;
- Not synchronized.
Links
HashMap sourceHashMap source from JDK7
Sources JDK
OpenJDK & trade 6 Source Release - Build b23Measurement tools -
memory-measurer and
Guava (Google Core Libraries).