Original website: Java--HashMap principle -- capacity expansion mechanism_ CSDN blog
brief introduction
This article introduces how to expand the capacity of Java HashMap.
When to expand capacity
HashMap is lazy loading. After the HashMap object is constructed, if put is not used to insert elements, HashMap will not initialize or expand the table.
When calling the put method for the first time, HashMap will find that table is empty and then call resize method to initialize it.
When the put method is not called for the first time, if HashMap finds that the size (array size) is greater than the threshold (the size of the current array multiplied by the value of the loading factor), it will call the resize method to expand the capacity.
The array cannot be expanded automatically, so we can only change a larger array to fill the previous elements and the new elements to be added.
resize() overview
- Judge whether the capacity of the old array before capacity expansion has reached the maximum (2 ^ 30)
- If it is reached, the threshold will be changed to the maximum value of Integer (2 ^ 31 - 1), and the capacity will not be expanded in the future.
- If not, modify the size of the array to twice the original size
- Create a new array with the new array size (node < K, V > [])
- Transfer the data to a new array (node < K, V > [])
- Not all nodes have to change positions.
- For example, the size of the original array is 16 and 32 After capacity expansion. If there are two data with hash values of 1 and 17, their remainder of 16 is 1, which is in the same bucket; After capacity expansion, the surplus of 1 pair of 32 is still 1, while the surplus of 17 pair of 32 is 17, which needs to be changed.
- The corresponding code is: if ((e.hash & oldcap) = = 0). If it is true, there is no need to change the location.
- For example, the size of the original array is 16 and 32 After capacity expansion. If there are two data with hash values of 1 and 17, their remainder of 16 is 1, which is in the same bucket; After capacity expansion, the surplus of 1 pair of 32 is still 1, while the surplus of 17 pair of 32 is 17, which needs to be changed.
- Not all nodes have to change positions.
- Returns a new node < K, V > [] array
class | Initial capacity | Maximum capacity | Multiple during capacity expansion | Loading factor | Bottom implementation |
HashMap | 2^4 | 2^30 | n * 2 | 0.75 | Map.Entry |
HashSet | 2^4 | 2^30 | n * 2 | 0.75 | HashMap<E,Object> |
HashTable | 11 | Integer.MAX_VALUE - 8 | n*2 + 1 | 0.75 | Hashtable.Entry |
In HashMap, the length and size of the hash bucket array table must be the nth power of 2 (non prime), which is an unconventional design. The conventional design is to design the bucket size as prime. Relatively speaking, the probability of conflict caused by prime numbers is less than that of non prime numbers. The specific proof can be referred to http://blog.csdn.net/liuqiyao_01/article/details/14475159 , the initial bucket size of Hashtable is 11, which is an application in which the bucket size is designed as a prime number (it cannot be guaranteed to be a prime number after Hashtable capacity expansion).
HashMap adopts this unconventional design, which is mainly to optimize the modulus and capacity expansion. At the same time, in order to reduce the conflict, HashMap also adds the process of high-order participation in the operation when locating the index position of hash bucket.
Source code
HashMap#resize()
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4 static final int MAXIMUM_CAPACITY = 1 << 30; static final float DEFAULT_LOAD_FACTOR = 0.75f; transient Node<K,V>[] table; int threshold; final float loadFactor; final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold } else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; if (oldTab != null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null) { oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) // Red black tree ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { // preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { // Key point 1: judge whether the node needs to change its position in the array after resizing if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); // Key point 2: split and reorganize the linked list in a node into two linked lists: one needs to change the position, and the other does not need to change the position if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }
Other websites
HashMap's capacity expansion mechanism -- resize()_ Data structure and algorithm_ Pan Jiannan's blog - CSDN blog
HashMap expansion mechanism - brief book
HashMap's capacity expansion mechanism ------ resize()_java_IM_MESSI blog - CSDN blog
hashMap capacity expansion mechanism_ java_mengyue000 blog - CSDN blog