HashMap underlying principle implementation source code analysis


***

summary

HashMap is implemented based on the Map interface. Elements are stored in the form of key value pairs, and null values are allowed to be added. Because keys are not allowed to be repeated, only one key can be null. In addition, HashMap cannot guarantee the order in which elements are placed. It is disordered and cannot be the same as the order in which they are placed. HashMap is thread unsafe

Storage structure of HashMap

JDK 7 and previous versions: HashMap is an array + linked list structure (i.e. chain address method), and each node is an Entry [] object
After the release of JDK version 8: HashMap is implemented by array + linked list + red black tree. Each Node is a Node object

Important constants in HashMap source code

DEFAULT_ INITIAL_ Capability: default capacity of HashMap, 16
MAXIMUM_ Capability: maximum supported capacity of HashMap, 2 ^ 30
DEFAULT_LOAD_FACTOR: default load factor for HashMap
TREEIFY_THRESHOLD: if the length of the linked list in the Bucket is greater than the default value, it will be converted into a red black tree
UNTREEIFY_THRESHOLD: if the Node stored in the red black tree in the Bucket is less than the default value, it is converted into a linked list
MIN_ TREEIFY_ Capability: the minimum hash table capacity when the nodes in the bucket are trealized. (when the number of nodes in the bucket is so large that the tree needs to be reddened and black, if the capacity of the hash table is less than min_tree_capability, the resize expansion operation should be performed at this time. The value of min_tree_capability is at least 4 times that of tree_threshold.)
table: an array of storage elements, always to the nth power of 2
entrySet: a set that stores concrete elements
size: the number of key value pairs stored in the HashMap
modCount: the number of HashMap expansion and structure changes.
threshold: critical value of expansion, = capacity * filling factor
loadFactor: fill factor

//The default initialization capacity is 16, which must be the nth power of 2
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; //  16

//The maximum capacity is 2 ^ 30, a large number
static final int MAXIMUM_CAPACITY = 1 << 30;

//The default loading factor is 0.75, which is the value obtained by multiplying the array capacity. It is used to indicate how many elements need to be expanded when the number of elements reaches.
//Why set the value of 0.75? It is simply a trade-off between time and space.
//If it is less than 0.5, when the array length reaches half the size, it needs to be expanded, and the space utilization rate is greatly reduced,
//If it is greater than 0.5, such as 1, it will increase the probability of hash conflict and affect the query efficiency.
static final float DEFAULT_LOAD_FACTOR = 0.75f;

//Just mentioned that when the length of the linked list is too long, there will be a threshold. If it exceeds this threshold 8, it will be transformed into a red black tree
static final int TREEIFY_THRESHOLD = 8;

//When the number of elements on the red black tree is reduced to 6, it degenerates into a linked list
static final int UNTREEIFY_THRESHOLD = 6;

//To convert a linked list into a red black tree, in addition to the threshold limit, there is another limit. Only when the array capacity reaches at least 64 can it be trealized.
//This is to avoid the conflict between array expansion and treelization threshold.
static final int MIN_TREEIFY_CAPACITY = 64;

//Array for storing all nodes, main array
transient Node<K,V>[] table;

//Store all key value pairs
transient Set<Map.Entry<K,V>> entrySet;

//The number of actual key value pairs in the map, that is, the number of elements in the array
transient int size;

//Array expansion threshold
int threshold;

//Loading factor
final float loadFactor;

Inheritance relationship

public class HashMap<K,V>extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable

In fact, there is a little conscious that the inheritance relationship here is a little redundant. The Map interface is implemented in the parent class AbstractMap of HashMap. As a result, the Map interface is implemented again in HashMap. Repeat. You can talk to the interviewer about this

constructor

//Default parameterless construction, specifying a default load factor
public HashMap() {
	this.loadFactor = DEFAULT_LOAD_FACTOR; 
}

//The parametric structure of capacity can be specified, but it should be noted that the current capacity we specify is not necessarily the actual capacity
public HashMap(int initialCapacity) {
	//The default load factor is also used
	this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

//The capacity and loading factor can be specified, but the author does not recommend manually specifying a loading factor other than 0.75
public HashMap(int initialCapacity, float loadFactor) {
	if (initialCapacity < 0)
		throw new IllegalArgumentException("Illegal initial capacity: " +
										   initialCapacity);
	if (initialCapacity > MAXIMUM_CAPACITY)
		initialCapacity = MAXIMUM_CAPACITY;
	if (loadFactor <= 0 || Float.isNaN(loadFactor))
		throw new IllegalArgumentException("Illegal load factor: " +
										   loadFactor);
	this.loadFactor = loadFactor;
	//Here is to change the capacity we specify to a value greater than its minimum power of 2. If the capacity passed is 28, 32 will be returned
	//Note here that it is reasonable to assign the returned value to capacity, that is, to ensure that the array capacity is always the nth power of 2
	this.threshold = tableSizeFor(initialCapacity);
}

//You can pass in an existing map
public HashMap(Map<? extends K, ? extends V> m) {
	this.loadFactor = DEFAULT_LOAD_FACTOR;
	putMapEntries(m, false);
}

HashMap loading factor, load factor, why is the loading factor 0.75

The loading factor is set to 1: the space utilization is greatly satisfied. It is easy to collide and generate linked lists, resulting in low query efficiency
The loading factor is set to 0.5: low probability of collision, low probability of capacity expansion, low probability of generating linked list, high query efficiency and low space utilization

Why must the length of HashMap be 2^n

  1. H & (length-1) equivalent h%length operation. The premise of equivalence is that length must be an integer multiple of 2
  2. Prevent hash conflict and location conflict

Differences between HashMap JDK7 and JDK8

1. HashMap = new hashmap() in jdk8// By default, arrays with a length of 16 are not created first
2. When map.put() is called for the first time, an array with a length of 16 is created. JDK7 is created directly
3.JDK8 array is of Node type, which is called Entry type in jdk7
4. When forming the linked list structure, the newly added key value pair is at the tail of the linked list (seven up and eight down). JDK7 is the head interpolation method and JDK8 is the tail interpolation method
5. When the length of the linked list at the specified index position of the array is > 8 and the length of the array in the map is > 64, all key value pairs at this index position are stored using a red black tree.

Keywords: Java Interview Singleton pattern

Added by zman on Sat, 20 Nov 2021 08:45:13 +0200