Analyze and understand jdk1 put() method of HashMap, version 8 Java collection

HashMap is the most commonly used key value collection in java. It is necessary to understand its principle and underlying code. Today, record the of HashMap Research and analysis of put() method (element addition method);

First, let's talk about the results of personal research and analysis:

  1. During instance initialization, HashMap does not initialize the container for storing elements (red black tree of index group linked list in version 1.8 and array plus linked list in version 1.7), but assigns values to relevant attributes according to the transmission parameters. The real initialization of the container is implemented when the put() method is called. Initialization parameter passing explanation: initialCapacity is related to the initial value of the container. If no parameter is specified, the default value of the initial value of the container is default_ INITIAL_ Capability = 16. If the parameter is transferred, the value greater than or equal to the power of the minimum 2 of the parameter value is used. For example, if the value is 5, the container initialization value is the power of 2, that is, 8. In addition, the size of the external value is limited and cannot be greater than MAXIMUM_CAPACITY = 1 < < 30. If the current value is exceeded, change the value of initialCapacity to maximum_ The value of capability. loadFactor is related to capacity expansion. If no transfer parameter is specified, the default value is DEFAULT_LOAD_FACTOR=0.75f.
  2. Through the put() method, it is found that the data structure stored in the HashMap element is an array. In case of hash conflict, it will become an array linked list. By default, when the length of the linked list is greater than or equal to 8, the linked list will be turned into a red black tree. Verified jdk1 8 the HashMap element storage structure is an array plus linked list + red black tree.
  3. The put() method has a return value. The return value returns different values according to different situations. When the key of the added element does not exist in the container, the element is returned as empty. When the key of the added element exists in the container, the element in the container is replaced and the original old value element in the container is returned

From new HashMap to call put() method to analyze the underlying source code and verify the above conclusion:

       //Instantiate a HashMap. In the IDEA editor, click the HashMap with Ctrl + left mouse button to view the source code
        HashMap<String,String> s=new HashMap();
        //Add two elements with the same key and different value
        Object put = s.put("key key","First value" );
        Object put1 = s.put("key key","Second value");
        System.out.println(put);
        System.out.println(put1);
        //The output result put = null. The key and value of put1 are "key" and "first value" respectively

Analyze the relevant source code of HashMap constructor:

 //Parameter structure, and input the initialization capacity value and loading factor ****** 1 respectively
 public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);//Sets the initial container size based on the passed in constant value
  }

//Parameter structure incoming initialization capacity *********** 2     
  public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);//The 1 construct is called internally, and the load factor uses the default value
  }
//Null parameter construction ********** 3 all default values
  public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }
//A parameterized construct is passed into another map set (for the time being, the current constructor is analyzed more, and the principle is the same)
  public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;//The load factor uses the default value and initializes the capacity value (m.size()/loadFactor)+1.0f 
        putMapEntries(m, false);//Put all the other map collection elements into the current map
    }

Through the above source code, we can find that when the hashMap instance is initialized, the size of the container is not initialized, but the relevant attributes are assigned. Moreover, the initialization capacity value we passed in is not directly assigned to the initialization capacity related attribute threshold. Instead, we call a method called tableSizeFor(int cap) to pass in the initialCapacity value for processing, and then return a value to threshold Looking at the source code of this method, we will find that the value greater than or equal to the power of the minimum 2 of the parameter initialCapacity is obtained through multiple displacements and bitwise or operations.

The source code of tableSizeFor(int cap) is as follows:

   /**
     * Returns a power of two size for the given target capacity(This is an admirable algorithm to obtain the least quadratic number of parameters
     */
    static final int tableSizeFor(int cap) {
        int n = cap - 1;//Subtracting 1 here is to prevent cap itself from being the power of 2. For example, if cap is 4, the following displacement operation will get the wrong value of 8 This is wrong. Because 4 itself is the power of 2
        n |= n >>> 1;//This code means to shift the unsigned right by one bit, and then perform bitwise OR operation with the original to get a new n value. For example, if the original n is 3 and the binary representation of the computer is 0000 0011, moving one bit to the right will become 0000 0001. This is the two bits or the result is 0000 0011. When converted to decimal, it is 3
        n |= n >>> 2;//The same is true below
        n |= n >>> 4;//The same is true below
        n |= n >>> 8;//The same is true below
        n |= n >>> 16;//The same is true below
        //Use the ternary operator to judge that when the obtained n is less than 0, it will return 1. If it is greater than the maximum default value, it will return the maximum value. If not, it will return n + 1 (that is, the least quadratic number greater than or equal to cap, that is, the least quadratic number greater than or equal to the value of initialCapacity you initialized)
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

Analyze the source code of the put() method:

Click on the source code of put () method, and we will find that only putVal() method and hash(key) method are called internally, and the return value of concurrent putVal() method is returned. II. hash(key), as the name suggests, only hash the key, that is, the real method to add elements is the putVal() method.

 public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

So let's simply analyze the hash(key) method, and focus on the putVal() method

//When the key is null and returns 0, if the key is not empty, the value returned by the hash method of the key is the real hashcode of the key. The advantage of this is that it will make the obtained subscripts more hash and help reduce hash conflicts
static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
/**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
 final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
//Here, assign table to tab and judge whether it is empty. If tab is not empty, assign the length of tab to n and judge Here table refers to a container The purpose here is to see if the put method is called for the first time. If so, call the resize () method to resize the container to complete the initialization of the container (at this time, the container wants to be a one-dimensional array) and obtain the length of the elements in the container and assign it to n. The resize () method will be described later.
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
//Obtain the maximum subscript of the current tab array and the current incoming element hash(key) for bitwise sum operation. Judge whether the i of tab is the existence of an element (the position where the new element should be stored is obtained through the sum operation of (n - 1) & hash). If there is no element in this position, a new node will pass in the current element, otherwise, the next step
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
//Judge whether the hash value of p element, that is, tab [i = (n - 1) & hash] element, the hash value of the currently inserted element and their keys are consistent. If they are consistent, assign the value of p to e for later element replacement (solve the case where the key is the same but the value is different)
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
//It is used to deal with the situation that a red black tree has been generated
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
//If the hash value of the p element, that is, the tab [i = (n - 1) & hash] element, is inconsistent with the hash value of the currently inserted element and their key s, a linked list form is generated.
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {//The next element of p element is assigned to e, and judge whether it is empty
                        p.next = newNode(hash, key, value, null);//If it is empty, you can directly insert the current inserted element into the node
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st / / here is the judgment. If the number of circular lookups is greater than the threshold of turning to red black tree (8), start turning the linked list into red black tree
                            treeifyBin(tab, hash);
                        break;//Jump out
                    }
//Here is to judge whether the hash value of p.next and the hash value of the currently inserted element and their key s are consistent if p.next is not equal to e when it is empty. If they are consistent, jump out of the current cycle and process e later
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;//Jump out
                    p = e;//Point p to the next node of p 
                }
            }
//This is the processing method when there is a key in the container and the key hash value of the element to be inserted is the same as the value
            if (e != null) { // existing mapping for key
                V oldValue = e.value;//Copy the old value in the container to oldValue
                if (!onlyIfAbsent || oldValue == null)//onlyIfAbsent is a flag. The default value is false. Generally, no changes are made
                    e.value = value;//Assign the current element value to e.value
                afterNodeAccess(e);//This code doesn't make sense here. The method body is empty
                return oldValue; //The put method returns the old value after completion
            }
        }
        ++modCount;//This is equivalent to version verification, which is generally used for fast failure collection class elements
        if (++size > threshold)//Used for capacity expansion. The number of elements in the current container is greater than the initial capacity value threshold
            resize();//Call the resize method to resize the container
        afterNodeInsertion(evict);//This code doesn't make sense here. The method body is empty
        return null;//The value returned when a new key is created

    }

Keywords: Java Algorithm data structure HashMap

Added by a1amattyj on Wed, 19 Jan 2022 00:01:02 +0200