Analysis of core source code of TreeMap and LinkedHashMap

Before understanding TreeMap, let's take a look at the two sorting methods in daily work as the basic reserve for our study. The codes of the two methods are as follows:

public class TreeMapDemo {
 
  @Data
  // Objects sorted by DTO for us
  class DTO implements Comparable<DTO> {
    private Integer id;
    public DTO(Integer id) {
      this.id = id;
    }
 
    @Override
    public int compareTo(DTO o) {
      //Default sort from small to large
      return id - o.getId();
    }
  }
 
  @Test
  public void testTwoComparable() {
    // The first sort, sort from small to large, implements the compareTo method of Comparable to sort
    List<DTO> list = new ArrayList<>();
    for (int i = 5; i > 0; i--) {
      list.add(new DTO(i));
    }
    Collections.sort(list);
    log.info(JSON.toJSONString(list));
 
    // The second sort, from large to small, uses the external sorter Comparator to sort
    Comparator comparator = (Comparator<DTO>) (o1, o2) -> o2.getId() - o1.getId();
    List<DTO> list2 = new ArrayList<>();
    for (int i = 5; i > 0; i--) {
      list2.add(new DTO(i));
    }
    Collections.sort(list,comparator);
    log.info(JSON.toJSONString(list2));
  }
}

The results of the first sort output from small to large are: [{id: 1}, {id: 2}, {id: 3}, {id: 4}, {id: 5}];

The result of the second output is just the opposite. The result is: [{id: 5}, {id: 4}, {id: 3}, {id: 2}, {id: 1}].

The above two methods are sorting through Comparable and Comparator respectively, and TreeMap uses this principle to sort key s. Let's take a look at it together.

Overall TreeMap architecture

The underlying data structure of TreeMap is the red black tree, which is the same as the red black tree structure of HashMap.

The difference is that TreeMap takes advantage of the small left node and large right node of the red black tree. It sorts according to the key, so that each element can be inserted into the appropriate size position of the red black tree, and maintains the size relationship of the key. It is suitable for scenes where the key needs to be sorted.

Because the underlying layer uses a balanced red black tree structure, the time complexity of containsKey, get, put, remove and other methods is log(n).

attribute

// Comparator. If there is an external comparator, the external comparator is preferred
// If there is no external comparator, the Comparable#compareTo method implemented by key is used
private final Comparator<? super K> comparator;

// Root node of red black tree
private transient Entry<K,V> root;

// Number of elements in red black tree
private transient int size = 0;

// Modification times, used for rapid failure of iterative process
private transient int modCount = 0;

// Red black tree node
static final class Entry<K,V> implements Map.Entry<K,V> {
    K key;
    V value;
    Entry<K,V> left;
    Entry<K,V> right;
    Entry<K,V> parent;
    boolean color = BLACK;
    
    Entry(K key, V value, Entry<K,V> parent) {
        this.key = key;
        this.value = value;
        this.parent = parent;
    }
    ......
}

New node

Let's take a look at the steps of adding new nodes to TreeMap:

public V put(K key, V value) {
    Entry<K,V> t = root;
    // The red black tree node is empty. Create a new one directly
    if (t == null) {
        // The compare method restricts the key from being null
        compare(key, key); // type (and possibly null) check
		// Become root node
        root = new Entry<>(key, value, null);
        size = 1;
        modCount++;
        return null;
    }
    int cmp;
    Entry<K,V> parent;
    // split comparator and comparable paths
    Comparator<? super K> cpr = comparator;
    if (cpr != null) {
        // Spin to find the location where the key should be added, that is, it should be mounted on the head of that node
        do {
            // At the end of a cycle, the parent is the object compared last time
            parent = t;
            // Compare the size of key s through compare
            cmp = cpr.compare(key, t.key);
            // If the key is less than t, assign the value on the left of t to t, because the value on the left of the red black tree is relatively small, and the cycle is compared again
            if (cmp < 0)
                t = t.left;
            // If the key is greater than t, assign the value on the right side of t to t, because the value on the right side of the red black tree is relatively large, and the cycle is compared again
            else if (cmp > 0)
                t = t.right;
            // If equal, directly overwrite the original value
            else
                return t.setValue(value);
            // t is empty, indicating that the leaf node has been reached
        } while (t != null);
    }
    else {
        if (key == null)
            throw new NullPointerException();
        @SuppressWarnings("unchecked")
            Comparable<? super K> k = (Comparable<? super K>) key;
        do {
            parent = t;
            cmp = k.compareTo(t.key);
            // cmp represents the size of the last comparison, less than 0, and e is on the left of the previous node
            if (cmp < 0)
                t = t.left;
            //cmp represents the size of the last comparison, greater than 0, and e is on the right of the previous node
            else if (cmp > 0)
                t = t.right;
            else
                return t.setValue(value);
        } while (t != null);
    }
    Entry<K,V> e = new Entry<>(key, value, parent);
    if (cmp < 0)
        parent.left = e;
    else
        parent.right = e;
    fixAfterInsertion(e);
    size++;
    modCount++;
    return null;
}

From the source code, we can see:

When adding a new node, it takes advantage of the small left and large right characteristics of the red black tree to keep looking down from the root node until the node is found to be null. If the node is null, it indicates that it has reached the leaf node;
During the search process, the key value is found to exist and can be overwritten directly;
TreeMap forbids that the key is null.

TreeMap is relatively simple. The red black tree is similar to HashMap. The key is to compare the size of keys through compare, and then use the small left and large right characteristics of red black tree to find their own location for each key, so as to maintain the size sorting order of keys.

LinkedHashMap overall architecture

HashMap is out of order. TreeMap can be sorted by key. Can a tree Map maintain the insertion order? Next, let's take a look at LinkedHashMap.

LinkedHashMap itself inherits from HashMap, so it has all the features of HashMap. On this basis, it also provides two major features:

Access according to the insertion order;
It realizes the function of deleting the keys with the least access first. Its purpose is to automatically delete the keys that have not been accessed for a long time.

LinkedHashMap linked list structure

Let's take a look at the new attributes of LinkedHashMap to achieve the of linked list structure:

// Inherit HashMap Node, which adds before and after attributes to each element of the array
static class Entry<K,V> extends HashMap.Node<K,V> {
    Entry<K,V> before, after;
    Entry(int hash, K key, V value, Node<K,V> next) {
        super(hash, key, value, next);
    }
}

// Chain header node
transient LinkedHashMap.Entry<K,V> head;

// End of list node
transient LinkedHashMap.Entry<K,V> tail;

// Fields that control the two access modes. The default value is false
// true: put frequently accessed key s at the end of the queue according to the access order
// false: provides access in the order of insertion
final boolean accessOrder;

It can be seen from the new attributes of the above Map that the data structure of LinkedHashMap is like replacing each element of LinkedList with the Node of HashMap, which is like a combination of the two. It is precisely because these structures are added that the elements of Map can be connected in series to form a linked list, and the linked list can ensure the order, You can maintain the order in which elements are inserted.

How to add in order

When LinkedHashMap initializes, the default accessOrder is false, that is, it will provide access in the insertion order. The insertion method uses the put method of the parent class HashMap, but overrides the newNode/newTreeNode and afterNodeAccess methods invoked in the put method execution.

The newNode/newTreeNode method controls the addition of new nodes to the tail of the linked list, so that each new node is added to the tail to ensure the insertion order. Let's take the newNode source code as an example:

// Add a new node and append it to the end of the linked list
Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
    // New node
    LinkedHashMap.Entry<K,V> p =
        new LinkedHashMap.Entry<K,V>(hash, key, value, e);
    // Append to the end of the linked list
    linkNodeLast(p);
    return p;
}
// link at the end of list
private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
    LinkedHashMap.Entry<K,V> last = tail;
    // New node equals bit node
    tail = p;
    // If last is empty, it means that the linked list is empty and the head and tail nodes are equal
    if (last == null)
        head = p;
    // If the linked list has data, you can directly establish the relationship between the new node and the last tail node
    else {
        p.before = last;
        last.after = p;
    }
}

LinkedHashMap adds before and after attributes to each node by adding a head node and a tail node. Each time it adds a new node, it adds the node to the tail node. When it adds a new node, it has maintained the linked list structure according to the insertion order.

Sequential access

LinkedHashMap only provides one-way access, that is, it is accessed from beginning to end in the order of insertion. It cannot be accessed in both directions like LinkedList.

We mainly access through the iterator. When the iterator is initialized, it will access from the first node by default. In the process of iteration, we can continuously access the after node of the current node.

Map provides iterative methods for key, value and entity (nodes). If we need to iterate entity, we can use LinkedHashMap entrySet(). The writing method of iterator () directly returns LinkedHashIterator. LinkedHashIterator is an iterator. We can call the nextNode method of the iterator to get the next node. The source code of the iterator is as follows:

// During initialization, the default is to access from the beginning node
LinkedHashIterator() {
    // The head node is the first node to access
    next = head;
    expectedModCount = modCount;
    current = null;
}
 
final LinkedHashMap.Entry<K,V> nextNode() {
    LinkedHashMap.Entry<K,V> e = next;
    if (modCount != expectedModCount)// check
        throw new ConcurrentModificationException();
    if (e == null)
        throw new NoSuchElementException();
    current = e;
    next = e.after; // Find the node of the next iteration through the after structure of the linked list
    return e;
}

When adding new nodes, we have maintained the insertion order between elements, so iterative access is very simple. We only need to continuously access the next node of the current node.

Access least delete policy

This strategy is also called LRU (Least recently used), which roughly means that frequently accessed elements will be added to the tail of the team, so that infrequently accessed data will naturally be close to the head of the team. Then we can set the deletion strategy, such as deleting the head node when the number of Map elements is greater than

public void testAccessOrder() {
  // New LinkedHashMap
  LinkedHashMap<Integer, Integer> map = new LinkedHashMap<Integer, Integer>(4,0.75f,true) {
    {
      put(10, 10);
      put(9, 9);
      put(20, 20);
      put(1, 1);
    }
 
    @Override
    // Override the method of deleting the policy. We set that when the number of nodes is greater than 3, the header node will be deleted
    protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest) {
      return size() > 3;
    }
  };
 
  log.info("initialization:{}",JSON.toJSONString(map));
  Assert.assertNotNull(map.get(9));
  log.info("map.get(9): {}",JSON.toJSONString(map));
  Assert.assertNotNull(map.get(20));
  log.info("map.get(20): {}",JSON.toJSONString(map));
}

The printed results are as follows:

initialization:{9:9,20:20,1:1}
map.get(9): {20:20,1:1,9:9}
map.get(20): {1:1,9:9,20:20}

It can be seen that when the map is initialized, we put in four elements, but only three elements, 10, are missing. This is mainly because we override the removeEldestEntry method. We realize that if the number of elements in the map is greater than 3, we delete the elements of the queue head. When put (1,1) is executed, we just delete the 10 of the queue head, This indicates that the header node will be automatically deleted when the deletion policy we set is reached.

When we call map When using the get (9) method, element 9 moves to the end of the queue and calls map In the get (20) method, the element 20 is moved to the end of the queue, which reflects that the frequently accessed nodes will be moved to the end of the queue.

This example well illustrates the least access deletion strategy. Next, let's look at the principle.

The element is transferred to the end of the queue

Let's first look at why elements are moved to the end of the queue when get ting:

public V get(Object key) {
    Node<K,V> e;
    // Call HashMap get method
    if ((e = getNode(hash(key), key)) == null)
        return null;
    // If LRU policy is set
    if (accessOrder)
    // This method moves the current key to the end of the queue
        afterNodeAccess(e);
    return e.value;
}

From the above source code, we can see that the current access node is moved to the end of the queue through the afterNodeAccess method. In fact, it is not just the get method. It will also be done when the getOrDefault, compute, computeIfAbsent, computeIfPresent and merge methods are executed. By constantly moving the frequently accessed nodes to the end of the queue, the nodes close to the head of the queue, Nature is an element that is rarely accessed.

Delete policy

In the above demo, when we execute the put method, we find that the team head element is deleted. LinkedHashMap itself is not implemented by the put method. It calls the put method of HashMap, but LinkedHashMap implements the call afterNodeInsertion method in the put method. This method realizes the deletion. Let's see the source code:

// Delete elements that are rarely accessed and are called by the put method of HashMap
void afterNodeInsertion(boolean evict) { 
    // Get the element header node
    LinkedHashMap.Entry<K,V> first;
    // removeEldestEntry to control the deletion policy. If the queue is not empty and the deletion policy allows deletion, delete the header node
    if (evict && (first = head) != null && removeEldestEntry(first)) {
        K key = first.key;
        // removeNode deletes the header node
        removeNode(hash(key), key, null, false, true);
    }
}

Added by tsinka on Fri, 28 Jan 2022 00:21:44 +0200

Programming VIP

Analysis of core source code of TreeMap and LinkedHashMap

Analysis of core source code of TreeMap and LinkedHashMap

Overall TreeMap architecture

attribute

New node

LinkedHashMap overall architecture

LinkedHashMap linked list structure

How to add in order

Sequential access

Access least delete policy

The element is transferred to the end of the queue

Delete policy

Popular Keywords