Understanding the source code of ConcurrentHashMap

JDK1. Version 8 of ConcurrentHashMap

1. Why is thread safe?

Concurrent HashMap uses CAS and synchronized to ensure the security of collection concurrency control, that is, the key to thread safety.

Version 1.1.7 thread safety (ReentrantLock)

The underlying data structure of version 1.7 is array + HashEntry + linked list (HashEntry node), that is, the HashMap array is divided into several segments and placed in each Segment node. When inserting data, the hash value is calculated. This hash value is used to determine which Segment array node to insert. In order to ensure thread safety, just lock this node with ReentrantLock, and other nodes of Segment array can still insert data. This improves concurrent access Ask about efficiency.

1.2 version 1.8 thread safety (CAS+Synchronized)

The underlying database structure of version 1.8 is Node array + linked list + red black tree. Segment is no longer used to split the array. In order to ensure thread safety, version 1.8 locks the nodes of each array of HashMap, so that when inserting data, other nodes can still insert data except that the inserted Node is inaccessible. The efficiency of concurrent access is much higher than that before 1.7.
When the inserted node is empty, CAS optimistic locking technology is used to ensure thread safety. When the inserted node is not empty, the Synchronized keyword is used to lock the node.

1.2.1 CAS introduction

The full name of CAS is Compare And Swap,.
CAS appears during the insert operation in Concurrent Hash Map:

public V put(K key, V value) {
       return putVal(key, value, false);
   }

   /** Implementation for put and putIfAbsent */
   final V putVal(K key, V value, boolean onlyIfAbsent) {
       if (key == null || value == null) throw new NullPointerException();
       int hash = spread(key.hashCode());
       int binCount = 0;
       //tab entire HashMap array
       for (Node<K,V>[] tab = table;;) {
           Node<K,V> f; int n, i, fh;
           //If tab is empty, initialize HashMap and allocate initial memory 16
           if (tab == null || (n = tab.length) == 0)
               tab = initTable();
           //When the CASl optimistic lock appears, judge whether the inserted node f is empty. If it is empty, use casTableAt to judge the thread safety of the node.
           else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
               if (casTabAt(tab, i, null,
                            new Node<K,V>(hash, key, value, null)))
                   break;                   // no lock when adding to empty bin
           }
           else if ((fh = f.hash) == MOVED)
               tab = helpTransfer(tab, f);
           else {
               V oldVal = null;
           //If the inserted node is not empty, use synchronized to lock f this inserted node
               synchronized (f) {
                   if (tabAt(tab, i) == f) {
                       if (fh >= 0) {
                           binCount = 1;
                           for (Node<K,V> e = f;; ++binCount) {
                               K ek;
                               if (e.hash == hash &&
                                   ((ek = e.key) == key ||
                                    (ek != null && key.equals(ek)))) {
                                   oldVal = e.val;
                                   if (!onlyIfAbsent)
                                       e.val = value;
                                   break;
                               }
                               Node<K,V> pred = e;
                               if ((e = e.next) == null) {
                                   pred.next = new Node<K,V>(hash, key,
                                                             value, null);
                                   break;
                               }
                           }
                       }
                       else if (f instanceof TreeBin) {
                           Node<K,V> p;
                           binCount = 2;
                           if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                          value)) != null) {
                               oldVal = p.val;
                               if (!onlyIfAbsent)
                                   p.val = value;
                           }
                       }
                   }
               }
               if (binCount != 0) {
                   if (binCount >= TREEIFY_THRESHOLD)
                       treeifyBin(tab, i);
                   if (oldVal != null)
                       return oldVal;
                   break;
               }
           }
       }
       addCount(1L, binCount);
       return null;
   }

CAS optimistic lock is in the casTabAt method. Let's take a look at the casTab source code:

    static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
                                        Node<K,V> c, Node<K,V> v) {
        return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);

We are getting closer to the truth. What is U? It is a class called sun misc. Unsafe. It's all called unsafe. It's definitely unsafe, because its internal methods can directly operate memory like C pointers. The operation of CAS depends on the methods of unsafe class. CAS related operations in Usafe can be implemented in the following three ways:

    public final native boolean compareAndSwapObject(Object var1, long var2, Object var4, Object var5);

    public final native boolean compareAndSwapInt(Object var1, long var2, int var4, int var5);

    public final native boolean compareAndSwapLong(Object var1, long var2, long var4, long var6);

var1 gets the memory address of the object. Var2 is the offset. Get the memory address of the old value through var1+var2 to get the old value. Compare the old address with var4. If it is the same, assign the old value to var5 and return true. Otherwise, return false. The operation is atomic.

1.2.2 ReentrantLock

Compared with synchronized, ReentrantLock provides three advanced functions: wait interruptible (lock. Lockinterruptible()), fair lock (new ReentrantLock(true)), and multiple conditions bound. ReentrantLock ensures the atomicity and volatility of thread operation by using CAS optional mechanism (when modifying a variable modified by volatile, the keyword will force the modified value to be written to the main memory immediately, invalidate other caches, and other threads can only read data from the main memory again. At the same time, volatile prohibits instruction rearrangement.) it ensures data visibility, and has realized the function of locking.

1.2.2 Synchronized

Since the introduction of biased lock and lightweight lock (spin lock) into synchronized, the performance of synchronized and ReentrantLock is almost the same. When both methods are available, the official recommends using synchronized.
Synchronized is a non fair lock and cannot be interrupted. Unless an exception occurs or the normal execution is completed, the lock occupation can be automatically released. The implementation of synchronized involves lock upgrading, including no lock, bias lock, spin lock and applying for heavyweight lock from OS.

Keywords: Java

Added by nigaki on Sun, 02 Jan 2022 06:42:16 +0200