CopyOnWriteArrayList implementation principle and source code analysis

1. CopyOnWrite container (concurrent container)

Copy on write, referred to as COW, is an optimization strategy used in program design.
The basic idea is that from the beginning, everyone is sharing the same content. When someone wants to modify the content, he will really Copy the content to form a new content and then change it. This is a delayed laziness strategy.
From jdk1 5 starts Java and provides two concurrent containers implemented by CopyOnWrite mechanism in the contract. They are CopyOnWriteArrayList and CopyOnWriteArraySet.

The CopyOnWrite container is the container that is copied when writing.
The popular understanding is that when we add elements to a container, we do not directly add them to the current container, but first Copy the current container, Copy a new container, then add elements to the new container, and then point the reference of the original container to the new container after adding elements.
The advantage of this is that we can read the CopyOnWrite container concurrently without locking, because the current container will not add any elements.
In addition, the idea of using the container to separate reading and writing, and the idea of using the container to solve the conflict between reading and writing is different.

2. CopyOnWriteArrayList data structure

public class CopyOnWriteArrayList<E>
            implements List<E>, RandomAccess, Cloneable, java.io.Serializable {}

CopyOnWriteArrayList implements the List interface, which defines the basic operations on the List;

  • At the same time, RandomAccess interface is implemented, which means that the array can be accessed randomly (the array has the characteristics of random access);
  • At the same time, the clonable interface is implemented, which means that it can be cloned;
  • At the same time, it also implements the Serializable interface, which indicates that it can be serialized.
  • The bottom layer of CopyOnWriteArrayList uses arrays to store elements.

2. CopyOnWriteArrayList Add method

The CopyOnWriteArrayList container is collections The alternative to synchronized list is a thread safe variant of ArrayList.
Basic principle:
During initialization, there is only one container. Very often, for a period of time, when the data and quantity of this container have not changed, everyone (multiple threads) reads (assuming that only reading operations occur during this period) the data in the same container, so the data we read is unique, consistent and safe, But then someone added a data to it. At this time, the principle of the bottom implementation of CopyOnWriteArrayList is to copy out a container (which can be referred to as copy), then add the new data to the new container, and finally assign the reference address of the new container to the address of the old container before. However, during the period of adding this data, If other threads want to read data, they still read the data in the old container.

The implementation of the add method in CopyOnWriteArrayList (adding elements to CopyOnWriteArrayList) can be found that locking is required when adding, otherwise N copies will be copied out when writing by multiple threads.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 public boolean add(E e) {         final ReentrantLock lock = this.lock;         lock.lock();         try {             Object[] elements = getArray();             int len = elements.length;             // Copy out the new array             Object[] newElements = Arrays.copyOf(elements, len + 1);             // Add new element to new array            newElements[len] = e;            // Point the original array reference to the new array             setArray(newElements);             return true;         } finally {             lock.unlock();         }     }           /**      * Sets the array.      */     final void setArray(Object[] a) {         array = a;     }

  

There is no need to lock when reading. If multiple threads are adding data to CopyOnWriteArrayList when reading, the old data will still be read, because the old CopyOnWriteArrayList will not be locked when writing.

public E get(int index) {
        return get(getArray(), index);
    }

3. remove method

public E remove(int index) {
        final ReentrantLock lock = this.lock;
        lock.lock();
        try {
            Object[] elements = getArray();
            int len = elements.length;
            E oldValue = get(elements, index);
            int numMoved = len - index - 1;
            if (numMoved == 0)
                setArray(Arrays.copyOf(elements, len - 1));
            else {
                Object[] newElements = new Object[len - 1];
                System.arraycopy(elements, 0, newElements, 0, index);
                System.arraycopy(elements, index + 1, newElements, index,
                                 numMoved);
                setArray(newElements);
            }
            return oldValue;
        } finally {
            lock.unlock();
        }
    }

To delete an element, it is very simple to judge whether the element to be deleted is the last one. If the last one is directly in the replica array, the copy length can be the length-1 of the old array;
However, if it is not the last element, first copy the elements before index of the old array to the new array, then copy the elements after index in the old array to the array, and finally copy the new array to the reference of the old array. Finally, release the lock in the finally statement block.

4. set method

public E set(int index, E element) {
        final ReentrantLock lock = this.lock;
        lock.lock();
        try {
            Object[] elements = getArray();
            E oldValue = get(elements, index);
 
            if (oldValue != element) {
                int len = elements.length;
                Object[] newElements = Arrays.copyOf(elements, len);
                newElements[index] = element;
                setArray(newElements);
            } else {
                // Not quite a no-op; ensures volatile write semantics
                setArray(elements);
            }
            return oldValue;
        } finally {
            lock.unlock();
        }
    }

5. CopyOnWriteArrayList initialization (construction method)

 /**
     * Sets the array.Point the old array to the new array
     */
    final void setArray(Object[] a) {
        array = a;
    }
 
    /**
     * Creates an empty list.Constructor
     */
    public CopyOnWriteArrayList() {
        setArray(new Object[0]);
    }
 
    public CopyOnWriteArrayList(Collection<? extends E> c) {
        Object[] elements = c.toArray();
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elements.getClass() != Object[].class)
            elements = Arrays.copyOf(elements, elements.length, Object[].class);
        setArray(elements);
    }
    /**
     * Creates a list holding a copy of the given array.
     *
     * @param toCopyIn the array (a copy of this array is used as the
     *        internal array)
     * @throws NullPointerException if the specified array is null
     */
    public CopyOnWriteArrayList(E[] toCopyIn) {
        setArray(Arrays.copyOf(toCopyIn, toCopyIn.length, Object[].class));
    }

No matter which construction method we use to create a CopyOnWriteArrayList Object, we will create an array of Object type and assign it to the member array.

6. copyOf function

This function is used to copy the specified array, intercept or fill with null if necessary, so that the copy has the specified length.

    public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
        @SuppressWarnings("unchecked")
        // Determine the type of copy (convert newType to Object type and Object[].class to Object type, and judge whether they are equal. If they are equal, an Object array of specified length will be generated
        // Otherwise, a new type array of the specified length is generated)
        T[] copy = ((Object)newType == (Object)Object[].class)
            ? (T[]) new Object[newLength]
            : (T[]) Array.newInstance(newType.getComponentType(), newLength);
        // Copy the original array from the subscript 0 with the copy length (the smaller of original.length and newLength) to the copy array (also from the subscript 0)
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

7. Application scenario of CopyOnWrite

CopyOnWrite concurrency container is used for concurrency scenarios with more reads and less writes.
For example, the access and update scenarios of white list, blacklist and commodity categories. If we have a search website, users enter keywords in the search box of the website, but some keywords are not allowed to be searched.
These keywords that cannot be searched will be placed in a blacklist, which will be updated every night. When users search, they will check whether the current keyword is in the blacklist. If it is, they will be prompted that they cannot search.


8. Disadvantages of CopyOnWrite

CopyOnWrite container has many advantages (solving the concurrency problem of multithreading in development), but there are also two problems, namely memory occupation and data consistency.
1. Memory occupation.
Because of CopyOnWrite's copy on write mechanism, when writing, the memory of two objects will be stationed in the memory at the same time, Old objects and newly written objects (Note: when copying, only the references in the container are copied, but when writing, new objects will be created and added to the new container, while the objects in the old container are still in use, so there are two copies of object memory).
If these objects occupy a large amount of memory, such as about 200M, then writing 100M data into them will occupy 300M of memory, which is likely to cause frequent Yong GC and Full GC at this time.

To solve the problem of memory occupation, you can reduce the memory consumption of large objects by compressing the elements in the container. For example, if the elements are all hexadecimal numbers, you can consider compressing them into hexadecimal 36 or hexadecimal 64.
Or instead of using the CopyOnWrite container, use other concurrent containers, such as ConcurrentHashMap.

2. Data consistency.
CopyOnWrite container can only guarantee the final consistency of data, but can not guarantee the real-time consistency of data. So if you want to write data that can be read immediately, please do not use the CopyOnWrite container.


9. Summary:

    • 1.CopyOnWriteArrayList is suitable for scenarios with more reading and less writing
    • 2. The concurrent modificationexception will not be thrown when the container object is operated concurrently, and the returned element is consistent with the element when the iterator is created
    • 3. The replication of container objects requires some overhead. If the objects occupy too much memory, it may cause frequent young GC and Full GC
    • 4.CopyOnWriteArrayList cannot guarantee the real-time consistency of data, but only the final consistency
    • 5. When the List object needs to be operated concurrently, CopyOnWriteArrayList is preferred
    • 6. With the increase of elements in CopyOnWriteArrayList, the modification cost of CopyOnWriteArrayList will become more and more expensive. Therefore, CopyOnWriteArrayList is suitable for concurrent scenarios where read operations are much more than modification operations.

Keywords: Java

Added by eyekkon on Sat, 05 Feb 2022 20:05:01 +0200