Java multithreaded programming -- a preliminary study on thread unsafe collection

In Java, some collections are thread unsafe, such as ArrayList, HashSet, TreeSet, HashMap, etc. In this paper, we take ArrayList as an example to explore the phenomenon and causes of thread insecurity.

1, How ArrayList appends elements

First, we extract the source code of the add method in ArrayList:

/**
 * Appends the specified element to the end of this list.
 *
 * @param e element to be appended to this list
 * @return <tt>true</tt> (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

The above program first checks whether the array elementData has enough space to store size+1 elements, then places a new element e at the end of the stored data, and then increases the number of existing elements in the array by one. The first step is to check whether there is enough space as follows:

private void ensureCapacityInternal(int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
    }

    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

/**
 * Increases the capacity to ensure that it can hold at least the
 * number of elements specified by the minimum capacity argument.
 *
 * @param minCapacity the desired minimum capacity
 */
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

For the above program to check whether the array capacity is sufficient, first check whether the array is empty. If it is empty, the maximum between the space required for the allocation application and the default initial capacity; If it is not an empty array, if the space required for the application does not exceed the space opened by the array, no operation will be carried out, otherwise the array capacity will be expanded. The idea of array expansion is to open up a new array with a size of 1.5 times the size of the current array, and then copy the data from the old array to the new array.

2, Embodiment of unsafe ArrayList thread

Now let's create two threads and add elements to an ArrayList list to analyze the data inconsistency. The specific procedures are as follows:

import java.util.ArrayList;

public class Main {
    public static void main(String[] args) throws InterruptedException {
        ArrayList<Integer> list = new ArrayList<>();

        new ManipulateArrayList(list).start();
        new ManipulateArrayList(list).start();

        Thread.sleep(10000);

        for (int i = 0; i < list.size(); i++) {
            System.out.println("list[" + i + "]: " + list.get(i));
        }
    }
}

class ManipulateArrayList extends Thread {
    private ArrayList<Integer> list;

    public ManipulateArrayList(ArrayList<Integer> list) {
        this.list = list;
    }

    @Override
    public void run() {
        for (int i = 0; i < 500; i++) {
            list.add(i);
        }
    }
}

Let's analyze the possible errors of the program one by one.

(1) Array out of bounds exception

Exception in thread "Thread-1" java.lang.ArrayIndexOutOfBoundsException: 244
	at java.util.ArrayList.add(ArrayList.java:459)
	at ManipulateArrayList.run(Main.java:29)

1. The number of stored elements in the current array is 243.

2. Thread 1 first executes ensureCapacityInternal(size + 1) to check whether the array size is sufficient. The read size value is 243.

3. Thread 2 starts to execute the ensureCapacityInternal method to judge the capacity, and the read size value is 243.

4. Thread 1 found that the space opened up by the array is 244, which is enough to store a new element, so there is no need to expand the capacity.

5. Thread 2 found that the space opened up by the array is 244, which is enough to add a new element based on the existing 243 elements, so it did not expand the capacity.

6. Thread 1 executes elementData[size++] = e statement, that is, place a new element at the position with subscript 243, and then add one to size to 244.

7. Thread 2 also executes the elementData[size++] = e statement. At this time, the read size is 244, so thread 2 attempts to store element E at the position with subscript 244. However, the space opened up by the elementData array is only 244, which is not enough to put down the 245th element, so the program throws an exception: the element cannot be inserted at the position with the subscript 244, because the array is out of bounds!

(2) The element value is empty and the value is overwritten

......
list[103]: 103
list[104]: 104
list[105]: 0
list[106]: 105
list[107]: 106
list[108]: 1
list[109]: null
list[110]: 2
list[111]: 3
......

As you can see, the element with subscript 109 in the array is null. The reason for this is that the add method of ArrayList is not an atomic operation. The statement elementData[size++] = e actually contains two steps:

1. elementData[size] = e

2. size++

When two threads execute alternately, the following problems may occur:

1. Thread 1 reads that the size is 108, so it adds the E element (i.e. 108) to the position with the subscript 108.

2. Thread 2 also reads that the size is 108, so it also places the value it wants to assign (i.e. 1) at the position of subscript 108.

3. Thread 1 executes the size + + operation, and the size becomes 109.

4. Thread 2 executes the size + + operation, and the size becomes 110.

5. Thread 2 reads that the size is 110, so it adds the E element (i.e. 2) to the position with subscript 110.

It is not difficult to see that the position with array subscript 109 is not assigned, so the element is still left blank. In addition, the position of array subscript 108 is assigned twice, and the element written by thread 1 is overwritten by thread 2, so the writing of thread 1 is lost.

3, Thread safe CopyOnWriteArrayList collection

In Java util. A thread safe CopyOnWriteArrayList collection is provided under concurrent. We modified the above program to use the CopyOnWriteArrayList set.

import java.util.concurrent.CopyOnWriteArrayList;

public class Main {
    public static void main(String[] args) throws InterruptedException {
        CopyOnWriteArrayList<Integer> list = new CopyOnWriteArrayList<>();

        new ManipulateArrayList(list).start();
        new ManipulateArrayList(list).start();

        Thread.sleep(10000);

        for (int i = 0; i < list.size(); i++) {
            System.out.println("list[" + i + "]: " + list.get(i));
        }
    }
}

class ManipulateArrayList extends Thread {
    private CopyOnWriteArrayList<Integer> list;

    public ManipulateArrayList(CopyOnWriteArrayList<Integer> list) {
        this.list = list;
    }

    @Override
    public void run() {
        for (int i = 0; i < 500; i++) {
            list.add(i);
        }
    }
}

At this time, the program will no longer have various data inconsistencies. The running results of the program are as follows:

list[0]: 0
list[1]: 1
list[2]: 2
list[3]: 3
......
list[254]: 250
list[255]: 251
list[256]: 4
list[257]: 5
......
list[319]: 66
list[320]: 67
list[321]: 253
list[322]: 254
......
list[945]: 446
list[946]: 499
list[947]: 447
......
list[998]: 498
list[999]: 499

Keywords: Java

Added by mdell on Sun, 16 Jan 2022 06:06:11 +0200