ArrayList source code analysis

ArrayList inherits the following architecture:

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable

It mainly inherits the AbstractList class and implements the List interface, and implements the Cloneable and Serializable interfaces, so that ArrayList has the functions of cloning and serialization.

Constructor:

/**
 * Constructs an empty list with an initial capacity of ten.
 */
public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

Of which:

private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {}; 

transient Object[] elementData; 

The bottom layer is array implementation. The keyword transient tells us that this element should be ignored when serializing this object, but ArrayList supports serialization. In this case, why can serialization succeed?

The reason is that ArrayList is rewritten. When ObjectOutputStream serializes an object, the following method will be called:

/**
 * Save the state of the <tt>ArrayList</tt> instance to a stream (that
 * is, serialize it).
 *
 * @serialData The length of the array backing the <tt>ArrayList</tt>
 *             instance is emitted (int), followed by all of its elements
 *             (each an <tt>Object</tt>) in the proper order.
 */
private void writeObject(java.io.ObjectOutputStream s)
    throws java.io.IOException{
    // Write out element count, and any hidden stuff
    int expectedModCount = modCount;
    s.defaultWriteObject();

    // Write out size as capacity for behavioural compatibility with clone()
    s.writeInt(size);

    // Write out all elements in the proper order.
    for (int i=0; i<size; i++) {
        s.writeObject(elementData[i]);
    }

    if (modCount != expectedModCount) {
        throw new ConcurrentModificationException();
    }
}

Why not automatically serialize the data elements in the ArrayList into the media when serializing? The answer is: ArrayList will open up extra space to save data, and serialization and deserialization of these spaces without data will consume more resources. Therefore, the array of ArrayList is declared as transient and tell the virtual machine to leave it alone, I handle it myself, and then implement the write/readObject method myself, just serializing the stored data.

//When the specified ArrayList capacity is 0, the empty array is returned
private static final Object[] EMPTY_ELEMENTDATA = {};
//And empty_ The difference of elementdata is that the array is returned by default, while the latter is returned when the user specifies the capacity as 0
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

Construction method

1. Parameterless constructor

public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

An empty list with an initial capacity of 10 is constructed by default

2. Constructor with specified capacity as parameter

public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    }
}

Normally, the underlying array object is instantiated as the capacity according to the parameter size. When the parameter is less than 0, an exception is thrown and equal to 0, empty is used_ elementData to initialize the underlying array elementData.

3. Constructor with collection as parameter

public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        // replace with empty array.
        this.elementData = EMPTY_ELEMENTDATA;
    }
}

Convert the container Collection into an array, assign it to the array elementData, and check whether it is converted into an Object [] Object. If empty, use EMPTY_ELEMENTDATA is assigned to elementData;

Add

1.add(E e)

public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}
 private void ensureCapacityInternal(int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }

        ensureExplicitCapacity(minCapacity);
    }

It can be seen that the function ensureCapacityInternal is used to expand the capacity if elementdata = = defaultcapability_ EMPTY_ Elementdata uses the default capacity of 10 to open up space, so the parameterless constructor is to construct an array object with a length of 10 by default. The time complexity is O(1)

private void ensureExplicitCapacity(int minCapacity) {
    //Record modification times
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
    	//If the newCapacity expansion is too small. Then it should be expanded to the required space size minCapacity at least
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
    	//The expansion of newCapacity is too large. If it is too large, use integer MAX_ Value instead
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

It can be seen that the default capacity expansion is 1.5 times, but it may not meet the demand. If the 1.5 times is too small, use it

Use minCapacity to expand the capacity. If 1.5 times is too large, use integer MAX_ Value to expand.

The last statement in the above grow th function elementdata = arrays copyOf(elementData, newCapacity); The function of is to copy and expand the elements in the original array to a new array with the size of newcapacity, and return this new array. Arrays. The source code of copyof (elementdata, newcapacity) is as follows:

public static <T> T[] copyOf(T[] original, int newLength) {
    return (T[]) copyOf(original, newLength, original.getClass());
}

public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
        @SuppressWarnings("unchecked")
        T[] copy = ((Object)newType == (Object)Object[].class)
            ? (T[]) new Object[newLength]
            : (T[]) Array.newInstance(newType.getComponentType(), newLength);
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

/**
*src Array to copy
*srcPos: The starting position of the array to be copied
*dest target array
*destPos Start position of target array
*length How many elements do you want to copy
*native The method is implemented by other languages, generally (C or C + +)
*/
public static native void arraycopy(Object src,  int  srcPos,
                                        Object dest, int destPos,
                                        int length);

2.add(int index,E element)

public void add(int index, E element) {
    //Position validity check
    rangeCheckForAdd(index);
	//Add modification times and judge whether the array length needs to be expanded,
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    //Copy all elements of the array from index to the position of index+1 with the length of size index.
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    elementData[index] = element;
    size++;
}

3.addAll(Collection)

public boolean addAll(Collection<? extends E> c) {
    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount
     //Copy all elements in a to the last element of the array elementData
    System.arraycopy(a, 0, elementData, size, numNew);
    size += numNew;
    return numNew != 0;
}

4.addAll(int index, Collection)

public boolean addAll(int index, Collection<? extends E> c) {
    rangeCheckForAdd(index);

    Object[] a = c.toArray();
    int numNew = a.length;
    ensureCapacityInternal(size + numNew);  // Increments modCount

    int numMoved = size - index;
    if (numMoved > 0)
        System.arraycopy(elementData, index, elementData, index + numNew,
                         numMoved);

    System.arraycopy(a, 0, elementData, index, numNew);
    size += numNew;
    return numNew != 0;
}

get(int index)

public E get(int index) {
    rangeCheck(index);
    checkForComodification();
    return ArrayList.this.elementData(offset + index);
}

Checkforconfirmation() is mainly used to implement the fail fast mechanism, such as:

There are two threads (thread A and thread b). Thread A is responsible for traversing the list and thread B modifies the list. When thread A traverses the list process (at this time, expectedModCount = modCount=N), the thread starts, and thread B adds an element, which is the change of the value of modCount (modCount + 1 = N + 1). When thread A continues to traverse and execute the next method, it notifies the checkforconfirmation method that expectedModCount = N and modCount = N + 1 are not equal. At this time, it throws A ConcurrentModificationException exception, resulting in A fail fast mechanism.

set(int index, E element)

Change the size of the value of a location element in the ArrayList object and return the previous element:

public E set(int index, E element) {
    rangeCheck(index);

    E oldValue = elementData(index);
    elementData[index] = element;
    return oldValue;
}

remove

Remove the removed elements in ArrayList, including the overloading of two functions. One is to remove the elements at the specified location, and the other is to remove the elements with the specified value.

1.remove(int index)

public E remove(int index) {
    rangeCheck(index);

    modCount++;
    E oldValue = elementData(index);

    int numMoved = size - index - 1;
    //Self copy (that is, copy the elements of the array from the beginning to the end of index+1 to the beginning of index)
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work

    return oldValue;
}

2.remove(Object o)

public boolean remove(Object o) {
    if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        for (int index = 0; index < size; index++)
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}

It can be seen from the source code that whether the specified object o is null, the location of the first equivalent element is found in ArrayList, and then fastRemove(index) is used to remove it. If the location of the specified object o is not found, false is returned, indicating that the removal was not successful.

private void fastRemove(int index) {
    modCount++;
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work
}

It is found that there is basically no difference between fastRemove(int index) and remove(int index). The only difference is that fastRemove does not check the validity of index and does not return the removed old value. Remove (Object o) itself gives the value to be removed.

clear

public void clear() {
    modCount++;

    // clear to let GC do its work
    for (int i = 0; i < size; i++)
        elementData[i] = null;

    size = 0;
}

Just set all elements in the array to null, which is convenient for garbage collection.

sublist

private class SubList extends AbstractList<E> implements RandomAccess {
    private final AbstractList<E> parent;
    private final int parentOffset;
    private final int offset;
    int size;

    SubList(AbstractList<E> parent,
            int offset, int fromIndex, int toIndex) {
        this.parent = parent;
        this.parentOffset = fromIndex;
        this.offset = offset + fromIndex;
        this.size = toIndex - fromIndex;
        this.modCount = ArrayList.this.modCount;
   }
    ....
}

Through the construction method, it can be seen that the sublist has a reference to the parent ArrayList, that is, the reference to the original ArrayList, and the fromIndex and size relative to the original array are saved. Next, focus on the add, set and remove methods related to sublist.

When we call the add method on a SubList object, we actually call the add method of AbstractList, and then the SubList add method. The source code is as follows:

public void add(int index, E e) {
    rangeCheckForAdd(index);
    checkForComodification();
    parent.add(parentOffset + index, e);
    this.modCount = parent.modCount;
    this.size++;
}

Through the source code of sublist, you can clearly see that each time you add a sublist, the bottom layer adds an element to the ArrayList of the parent class and adds its own size+1. Note that for the ArrayList of the parent class, this element is not added at the end, but after the toIndex of the child class to ensure continuity.

The source code of SubList set method is as follows:

public E set(int index, E e) {
    rangeCheck(index);
    checkForComodification();
    E oldValue = ArrayList.this.elementData(offset + index);
    ArrayList.this.elementData[offset + index] = e;
    return oldValue;
}

The source code of SubList remove method is as follows:

public E remove(int index) {
    rangeCheck(index);
    checkForComodification();
    E result = parent.remove(parentOffset + index);
    this.modCount = parent.modCount;
    this.size--;
    return result;
}

From the source code, we can see that the operation of sublist is the operation of the original ArrayList array, so we should be careful when using sublist in the List, because the sublist class itself is a view of the original List, and changing the sublist will change the original List. The following methods can be used to avoid:

List<Integer> listNew  = new ArrayList<>();
listNew.addAll(list.subList(0, list.size() - 1));

Iterator

Iterator contains four methods: next(), hasNext(), remove(), foreachremaining (consumer <? Super E > action)

1. Iterator (fast failure and security failure)

1.1 fail fast

When traversing a collection object with an iterator, if the contents of the collection object are modified during traversal (add, remove, replaceAll,trimToSize these operations will modify the modCount value), a Concurrent Modification Exception will be thrown, but the set operation will not modify the modCount value.

Principle: the iterator directly accesses the contents of the collection during traversal, and uses a modCount variable during traversal. If the content of the collection changes during traversal, the value of modCount will be changed. Whenever the iterator uses hashNext()/next() to traverse the next element, it will detect whether the modCount variable is the expectedmodCount value. If so, it will return the traversal; Otherwise, an exception is thrown and the traversal is terminated.

Note: the throw condition of the exception here is that modCount is detected= expectedmodCount this condition. If the modified modCount value is just set to the expectedmodCount value when the collection changes, the exception will not be thrown. Therefore, you cannot program concurrent operations depending on whether this exception is thrown or not. This exception is only recommended to detect concurrent modification bug s.

Scenario: Java The collection classes under util package fail quickly and cannot be modified concurrently under multithreading (modified during iteration). Example

ArrayList list = new ArrayList();
        list.add("ee");
        list.add("ee");
        list.add("eee");

        Iterator it = list.listIterator();
        list.add("It will be modified here list Medium modCount");
        try{
            it.next(); //This will result in an error because the value of modCount in the parent list is different from the value of expicedmodcount in the iterator
        }catch (Exception e){
            System.out.print("error2");
        }

2. fail safe

The collection container with security failure mechanism is not directly accessed on the collection content during traversal, but copies the original collection content first and traverses the copied collection.

Principle: since the copy of the original set is traversed during iteration, the modifications made to the original set during traversal cannot be detected by the iterator, so the Concurrent Modification Exception will not be triggered. Weak consistency is used for security failure.

Disadvantages: the advantage of copying content is to avoid the Concurrent Modification Exception, but similarly, the iterator cannot access the modified content, that is, the iterator traverses the set copy obtained at the moment when it starts traversing, and the iterator does not know the modification of the original set during traversal.

Scenario: Java util. The containers under the concurrent package are all safe failures and can be used and modified concurrently under multithreading. example:

ConcurrentHashMap concurrentHashMap = new ConcurrentHashMap();
    concurrentHashMap.put("Not only Java-1", 1);
    concurrentHashMap.put("Not only Java-2", 2);
    concurrentHashMap.put("Not only Java-3", 3);

    Set set = concurrentHashMap.entrySet();
    Iterator iterator = set.iterator();

    while (iterator.hasNext()) {
        System.out.println(iterator.next());
        concurrentHashMap.put("The next cycle is executed normally", 4);
    }
    System.out.println("Program end");

The operation effect is as follows. It does not throw exceptions, and the program executes normally,

Not only Java-1=1
 Not only Java-2=2
 Not only Java-3=3
 Program end

Through the ConcurrentModificationException exception, you can fail quickly. Generally, it is not allowed to modify when reading data. For example, one thread is usually not allowed to modify the collection while another thread is traversing it. Generally speaking, in this case, the result of iteration is undefined. Some iterator implementations, including those of all common collection implementations provided by JRE, may choose to throw this exception when this behavior is detected. Iterators that do this are called fault fast iterators because they fail quickly and cleanly, but risk arbitrary uncertain behavior at an uncertain time in the future.

Note that this exception does not always indicate that the object has been modified concurrently by different threads. If a single thread issues a method call sequence that violates the object contract, the object may throw this exception. For example, if a thread modifies a collection directly when iterating over it using a fast fail iterator, the iterator will throw this exception.

Finally, fast failure and security failure are for iterators. Java. Java is recommended for concurrent environments util. Container class under concurrent package, unless there is no modification operation.

2.iterator()

Returns iterators for the elements in the list in the correct order.

public Iterator<E> iterator() {
    return listIterator();
}
 public ListIterator<E> listIterator() {
        return listIterator(0);
    }

3.listIterator(int index)

Returns the list iterator of the elements in the list (in the correct order) starting at the specified position in the list

public ListIterator<E> listIterator(int index) {
        if (index < 0 || index > size)
            throw new IndexOutOfBoundsException("Index: "+index);
        return new ListItr(index);
    }

Analysis of the problem of removing ArrayList in foreach

When using foreach loop, ArrayList cannot add or remove elements, otherwise it may throw an exception. Let's analyze the specific implementation

There is the following code:

public class TestForEachList extends BaseTests {

    @Test
    public void testForeach() {
        List<String> list = new ArrayList<>();
        list.add("1");
        list.add("2");
        list.add("3");

        for (String s : list) {
        }
    }

}

Open the class file directly with idea and decompile it automatically. The following contents are obtained:

public class TestForEachList extends BaseTests {
    public TestForEachList() {
    }

    @Test
    public void testForeach() {
        List<String> list = new ArrayList();
        list.add("1");
        list.add("2");
        list.add("3");

        String var3;
        for(Iterator var2 = list.iterator(); var2.hasNext(); var3 = (String)var2.next()) {
            ;
        }

    }
}

It can be seen that the real implementation of foreach is as follows:

for(Iterator var2 = list.iterator(); var2.hasNext(); var3 = (String)var2.next()) {
            ;
        }

Next, let's see how these three methods are implemented:

iterator

The iterator implementation of ArrayList is as follows:

public Iterator<E> iterator() {
    return new Itr();
}

private class Itr implements Iterator<E> {
    int cursor;       // index of next element to return
    int lastRet = -1; // index of last element returned; -1 if no such
    int expectedModCount = modCount;
    //Omit partial implementation
}

Itr is an internal class in ArrayList, so list The function of iterator () is to return an Itr object assigned to var2, and then call var2 hasNext(),var2.next() is the concrete implementation of Itr.

expectedModCount is also mentioned here. This variable record is assigned as modCount. modCount is a field of AbstractList, the parent class of ArrayList. This field means the number of changes in the list structure. Usually, modCount + + will be triggered if the number of elements is changed, such as add or remove.

Let's continue with ITR hasNext()``var2. Implementation of next().

itr.hasNext and ITR Next implementation

public boolean hasNext() {
            return cursor != size;
        }

If the current index is not equal to size, it indicates that the iteration has not been completed. The size here is the field of the external class ArrayList, indicating the number of elements.

Looking at the next implementation:

public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }

The first step of the next method is checkforconfirmation (). What does it do? If modCount= expectedModCount throws the exception ConcurrentModificationException. What is modCount? The number of element changes of the external class ArrayList; What is expectedModCount? The number of element changes of the external class when initializing the internal class Itr.

Therefore, if an add or remove operation is performed in the foreach, the program exception ConcurrentModificationException will be caused. Here are two examples:

@Test(expected = ConcurrentModificationException.class)
    public void testListForeachRemoveThrow() {
        List<String> list = new ArrayList<>();
        list.add("1");
        list.add("2");
        list.add("3");

        for (String s : list) {
            list.remove(s);
        }
    }

    @Test(expected = ConcurrentModificationException.class)
    public void testListForeachAddThrow() {
        List<String> list = new ArrayList<>();
        list.add("1");
        list.add("2");
        list.add("3");

        for (String s : list) {
            list.add(s);
        }
    }

When the unit test runs, they throw ConcurrentModificationException.

The code after checkForComodification() is relatively simple and will not be analyzed here.

Special of the penultimate element

Here, let's go through the general process:

  1. Get the Itr object and assign it to var2
  2. Judge hasNext, that is, judge cursor= Size, if the subscript of the current iteration element is not equal to the number of list s, return true to continue the iteration; Otherwise, exit the cycle
  3. next fetch iteration element
    1. Checkforconfirmation(), judge modcount= Expectedmodcount: the number of element changes is not equal to the number of element changes when initializing the internal class Itr, that is, the ConcurrentModificationException is thrown if it is modified during the iteration.
    2. If the check passes cursor++

Consider the following: what happens when the penultimate element is remove d? The code is as follows:

@Test
public void testListForeachRemoveBack2NotThrow() {
    List<String> list = new ArrayList<>();
    list.add("1");
    list.add("2");
    list.add("3");

    for (String s : list) {
        System.out.println(s);
        if ("2".equals(s)) {
            list.remove(s);
        }
    }
}

Guess if you throw an exception? The answer is No. Output is:

1
2

It is found that 3 is missing and there is no output. Analyze it

After the penultimate element "2" is removed, the size-1 of the list changes to 2. At this time, cur in itr adds 1 after taking out the element "2" in the next method, and the value changes to 2. As a result, when judging hasNext next, cursor==size, hasNext returns false, and finally the last element is not output.

How to avoid the pit;

1. Method 1, or fori, just move the position forward and subtract it back. After remove, i –:

@Test
    public void testListForiRight() {
        List<String> list = new ArrayList<>();
        list.add("1");
        list.add("2");
        list.add("3");

        for (int i = 0; i < list.size(); i++) {
            System.out.println(list.get(i));
            list.remove(i);
            i--;  //Just move the position forward and reduce it back
        }
    }

2. Method 2: instead of the remove method of ArrayList, use the remove method defined by Itr. The code is as follows:

@Test
    public void testIteratorRemove() {
        List<String> list = new ArrayList<>();
        list.add("1");
        list.add("2");
        list.add("3");

        Iterator<String> itr = list.iterator();
        while (itr.hasNext()) {
            String s = itr.next();
            System.out.println(s);
            itr.remove();
        }
    }

Why doesn't the remove defined by itr report an error? Look at the source code:

public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            //There are still checks. Is the quantity changed
            checkForComodification();

            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                //But after the change, the value is reassigned and equal again
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

There is still checkForComodification() verification, but we see that it is reassigned later, so it is equal again.

Keywords: Java Container source code

Added by maest on Fri, 28 Jan 2022 01:29:47 +0200