JAVA Common Collection Source Parsing Series-ArrayList Source Parsing (based on JDK8)

The article was originally created by the author. If it is reproduced, please indicate the source. If it is the same, it is the same. ~ (who care!)

1. Write before

This is the first part of the Source Analysis Program. The blogger is going to pass through some of the commonly used collection sources, such as ArrayList, HashMap and their corresponding thread security implementations. This article is a summary of his own related learning, recording the learning results, and hoping to provide some help to the friends who are close to him.

Of course, there are some faults with limited abilities. I would like to find friends who can point them out. Thank you so as not to mislead everyone!

2. Strong and stable past source code

First, the member variable definition and construction method within the source code:

 1 /**
 2      * Default initial capacity.
 3     (Default Initialization Length) ps: Actually "lazy init", as detailed below
 4      */
 5     private static final int DEFAULT_CAPACITY = 10;
 6 
 7     /**
 8      * Shared empty array instance used for empty instances.
 9     (Share empty arrays for efficiency)
10      */
11     private static final Object[] EMPTY_ELEMENTDATA = {};
12 
13     /**
14      * Shared empty array instance used for default sized empty instances. We
15      * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
16      * first element is added.
17     (Unlike EMPTY_ELEMENTDATA, when using the default construction method, this empty array is used by default, and lazy init is implemented together with DEFAULT_CAPACITY
18      */
19     private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
20 
21     /**
22      * The array buffer into which the elements of the ArrayList are stored.
23      * The capacity of the ArrayList is the length of this array buffer. Any
24      * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
25      * will be expanded to DEFAULT_CAPACITY when the first element is added.
26     (Where the collection data really resides, so for ArrayList we understand it as an array that provides a set of efficient ways to manipulate it.Note the second half of the comment: When the first element of the collection is added, expand the empty collection DEFAULTCAPACITY_EMPTY_ELEMENTDATA to a collection of DEFAULT_CAPACITY size, which is lazy init, which allocates memory space when used, in order to prevent space waste.)ps:transient indicates that this variable does not participate in serialization
27      */
28     transient Object[] elementData; // non-private to simplify nested class access
29 
30     /**
31      * The size of the ArrayList (the number of elements it contains).
32      *
33      * @serial
34     (Array size)
35      */
36     private int size;
Parameter Item
 1 /**
 2      * Constructs an empty list with the specified initial capacity.
 3      *
 4      * @param  initialCapacity  the initial capacity of the list
 5      * @throws IllegalArgumentException if the specified initial capacity
 6      *         is negative
 7 (Nothing to say, initialization parameter >0 creates the length array, =0 uses a shared empty array, <0 errors)
 8      */
 9     public ArrayList(int initialCapacity) {
10         if (initialCapacity > 0) {
11             this.elementData = new Object[initialCapacity];
12         } else if (initialCapacity == 0) {
13             this.elementData = EMPTY_ELEMENTDATA;
14         } else {
15             throw new IllegalArgumentException("Illegal Capacity: "+
16                                                initialCapacity);
17         }
18     }
19 
20     /**
21      * Constructs an empty list with an initial capacity of ten.
22 (using default empty array, lazy init, expanding array to default capacity DEFAULT_CAPACITY when adding elements)
23      */
24     public ArrayList() {
25         this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
26     }
27 
28     /**
29      * Constructs a list containing the elements of the specified
30      * collection, in the order they are returned by the collection's
31      * iterator.
32      *
33      * @param c the collection whose elements are to be placed into this list
34      * @throws NullPointerException if the specified collection is null
35      */
36     public ArrayList(Collection<? extends E> c) {
37         elementData = c.toArray();
38         if ((size = elementData.length) != 0) {
39             // c.toArray might (incorrectly) not return Object[] (see 6260652)
40 (This is exactly a bug, as you can see from the link below that it will be fixed in jdk9 and the cause of the bug will be explained later.Attach the link http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6260652)
41             if (elementData.getClass() != Object[].class)
42                 elementData = Arrays.copyOf(elementData, size, Object[].class);
43         } else {
44             // replace with empty array.
45             this.elementData = EMPTY_ELEMENTDATA;
46         }
47     }
Construction method

Advanced analysis, set common operations, start with a simple:

 1 /**
 2 The idea is simple:
 3 1.Determining if an index is out of bounds is an exception
 4 2.Remove the element of index directly from the corresponding position of the array elementData and force it to E
 5 **/
 6 public E get(int index) {
 7         rangeCheck(index);
 8 
 9         return elementData(index);
10     }
11 private void rangeCheck(int index) {
12         if (index >= size)
13             throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
14     }
15  E elementData(int index) {
16         return (E) elementData[index];
17     }
Get individual collection elements bitwise
 1 /**
 2 It's also simple:
 3 1.Determining if an index is out of bounds is an exception
 4 2.Take the value at index as the original value
 5 3.Set the location value to the modified value
 6 4.return Original value
 7 **/
 8 public E set(int index, E element) {
 9         rangeCheck(index);
10 
11         E oldValue = elementData(index);
12         elementData[index] = element;
13         return oldValue;
14     }
Modify individual collection elements bit by bit

As mentioned earlier, the bitwise fetching of individual elements is simple, and this is all you can expect if you know that arrayLists store data inside arrayLists.

Continue below, adding and deleting collections. Friends who have not seen the source code can think about the main difference between adding and deleting collections and the above two operations.Why is it more complex?

What about adding elements because they may not be in enough positions?If the location is not enough, add a location, also known as "expansion"!Let's think about it ourselves and compare it with the author's:

1. As mentioned above, the default construction method uses the lazy_init pattern, which is init only when elements are added.The first step is to determine if an init is needed and an init is needed.

2. Determine if expansion is required (which is, is the current array empty?)You need to expand (create a new array with a larger capacity), but the expansion is not infinite, error if you exceed the limit (Integer.Max_Value), and copy the original data to the new array, otherwise do nothing

3. Add elements at specified locations and size+.

 1 /**
 2 See how the author thinks and compare it with his own:
 3 1.Determine if lazy init is required, init if necessary, then execute step 2
 4 2.Determine if expansion is required, and if necessary, proceed to step 3, otherwise proceed to step 5
 5 3.Expansion, new array length = current array length * 1.5, and if the expanded length satisfies the target capacity or not, the new array length = the target capacity.Next, determine if the new array length exceeds the threshold MAX_ARRAY_SIZE, then execute step 4 if it exceeds
 6 4.Target Array Length=If Target Array Length>MAX_ARRAY_SIZE?Integer.MAX_VALUE:MAX_ARRAY_SIZE
 7 5.After expansion, execute data copy, from the original array copy data to the new array
 8 6.Add elements at specified locations and increase length
 9 **/
10 public boolean add(E e) {
11         ensureCapacityInternal(size + 1);  // Increments modCount!!
12         elementData[size++] = e;
13         return true;
14     }
15 private void ensureCapacityInternal(int minCapacity) {
16         if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
17             minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
18         }
19 
20         ensureExplicitCapacity(minCapacity);
21     }
22 private void ensureExplicitCapacity(int minCapacity) {
23         modCount++;
24 
25         // overflow-conscious code
26         if (minCapacity - elementData.length > 0)
27             grow(minCapacity);
28     }
29 private void grow(int minCapacity) {
30         // overflow-conscious code
31         int oldCapacity = elementData.length;
32         int newCapacity = oldCapacity + (oldCapacity >> 1);
33         if (newCapacity - minCapacity < 0)
34             newCapacity = minCapacity;
35         if (newCapacity - MAX_ARRAY_SIZE > 0)
36             newCapacity = hugeCapacity(minCapacity);
37         // minCapacity is usually close to size, so this is a win:
38         elementData = Arrays.copyOf(elementData, newCapacity);
39     }
40 private static int hugeCapacity(int minCapacity) {
41         if (minCapacity < 0) // overflow
42             throw new OutOfMemoryError();
43         return (minCapacity > MAX_ARRAY_SIZE) ?
44             Integer.MAX_VALUE :
45             MAX_ARRAY_SIZE;
46     }
Single Add Collection Element

After comparison, the general idea is OK, but why should the author add the capacity cap processing logic below?Friends can think about whether the maximum capacity of an ArrayList is Integer.MAX_VALUE or MAX_ARRAY_SIZE?

The answer is: Integer.MAX_VALUE, still it, is not MAX_ARRAY_SIZE defined by the author in the source code, which raises the question of the meaning of the existence of-MAX_ARRAY_SIZE.

I understand that the existence of this upper limit reduces the probability of memory overflow.First of all, the comment mentions "Some VMs reserve some header words in an array", which means that some virtual machines keep the header information in the array. There is no doubt that it takes up space to save the information. That is, my actual array space may be insufficient for Integer.MAX_VALUE. The author should think like this (I guess ~""I can basically make sure that the actual space is not less than MAX_ARRAY_SIZE (probably the result of an analysis of the implementation of mainstream virtual machines), so set a threshold to ensure that no memory overflow occurs, but if this capacity is not sufficient, you'll be given the remaining 8 at high risk, as to whether it willOh my God, I've done my best anyway!"

if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }
 1 /**
 2 Simple, one stroke, delete by location
 3 Ideas:
 4 1.Cross-border detection
 5 2.Remove the element from its current position
 6 3.All elements after the current position move forward one position as a whole
 7 4.The last position is empty, size--, returns the deleted element value.
 8 **/
 9 public E remove(int index) {
10         rangeCheck(index);
11 
12         modCount++;
13         E oldValue = elementData(index);
14 
15         int numMoved = size - index - 1;
16         if (numMoved > 0)
17             System.arraycopy(elementData, index+1, elementData, index,
18                              numMoved);
19         elementData[--size] = null; // clear to let GC do its work
20 
21         return oldValue;
22     }
23 /**
24 The idea is simple, delete by element (note that only delete the first matching value from scratch), traverse - > match - > delete
25 It is worth noting that special handling of null is done separately, comparing by address
26 **/
27 public boolean remove(Object o) {
28         if (o == null) {
29             for (int index = 0; index < size; index++)
30                 if (elementData[index] == null) {
31                     fastRemove(index);
32                     return true;
33                 }
34         } else {
35             for (int index = 0; index < size; index++)
36                 if (o.equals(elementData[index])) {
37                     fastRemove(index);
38                     return true;
39                 }
40         }
41         return false;
42     }
43 
44 Single Delete Collection Element
Single Delete Element
 1 public boolean removeAll(Collection<?> c) {
 2         Objects.requireNonNull(c);
 3         return batchRemove(c, false);
 4     }
 5 
 6     private boolean batchRemove(Collection<?> c, boolean complement) {
 7         final Object[] elementData = this.elementData;
 8         int r = 0, w = 0;
 9         boolean modified = false;
10         try {
11 //Traverse all elements of the current collection
12             for (; r < size; r++)
13                 if (c.contains(elementData[r]) == complement)
14 //If the specified collection does not contain the element (that is, it should not be deleted and needs to be preserved), move the current element to the head w Location (the original header element is deleted directly because it does not meet the criteria, so overwriting here deletes it), and w Move Tag to Next Bit
15                     elementData[w++] = elementData[r];
16         } finally {
17             // Preserve behavioral compatibility with AbstractCollection,
18             // even if c.contains() throws.
19 //Here are two cases:
20 //No exception r==size Will not enter this if
21 //If there are exceptions, then all elements that can be compared in the future due to the exceptions are grouped together copy reach w Position and place the w Marker Shift size - r(Quantity that can be understood as not yet compared)
22             if (r != size) {
23                 System.arraycopy(elementData, r,
24                                  elementData, w,
25                                  size - r);
26                 w += size - r;
27             }
28 //It's easy to understand here. w All that follows a location that was previously compared or abnormal and needs to be preserved should be deleted.
29             if (w != size) {
30                 // clear to let GC do its work
31                 for (int i = w; i < size; i++)
32                     elementData[i] = null;
33                 modCount += size - w;
34                 size = w;
35                 modified = true;
36             }
37         }
38         return modified;
39     }
Bulk Delete
 1 /**
 2 It looks simpler, but after careful analysis, you will find that it is still somewhat deep and interesting to see the implementation of defaultReadObject, which is ignored for the moment, and you will have the opportunity to analyze the serialization and deserialization of java in detail in the future.
 3 Ideas for implementation (serialization is an example, deserialization is the same)
 4 1.Call default serialization
 5 2.Write size
 6 3.Traverse the array, writing all collection elements
 7 **/
 8 private void writeObject(java.io.ObjectOutputStream s)
 9         throws java.io.IOException{
10         // Write out element count, and any hidden stuff
11         int expectedModCount = modCount;
12         s.defaultWriteObject();
13 
14         // Write out size as capacity for behavioural compatibility with clone()
15         s.writeInt(size);
16 
17         // Write out all elements in the proper order.
18         for (int i=0; i<size; i++) {
19             s.writeObject(elementData[i]);
20         }
21 
22         if (modCount != expectedModCount) {
23             throw new ConcurrentModificationException();
24         }
25     }
26 
27     /**
28      * Reconstitute the <tt>ArrayList</tt> instance from a stream (that is,
29      * deserialize it).
30      */
31     private void readObject(java.io.ObjectInputStream s)
32         throws java.io.IOException, ClassNotFoundException {
33         elementData = EMPTY_ELEMENTDATA;
34 
35         // Read in size, and any hidden stuff
36         s.defaultReadObject();
37 
38         // Read in capacity
39         s.readInt(); // ignored
40 
41         if (size > 0) {
42             // be like clone(), allocate array based upon size not capacity
43             ensureCapacityInternal(size);
44 
45             Object[] a = elementData;
46             // Read in all elements in the proper order.
47             for (int i=0; i<size; i++) {
48                 a[i] = s.readObject();
49             }
50         }
51     }
Serializable and Deserialize
/**
iterator
1.Learn about the fast failure mechanism
2.Focus on the implementation of forEachRemaining to extend the lambda expression
**/
public Iterator<E> iterator() {
        return new Itr();
    }

    /**
     * An optimized version of AbstractList.Itr
     */
    private class Itr implements Iterator<E> {
        int cursor;       // index of next element to return
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount;

        public boolean hasNext() {
            return cursor != size;
        }

        @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();

            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

        @Override
        @SuppressWarnings("unchecked")
        public void forEachRemaining(Consumer<? super E> consumer) {
            Objects.requireNonNull(consumer);
            final int size = ArrayList.this.size;
            int i = cursor;
            if (i >= size) {
                return;
            }
            final Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length) {
                throw new ConcurrentModificationException();
            }
            while (i != size && modCount == expectedModCount) {
                consumer.accept((E) elementData[i++]);
            }
            // update once at end of iteration to reduce heap write traffic
            cursor = i;
            lastRet = i - 1;
            checkForComodification();
        }

        final void checkForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }
    }
iterator

3. ArrayList Use Recommendations

1. When you need an Array List with a fixed length, this length initialization is recommended for the reason that it prevents frequent expansion, which is relatively performance-intensive (note that this is relative to regular add-delete, change-check, and so on).

2. Use clone and toArray with caution. Data modification operations affect each other for source and method return objects.They eventually executed System.arrayCopy(), which is a "shallow copy": only the references are copied, and all modifications to one of the references are passed on to the other.Copies of java should all be "shallow copies". The tests are as follows:

/**
Test the code, do it anywhere, for reference only.
**/
class TestObj {

            TestObj(Integer i, String s) {
                this.i = i;
                this.s = s;
            }

            Integer i;

            String s;

            public String toString() {

                return "i=" + i + "," + "s=" + s;
            }
        }
        ArrayList<TestObj> arrayList = new ArrayList<>();

        int size = 10;


        for (Integer i = 0; i < size; i++) {

            arrayList.add(new TestObj(i, "test" + i));

        }


        ArrayList<TestObj> cloneArrayList = (ArrayList<TestObj>) arrayList.clone();

        cloneArrayList.add(new TestObj(101, "test" + 101));

        System.out.println("arrayList size: " + arrayList.size());
        System.out.println("cloneArrayList size: " + cloneArrayList.size());


        System.out.println("arrayList index 0: " + arrayList.get(0).toString());
        System.out.println("cloneArrayList index 0: " + cloneArrayList.get(0).toString());

        //modify cloneArrayList index 0
        TestObj testInteger = cloneArrayList.get(0);
        testInteger.i = 1000111;

        System.out.println("modify cloneArrayList index=0 After the object, ps: I did not modify it arrayList Oh");


        System.out.println("arrayList index 0: " + arrayList.get(0).toString());
        System.out.println("cloneArrayList index 0: " + cloneArrayList.get(0).toString());
//Test results:
arrayList size: 10
cloneArrayList size: 11
arrayList index 0: i=0,s=test0
cloneArrayList index 0: i=0,s=test0
//Modify cloneArrayList index=0 After the object, ps: I did not modify it arrayList Oh
arrayList index 0: i=1000111,s=test0
cloneArrayList index 0: i=1000111,s=test0
Test clone

3. Use sublists with caution, and after further analysis you will find that the underlying reasons are similar to those above and are the same references, so data modification operations can affect each other, as in the following examples:

 1 public static void testSubList() {
 2 
 3         Integer index = 10;
 4 
 5         List<Integer> myList = new ArrayList<>(index);
 6 
 7         for (int i = 0; i < index; i++) {
 8             myList.add(i);
 9         }
10 
11         List<Integer> mySubList = myList.subList(3, 5);
12 
13 
14 System.out.println ("Print myList:");
15         myList.forEach(System.out::println);
16 
17 System.out.println ("Add an element to mySubList, notice I'm not doing anything with myList");
18         mySubList.add(100);
19 
20 
21 System.out.println ("Print mySubList:");
22         mySubList.forEach(System.out::println);
23 
24 System.out.println ("Print myList again:");
25         myList.forEach(System.out::println);
26 
27 
28     }
29 Run results:
30 Print myList:
31 0
32 1
33 2
34 3
35 4
36 5
37 6
38 7
39 8
40 9
 41 add an element to mySubList, notice that I haven't done anything with myList
 42 Print mySubList:
43 3
44 4
45 100
 46 Print myList again:
47 0
48 1
49 2
50 3
51 4
52 100
53 5
54 6
55 7
56 8
57 9
Test subList

4. Careful friends may find that arrayList does not provide a way to construct arrayLists, but we know that array->arrayList is a common requirement, so what?There is more than one way, just choose what you like, as follows:

1 int[] ints = {1, 2, 3, 4};
2         List<Integer> myIntListByStream = Arrays.stream(ints).boxed().collect(Collectors.toList());
3 
4 
5         Integer[] integers = {1, 2, 3, 4};
6         List<Integer> myListByAsList = Arrays.asList(integers);   //ps: The resulting array is not modifiable,Modifications here refer to all changes List Behavior of data
7         List<Integer> myListByNewArrayListAsList = new ArrayList<>(Arrays.asList(integers));
8         List<Integer> myListByStream = Arrays.stream(integers).collect(Collectors.toList());
arrayToList

4. Source-related extensions (something I can think of that I can think about in depth)

1. fail fast mechanism:

We know that ArrayList is not a thread-safe collection class.This means that errors are prone to occur during concurrent operations.Follow the general thinking, find the result is wrong, throw an exception.However, in practical applications, we are not satisfied with this. Is there a detection mechanism that checks before concurrent operations and tells us in advance about possible errors?As you might expect, this is the quick failure mechanism we mentioned.How does it work?It's really simple. We define a counter, modCount, that records the number of changes in the structure of all collections, such as modCount+=2 for adding two elements and modCount+=3 for deleting three elements. This counter can only increase and it can't decrease. Then we do something that needs to fail quickly (For example, iterators, serializations, and so on), record the current modCount as expectedModCount before execution, and determine if expectedModCount and modCount are equal at the appropriate time to determine if there has been any change in the collection structure during this period.

2.lambda expression: Reading the source code, we found that many lambda supporting methods have been added to the ArrayList. As one of the highlights of JDK8, it should be skilled in using lambda, and then I find it interesting to analyze the implementation mechanism of lambda in detail.

Implementation of common JNI methods in 3.jdk: When we analyzed the clone method above, we only analyzed the Object's native methods in the end. I think it is interesting to have a chance to see the implementation of some commonly used native methods, which need to be studied.

5. Summarize

To be honest, it took more time to analyze ArrayList than I expected, and I thought I understood it thoroughly at one time, but I reviewed almost half of the source code on the way to this article (it's a sad story), but anyway, it was a good start, the next oneParse hashMap.

Keywords: Java Lambda less JDK

Added by D1proball on Sat, 06 Jul 2019 20:31:55 +0300