debug the underlying java code, the correct posture of de duplication of data in the list, and compare the correct use method and wrong use of java list remove

preface

There are mainly two kinds of data structures in List, one is array and the other is linked List.

Array, all of which implement the RandomAccess class

Array implementations include ArrayList, Vector, Stack and CopyOnWriteArrayList. They all inherit AbstractList. AbstractList implements some interfaces in the List and traverses the array structure, that is, in the Iterator part of the Iterator design mode mentioned above, Stack implements a LIFO and Vector implements thread safety, CopyOnWriteArrayList is a variant of thread safety. The read operation has better performance without lock. When writing, there is a lock, and when writing, copy the metadata, modify it on the new data, and then synchronize it back to the metadata, so that it can be modified during traversal without throwing CurrentModificationException.

Linked list, RandomAccess class not implemented

The underlying LinkedList uses the Entry class to store data elements. The linked list is hand in hand. You can know the previous element and the next element through an element. The traversal mode of the linked list is different from that of the array. The construction method initializes the value of next. if (int < (size > > 1)) here introduces the > > and > > > symbols, and the > > generation table shifts to the right.

For example, 3 > > 1 represents binary 0011 moving 1 bit to the right, which is 0001. The final result is 1. 8 > > > 3 represents the third power of 8 divided by 2, and the final result is 1.

If (int < (size > > 1)) here is mainly used to judge whether the element is in the first half or the second half. Size > > 1 is equivalent to size/2. It is mainly used to locate the position of the index element in the linked list faster without traversing from scratch. This is the advantage of the iterator design pattern, which hides the details of internal implementation and provides a unified iterative interface for different data structures.

Main method for preparing test list

    public static void main(String[] args) {
        ArrayList<String> list = new ArrayList<String>();
        for (int i = 0; i < 10000; i++) {
            list.add("shanghai");
            list.add("shanghai");
            list.add("guangzhou");
            list.add("shenzhen");
            list.add("shanghai");
            list.add("hangzhou");
            list.add("shanghai");
            list.add("beijing");
            list.add("shanghai");
            list.add("shanghai");
            list.add("guangzhou");
            list.add("shenzhen");
            list.add("shanghai");
            list.add("hangzhou");
            list.add("shanghai");
        }

//        print(list);
        System.out.println("\n");
        removeSetRepeat(list); // 9ms  25ms
//        removeStrRepeat(list);  //19ms  78ms
        System.out.println("\n");
//        removeRowRepeat(list);  //9ms  24ms
    }

    private static void print(List<String> list) {
        System.out.print("item:");
        for (String item : list) {
            System.out.print(item+": ");
        }
    }

for loop de duplication comparison

StringBuilder splicing and de duplication

    /*
     * Correct StringBuilder de duplication
     */
    public static void removeStrRepeat(List<String> list){
        StopWatch sw = new StopWatch();
        sw.start();
        StringBuilder stringBuilder = new StringBuilder("x");
        for(int i = list.size() - 1; i >= 0; i--){
            String item = list.get(i);
            if(item == null || stringBuilder.toString().contains(item)){
                list.remove(i);
                continue;
            }
            stringBuilder.append(item).append("x");
        }
        print(list);
        sw.stop();
        System.out.println("----: " + sw.getTotalTimeMillis()+"ms");
    }
StringBuilder Remove 140000 pieces of data 19 ms,140 10000 pieces of data 78 ms

ArrayList de duplication

    /*
     * Correct ArrayList to remove 140000 pieces of data for 9ms and 1.4 million pieces of data for 24ms
     */
    public static void removeRowRepeat(List<String> list){
        StopWatch sw = new StopWatch();
        sw.start();
        List<String> rowx = new ArrayList<>();
        for(int i = list.size() - 1; i >= 0; i--){
            String item = list.get(i);
            if(item == null || rowx.contains(item)){
                list.remove(i);
                continue;
            }
            rowx.add(item);
        }
        print(list);
        sw.stop();
        System.out.println("----: " + sw.getTotalTimeMillis()+"ms");
    }
ArrayList Array de duplication 140000 data 9 ms,140 10000 pieces of data 24 ms. 

set de duplication

/*
     * Correct HashSet to remove 140000 pieces of data 9ms and 1.4 million pieces of data 25ms
     */
    public static void removeSetRepeat(List<String> list){
        StopWatch sw = new StopWatch();
        sw.start();
        Set<String> set = new HashSet<>(list);
        print(new ArrayList<>(set));
        sw.stop();
        System.out.println("----: " + sw.getTotalTimeMillis()+"ms");
    }
Set Remove 140000 pieces of data 9 ms,140 10000 pieces of data 25 ms,And ArrayList array for Cycle weight removal is equal.

for loop size type remove

Error case

IndexOutOfBoundsException index out of bounds

 public static void remove11(List<String> list, String target) {
        int size = list.size();
        for (int i = 0; i < size; i++) {
            String item = list.get(i);
            if (target.equals(item)) {
                list.remove(item);
            }
        }
        print(list);
    }

index starts from 0. remove cannot completely delete the target keyword shanghai --- retrograde crash

because index Has been increasing, and list Delete once in, list of size Decrease, resulting in list.get(i)One of the elements may be ignored
  /*
     * Error index starts from 0. remove cannot completely delete the target keyword shanghai
     * Because the index keeps increasing, and the list is deleted once, list Get (I) may ignore one of the elements
     */
    public static void remove12(List<String> list, String target) {
        //item:beijing: shanghai: shanghai: guangzhou: shenzhen: hangzhou: 
        for (int i = 0; i < list.size(); i++) {
            String item = list.get(i);
            System.out.println("----: " + item);
            if (target.equals(item)) {
                list.remove(item);
            }
        }
        print(list);
    }

Correct case

Go opposite -- list Remove, no error

size Decay backwards, index It also decays from the maximum value, and the two go opposite, so list.remove,No mistakes
public static void remove13(List<String> list, String target) {
        int size = list.size();
        for (int i = size - 1; i >= 0; i--) {
            String item = list.get(i);
            if (target.equals(item)) {
                list.remove(item);
            }
        }
        print(list);
    }

public static void remove14(List<String> list, String target){
        for(int i = list.size() - 1; i >= 0; i--){
            String item = list.get(i);
            if(target.equals(item)){
                list.remove(item);
            }
        }
        print(list);
}

Object traversal list--foreach writing -- for (string item: list)

Error -- > remove for (string item: list) causes ConcurrentModificationException

Object traversal list, list Remove, which causes the modCount value of the list object to be modified, while the expectedModCount value of the iterator of the list object is not modified. Therefore, a ConcurrentModificationException is thrown.

    /*
     * The error object traverses the list, the modCount value of the list object is modified, but the expectedModCount value of the iterator of the list object is not modified, so a ConcurrentModificationException is thrown.
     */
    public static void remove21(List<String> list, String target){
        for(String item : list){
            System.out.print(item+": ");
            if(target.equals(item)){
                list.remove(item);
            }
        }
        print(list);
    }

Correct -- > copyonwritearraylist solves the concurrency problem of List

public static void remove22(ArrayList<String> list, String target) {
        final CopyOnWriteArrayList<String> cowList = new CopyOnWriteArrayList<String>(list);
        for (String item : cowList) {
            if (item.equals(target)) {
                cowList.remove(item);
            }
        }
        print(cowList);
    }

For each CopyOnWriteArray List object, there is an array object used to store specific elements. ReentrantLock exclusive lock object is used to ensure that only one thread modifies the array at the same time.

CopyOnWriteArrayList is the only concurrent List in concurrent contracting. CopyOnWriteArrayList is a thread safe ArrayList. The modification operations are carried out on a copied array (snapshot) at the bottom, that is, the write time copy strategy is used (when adding / set / remove (modify), the new array is copied for operation).

Principle: if each element is added, a new array will be copied. At this time, if there are multiple threads, the new array will be used in the add/set/remove operation, and the old array will be used in the get operation, so there is no need to lock the get, but the add/set/remove still needs to be locked (exclusive lock). At this time, the so-called read-write separation state is formed.

The copy operation of CopyOnWriteArrayList causes the following problems:

  1. There is a problem of memory occupation, because every time we modify the container structure, we have to copy the container. In this way, we have old objects and new objects, which will occupy two copies of memory. If an object occupies a large amount of memory, it will cause frequent garbage collection and reduce performance;
  2. CopyOnWrite can only guarantee the final consistency of data, but can not guarantee the real-time consistency of data.

For example, one thread is modifying the container and another thread is reading the contents of the container. In fact, these are two container arrays, so the reading thread reads the old data

Therefore, the CopyOnWriteArrayList collection is more suitable for scenarios where read operations are far more than write operations.

list.iterator iterative traversal

Error case list The iterator deletes the message ConcurrentModificationException

  /*
     * Error case list The iterator deletes the message ConcurrentModificationException
     */
    public static void remove31(List<String> list, String target){
        Iterator<String> iter = list.iterator();
        while (iter.hasNext()) {
            String item = iter.next();
            if (item.equals(target)) {
                list.remove(item);
            }
        }
        print(list);
    }

Generate Java util. Concurrentmodificationexception exception. list.iterator is actually short for iteratable, hasNext and next methods. So we start from list iterator() starts analysis and tracks the iterator() method, which returns the Itr iterator object.

Correct case -- use list Delete the internal class Itr in the iterator

   /*
     * Use list correctly Delete the internal class Itr in the iterator
     */
    public static void remove32(List<String> list, String target){
        Iterator<String> iter = list.iterator();
        while (iter.hasNext()) {
            String item = iter.next();
            if (item.equals(target)) {
                iter.remove();
            }
        }
        print(list);
    }
use list.iterator Inner class in Itr Delete without modification ArrayList of modCount Value of, so modCount != expectedModCount Forever for false

 

Keywords: Java

Added by witty on Fri, 11 Feb 2022 15:46:24 +0200