Java cultivation Guide: high frequency source code analysis reading notes - implementation collection class of Java data structure

1, Arrays tool class

From Java util. Arrays, various methods used to process arrays.

1.1 List asList(T... a)

It is used to return a fixed size List supported by a custom array. Although a List is returned here, it is an internal class in Arrays. The main methods here are:

There are no add and remove methods, so it does not support add and remove methods. It is a fixed length list, which does not support addition and deletion, but can be modified

Note: because parameters are generic, basic data types cannot be used as parameters, but arrays of basic data types can be used.

1.2 void sort(int[] a)

It has a series of overloaded methods, including overloaded methods of various basic data types and Object types (which implement the Comparable interface)

1.3 int binarySearch(int[] a, int key)

binarySearch can only search in an ordered array. If an element is found, it returns the array subscript. If there is no element, it returns a negative number.

1.4 int[] copyOf(int[] original, int newLength)

Call system_ arraycopy_ Implement array copy, copy the original array into a new array, and the array length is the set length.

1.5 boolean equals(int[] a, int[] a2)

Compare whether each element at the corresponding position in the two arrays is equal.

1.6 boolean deepEquals(Object[] a1, Object[] a2)

It is used to compare whether the elements of two arrays are equal. It is mainly used to compare multidimensional arrays and nested arrays at any level.

1.7 void fill(int[] a, int val)

There are many methods to assign values to overloaded arrays

2, ArrayList class

ArrayList is a dynamic array. It is implemented by an array. It supports random access. The elements are ordered and can be repeated.

RandomAccess interface is a tag interface, which is generally used for List implementation to indicate that they support fast (O(1)) random access.
Important attributes:

     * Default initial capacity.
    private static final int DEFAULT_CAPACITY = 10;

     * Shared empty array instance used for empty instances.
    private static final Object[] EMPTY_ELEMENTDATA = {};

     * Shared empty array instance used for default sized empty instances. We
     * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
     * first element is added.
     Used to determine whether to extend to default when adding an element for the first time_ CAPACITY
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
    transient Object[] elementData; // non-private to simplify nested class access

     * The size of the ArrayList (the number of elements it contains).
     * @serial
    private int size;

     * The number of times this list has been <i>structurally modified</i>.
     * Structural modifications are those that change the size of the
     * list, or otherwise perturb it in such a fashion that iterations in
     * progress may yield incorrect results.
     * <p>This field is used by the iterator and list iterator implementation
     * returned by the {@code iterator} and {@code listIterator} methods.
     * If the value of this field changes unexpectedly, the iterator (or list
     * iterator) will throw a {@code ConcurrentModificationException} in
     * response to the {@code next}, {@code remove}, {@code previous},
     * {@code set} or {@code add} operations.  This provides
     * <i>fail-fast</i> behavior, rather than non-deterministic behavior in
     * the face of concurrent modification during iteration.
     * <p><b>Use of this field by subclasses is optional.</b> If a subclass
     * wishes to provide fail-fast iterators (and list iterators), then it
     * merely has to increment this field in its {@code add(int, E)} and
     * {@code remove(int)} methods (and any other methods that it overrides
     * that result in structural modifications to the list).  A single call to
     * {@code add(int, E)} or {@code remove(int)} must add no more than
     * one to this field, or the iterators (and list iterators) will throw
     * bogus {@code ConcurrentModificationExceptions}.  If an implementation
     * does not wish to provide fail-fast iterators, this field may be
     * ignored.
     	Provide quick failure mode
    protected transient int modCount = 0;

The default initial capacity of ArrayList is 0 instead of 10. 10 is the default set when adding elements for the first time, that is, defaultcapability_ EMPTY_ELEMENTDATA, then it will increase to 10 for the first time. If EMPTY_ELEMENTDATA, then it will be expanded by 1.5 times on the original basis each time.

2.1 boolean add(E e)

The overload method also has void add(int index, E element)
Via arrays Cooy () expands the dynamic array. First, the collection size will be determined through the judgment of ensureCapacityInternal. Among them, the logic has the above-mentioned default capability_ EMPTY_ Elementdata's special judgment on it. Each modification will add 1 to the modCount. This is to ensure that the ArrayList iterator is used. If it is modified in concurrent operations, it can fail quickly (similar to optimistic locking)
Only when the array element exceeds the current length can it be expanded. The maximum expansion can be 0x7fffff and the maximum value of int.
null values can be added.
Modify modCount

2.2 E remove(int index)

Overload methods include: boolean remove(Object o) to remove the elements found in the * * first query * *, Boolean removeAll (collection <? > C) to remove the relevant elements in all collections (not the first), and Boolean removeif (predict <? Super E > filter) to remove the qualified elements.

In the remove method, it mainly applies system_ arraycopy_ Remove the element.
Modify modCount

2.3 E set(int index, E element)

Sets the element of an element. modCount will not be modified.

2.4 E get(int index)

Returns the element at that location

2.5 int indexOf(Object o)

Returns the subscript position of the element for the first time. There is no return_- 1_
int lastIndexOf(Object o) the subscript position of the first occurrence from back to front.

2.6 Iterator iterator()

Return an Itr internal class, which belongs to the implementation class of the Iterator interface. Internally, check the expectedModCount and modCount through the checkForComodification() method to realize the optimistic locking mechanism.

ArrayList also provides an internal method, listiterator < E > listiterator(), which is used to generate the internal class of ListItr. It is a subclass of Itr. Compared with the internal class of Itr, it provides the previous method and add method of forward traversal.

2.7 traversal mode

In addition to the Iterator above, there are three traversal methods, namely
for (int i = 0; i < a.size(); i++) { a.get(i); } This kind of direct for does not check modCount
Void foreach (consumer <? Super E > action) is performed through lamda expression, which will also check modCount
for (Integer s : a) { System._out_.println(s); } This is a syntax sugar of JDK. If it is decompiled, it is still traversed by Iterator.

2.8 List subList(int fromIndex, int toIndex)

It is used to generate a sublist class. It is an internal class and can be regarded as the view of the original collection. If the sublist class is modified and added, the original array will also be operated accordingly.

2.9 int size()

Returns the number of sets, not the number of arrays.

2.10 void trimToSize()

This method is used to recover excess memory, that is, once the set is no longer added to the redundant elements, the method is called to adjust the size of the array to the size of the collection element.

3, LinkedList class

A collection class implemented through a double linked list. Because LinkedList implements Deque, LinkedList can be directly used as a declaration method of Deque and Queue, but ArrayList can't. only ArrayDeque can be used.

Important attributes:

    transient int size = 0;

     * Pointer to first node.
     * Invariant: (first == null && last == null) ||
     *            (first.prev == null && first.item != null)
    transient Node<E> first;

     * Pointer to last node.
     * Invariant: (first == null && last == null) ||
     *            ( == null && last.item != null)
    transient Node<E> last;

    private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
   = next;
            this.prev = prev;

The linked list does not need to initialize the length of the set in advance.

3.1 boolean add(E e)

Overload methods include: void addlast (E, e) is consistent with add, void addfirst (E, e), void add(int index, E element), Boolean addall (collection <? Extensions E > C), Boolean addall (int index, collection <? Extensions E > C). Insert the collection at the specified position

Insert element

3.2 E remove()

Overload methods include: E removeFirst() is consistent with the default, E remove(int index), boolean remove(Object o), boolean removeFirstOccurrence(Object o) is consistent with the previous remove(Object o), E removeLast(), boolean removeLastOccurrence(Object o), Boolean removeAll (collection <? > C), Boolean removeif (predict <? Super E > filter)
Removing Elements

3.3 E set(int index, E element)

Replace the element at the specified location

3.4 E get(int index)

Overload methods: E getFirst(), E getLast()
Get element

3.5 traversal mode

Linked has the same traversal method as ArrayList. However, due to its internal data results, if an ordinary for loop is used, it will lead to more time consumption, because its get time complexity is O(N).
You can use an iterator, which records the previous Node, so you don't need to traverse from scratch when accessing the next Node.
Others are similar to ArrayList.

4, HashMap class

HashMap uses the chain address method to solve hash conflicts in jdk1 In 7, HashMap is composed of array + linked list, but in jdk1 In 8, HashMap is composed of array + linked list + red black tree.

	// The default minimum capacity is 16
	static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     	Default maximum capacity
    static final int MAXIMUM_CAPACITY = 1 << 30;

     * The load factor used when none specified in constructor.
     Load factor
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     When the node on the bucket is greater than this value, it is converted into a red black tree
    static final int TREEIFY_THRESHOLD = 8;

     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     When the number of Bucket nodes is less than this value, it is converted to a linked list
    static final int UNTREEIFY_THRESHOLD = 6;

     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     When the number of elements in the collection is less than this value, even if the number in the above bucket exceeds the threshold, it will not be converted into a tree, but will be expanded.
    static final int MIN_TREEIFY_CAPACITY = 64;

     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     Save actual data
    transient Node<K,V>[] table;

     * Holds cached entrySet(). Note that AbstractMap fields are used
     * for keySet() and values().
    transient Set<Map.Entry<K,V>> entrySet;

     * The number of key-value mappings contained in this map.
    transient int size;

     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
    transient int modCount;

     * The next size value at which to resize (capacity * load factor).
     * @serial
    // (The javadoc description is true upon serialization.
    // Additionally, if the table array has not been allocated, this
    // field holds the initial array capacity, or zero signifying
	// The threshold of capacity expansion is generally capacity * load factor. Each capacity expansion is always twice the previous one, that is, the idempotent of 2
    int threshold;

     * The load factor for the hash table.
     * Load factor, the default value is 0.75, which can be greater than 1 because of the linked table
     * @serial
    final float loadFactor;


    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);

4.1 V put(K key, V value)

Similar methods: void putall (map <? Extensions K,? Extensions V > m), V putIfAbsent(K key, V value)
In this method, if the size is greater than the threshold, expand the capacity and call the resize() method. Other logic is based on whether the bucket where the hash value is located has a value. If not, it is directly added to the node; If so, judge whether the location should be a linked list or a red black tree, and add the corresponding operation.
Note: if it is a linked list, when the quantity is greater than or equal to the threshold_ TREEIFY_THRESHOLD_, Then it is transformed into red black tree; Red and black trees also have corresponding operations to convert to linked lists.

The methods afternodeaccess (node < K, V > P) and afterNodeInsertion(boolean evict) are not implemented in HashMap, but are handled in LinkedHashMap.

4.2 Node<K,V>[] resize()

In this method, capacity expansion refers to the size adjustment of the attribute table array, which is similar to the dynamic capacity expansion of ArrayList, but there are more operations on linked list and red black tree. Here, the threshold attribute during capacity expansion is threshold.
In the Map, the maximum expandable capacity is still 0x7fffff. Of course, this does not mean that the maximum size is the value, but that there are only so many value buckets at most.
In jdk1 In 8, each expansion is twice the direct expansion, so the original element is either in the original position or in the original position + the length of the original array.

4.3 V remove(Object key)

Similar method: Boolean remove (object key, object value) when the key values are equal at the same time, remove
To remove elements from Map, generally find the location of the bucket (through hash) and then judge. If it is a linked list, traverse the linked list and delete the elements to be deleted. If it is a red black tree, traverse the tree and delete the elements to adjust the balance. When the red black tree is less than the threshold untreeify_ When threshold, it is converted into a linked list.
For afternoderemoval (node < K, V > P), LinkedHashMap is also used for processing

4.4 V get(Object key)

Similar methods: V getOrDefault(Object key, V defaultValue), boolean containsKey(Object key), boolean containsValue(Object value) find the time complexity of each value, O(N) through traversal
Find the element according to the key and view the first node of the bucket. If yes, it will return. If not, it will traverse the linked list or red black tree. Does not exist. Return null.

4.5 traversal method

The set of keys or values can be obtained through set < k > keyset() and collection < V > values().
It is not recommended to use keySet to get every value after obtaining the set of keys.
You can use set < map Entry < K, V > > entryset() to directly obtain the set of key and value.
Iterators can be used in the above three methods of obtaining collections.
Note: the change of modCount

5, LinkedHashMap class

LinkedHashMap is a set based on HashMap, which has all the characteristics of HashMap set. Except that HashMap is disordered, LinkedHashMap is orderly. This is because LinkedHashMap maintains a two-way linked list with all data on the basis of HashMap, which saves the sequence of element iteration.

Entry node:

     * HashMap.Node subclass for normal LinkedHashMap entries.
    static class Entry<K,V> extends HashMap.Node<K,V> {
        Entry<K,V> before, after;
        Entry(int hash, K key, V value, Node<K,V> next) {
            super(hash, key, value, next);

It is a two-way linked list.

     * The head (eldest) of the doubly linked list.
    transient LinkedHashMap.Entry<K,V> head;

     * The tail (youngest) of the doubly linked list.
    transient LinkedHashMap.Entry<K,V> tail;

     * The iteration ordering method for this linked hash map: <tt>true</tt>
     * for access-order, <tt>false</tt> for insertion-order.
     * @serial
    final boolean accessOrder;

In addition to the attributes inherited from HashMap, a two-way linked list is maintained separately, which are the necessary attributes and data structures of the two-way linked list.

5.1 adding functions

There is no put method in LinkedHashMap. Instead, it directly calls the put method of HashMap, but rewrites the methods of afternodeaccess (node < K, V > P) and afterNodeInsertion(boolean evict).
Of course, when adding a new element, the new element is also added to the end of the linked list. Through the method linkNodeLast()
afterNodeAccess takes effect when accessOrder is true (and the current node is not the tail node), that is, it is sorted according to the access order. In the specific implementation, the node is moved from other positions to the tail.
afterNodeInsertion mainly deals with whether to remove the elements of the first node after adding. This feature can be used to implement the LRU algorithm, in which the Boolean removeeldestentry (map. Entry < K, V > eldest) method needs to be rewritten.

5.2 deleting elements

LinkedHashMap directly calls the remove method of HashMap. Just override the afterNodeRemoval method,
In this method, the function of removing an element in the double linked list is added. Relatively simple

5.3 traversal elements

LinkedHashMap is traversed according to the double linked list, which is ordered. The method entrySet returns the internal class of linkedenryset.

6, TreeMap class

TreeMap is an ordered k-v set related to Tree and Map set, which is realized by red black Tree as the bottom layer
It not only inherits AbstractMap, but also implements the interface NavigableMap, indicating that it supports a series of navigation methods to obtain the specified set, such as obtaining the set smaller than the specified key.

Because it is an ordered set, constructors can be directly passed in the construction method, such as public treemap (comparator <? Super k > comparator). Of course, the comparator Comparable can also be implemented in the class of key
The order here is to sort the elements according to the key. The LinkedHashMap above only ensures an insertion / access order, not according to the key.

6.1 V put(K key, V value)

Similar methods: void putall (map <? Extensions K,? Extensions V > map), V putIfAbsent(K key, V value)
If the comparator method is not implemented, the key is not allowed to be null

6.2 V remove(Object key)

Similar method: boolean remove(Object key, Object value)
Removing Elements

Note: This article is the reading notes of "Java Practice Guide: high frequency source code analysis"

Keywords: Java

Added by liquefi on Mon, 21 Feb 2022 09:08:42 +0200