Origin | Do you really know HashMap?(HashMap Source Analysis)

Preface

When you talk about hashMap, you really don't understand the simple things. You really feel that you need an article to explain it so that you are not afraid of being asked during an interview.

**Introduction to HashMap
HashMap Initialization
HashMap Expansion Mechanism
HashMap data structure
Resolution of HashMap Hash Conflict
HashMap Usage**

1. Introduction to HashMap

  He is an implementation of the Map interface based on a hash table.This implementation provides all the optional mapping operations and allows null values and keys to be used.This class does not guarantee the order of mappings, especially that it does not guarantee that the order will remain constant.This implementation assumes that hash functions properly distribute elements among buckets, providing stable performance for basic operations (get and put).Additionally, HashMap is thread-safe, while Hashtable is thread-safe.

HashMap inherits the AbstractMap class and implements the Map, Cloneable, Serializable interfaces.


public class HashMap<K,V>
    extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable

2. HashMap Initialization

Important parameter description

//Initialized capacity size
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
    //Maximum capacity size
    static final int MAXIMUM_CAPACITY = 1 << 30;
    //Default Load Factor
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

HashMap does two things when it is initialized, initializing capacity (compared to a bucket) of 16 and a load factor of 0.75

public HashMap() {
     //Initialization bucket size DEFAULT_INITIAL_CAPACITY=16  
     //Load factor DEFAULT_LOAD_FACTOR=0.75
     this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
}
public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        //If the capacity exceeds the maximum value, take the maximum value
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        //Initialize Load Factor 0.75
        this.loadFactor = loadFactor;
        //Initialization bucket size 16
        threshold = initialCapacity;
        init();
    }

III. HashMap Expansion Mechanism

Load factor can be understood as saturation. The larger the load factor, the smaller the space occupied, but the less efficient the query is.The smaller the load factor, the larger the space consumed, but it will improve query efficiency.This is determined by the data structure, described below.

The actual capacity of HashMap is the factor* capacity, which defaults to 16 x 0.75=12; this is important and has some impact on efficiency!When objects stored in a HashMap exceed this capacity, the HashMap will need resize (rearrange after expanding by two times).

private void inflateTable(int toSize) {
        //Power of 2 closest to and greater than toSize
        int capacity = roundUpToPowerOf2(toSize);
        //Define the maximum actual capacity beyond which expansion is required
        threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        //Define an Entry array, place put data
        table = new Entry[capacity];
        initHashSeedAsNeeded(capacity);
  }
  //The closest power of 2 that is greater than or equal to number, for example, number3 returns 4, number 4 returns 4
   private static int roundUpToPowerOf2(int number) {
        return number >= MAXIMUM_CAPACITY
                ? MAXIMUM_CAPACITY
                : (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1;
    }
//Add elements and determine if expansion is required
void addEntry(int hash, K key, V value, int bucketIndex) {
      //Size is the current array size, threshold s are the actual maximum capacity, and expand if larger
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);//Expansion
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }
        //Add to
        createEntry(hash, key, value, bucketIndex);
    }
    //Expansion
    void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        //Determines if the old array is equal to the maximum capacity and if it is equal to no expansion allowed
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }
        //Define new data
        Entry[] newTable = new Entry[newCapacity];
        //Redefine old data in a new array
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        table = newTable;
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }
    //Need to recalculate the save location in the new array and save
void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
        for (Entry<K,V> e : table) {
            while(null != e) {
                Entry<K,V> next = e.next;
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }
    }

Because re-creating new data and recalculating the storage location of elements can have a significant impact on performance, it's best to estimate the number of elements, define their size when creating a HashMap, and ideally power it to two, so airborne can be better utilized.

4. HashMap data structure

    HashMap is a hash bucket (array and chain table) that stores key-value mappings, data structures of arrays and chain tables, and inherits linear queries of arrays and addressing modifications of chain tables in terms of query and modification.(Horizontal represents an array, and vertical represents a list of chains)

5. Resolution of HashMap Data Collision

    Because hash values have hash conflicts, different key s may have the same hash value.After that, the problem needs to be solved through the chain table structure.

Normally, when we put a key-value data into storage, the HashMap accountant calculates the location that should be stored in the bucket based on the hash value of the key, determines if there is data at that location, and if there is data, determines if hash and key are equal. If they are equal, the new value replaces the old value and returns the old value.(This is not a collision, and the key is the same)

public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
        //Compute Hash
        int hash = hash(key);
        //Calculate position in bucket
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
          //Determine if hash and key are the same
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {        //equally
              //Hash is the same, key is the same
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        //i Location data is empty; or data is not empty, hash is different; or hash is the same, key is different
        modCount++;
        //Add to
        addEntry(hash, key, value, i);
        return null;
    }

i Location data is empty; or data is not empty, hash is different; or hash is the same (hash collision), key is different, follow the process below:

In the source code below, bucketIndex is the location of the new data in the array. If there is no data at this location, the new data is saved, and its next data in the chain table is null (e is null). If there is already data at this location, the new data is saved at this location, and its next data in the chain table is old.Data (e).

void createEntry(int hash, K key, V value, int bucketIndex) {
        //Get data e for the current location, possibly null
        Entry<K,V> e = table[bucketIndex];
        //The current location holds new data, and the historical data e is linked to the linked list of new data
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
}

Will collisions occur if the hash values are different?The answer is yes.

First let's look at how data in hashMap is calculated in barrels

  static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
    }

And operation extension:

Two data that participates in the operation, and in binary bits.

Rules of operation: 0&0=0; 0&1=0; 1&0=0; 1&1=1;

That is, if both of them are "1", the result is "1", otherwise 0

**Examples:
1&(16-1)=1
17&(16-1)=1**

So we can understand that when the bucket capacity is fixed, the larger the load factor, the larger the maximum actual capacity, and the more data needs to be saved.More collisions occur, increasing data in the chain table structure in HashMap and reducing query efficiency.Increased space utilization.

Conversely, the smaller the load factor is, the smaller the bucket's actual capacity will be, the fewer data can be stored, collisions will be reduced, chain table data will be reduced, and query efficiency will be increased.Reduced space utilization.

6. Simple use of HashMap

Find out~

Your favorite buddy remembers to keep an eye on ~You can get free learning materials~

Keywords: Java less

Added by savagenoob on Sat, 07 Sep 2019 04:06:44 +0300

Programming VIP

Origin | Do you really know HashMap?(HashMap Source Analysis)

Popular Keywords