HashMap Implementation Principle Learning

HashMap source from: android-25/java/util/HashMap

1. Construction methods

static final int MAXIMUM_CAPACITY = 1 << 30;
static final int DEFAULT_INITIAL_CAPACITY = 4;
static final float DEFAULT_LOAD_FACTOR = 0.75f;

public HashMap() {
    this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
}
// The default parameter is 4,0.75f
public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        // 4 < MAXIMUM_CAPACITY
        if (initialCapacity > MAXIMUM_CAPACITY) {
            initialCapacity = MAXIMUM_CAPACITY;
        }
        // 4 = DEFAULT_INITIAL_CAPACITY
        else if (initialCapacity < DEFAULT_INITIAL_CAPACITY) {
            initialCapacity = DEFAULT_INITIAL_CAPACITY;
        }
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        threshold = initialCapacity;
        init();
}

ps:
init() is an empty method; the construction method only makes a simple restriction on the HashMap array capacity field, max. MAXIMUM_CAPACITY and min. DEFAULT_INITIAL_CAPACITY

2. Add element put(K key, V value)

Conflicts occur when adding data.
Java resolves conflicts in the form of arrays + chained lists.The effect is as follows:

  • HashMap has an array of table s with a default length of 16, which is doubled when the capacity of the array reaches 0.75 times the default length.
  • Each item of the table array has the following data structure:
static class HashMapEntry<K,V> implements Map.Entry<K,V> {
    final K key; // key
    V value;     // value
    HashMapEntry<K,V> next; // Next item in the list of chains
    int hash;    // hash value of key
}

See below by following the source code:

table array initialization

Before introducing the put(K key, V value) method, a brief introduction to table array initialization is given

// Add key value
public V put(K key, V value) {
    // If the table list is null, it has been initialized with the inflateTable method
    if (table == EMPTY_TABLE) {
        inflateTable(threshold);
    }
    ...
    return null;
}
// Initialize table array
private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize
        // This calculates the capacity of an array of n-powers of 2, defaulting to 4-powers of 2, which is 16
        int capacity = roundUpToPowerOf2(toSize);
        // Calculates 0.75 times the capacity of an array, which needs to be expanded when it exceeds 0.75 times the capacity
        float thresholdFloat = capacity * loadFactor;
        if (thresholdFloat > MAXIMUM_CAPACITY + 1) {
            thresholdFloat = MAXIMUM_CAPACITY + 1;
        }
        // 0.75 times array capacity
        threshold = (int) thresholdFloat;
        // Initialize the array with a default capacity of 16
        table = new HashMapEntry[capacity];
}

ps:
Here, by default, an array of table s with an array capacity of 16 is initialized, and the question of why roundUpToPowerOf2(toSize) is the nth power of 2 is discussed below.

put(K key, V value)

// Add key value
public V put(K key, V value) {
        // If the table list is null, it has been initialized with the inflateTable method
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        // If key is null, add value with key as null
        if (key == null)
            return putForNullKey(value);
        // Get hash value from key
        // The Jenkins hash algorithm can be found in the following links:
        // https://en.wikipedia.org/wiki/Jenkins_hash_function
        int hash = sun.misc.Hashing.singleWordWangJenkinsHash(key);
        //  H &(length-1) redundancy is too consuming performance, the same effect is achieved here by bitwise operations
        // Gets the index of the key in the table array
        int i = indexFor(hash, table.length);
        // Chain table corresponding to loop table[i]
        // If hash values are the same & key is the same, replace the corresponding value and return the old value
        // Note: This is just a chain table looping table[i], no looping is done on the table array
        for (HashMapEntry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            // If hash values are the same & key is the same, replace the corresponding value and return the old value
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        // In the following two cases, you will need to look at it through the createEntry method
        // hash Same & Key Different
        // hash is different & key is different 
        addEntry(hash, key, value, i);
        return null;
}

ps:
When adding data, this is described above, "If hash values are the same & key is the same, the corresponding value is replaced and the old value is returned", but for "hash is the same & key is different" and "hash is different & key is different", it needs to be explained in createEntry


void addEntry(int hash, K key, V value, int bucketIndex) {
        // When the occupancy of an array is 0.75 times the length of the array, it needs to be expanded to twice the capacity of the original
        // Array expansion first creates an array twice the length of the original array, then assigns the old array data to the corresponding item item of the new array
        // Array expansion code, not explained here
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? sun.misc.Hashing.singleWordWangJenkinsHash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }
        // hash Same & Key Different
        // hash is different & key is different 
        createEntry(hash, key, value, bucketIndex);
}
// hash Same & Key Different
// hash is different & key is different 
void createEntry(int hash, K key, V value, int bucketIndex) {
        // Remove the original value of the table[bucketIndex] array, possibly null, possibly HashMapEntry
        // If null, place the value directly in the table[bucketIndex] location ok ay
        // If not null, place the new array in the table[bucketIndex] position and the old array in the next field of the new data chain table
        // This is how hash conflicts are resolved, and you can see that, indeed, they are resolved in the way of arrays + chained lists, as shown above
        HashMapEntry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new HashMapEntry<>(hash, key, value, e);
        size++;
}

ps:
Using the createEntry method, we see that Hash conflicts are resolved in HashMap by using an array + a chain table, which echoes the above figure

Why roundUpToPowerOf2(toSize) is n-th power of 2

Make an analogy:

  • When the length of the array is 15, H & (length-1) is calculated as hash&14 (0x1110) when the array is added, then the last bit will always be 0, which will result in 1 (0x0001), 3 (0x0011), 5 (0x0101), 7 (0x0111), 9 (0x1001), 11 (0x1011) in the table array never being able to store data, thus wasting space.
  • Worse yet, in this case, the positions available to the array are much smaller than the length of the array, which means that collisions are further increased and query efficiency is slowed down.

ps:
Detailed information on why roundUpToPowerOf2(toSize) is the nth power of 2 can be found
http://blog.csdn.net/yyh35209...

========== THE END ==========

Keywords: Android Java jenkins

Added by phpforever on Wed, 27 Nov 2019 05:01:40 +0200