HashTable source code analysis, thread safe hash table

summary

This class implements a hash table for storing key value pairs. Similar to HashMap, it also has two parameters that may affect its performance: initial capacity and load factor. The capacity here is the length of the hash array in the memory storage structure as in HashMap, and its default load factor is usually 0.75. In addition, this class is internally synchronized, and its operation is thread safe. If you do not need to consider the thread safety of data during development, you can consider using HashMap, which is more efficient. In actual use, it is determined according to the actual situation.

Member variable

// Stored as an array in the form of key value pairs
private transient HashtableEntry<?,?>[] table;

// Number of entries stored in HashTable
private transient int count;

// Capacity expansion threshold. When the number of storage entries exceeds this value, re hashing is performed. This value is generally = capacity * load factor
private int threshold;

// Load factor
private float loadFactor;

Constructor

// The default constructor initializes the HashTable with a capacity of 11 and a load factor of 0.75
public Hashtable() {
    this(11, 0.75f);
}
// Initialize Table with specified capacity
public Hashtable(int initialCapacity) {
    this(initialCapacity, 0.75f);
}
//Specify capacity and load factor for initialization
public Hashtable(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal Load: "+loadFactor);

    if (initialCapacity==0)
        initialCapacity = 1;
    this.loadFactor = loadFactor;
    table = new HashtableEntry<?,?>[initialCapacity];
    // The expansion threshold value is the minimum of the default maximum value and the specified capacity
    threshold = (int)Math.min(initialCapacity, MAX_ARRAY_SIZE + 1);
}

Several initialization functions are relatively simple. Only several internal member variables can be initialized. There is nothing difficult to understand here

add entry

public synchronized V put(K key, V value) {
    // HashTable requires that the value of the new element cannot be empty
    if (value == null) {
        throw new NullPointerException();
    }

    // The following code ensures the key of the new element
    HashtableEntry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    @SuppressWarnings("unchecked")
    HashtableEntry<K,V> entry = (HashtableEntry<K,V>)tab[index];
    for(; entry != null ; entry = entry.next) {
        // If the key already exists in the HashTable, the old data will be returned directly
        if ((entry.hash == hash) && entry.key.equals(key)) {
            V old = entry.value;
            entry.value = value;
            return old;
        }
    }
	// Through the above search, if no element with the same key is found, the new element will be added to the HashTable
    addEntry(hash, key, value, index);
    return null;
}

private void addEntry(int hash, K key, V value, int index) {
    modCount++;// Number of operations

    HashtableEntry<?,?> tab[] = table;
    if (count >= threshold) {// If the number of entries has exceeded the expansion threshold
        rehash();// Hashing again

        tab = table;// Through the rehash() method call, this is the new hash table
        hash = key.hashCode();
        index = (hash & 0x7FFFFFFF) % tab.length;// Calculate the hash position of the new element in the new table by the length of the new table
    }

    HashtableEntry<K,V> e = (HashtableEntry<K,V>) tab[index];
    tab[index] = new HashtableEntry<>(hash, key, value, e);
    count++;
}
// Re hash the original data, that is, expand the capacity
protected void rehash() {
    int oldCapacity = table.length;
    HashtableEntry<?,?>[] oldMap = table;

    int newCapacity = (oldCapacity << 1) + 1;//Double the original capacity + 1
    if (newCapacity - MAX_ARRAY_SIZE > 0) {// The new capacity is greater than the default maximum
        if (oldCapacity == MAX_ARRAY_SIZE)// If the old capacity has reached the maximum, return directly and do nothing
            return;
        newCapacity = MAX_ARRAY_SIZE;// Use default maximum for new capacity
    }
    // Reallocate memory
    HashtableEntry<?,?>[] newMap = new HashtableEntry<?,?>[newCapacity];

    modCount++;
    // Calculate new expansion threshold
    threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
    table = newMap;// Set internal variable to new table

    // Re hash the data in each bucket in the original table to the appropriate hash position in the new table
    for (int i = oldCapacity ; i-- > 0 ;) {
        for (HashtableEntry<K,V> old = (HashtableEntry<K,V>)oldMap[i] ; old != null ; ) {
            HashtableEntry<K,V> e = old;
            old = old.next;

            int index = (e.hash & 0x7FFFFFFF) % newCapacity;
            e.next = (HashtableEntry<K,V>)newMap[index];
            newMap[index] = e;
        }
    }
}

From the process of adding elements, it can be seen that the new element is constructed and compared with HashtableEntry, and then added to the bucket at the corresponding position. The definition of HashtableEntry is as follows. It can be seen that it is a node of a linked list, storing hash value, key, value, and subsequent reference address of the linked list.

private static class HashtableEntry<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    HashtableEntry<K,V> next;

    protected HashtableEntry(int hash, K key, V value, HashtableEntry<K,V> next) {
        this.hash = hash;
        this.key =  key;
        this.value = value;
        this.next = next;
    }
}

As can be seen from the method call process above, the process of adding new elements

  1. Check the validity of parameters
  2. Find out whether there are elements with the same key in the original data, and directly return the old value
  3. If the same key value does not exist in the original data, a new element is added to the hashtable
  4. Check whether the capacity needs to be expanded. If the capacity needs to be expanded, expand the capacity, and then hash the new original data into the newly expanded hash array
  5. Add the new element to the first position in the corresponding bucket, that is, the latest element in each bucket is the newly added element

Delete element

public synchronized V remove(Object key) {
    HashtableEntry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;//Determine the position of the bucket according to the hash value
    @SuppressWarnings("unchecked")
    HashtableEntry<K,V> e = (HashtableEntry<K,V>)tab[index];
    // Traverse the list stored in the bucket to find the name of the element to be deleted
    for(HashtableEntry<K,V> prev = null ; e != null ; prev = e, e = e.next) {
        if ((e.hash == hash) && e.key.equals(key)) {// Found element to delete
            modCount++;
            if (prev != null) {
                prev.next = e.next;
            } else {
                tab[index] = e.next;
            }
            count--;
            V oldValue = e.value;
            e.value = null;
            return oldValue;
        }
    }
    return null;
}

The process of deleting elements is relatively simple:

  1. Locate the bucket position through the hash value of the key
  2. Traverse the linked list in the bucket to view the elements to be deleted
  3. When found, the element is deleted from the linked list in the bucket. If not found, null is returned

Get and replace elements

public synchronized V get(Object key) {
    HashtableEntry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;
    int index = (hash & 0x7FFFFFFF) % tab.length;
    // Traverse the linked list in the bucket to find the element equal to key. If found, return its value
    for (HashtableEntry<?,?> e = tab[index] ; e != null ; e = e.next) {
        if ((e.hash == hash) && e.key.equals(key)) {
            return (V)e.value;
        }
    }
    return null;// Not found, null returned
}
public synchronized V replace(K key, V value) {
    Objects.requireNonNull(value);// Expected value value cannot be null
    HashtableEntry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length;// Calculate bucket position index
    @SuppressWarnings("unchecked")
    HashtableEntry<K,V> e = (HashtableEntry<K,V>)tab[index];
    for (; e != null; e = e.next) {// Traverse the linked list in the bucket
        // If an equal key is found, replace the new value value and return the old value
        if ((e.hash == hash) && e.key.equals(key)) { 
            V oldValue = e.value;
            e.value = value;
            return oldValue;
        }
    }
    return null;// The key to be replaced is not found. null is returned
}

The operation of obtaining and modifying elements is relatively simple. It is only through hash positioning and linked list traversal.

Observe the above addition, deletion, modification and query methods. Their implementations are modified with synchronized keywords, indicating that it has achieved synchronization internally and can run safely in a multithreaded environment.

summary

  1. Internal hash array combined with single linked list structure
  2. On the storage content, the key is not repeatable, the value is repeatable, and neither key nor value is allowed to be null
  3. Internal implementation to ensure thread safety

Keywords: Java data structure

Added by Daniel.Conaghan1 on Sat, 18 Dec 2021 05:18:38 +0200