In depth understanding of "hash collision attack of PHP arrays" from the perspective of HashTable

  • Read the blog post "hash collision attack of PHP array" and take this opportunity to learn more about hash table and deepen understanding!
  • Learn from the in-depth understanding and practical exercise of HashTable Original address

HashTable concept

A definite correspondence h is established between the storage location of the record and its keyword. The location of the record with the function H(key) as the keyword as the key in the table. This correspondence h is called Hash function (also known as Hash function). The table established according to this idea is called Hash table.

Construction of hash function

1. The key of H is a direct hash function (i.e. the key of H is a key) = × key+b, where a and b are constants and a ≠ 0.

For example: the hash function of student table keyword student number and storage location: H(key)=key-1000.

2. Digital analysis method: if the keyword is an r-ary number and all possible keyword values can be predicted, several bits in the keyword can be used to form a hash address.

For example, the last four digits of the mobile phone number are used as the hash address

3. Square middle method: if the keyword is short, square the keyword value first, and then take the middle digits of the operation result as the hash address.

give an example:
Square a group of keywords (01000110101010010111) to get (0010000001210010201010020010012321). If the table length is 1000, the middle three digits can be taken as the hash address set: (100121201020123).

4. Folding method: divide the keyword value into several parts with the same number of bits, and then take the superposition and (round off the carry) of these parts as the hash address.
5. Divide and leave remainder method: the remainder obtained after the keyword is divided by a number p not greater than the hash table length m is the hash address. (generally, p should be prime)

give an example:
There is a set of keywords as follows: (19,14,23,01,68,20,84,27,55,11,10,79), H(key)=key%13.

Code walkthrough of hash function construction

  • Build a student class: student number + name;
  • Construct a hash table class: construct and initialize the sequential array of student records in the magazine;

    Using the simple division and remainder method, the remainder of the student number is taken as the hash value, and according to this hash value, it is used as the subscript of the sequential array to store the student records.

  • Write the main program for testing
/**
 * Code drill for in-depth understanding of hash table HashTable
 */
public class MyHashTable {
    private Student[] addr;
    private Integer addrCount;

    MyHashTable() {
        this.addrCount = 1000;
        this.addr = new Student[this.addrCount];
        for (int i = 0; i < addr.length; i++) {
            this.addr[i] = new Student(null, null);
        }
    }

    // Hash operation with division and residue method
    private int hash(Integer key) {
        return (key % this.addrCount);
    }

    // Add student record according to key value
    public void add(Integer key, Student student) {
        Integer addrIndex = hash(key);
        addr[addrIndex] = student;
    }

    // Delete student records according to key values
    public void remove(Integer key) {
        Integer addrIndex = hash(key);
        addr[addrIndex] = new Student(null, null);
    }

    public boolean exist(Integer key) {
        Integer addrIndex = hash(key);
        return (addr[addrIndex].getId() != null);
    }

    @Override
    public String toString() {
        StringBuilder stringBuilder = new StringBuilder();
        stringBuilder.append("MyHashTable{");
        for (int i = 0; i < addr.length; i++) {
            stringBuilder.append("hash="+i+":");
            stringBuilder.append(addr[i].toString()+",");
        }
        stringBuilder.append("}");
        return stringBuilder.toString();
    }

    public static void main(String[] args) {
        MyHashTable myHashTable = new MyHashTable();
        myHashTable.add(1001, new Student(1001, "Jack"));
        myHashTable.add(1002, new Student(1002, "Rose"));
        myHashTable.add(1003, new Student(1003, "Mike"));
        myHashTable.add(1004, new Student(1004, "Tom"));
        myHashTable.add(1005, new Student(1005, "Mary"));
        myHashTable.add(1017, new Student(1017, "Kate"));

        if (myHashTable.exist(1001)){
            myHashTable.remove(1001);
        }
        if (myHashTable.exist(1008)){
            myHashTable.remove(1008);
        }
        System.out.println(myHashTable.toString());
    }
}

/**
 * Define student classes
 */
class Student {
    private Integer id;
    private String name;

    Student(Integer id, String name) {
        this.id = id;
        this.name = name;
    }

    public Integer getId() {
        return id;
    }

    public void setId(Integer id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public String toString() {
        return "Student{" +
                "id=" + id +
                ", name='" + name + '\'' +
                '}';
    }
}

Handling of hash address conflict

1. In the above coding exercise, you may encounter the following situations: the student with student number 1001 and the student with student number 2001 get the same hash address through the division and remainder method, and the student record added later will conflict with the record with the same hash address.

Conflict resolution

1. Open addressing method: obtain another address sequence for the conflicting address H(key) until there is no conflict.

Linear detection hashing
Square probe hashing
Random detection hashing

2. Chain address method: store all the keywords with the same hash address obtained according to the given hash function in the same linear linked list, and make the linked list orderly according to the keywords.

Hash address conflict resolution code drill

/**
 * In depth understanding of hash table HashTable code drill -- chain connection method to solve conflicts
 */
public class MyHashTableLinked {
    private LinkedList<Student>[] addr;
    private Integer addrCount;

    MyHashTableLinked() {
        this.addrCount = 1000;
        this.addr = (LinkedList<Student>[]) new LinkedList[addrCount];
        for (int i = 0; i < addr.length; i++) {
            this.addr[i] = new LinkedList<Student>();
        }
    }

    // Hash operation with division and residue method
    private int hash(Integer key) {
        return (key % this.addrCount);
    }

    // Add student record according to key value
    public void add(Integer key, Student student) {
        Integer addrIndex = hash(key);
        addr[addrIndex].add(student);
    }

    // Delete student records according to key values
    public void remove(Integer key) {
        Integer addrIndex = hash(key);
        LinkedList<Student> list = addr[addrIndex];
        if (list != null) {
            Iterator iterator = list.iterator();
            while (iterator.hasNext()) {
                Student student = (Student) iterator.next();
                list.remove(student);
            }
        }
    }

    @Override
    public String toString() {
        StringBuilder stringBuilder = new StringBuilder();
        stringBuilder.append("MyHashTable{\r\n");
        for (int i = 0; i < addr.length; i++) {

            Iterator iterator = addr[i].iterator();
            if (iterator.hasNext()) {
                stringBuilder.append("[hash=" + i + ":");
            } else {
                continue;
            }
            while (iterator.hasNext()) {
                Student student = (Student) iterator.next();
                stringBuilder.append(student.toString() + ",");
            }
            stringBuilder.append("]\r\n");

        }
        stringBuilder.append("}");
        return stringBuilder.toString();
    }

    public static void main(String[] args) {
        MyHashTableLinked myHashTableLinked = new MyHashTableLinked();
        myHashTableLinked.add(1001, new Student(1001, "Jack"));
        myHashTableLinked.add(1002, new Student(1002, "Rose"));
        myHashTableLinked.add(1003, new Student(1003, "Mike"));
        myHashTableLinked.add(1004, new Student(1004, "Tom"));
        myHashTableLinked.add(1005, new Student(1005, "Mary"));
        myHashTableLinked.add(2001, new Student(2001, "Jerry"));
        System.out.println(myHashTableLinked.toString());

        myHashTableLinked.remove(1001);
        System.out.println(myHashTableLinked.toString());
    }
}

Hash table lookup

  • What determines the efficiency of hash table lookup is the hash function, the method of dealing with conflicts and the loading factor of hash table.
  • The smaller the loading factor of the hash table, the less likely the conflict will occur, and the lower the utilization of storage space.

// lookup
    public Student find(Integer key){
        Integer addrIndex = hash(key);
        LinkedList<Student> list = addr[addrIndex];
        if (list != null) {
            Iterator iterator = list.iterator();
            while (iterator.hasNext()) {
                Student student = (Student) iterator.next();
                if (student.getId()!=null && key.equals(student.getId())){
                    return student;
                }
            }
        }
        return null;
    }

Added by n8r0x on Sat, 19 Feb 2022 05:10:29 +0200