Kiner algorithm: hash table and bloom filter (hand tearing algorithm)

Guide to series of articles

Open source project

All articles in this series will be included in GitHub for unified collection and management. Welcome ISSUE and Star.

GitHub portal: Kiner algorithm

preface

After understanding the underlying implementation principle of hash table, the design of hash function, the solution of hash conflict and the application scenario of Bloom filter, let's have another wave of algorithm questions to deepen our understanding of hash table.

705. Design hash set

Problem solving ideas

In fact, we can directly use the training method to realize the hash table, because this problem requires deleting elements, and the linked list is very efficient for inserting and deleting elements.

Code demonstration

/*
 * @lc app=leetcode.cn id=705 lang=typescript
 *
 * [705] Design hash set
 *
 * https://leetcode-cn.com/problems/design-hashset/description/
 *
 * algorithms
 * Easy (65.04%)
 * Likes:    163
 * Dislikes: 0
 * Total Accepted:    58.3K
 * Total Submissions: 89.6K
 * Testcase Example:  '["MyHashSet","add","add","contains","contains","add","contains","remove","contains"]\n' +
  '[[],[1],[2],[1],[3],[2],[2],[2],[2]]'
 *
 * Design a hash set without using any built-in hash table library.
 * 
 * Implement the MyHashSet class:
 * 
 * 
 * void add(key) Inserts the value key into the hash collection.
 * bool contains(key) Returns whether this value key exists in the hash set.
 * void remove(key) Deletes the given value key from the hash set. If there is no such value in the hash set, nothing is done.
 * 
 * 
 * 
 * Example:
 * 
 * 
 * Input:
 * ["MyHashSet", "add", "add", "contains", "contains", "add", "contains",
 * "remove", "contains"]
 * [[], [1], [2], [1], [3], [2], [2], [2], [2]]
 * Output:
 * [null, null, null, true, false, null, true, null, false]
 * 
 * Explanation:
 * MyHashSet myHashSet = new MyHashSet();
 * myHashSet.add(1);      // set = [1]
 * myHashSet.add(2);      // set = [1, 2]
 * myHashSet.contains(1); // Return True
 * myHashSet.contains(3); // Return False, (not found)
 * myHashSet.add(2);      // set = [1, 2]
 * myHashSet.contains(2); // Return True
 * myHashSet.remove(2);   // set = [1]
 * myHashSet.contains(2); // Returns False, (removed)
 * 
 * 
 * 
 * Tips:
 * 
 * 
 * 0 
 * add, remove, and contains can be called up to 10 ^ 4 times.
 * 
 * 
 * 
 * 
 * Advanced: can you solve this problem without using the built-in hash collection library?
 * 
 */

// @lc code=start

class LinkNode<T> {
    constructor(public key: T = undefined, public next: LinkNode<T> = null) {

    }
    insertAfter(node: LinkNode<T>) {
        node.next = this.next;
        this.next = node;
    }
    removeAfter() {
        if(!this.next) return;
        this.next = this.next.next;
    }
}


class MyHashSet {
    static size = 100;
    private data: LinkNode<number>[] = new Array<LinkNode<number>>(MyHashSet.size);
    constructor() {
        let n = MyHashSet.size;
        // Each position of the initial session hash table is an empty header node
        while(n) {
            this.data[n-1] = new LinkNode<number>();
            n-=1;
        }
    }

    /**
     * hash function 
     * Since our key itself is of type number, we only need to ensure that the index we return is not negative
     * We can use key & 0x7ffffff to ensure that our number must be positive, and then remainder the result with the hash table size
     * @param key 
     * @returns 
     */
    hashFn(key: number): number {
        return (key & 0x7fffffff) % this.data.length;
    }

    add(key: number): void {
        // If the key already exists in the hash table, it is returned directly
        if(this.contains(key)) return;
        // Calculate array subscript through hash function
        const idx = this.hashFn(key);
        // Insert the target value into the next bit of the chain header node. Why not the next bit of the last bit? Because we don't care at all
        // For the insertion order of the target value, we only need to be able to find the target value. Therefore, the next bit inserted into the head node is inserted into the
        // The next effect of the last node is the same. This is the head insertion method
        this.data[idx].insertAfter(new LinkNode(key));
    }

    remove(key: number): void {
        // There is no need to judge whether there is a key, because if it does not exist, the following p.next will be empty and will not enter the loop
        // p.removeAfter also excludes any processing if next is empty
        // If it exists, the hash function is used to calculate the array subscript first
        let idx = this.hashFn(key);
        let p: LinkNode<number> = this.data[idx];
        // Find the previous node of the target node
        while(p.next && p.next.key !== key) p = p.next;
        // Once found, delete the next node directly
        p.removeAfter();
    }

    contains(key: number): boolean {
        // Calculate array subscript through hash function
        const idx = this.hashFn(key);
        let p: LinkNode<number> = this.data[idx].next;
        while(p && p.key !== key) p = p.next;
        return !!p;
    }
}

/**
 * Your MyHashSet object will be instantiated and called as such:
 * var obj = new MyHashSet()
 * obj.add(key)
 * obj.remove(key)
 * var param_3 = obj.contains(key)
 */
// @lc code=end


706. Design hash mapping

Problem solving ideas

This question is very similar to the previous one at the beginning, but from simply saving a key value to saving a key value pair, this is closer to the json we often use. One thing to note is that when the data corresponding to the key value is found, the value value needs to be replaced

Code demonstration

/*
 * @lc app=leetcode.cn id=706 lang=typescript
 *
 * [706] Design hash mapping
 *
 * https://leetcode-cn.com/problems/design-hashmap/description/
 *
 * algorithms
 * Easy (64.62%)
 * Likes:    200
 * Dislikes: 0
 * Total Accepted:    48.7K
 * Total Submissions: 75.3K
 * Testcase Example:  '["MyHashMap","put","put","get","get","put","get","remove","get"]\n' +
  '[[],[1,1],[2,2],[1],[3],[2,1],[2],[2],[2]]'
 *
 * Design a hash map without using any built-in hash table library.
 * 
 * Implement the MyHashMap class:
 * 
 * 
 * MyHashMap() Initialize object with null mapping
 * void put(int key, int value) Insert a key, value pair into the HashMap. If key
 * If it already exists in the map, update its corresponding value.
 * int get(int key) Returns the value mapped by a specific key; If the mapping does not contain a key, return - 1.
 * void remove(key) If there is a key mapping in the mapping, remove the key and its corresponding value.
 * 
 * 
 * 
 * 
 * Example:
 * 
 * 
 * Input:
 * ["MyHashMap", "put", "put", "get", "get", "put", "get", "remove", "get"]
 * [[], [1, 1], [2, 2], [1], [3], [2, 1], [2], [2], [2]]
 * Output:
 * [null, null, null, 1, -1, null, 1, null, -1]
 * 
 * Explanation:
 * MyHashMap myHashMap = new MyHashMap();
 * myHashMap.put(1, 1); // myHashMap Now [[1,1]]
 * myHashMap.put(2, 2); // myHashMap Now [[1,1], [2,2]]
 * myHashMap.get(1);    // Return 1. myHashMap is now [[1,1], [2,2]]
 * myHashMap.get(3);    // Return - 1 (not found). myHashMap is now [[1,1], [2,2]]
 * myHashMap.put(2, 1); // myHashMap Now it is [[1,1], [2,1]] (update the existing value)
 * myHashMap.get(2);    // Return 1. myHashMap is now [[1,1], [2,1]]
 * myHashMap.remove(2); // Delete the data with key 2. myHashMap is now [[1,1]]
 * myHashMap.get(2);    // Return - 1 (not found), myHashMap is now [[1,1]]
 * 
 * 
 * 
 * 
 * Tips:
 * 
 * 
 * 0 
 * The put, get, and remove methods can be called up to 10 ^ 4 times
 * 
 * 
 * 
 * 
 * Advanced: can you solve this problem without using the built-in HashMap library?
 * 
 */

// @lc code=start



class LinkNode<T=any, P=any> {
    constructor(public key: T = undefined, public value: P = undefined, public next: LinkNode<T> = null) {

    }
    insertAfter(node: LinkNode<T>) {
        node.next = this.next;
        this.next = node;
    }
    removeAfter() {
        if(!this.next) return;
        this.next = this.next.next;
    }
}



class MyHashMap {
    static size = 100;
    private data: LinkNode<number, number>[] = new Array<LinkNode<number, number>>(MyHashMap.size);
    constructor() {
        let n = MyHashMap.size;
        while(n) {
            this.data[n-1] = new LinkNode();
            n--;
        }
    }

    hashFn(key: number): number {
        return (key & 0x7fffffff) % this.data.length;
    }

    put(key: number, value: number): void {
        // First, calculate the position in the hash table through the hash function
        let idx: number = this.hashFn(key);
        // Take out the chain header node at the specified position
        let p: LinkNode<number, number> = this.data[idx];
        // Constantly find the node corresponding to the key
        while(p.next && p.next.key !== key) p = p.next;
        // If the node with the same key is finally found, the value will be replaced directly
        if(p.next) {
            p.next.value = value;
        } else {// Otherwise, insert a new linked list node
            p.insertAfter(new LinkNode(key, value));
        }
    }

    get(key: number): number {
        // First, calculate the position in the hash table through the hash function
        let idx: number = this.hashFn(key);
        // Take out the chain header node at the specified position
        let p: LinkNode<number, number> = this.data[idx].next;
        while(p && p.key !== key) p = p.next;
        return p?p.value:-1;
    }

    remove(key: number): void {
        // First, calculate the position in the hash table through the hash function
        let idx: number = this.hashFn(key);
        // Take out the chain header node at the specified position
        let p: LinkNode<number, number> = this.data[idx];
        while(p.next && p.next.key !== key) p = p.next;
        p.removeAfter();
    }
}

/**
 * Your MyHashMap object will be instantiated and called as such:
 * var obj = new MyHashMap()
 * obj.put(key,value)
 * var param_2 = obj.get(key)
 * obj.remove(key)
 */
// @lc code=end


Interview question 16.25 LRU cache

Problem solving ideas

Do you think this concept seems familiar? We introduced the LRU cache elimination algorithm when we studied the linked list. In addition, the LRU cache elimination algorithm is also used in our Vue source code to optimize our single component compilation process.

The LRU cache elimination algorithm is like this: data can be inserted at the end of a certain size of space. Whenever we access a data in the storage space, we move the currently accessed data to the end of the storage area. When we find that our storage area is full when inserting data, we need to eliminate the data in the front of the storage space, The new data is then inserted at the end of the storage area.

In view of the above characteristics, LRU caching algorithm is usually implemented by hash linked list, which not only takes into account the advantages of fast index value of hash table, but also retains the advantages of efficient insertion, deletion and location adjustment of linked list. In order to facilitate deletion and insertion, we choose to use two-way linked list combined with hash table to solve this problem.

Code demonstration

class DoubleLinkNode {
    constructor(
        public key: number = 0, 
        public value: number = 0, 
        public prev: DoubleLinkNode = null, 
        public next: DoubleLinkNode = null
    ){}
    // Remove the current node in the linked list, that is, if it is stored in the previous node, let the next node of the previous node point to the next node of the current node
    // If there is a next node, that is, the previous node of the next node points to the previous node of the current node
    // Then set the front and back pointers of the current node to null and recycle
    removeThis(): DoubleLinkNode {
        if(this.prev) {
            this.prev.next = this.next
        };
        if(this.next) {
            this.next.prev = this.prev
        };
        this.next = this.prev = null;
        return this;
    }
    // Inserts a node before the current node
    insertBefore(node: DoubleLinkNode) {
        node.next = this;
        node.prev = this.prev;
        if(this.prev){
            this.prev.next = node;
        }
        this.prev = node;
    }
    // For debugging output
    output(){
        let p = this.next;
        while (p) {
            let str = '';
            if(p.prev) str+=`prev:${p.prev.key};`;
            if(p.next) str+=`next:${p.next.key};`;
            str+=`key:${p.key};`;
            console.log(str);
            p = p.next;
        }
        console.log("=============\n")
    }
}

class HashList {

    // Define virtual head node and virtual tail node to facilitate the rapid operation of elements at the head and tail
    private hair: DoubleLinkNode = new DoubleLinkNode(-1,-1);
    private tail: DoubleLinkNode = new DoubleLinkNode(100,100);
    // map is a ready-made hash table object implemented in js
    private map: Map<number, DoubleLinkNode> = new Map<number, DoubleLinkNode>();
    constructor(private capacity: number){
        // Initially, the head node and the next node point to the tail node
        // The last node of the tail node refers to the head node
        // hair ↔ tail
        // In this way, when we insert a node later, we can directly insert a node in front of the tail node, which becomes
        // hair ↔ node ↔ tail
        this.hair.next = this.tail;
        this.tail.prev = this.hair;
    }
    put(key: number, value: number) {
        const p: DoubleLinkNode = this.map.get(key);
        if(p) { // If it exists, update value and move the current node to the back
            p.value = value;
            // Delete the current node in the linked list first
            p.removeThis();
            // Then re insert the current node to the end of the linked list
            this.tail.insertBefore(p);
        } else {// If it does not exist, add one to the hash table and insert it at the end of the linked list
            const node = new DoubleLinkNode(key, value);
            this.tail.insertBefore(node);
            this.map.set(key, node);
        }
        // When the hash table size exceeds the limit size, delete the hash table header element
        if(this.map.size > this.capacity) {
            this.delete();
        }
        // this.hair.output();
    }
    get(key: number): number {
        if(this.map.get(key)) {
            // When obtaining, you need to move the currently accessed data to the end of the linked list
            const node: DoubleLinkNode = this.map.get(key);
            // Delete the current node from the linked list first
            node.removeThis();
            // Then insert this node to the end of the linked list
            this.tail.insertBefore(node);
            return node.value;
        };
        return -1;
    }
    delete() {
        // Delete data from hash table
        this.map.delete(this.hair.next.key);
        this.hair.next.removeThis();
    }
}



class LRUCache {
    private hashList: HashList;

    constructor(capacity: number) {
        this.hashList = new HashList(capacity);
    }

    get(key: number): number {
        return this.hashList.get(key);
    }

    put(key: number, value: number): void {
        this.hashList.put(key, value);
    }
}

/**
 * Your LRUCache object will be instantiated and called as such:
 * var obj = new LRUCache(capacity)
 * var param_1 = obj.get(key)
 * obj.put(key,value)
 */

535. Encryption and decryption of tinyurl

Problem solving ideas

This problem is a simple short chain service, which supports encoding and decoding functions. The difficulty lies in how to randomly generate a short web address. The short chain id we want to generate contains only the following characters:

  • a-z: ASCII: 97~122
  • A-Z: ASCII: 65-90
  • 0-9: ASCII: 48-57

Therefore, we need a method to generate a string of the specified length containing the above characters, and then use the hash table to map it.

Code demonstration

/*
 * @lc app=leetcode.cn id=535 lang=typescript
 *
 * [535] TinyURL Encryption and decryption of
 *
 * https://leetcode-cn.com/problems/encode-and-decode-tinyurl/description/
 *
 * algorithms
 * Medium (83.97%)
 * Likes:    119
 * Dislikes: 0
 * Total Accepted:    14.3K
 * Total Submissions: 17.1K
 * Testcase Example:  '"https://leetcode.com/problems/design-tinyurl"'
 *
 * TinyURL Is a URL simplification service, for example, when you enter a URL https://leetcode.com/problems/design-tinyurl
 * When, it returns a simplified URL http://tinyurl.com/4e9iAk.
 * 
 * Requirements: design an encrypted encode and decrypted decode of TinyURL
 * Methods. There is no limit to how your encryption and decryption algorithm is designed and operated. You only need to ensure that a URL can be encrypted into a TinyURL, and the TinyURL can be restored to the original URL by decryption.
 * 
 */

// @lc code=start
/**
 * Encodes a URL to a shortened URL.
 */

function randomNum(): number {
    // Randomly generate an integer random number below 62
    // Why 62?
    // If the generated random number is 0-25, we generate lowercase letters
    // If it is 26-51, capital letters are generated
    // If it is 52-61, the number is generated
    let num = Math.floor(Math.random()*62);
    // Programming tips: directly obtain the ASCII of the first character in the relevant interval and add the generated random number to generate a random character in the target interval
    if(num < 26) num += 'a'.charCodeAt(0);
    else if(num < 52) num += 'A'.charCodeAt(0) - 26;
    else num += '0'.charCodeAt(0) - 52;
    return num;
}

function randomStr(len:number = 6): string {
    let res = '';
    while(len--) {
        res += String.fromCharCode(randomNum());
    }
    return res;
}

// Store the relationship between the original link and id
let map: Map<string, string> = new Map<string, string>();
// The relationship between the id and the original link is stored for duplicate judgment
let map2: Map<string, string> = new Map<string, string>();
function encode(longUrl: string): string {
    // If this link has been generated, return directly
    if(map2.has(longUrl)) {
        return map2.get(longUrl);
    }
    // Generate a random id
	const id = randomStr();
    map.set(id, longUrl);
    map2.set(longUrl, id);
    return id;
};

/**
 * Decodes a shortened URL to its original URL.
 */
function decode(shortUrl: string): string {
    return map.get(shortUrl);
};

/**
 * Your functions will be called as such:
 * decode(encode(strs));
 */
// @lc code=end


187. Repetitive DNA sequences

Problem solving ideas

This problem is also solved by using hash table, but when we store it again, our value is the number of times the string appears. Then, there is another problem to be solved. How do we find all strings? We still have a programming trick. During the loop, i represents the start position. According to the length of the substring is 10, the end point of our loop should be j=s.length-9. Because the length of the string is less than 10, the loop is meaningless.

Code demonstration

/*
 * @lc app=leetcode.cn id=187 lang=typescript
 *
 * [187] Repetitive DNA sequence
 *
 * https://leetcode-cn.com/problems/repeated-dna-sequences/description/
 *
 * algorithms
 * Medium (47.53%)
 * Likes:    169
 * Dislikes: 0
 * Total Accepted:    33.8K
 * Total Submissions: 71.2K
 * Testcase Example:  '"AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"'
 *
 * All DNA consists of A series of nucleotides abbreviated as' A ',' C ',' G 'and'T', such as "ACGAATTCCG". When studying DNA, recognize DNA
 * Repeats in can sometimes be very helpful for research.
 * 
 * Write a function to find all target substrings. The length of the target substring is 10 and appears more than once in the DNA string s.
 * 
 * 
 * 
 * Example 1:
 * 
 * 
 * Input: S = "aaaacccccaaaaccccccaaaggttt"
 * Output: ["aaaaccccc", "cccccccaaaa"]
 * 
 * 
 * Example 2:
 * 
 * 
 * Input: S = "AAAA"
 * Output: [AAAA]
 * 
 * 
 * 
 * 
 * Tips:
 * 
 * 
 * 0 
 * s[i] Is' A ',' C ',' G ', or'T'
 * 
 * 
 */

// @lc code=start
function findRepeatedDnaSequences(s: string): string[] {
    // Used to store the number of occurrences of a string
    const map: Map<string, number> = new Map<string, number>();
    // Since the length of the string is 10, the range of our loop should be 0~s.length-9
    for(let i=0,j=s.length - 9; i<j;i++) {
        // Intercept a string of length 10 from the position of i
        const subStr = s.substr(i, 10);
        // Determine whether the string has been stored in the hash table
        const target = map.get(subStr);
        if(target!==undefined) {// If it has been deposited, add 1 directly
            map.set(subStr, target+1);
        } else {// Otherwise, set to 1
            map.set(subStr, 1);
        }
    }
    // Traverse all strings and put strings with occurrences greater than 1 into the result array
    let res: string[] = [];
    for(let [key, value] of map) {
        if(value>1) res.push(key);
    }
    return res;
};
// @lc code=end


318. Maximum word length product

Problem solving ideas

This question is a question that tests the thinking of hash algorithm. We can use this method to answer: because the title requires that there can be no duplicate letters between words, we can first use the idea of hash mapping to map the letters of each word to a binary array with a length of 26. If a letter appears in the word, it will be marked as 1 at the position of the binary array. The following operations are performed

# The following is a binary array with a length of 26
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
# The target word is ace
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1]

In this way, to judge whether there are duplicate letters in two words, we only need to carry out bitwise and operation. As long as one letter is repeated, the result must not be 0. Only when all letters are not repeated can we get 0.

Found a word that does not repeat, just take the maximum value with the last result.

Code demonstration

/*
 * @lc app=leetcode.cn id=318 lang=typescript
 *
 * [318] Maximum word length product
 *
 * https://leetcode-cn.com/problems/maximum-product-of-word-lengths/description/
 *
 * algorithms
 * Medium (67.32%)
 * Likes:    171
 * Dislikes: 0
 * Total Accepted:    16.3K
 * Total Submissions: 24.3K
 * Testcase Example:  '["abcw","baz","foo","bar","xtfn","abcdef"]'
 *
 * Given a string array words, find length(word[i]) * length(word[j])
 * And these two words do not contain common letters. You can think that each word contains only lowercase letters. If there are no such two words, 0 is returned.
 * 
 * Example 1:
 * 
 * Input: ["abcw","baz","foo","bar","xtfn","abcdef"]
 * Output: 16 
 * Explanation: these two words are "abcw" and "xtfn".
 * 
 * Example 2:
 * 
 * Input: [a","ab","abc","d","cd","bcd","abcd "]
 * Output: 4 
 * Explanation: these two words are "ab" and "CD".
 * 
 * Example 3:
 * 
 * Input: [a","aa","aaa","aaaa "]
 * Output: 0 
 * Explanation: there are no such two words.
 * 
 */

// @lc code=start
function maxProduct(words: string[]): number {
    // Defines an array of binary bit markers with a length of 26 bits
    const mask: number[] = new Array<number>(26);
    // For binary operations, you do not need to fill zero by default. Filling 0 here is for ease of understanding. In fact, this step can be omitted
    mask.fill(0);
    for(let i=0;i<words.length;i++) {
        const word = words[i];
        for(let char of word) {
            // Several programming techniques are used here
            // 1. Get the offset between the current character banana and 'a'. We need to determine the position of the marker bit according to this offset
            // 2. Shift left by bit. We want to mark the letter of the corresponding position on a binary array as 1. We only need to shift left by bit offset for 1
            // 3. Bitwise OR: one of the two is 1, which takes into account the situation that the word itself has repeated letters, such as aaa
            mask[i] |= 1 << (char.charCodeAt(0) - 'a'.charCodeAt(0));
        }
    }
    let res = 0;
    for(let i=0;i<words.length;i++) {
        for(let j=i+1;j<words.length;j++) {
            // As long as it is not 0, there are repeated letters
            if(mask[i] & mask[j]) continue;
            // Calculate the maximum word length product
            res = Math.max(res, (words[i].length * words[j].length));
        }
    }
    return res;
};
// @lc code=end


Keywords: Front-end Algorithm data structure Hash table

Added by irishprogrammin on Sun, 23 Jan 2022 17:17:54 +0200