ARTS Carding Activities - Third Week

1. Algorithms do an algorithmic problem of leetcode

146. LRU Cache
Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: get and put.

get(key) - Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
put(key, value) - Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.

The cache is initialized with a positive capacity.

Follow up:
Could you do both operations in O(1) time complexity?

Example:

LRUCache cache = new LRUCache( 2 /* capacity */ );

cache.put(1, 1);
cache.put(2, 2);
cache.get(1);       // returns 1
cache.put(3, 3);    // evicts key 2
cache.get(2);       // returns -1 (not found)
cache.put(4, 4);    // evicts key 1
cache.get(1);       // returns -1 (not found)
cache.get(3);       // returns 3
cache.get(4);       // returns 4

Answer: This question is not difficult, mainly to examine the use of data structures. Because LRU rules need to move an element from the linear structure to the head frequently, it can only choose the linked list structure; and because it needs to query whether the key is in the cache in a constant time, single linked list can not be satisfied, so it needs to add hash table to query quickly. That is to say, the data structure needed for this problem includes: bidirectional linked list + hash table.
My first version uses STL list s and unordered_map, the code is as follows:

class LRUCache {
public:
    LRUCache(int capacity) {
        m_cache.clear();
        m_hash.clear();
        m_capacity = capacity;
        m_size = 0;
    }
    
    int get(int key) {
        return _get(key, -1);
    }
    
    void put(int key, int value) {
        int ret = _get(key, value);
        if (ret > 0) {
            //exist and refresh, do nothing
            return;
        }
        else {
            //push_front
            m_cache.push_front(pair<int, int>(key, value));
            m_hash.insert(pair<int, list<pair<int, int>>::iterator>(key, m_cache.begin()));
            if (m_size >= m_capacity) {
                //pop the old value
                list<pair<int, int>>::iterator last = --m_cache.end();
                m_hash.erase((*last).first);
                m_cache.pop_back();
            }
            else {
                m_size++;
            }
        }
    }
private:
    int m_capacity;
    int m_size;
    map<int, list<pair<int, int>>::iterator> m_hash;
    list<pair<int, int>> m_cache;

    int _get(int key, int newVal) {
        map<int, list<pair<int, int>>::iterator>::iterator iter;
        iter = m_hash.find(key);
        if (iter == m_hash.end()) {
            return -1;
        }
        else {
            list<pair<int, int>>::iterator pos = iter->second;
            int val = (*pos).second;
            if (newVal > 0) {
                val = newVal;
            }
            m_cache.erase(pos);
            m_cache.push_front(pair<int, int>(key, val));
            m_hash.erase(key);
            m_hash.insert(pair<int, list<pair<int, int>>::iterator>(key, m_cache.begin()));
            return val;
        }
    }
};

Using STL list, although the implementation of two-way linked list is eliminated, it is not smooth to use because:

Adding new elements when the cache is full requires removing the last element, so the elements of the linked list need to record the key and the corresponding val (value is used for the return of get, key is mainly used to remove the corresponding data of the hash table).
The val ue of a hash table is an iterator of a linked list, which is cumbersome to define (typedef can be used to simplify, of course).
With STL's list implementation, the internal implementation adds a lot of complexity, and the execution time is higher than the pure two-way linked list.

So, finally, I encapsulated the implementation of the linked list by myself. The specific code is as follows:

struct Node {
    int key;
    int val;
    Node *prev;
    Node *next;
};

class LRUCache {
public:
    LRUCache(int capacity) {
        m_capacity = capacity;
        m_size = 0;
        m_head = new Node{0, 0};
        m_tail = new Node{0, 0};
        //m_head and m_tail are Sentinels
        m_head->next = m_tail;
        m_tail->prev = m_head;
    }
    
    int get(int key) {
        return _touch(key, -1);
    }
    
    void put(int key, int value) {
        if (_touch(key, value) < 0) {
            Node *p = new Node{key, value};
            _push_front(p);
            m_map[key] = p;
            if (m_size < m_capacity) {
                m_size++;
            }
            else {
                m_map.erase(_pop_back());
            }           
        }
    }
private:
    unordered_map<int, Node *> m_map;
    Node *m_head, *m_tail;
    int m_capacity;
    int m_size;

    int _touch(int key, int val) {
        unordered_map<int, Node *>::iterator iter = m_map.find(key);
        if (iter == m_map.end()) {
            return -1;
        }
        Node *p = iter->second;
        if (val > 0)
            p->val = val;
        _move_to_head(p);
        return p->val;
    }

    void _move_to_head(Node *p) {
        p->prev->next = p->next;
        p->next->prev = p->prev;
        _push_front(p);
    }

    int _pop_back() {
        Node *last = m_tail->prev;
        int key = last->key;
        last->prev->next = m_tail;
        m_tail->prev = last->prev;
        last = NULL;
        return key;
    }

    void _push_front(Node *p) {
        p->prev = m_head;
        p->next = m_head->next;
        m_head->next->prev = p;
        m_head->next = p;
    }
};

In the implementation of the linked list, I added two sentries (m_head, m_tail head and tail node pointer). When adding nodes and removing nodes, there is no need for blanking processing and a lot of code is saved. Ultimately, this kind of implementation, whether time-consuming or space-consuming, is better than the first one.

2.Review read and comment on at least one English technical article

This week's reading was written by Prateek Gogia Redis: What and Why? What is redis and what advantages it has over traditional databases?
First of all, redis is a kind of memory database, which supports string, hash table, ordered set, set and other data structures. Because data is stored in memory, the processing speed is much faster than that of traditional database.
Secondly, there are several reasons why redis should be used.

redis is written in C language, so the processing speed is very fast.
redis is NoSql;
At present, many technology giants use redis, such as GitHub, Weibo, Pinterest, Snapchat, Craigslist, Digg, Stack Overflow, Flickr;
Using redis cache can save the cost of accessing cloud database directly.
redis is friendly to developers and supports many languages, such as (JavaScript, Java, Go, C, C++, C#, Python, Objective-C, PHP and other popular languages);
redis is open source and so far stable.

3.Tip learns at least one technical skill

Sentinel usage: When writing arithmetic logic, often because of the limitation of the scene, it is necessary to make special judgments on some boundary conditions, such as pointer space judgment, upper and lower boundaries, the determination of the last element of the array, etc. At this time, the concept of Sentinel can be used to avoid such boundary judgments. Point space simplifies code logic. How sentinels are used is illustrated in the following cases:

Sentinel of two-dimensional matrix, many scenarios need to use two-dimensional matrix, such as maze, assuming the size of the maze is n*m, then when walking the maze, we need to judge the upper and lower boundaries, such as x > 0 & x < n, y coordinates are similar, each processing needs to make four judgments, which is very cumbersome. In this case, I usually add sentinels around two-dimensional matrices, that is, I will apply for (n+2)*(m+2) matrices. For coordinates (x, y) (x == 0 | x == n + 1 | y == 0 | y == m + 1), setting corresponding values is not passable, so the logic processing is much simpler, traversing 1 <= x <= n + 1 directly. And 1 <= y <= m will do.
To operate the head and tail pointer of a two-way linked list, we often need to decide whether it is empty first. If we initialize the linked list, we assign the head and tail two nodes to the linked list (these two nodes will not change in the life cycle of the linked list), so that when we operate the nodes of the linked list later, we do not need to do the empty operation.
Looking for elements i n an unordered array usually traverses once from beginning to end and finds a jump-out cycle. Each cycle requires two comparisons. One is to determine whether subscript I is less than n, and the other is to determine whether the corresponding data is equal to the key to be looked for. The way to set up a sentry is to have the last element of the array be the key to be looked for, so that the judgment of whether subscript I is less than n can be reduced during the loop, because it will be found eventually.

Share shares a technical article with opinions and thoughts

The article shared this week is Text Emotional Analysis Using TensorFlow It describes in detail how to use TensorFlow to analyze the emotions of the text. It has data step by step. It's a good article.

Keywords: C++ Redis Database less github

Added by Tandem on Wed, 04 Sep 2019 18:06:34 +0300

Programming VIP

ARTS Carding Activities - Third Week