Redis design and Implementation - learning notes

Recently, I was preparing for an interview. When I asked about redis, I could only scratch the surface. What I said was neither in-depth nor comprehensive, so I paid close attention to the assault and learned redis design and implementation first.

The reasons for choosing to read are:

The book is comprehensive and in-depth, and it must be very attentive to publish the book;

Search blog can not find more comprehensive articles than books, and it takes time;

Looking at the source code directly, one is that you don't have a good grasp of C, and it's easy to be sleepy and inefficient. Therefore, learning the source code with books is the best choice I think now.

 

I: five common data types

Simple dynamic string

redis makes an SDS used as a string. Except for some scenes that do not need to be modified, it uses SDS

The underlying implementation of C string is always an array with a length of N+1 characters

sds.h:

struct sdshdr {
    
    // buf Length of occupied space in
    int len;

    // buf Length of free space remaining in
    int free;

    // Data space
    char buf[];
};

 

Difference between C string and SDS
1 C string SDS
2 To find the length, you need to traverse O(n) Take len directly to O(1)
3 It is easy to cause buffer overflow  
4  

Reduced memory reallocation times

(the speed requirement is strict to avoid the time-consuming allocation of memory)

5   Binary security

4: SDS has free, which can reserve more space than C string. There are two main space optimization strategies:

  • Space pre allocation
    1. When SDS is modified, additional unused space will be obtained.
    2. Modified space n
      1. n<1M; Free allocation n + 1byte (null character)
      2. n>=1M; Free allocation 1M+1byte
  • Inert space release
    1. When SDS is shortened, the deleted space is not released and added to free.

API of SDS

  • sdsnew create
  • sdsempty create an empty SDS
  • sdsfree release
  • sdslen get len
  • sdsvail gets the number of free
  • Sdup creates a copy of sds
  • . .

 

Linked list

redis builds its own linked list:

/*
 * Double ended linked list node
 */
typedef struct listNode {

    // Front node
    struct listNode *prev;

    // Post node
    struct listNode *next;

    // Value of node
    void *value;

} listNode;

/*
 * Double ended linked list iterator
 */
typedef struct listIter {

    // Node currently iterated to
    listNode *next;

    // Direction of iteration
    int direction;

} listIter;
/*
 * Double ended linked list structure
 */
typedef struct list {

    // header node 
    listNode *head;

    // Footer node
    listNode *tail;

    // Node value copy function
    void *(*dup)(void *ptr);

    // Node value release function
    void (*free)(void *ptr);

    // Node value comparison function
    int (*match)(void *ptr, void *key);

    // Number of nodes contained in the linked list
    unsigned long len;

} list;

Dictionaries

Also known as: symbol table, associative array and mapping, which are used to save key value pairs;

/*
 * Dictionaries
 */
typedef struct dict {

    // Type specific function
    dictType *type;

    // Private data
    void *privdata;

    // Hashtable 
    dictht ht[2];

    // rehash Indexes
    // When rehash When not in progress, the value is -1
    int rehashidx; /* rehashing not in progress if rehashidx == -1 */

    // Number of security iterators currently running
    int iterators; /* number of iterators currently running */

} dict;

 

Understanding:

  • Dictionary dict;
    • Attribute type refers to dictType; dictType is a function that holds key value pairs of a specific type; redis will set different functions for different dictionaries.
    • The property privateData is an optional parameter that stores the corresponding type.
    • ht is a reference to dictht;
      • Where table is a two-dimensional reference to dictEntry, with two levels.
      • The first level is the value after the hash. When it reaches the threshold, it is necessary to rehash the expansion capacity
        • dictEntry is a hash node
        • Key is the key
        • Value is a value
        • There is also a reference next to the next node for chaining.

 

hash algorithm

conflict resolution

The hash table uses the chain address method to resolve key conflicts:

The pointers of the hash table node dictEntry form a chain, and the same hash is listed in the next of the current dictEntry

rehash

1. Allocate space for ht[1], compare with ht[0], and allocate space for capacity expansion

 // The size of the new hash table is at least twice the number of nodes currently in use
 // T = O(N)
return dictExpand(d, d->ht[0].used*2);

Shrink allocates a size equal to ht[0]?

/*
 * Reduce the given dictionary
 * Make the ratio between the number of used nodes and the size of the dictionary close to 1:1
 * Return to DICT_ERR indicates that the dictionary is already in rehash or dict_can_resize is false.
 * When the smaller ht[1] is successfully created and can be resize d, it returns DICT_OK. 
 */
int dictResize(dict *d)
{
    int minimal;
    // Cannot close at rehash Or is rehash Called when
    if (!dict_can_resize || dictIsRehashing(d)) return DICT_ERR;
    // Calculate the minimum number of nodes required to make the ratio close to 1:1
    minimal = d->ht[0].used;
    if (minimal < DICT_HT_INITIAL_SIZE)
        minimal = DICT_HT_INITIAL_SIZE;
    // Resize dictionary
    // T = O(N)
    return dictExpand(d, minimal);
}

2. Copy ht[0] and re hash it to ht[1]

3. Release ht[0] and set ht[1] to ht[0].

 

Expansion and contraction of hash table

The hash table automatically adjusts the size of the table to the second power.

When there is no bgSave or bgRewriteAOF command, the load factor is greater than 1; Or when there are bgSave or bgRewriteAOF commands, the load factor is greater than 5; Time execution

Load factor = saved / hash table size

Progressive rehash

 

Jump table

skipList 

todo

Integer set

 

Compression table

 

object

 

 

 

 

 

2: Single machine

principle

 

Method of saving key value pairs

 

Expiration Policies

 

Persistence mechanism

 

Other events, redis initialization process, etc

 

3: Distributed

Sentinel

 

copy

 

colony

 

4: Independent function

 

Publish and subscribe

 

affair

 

Lua script

 

sort

 

Slow query

 

 

 

1

1

1

1

1

1

1

1

1

1

1

Added by Illusionist on Tue, 25 Jan 2022 20:17:28 +0200