Implementation principle of Redis cluster

1, Node

At the beginning, each Redis instance is A cluster itself. The work of connecting each node is completed through the CLUSTER MEET command. After A and B shake hands, A will spread B's information to other nodes in A's cluster through the Gossip protocol, so that other nodes can also shake hands with B;

// Cluster nodes
struct clusterNode{
 
    // When the node was created
    mstime_t ctime;
 
    // Node name
    char name[REDIS_CLUSTER_NAMELEN];
 
    // Node status
    int flags;
 
    // Configure era for failover
    uint64_t configEpoch;
 
    // Node IP
    char ip[REDIS_IP_STR_LEN];
 
    // Node port number
    int port;
 
    // Connected information
    clusterlink *link;
 
};

// link attribute of clusterNode
typedef struct clusterlink{
 
    // Connection creation time
    mstime_t ctime;
 
    // socket descriptor 
    int fd;
   
    // Output buffer
    sds sndbuf;
 
    // Input buffer
    sds rcvbuf;
 
    // The node associated with this connection
    struct clusterNode *node;
 
} clusterlink;

Each node also saves a clusterState structure, that is, each node saves the state of the cluster;

typedef struct clusterState{
 
    clusterNode *myself;
 
    // Configuration Era
    uint64_t currentEpoch;
 
    // Cluster status, online / offline
    int state;
 
    // Number of nodes in the cluster that handle at least one slot
    int size;
   
    // Cluster node list, map < name, clusternode >
    dict *nodes;
 
} clusterState;

2, Slot assignment

Redis cluster saves key value pairs in slices. The whole database of the cluster is divided into 16384 (2 ^ 14) slots. If any slot is not processed by the node, the cluster is offline. By sending the cluster addlots 0 1 2 3... 100 command to the node, one or more slots can be assigned to the node. A node will not only record its own slots in the slots of the clusterNode, but also send messages to inform other nodes of the cluster;

struct clusterNode{
 
    // Save the slot of the node with bitmap
    unsigned char slots[16384/8];
 
    int numslots;
 
};

The slots array in clusterState records the assignment information of all slots in the cluster, while clusternode Slots array only records the slot assignment of one node;

typedef struct clusterState{
    // It records which clusterNode manages each slot
    clusterNode *slots[16384];
 
} clusterState;

The Redis cluster has 16384 hash slots. Each key takes the module of 16384 after passing the CRC16 verification to determine which slot to place. Each node of the cluster is responsible for part of the hash slots. When the client sends a command to the node, if the slot where the key is located is not assigned to the current node, a MOVED error is returned to direct the client to redirect to the correct node;

Cluster nodes are similar to stand-alone databases. They also save key value pair dictionaries and expired dictionaries. Nodes can only use database No. 0. Nodes also use slots in clusterState_ to_ Keys skip the table to save the relationship between slot and key. Which key is saved in which slot, and the score is the slot number. You can quickly obtain several keys in the same slot;

typedef struct clusterState{
 
    zskiplist *slots_to_keys;
 
} clusterState;

3, Repartition

CLUSTER SETSLOT IMPORTING <source_ The ID > command imports slot from the source node;
CLUSTER SETSLOT MIGRATING <target_ ID > command export slot to target node;

typedef struct clusterState{
 
    // The slot that the current node is importing from another node
    clusterNode *importing_slots_from[16384];
 
    // The slot that the current node is migrating
    clusterNode *migrating_slots_to[16384];
 
} clusterState;

An ASK error will be generated during re fragmentation, that is, the K-V responsible by the source node is transferred to the target node and needs to be redirected. Unlike MOVED, MOVED means that the responsibility of the slot has been transferred, and ASK is only a temporary measure when re dividing;

4, Replication and failover

Cluster replicate < node can be used in the cluster_ ID > make node node_id of the slave node, and start copying the master node, and inform other nodes of the cluster;

Each node regularly sends PING to other nodes. If a master node is marked offline by more than half of the master nodes, it is marked as offline. A new master node is elected from its slave node, which is similar to the election of the leader Sentinel. It is first come, first served, with more than half. Then execute SLAVE no one to become the new master node, which is responsible for processing the slot and informing other nodes;

5, News

There are mainly five kinds of messages among nodes: MEET, PING, PONG, FAIL (offline) and PUBLISH (group sending, the client sends messages to a node, and the node sends messages to other nodes instead of the client sending messages directly);

Keywords: Redis Distribution

Added by OttoBufonto on Fri, 11 Feb 2022 03:34:10 +0200