1, Node
At the beginning, each Redis instance is A cluster itself. The work of connecting each node is completed through the CLUSTER MEET command. After A and B shake hands, A will spread B's information to other nodes in A's cluster through the Gossip protocol, so that other nodes can also shake hands with B;
// Cluster nodes struct clusterNode{ // When the node was created mstime_t ctime; // Node name char name[REDIS_CLUSTER_NAMELEN]; // Node status int flags; // Configure era for failover uint64_t configEpoch; // Node IP char ip[REDIS_IP_STR_LEN]; // Node port number int port; // Connected information clusterlink *link; }; // link attribute of clusterNode typedef struct clusterlink{ // Connection creation time mstime_t ctime; // socket descriptor int fd; // Output buffer sds sndbuf; // Input buffer sds rcvbuf; // The node associated with this connection struct clusterNode *node; } clusterlink;
Each node also saves a clusterState structure, that is, each node saves the state of the cluster;
typedef struct clusterState{ clusterNode *myself; // Configuration Era uint64_t currentEpoch; // Cluster status, online / offline int state; // Number of nodes in the cluster that handle at least one slot int size; // Cluster node list, map < name, clusternode > dict *nodes; } clusterState;
2, Slot assignment
Redis cluster saves key value pairs in slices. The whole database of the cluster is divided into 16384 (2 ^ 14) slots. If any slot is not processed by the node, the cluster is offline. By sending the cluster addlots 0 1 2 3... 100 command to the node, one or more slots can be assigned to the node. A node will not only record its own slots in the slots of the clusterNode, but also send messages to inform other nodes of the cluster;
struct clusterNode{ // Save the slot of the node with bitmap unsigned char slots[16384/8]; int numslots; };
The slots array in clusterState records the assignment information of all slots in the cluster, while clusternode Slots array only records the slot assignment of one node;
typedef struct clusterState{ // It records which clusterNode manages each slot clusterNode *slots[16384]; } clusterState;
The Redis cluster has 16384 hash slots. Each key takes the module of 16384 after passing the CRC16 verification to determine which slot to place. Each node of the cluster is responsible for part of the hash slots. When the client sends a command to the node, if the slot where the key is located is not assigned to the current node, a MOVED error is returned to direct the client to redirect to the correct node;
Cluster nodes are similar to stand-alone databases. They also save key value pair dictionaries and expired dictionaries. Nodes can only use database No. 0. Nodes also use slots in clusterState_ to_ Keys skip the table to save the relationship between slot and key. Which key is saved in which slot, and the score is the slot number. You can quickly obtain several keys in the same slot;
typedef struct clusterState{ zskiplist *slots_to_keys; } clusterState;
3, Repartition
CLUSTER SETSLOT IMPORTING <source_ The ID > command imports slot from the source node;
CLUSTER SETSLOT MIGRATING <target_ ID > command export slot to target node;
typedef struct clusterState{ // The slot that the current node is importing from another node clusterNode *importing_slots_from[16384]; // The slot that the current node is migrating clusterNode *migrating_slots_to[16384]; } clusterState;
An ASK error will be generated during re fragmentation, that is, the K-V responsible by the source node is transferred to the target node and needs to be redirected. Unlike MOVED, MOVED means that the responsibility of the slot has been transferred, and ASK is only a temporary measure when re dividing;
4, Replication and failover
Cluster replicate < node can be used in the cluster_ ID > make node node_id of the slave node, and start copying the master node, and inform other nodes of the cluster;
Each node regularly sends PING to other nodes. If a master node is marked offline by more than half of the master nodes, it is marked as offline. A new master node is elected from its slave node, which is similar to the election of the leader Sentinel. It is first come, first served, with more than half. Then execute SLAVE no one to become the new master node, which is responsible for processing the slot and informing other nodes;
5, News
There are mainly five kinds of messages among nodes: MEET, PING, PONG, FAIL (offline) and PUBLISH (group sending, the client sends messages to a node, and the node sends messages to other nodes instead of the client sending messages directly);