Redis source code analysis (27) radius tree source code analysis

This work adopts Knowledge sharing Attribution - non commercial use - sharing in the same way 4.0 international license agreement License.
This work( Zhao Long Li Blog, by Zhao Long Li Creation), by Zhao Long Li Confirm, reprint please indicate the copyright.

introduction

I believe that Radix Tree is a "mysterious" data structure that many people have heard of but have not actually used. address space in OS uses it to store the mapping from address to page; Redis uses it to store all the key information corresponding to the slot and as the main data structure of the stream, but when it is used and what advantages it has, fewer people can answer it.

Radix Tree can actually be regarded as a compressed version of Tire Tree, which can minimize the memory consumption of storage, and the time complexity of query is also O(k), where k is the longest length of stored key. It provides excellent performance and memory consumption when short string is used as key. For example, when a string of fixed length ID S is used as a key.

Algorithm analysis

Mainly talk about the insertion algorithm. Here we mainly refer to the annotation of antirez in the source code. Here we can blow it again, brother. The annotation is really beautiful!

It is difficult to understand the algorithm directly, and it needs to be supplemented with code.

Insertion is mainly divided into two cases:

  1. The key already exists in the radius tree.
  2. The found node is a compressed node and conflicts when it matches the subscript in the compressed node.
  3. The found node is a compressed node with a prefix of key. At this time, only one node needs to be inserted.
  4. The found node is not a compressed node. It is inserted after sorting directly.

Suppose the initial tree structure is as follows:

 /*
 *     "ANNIBALE" -> "SCO" -> []
 */

The detailed algorithm of the second step is as follows. There are four matching cases:

/*
 * 1) Inserting "ANNIENTARE"
 *
 *               |B| -> "ALE" -> "SCO" -> []
 *     "ANNI" -> |-|
 *               |E| -> (... continue algo ...) "NTARE" -> []
 *
 * 2) Inserting "ANNIBALI"
 *
 *                  |E| -> "SCO" -> []
 *     "ANNIBAL" -> |-|
 *                  |I| -> (... continue algo ...) []
 *
 * 3) Inserting "AGO" (Like case 1, but set iscompr = 0 into original node)
 *
 *            |N| -> "NIBALE" -> "SCO" -> []
 *     |A| -> |-|
 *            |G| -> (... continue algo ...) |O| -> []
 *
 * 4) Inserting "CIAO"
 *
 *     |A| -> "NNIBALE" -> "SCO" -> []
 *     |-|
 *     |C| -> (... continue algo ...) "IAO" -> []
 */
  1. Save the NEXT pointer of the current compressed node (the pointer of the child element, which always exists in the compressed node).
  2. Create a "split node" with non-public letters at the compressed node as child nodes.
  3. a. Replace the old node with "split node".
    b. Split the compressed node (and reassign it) to contain the characters of the conflict point. Change the sub pointer to connect to the split node. If the new compression node len is only 1, set iscompr to 0.
  4. a. If the suffix len (the remaining string length of the original compressed node after splitting the character) is non-zero, a "suffix node" is created. If the suffix node has only one character set, iscompr is 0, otherwise iscompr is 1. Set the suffix node sub pointer to NEXT.
    b. If the suffix len is zero, just use NEXT as the suffix pointer.
  5. Set the child[0] of the split node to the postfix node.
  6. Set "split node" as the current node, set the current index at child[1], and continue the insertion algorithm as usual.

The detailed algorithm of the third step is as follows:

 	 /* Inserting "ANNI"
     *
     *     "ANNI" -> "BALE" -> "SCO" -> []
     */

SPLITPOS is the character that matches the conflict point.

For example, we insert "ANNIBBLE" and "ANNIANTARE", and the value is "B".

  1. Save the NEXT pointer of the current compression node.
  2. Create a "suffix node" that contains all characters from $SPLITPOS to the end. Use $NEXT as the suffix node sub pointer. If the suffix node length is 1, set iscompr to 0. Set the node to the key with the associated value of the new insert key.
  3. Trim the current node to include the first $SPLITPOS character. Typically, if the new node length is only 1, set iscompr to 0. Use the iskey / association value in the original node.
  4. Set the suffix node as the only child pointer of the trim node created in step 1.

Delete interested friends, you can understand it yourself.

Traversal is a simple deep search in the code.

Source code analysis

Basic data structure

typedef struct rax {
    raxNode *head;		// Head node
    uint64_t numele;	// Total number of elements
    uint64_t numnodes;	// Total nodes
} rax;

typedef struct raxNode {
    uint32_t iskey:1;     // Whether the node is a key
    uint32_t isnull:1;    // Whether the node is null, and the key can be null
    uint32_t iscompr:1;   // Is it a compressed node
    uint32_t size:29;     // Number of characters stored by the node
    unsigned char data[]; // Store the information corresponding to the word node. The compressed node is different from the uncompressed node. Please refer to [1]
} raxNode;

typedef struct raxIterator {
    int flags;				// Tag bits added during traversal
    rax *rt;                // The radius tree we traverse
    unsigned char *key;     // Current node during traversal
    void *data;             // key associated data
    size_t key_len;         // Length of key
    size_t key_max;         // Maximum valid range of key
    unsigned char key_static_string[RAX_ITER_STATIC_LEN];	// Use this instead of dynamic allocation for short key s
    raxNode *node;          // Node return value
    raxStack stack;         // Record the call stack during traversal
    raxNodeCallback node_cb; /* callback */
} raxIterator;

typedef struct raxStack {
	// The number of paths is the same as the key above. Short array optimization
    void **stack; /* Points to static_items or an heap allocated array. */
    size_t items, maxitems; /* Number of items contained and total space. */
    /* Up to RAXSTACK_STACK_ITEMS items we avoid to allocate on the heap
     * and use this static array of pointers instead. */
    void *static_items[RAX_STACK_STATIC_ITEMS];
    int oom; /* True if pushing into this stack failed for OOM at some point. */
} raxStack;

raxGenericInsert

/*
 * @rax: radix Tree itself
 * @s: Key string
 * @len: The length of the string for the key
 * @data: Value to insert
 * @old: If the key exists, return the old key
 * @overwrite: If the key already exists, do you want to overwrite it
 **/
int raxGenericInsert(rax *rax, unsigned char *s, size_t len, void *data, void **old, int overwrite) {
    size_t i;	// Represents the subscript in s when the mismatched character is found in the radius tree
    int j = 0;  // Which character does not match in the node representing the mismatched character found in the radius tree
    // H stores the found node and plink stores its parent node. Because of the organization form of radix, plink must exist if h exists
    raxNode *h, **parentlink;

    debugf("### Insert %.*s with value %p\n", (int)len, s, data);
    // Executing the search process, the core code has only one loop, which can be said to be very clever
    i = raxLowWalk(rax,s,len,&h,&parentlink,&j,NULL);

	// This judgment condition indicates that the key already exists in the radius tree
	// Or the matching end point is not in the middle of a compression node, so this node can be used as a key
    if (i == len && (!h->iscompr || j == 0 /* not in the middle if j is 0 */)) {
        debugf("### Insert: node representing key exists\n");
        // In the second case, we need to allocate memory
        if (!h->iskey || (h->isnull && overwrite)) {
            h = raxReallocForData(h,data);
            if (h) memcpy(parentlink,&h,sizeof(h));
        }
        if (h == NULL) {
            errno = ENOMEM;
            return 0;
        }

        // In the first case, you can update directly according to overwrite
        // In fact, this is the condition H - > iskey & & (! H - > isnull |! Overwrite)
        if (h->iskey) {
            if (old) *old = raxGetData(h);
            if (overwrite) raxSetData(h,data);
            errno = 0;
            return 0; /* Element already exists. */
        }
		
        /* Otherwise set the node as a key. Note that raxSetData()
         * will set h->iskey. */
        // In fact, the judgment conditions here are as follows
        // !h->iskey 
        raxSetData(h,data);
        rax->numele++;
        return 1; /* Element inserted. */
    }
	// Here I have to say that antriez is a real boss, and the comments can't be picky. The big brother of this part of the code combs the context of the algorithm very clearly with comments
	// Because the space is not in the code, we will talk about this part of the algorithm later.
    /* ------------------------- ALGORITHM 1 --------------------------- */
    // The node found is a compressed node and I= Len, prove that it needs to be split
    if (h->iscompr && i != len) {
        debugf("ALGO 1: Stopped at compressed node %.*s (%p)\n",
            h->size, h->data, (void*)h);
        debugf("Still to insert: %.*s\n", (int)(len-i), s+i);
        debugf("Splitting at %d: '%c'\n", j, ((char*)h->data)[j]);
        debugf("Other (key) letter is '%c'\n", s[i]);

        /* 1: Save next pointer. */
        raxNode **childfield = raxNodeLastChildPtr(h);
        raxNode *next;
        memcpy(&next,childfield,sizeof(next));
        debugf("Next is %p\n", (void*)next);
        debugf("iskey %d\n", h->iskey);
        if (h->iskey) {
            debugf("key value is %p\n", raxGetData(h));
        }

        /* Set the length of the additional nodes we will need. */
        size_t trimmedlen = j;
        size_t postfixlen = h->size - j - 1;
        // If h is the key, it indicates that the current node has a val, that is, the first node after splitting is the key
        // Obviously, if trimmedlen is zero, splitnode is key, otherwise trimmedlen is key
        // This sentence must be understood thoroughly
        int split_node_is_key = !trimmedlen && h->iskey && !h->isnull;
        size_t nodesize;

        /* 2: Create the split node. Also allocate the other nodes we'll need
         *    ASAP, so that it will be simpler to handle OOM. */
        // splitnode contains only one node
        raxNode *splitnode = raxNewNode(1, split_node_is_key);
        raxNode *trimmed = NULL;
        raxNode *postfix = NULL;
		
		// Normal allocation of memory
        if (trimmedlen) {
            nodesize = sizeof(raxNode)+trimmedlen+raxPadding(trimmedlen)+
                       sizeof(raxNode*);
            if (h->iskey && !h->isnull) nodesize += sizeof(void*);
            trimmed = rax_malloc(nodesize);
        }

        if (postfixlen) {
            nodesize = sizeof(raxNode)+postfixlen+raxPadding(postfixlen)+
                       sizeof(raxNode*);
            postfix = rax_malloc(nodesize);
        }

        /* OOM? Abort now that the tree is untouched. */
        if (splitnode == NULL ||
            (trimmedlen && trimmed == NULL) ||
            (postfixlen && postfix == NULL))
        {
            rax_free(splitnode);
            rax_free(trimmed);
            rax_free(postfix);
            errno = ENOMEM;
            return 0;
        }
        // In either case, splitnode always has a valid bit
        splitnode->data[0] = h->data[j];
		
        if (j == 0) {
            /* 3a: Replace the old node with the split node. */
            // The current node will be replaced by splitnode. A node is divided into splitnode and postfix
            if (h->iskey) {
                void *ndata = raxGetData(h);
                raxSetData(splitnode,ndata);
            }
            memcpy(parentlink,&splitnode,sizeof(splitnode));
        } else {
            /* 3b: Trim the compressed node. */
            // The current node is replaced by trim, and a node is split into trim, splitnode, postfix or trim, splitnode
            trimmed->size = j;
            memcpy(trimmed->data,h->data,j);
            trimmed->iscompr = j > 1 ? 1 : 0;
            trimmed->iskey = h->iskey;
            trimmed->isnull = h->isnull;
            if (h->iskey && !h->isnull) {
                void *ndata = raxGetData(h);
                raxSetData(trimmed,ndata);
            }
            raxNode **cp = raxNodeLastChildPtr(trimmed);
            memcpy(cp,&splitnode,sizeof(splitnode));
            memcpy(parentlink,&trimmed,sizeof(trimmed));
            parentlink = cp; /* Set parentlink to splitnode parent. */
            rax->numnodes++;
        }

        /* 4: Create the postfix node: what remains of the original
         * compressed node after the split. */
        if (postfixlen) {
            /* 4a: create a postfix node. */
            postfix->iskey = 0;
            postfix->isnull = 0;
            postfix->size = postfixlen;
            postfix->iscompr = postfixlen > 1;
            // If postfix exists, connect postfix to next
            memcpy(postfix->data,h->data+j+1,postfixlen);
            raxNode **cp = raxNodeLastChildPtr(postfix);
            memcpy(cp,&next,sizeof(next));
            rax->numnodes++;
        } else {
            /* 4b: just use next as postfix node. */
           	// Postfix is NULL. Obviously, setting postfix to next is the most convenient for the node created in step 3
           	// Hide complexity from step 5
            postfix = next;
        }

        /* 5: Set splitnode first child as the postfix node. */
        raxNode **splitchild = raxNodeLastChildPtr(splitnode);
        memcpy(splitchild,&postfix,sizeof(postfix));
        // Connecting splitnode and postfix

        /* 6. Continue insertion: this will cause the splitnode to
         * get a new child (the non common character at the currently
         * inserted key). */
        rax_free(h);
        h = splitnode;
        // Take h as the current node. After splitting the original node, we start to insert the newly added node
    } else if (h->iscompr && i == len) {
    /* ------------------------- ALGORITHM 2 --------------------------- */
        debugf("ALGO 2: Stopped at compressed node %.*s (%p) j = %d\n",
            h->size, h->data, (void*)h, j);

        /* Allocate postfix & trimmed nodes ASAP to fail for OOM gracefully. */
        // Allocate memory for postfix and trimmed
        size_t postfixlen = h->size - j;
        size_t nodesize = sizeof(raxNode)+postfixlen+raxPadding(postfixlen)+
                          sizeof(raxNode*);
        if (data != NULL) nodesize += sizeof(void*);
        raxNode *postfix = rax_malloc(nodesize);

        nodesize = sizeof(raxNode)+j+raxPadding(j)+sizeof(raxNode*);
        if (h->iskey && !h->isnull) nodesize += sizeof(void*);
        raxNode *trimmed = rax_malloc(nodesize);

        if (postfix == NULL || trimmed == NULL) {
            rax_free(postfix);
            rax_free(trimmed);
            errno = ENOMEM;
            return 0;
        }

        /* 1: Save next pointer. */
        raxNode **childfield = raxNodeLastChildPtr(h);
        raxNode *next;
        memcpy(&next,childfield,sizeof(next));

        /* 2: Create the postfix node. */
        postfix->size = postfixlen;
        postfix->iscompr = postfixlen > 1;
        // The code in this place is true
        // At first glance, we don't consider that data is null, and set isnull error to zero
        // If the data is null, the last data pointer will not be allocated, and there may be problems with direct setting
        // But big brother solves all problems in the order of raxSetData and raxNodeLastChildPtr
        postfix->iskey = 1;
        postfix->isnull = 0;
        memcpy(postfix->data,h->data+j,postfixlen);
        raxSetData(postfix,data);
        raxNode **cp = raxNodeLastChildPtr(postfix);
        memcpy(cp,&next,sizeof(next));
        rax->numnodes++;

        /* 3: Trim the compressed node. */
        trimmed->size = j;
        trimmed->iscompr = j > 1;
        trimmed->iskey = 0;
        // There seems to be some problems with the setting here. It should be set to 1, otherwise it is too much
        trimmed->isnull = 0;
        memcpy(trimmed->data,h->data,j);
        memcpy(parentlink,&trimmed,sizeof(trimmed));
     	
        if (h->iskey) {
            void *aux = raxGetData(h);
            raxSetData(trimmed,aux);
        }

        /* Fix the trimmed node child pointer to point to
         * the postfix node. */
        // Insert trimmed after postfix
        cp = raxNodeLastChildPtr(trimmed);
        memcpy(cp,&postfix,sizeof(postfix));

        /* Finish! We don't need to continue with the insertion
         * algorithm for ALGO 2. The key is already inserted. */
        rax->numele++;
        rax_free(h);
        return 1; /* Key inserted. */
    }

    /* We walked the radix tree as far as we could, but still there are left
     * chars in our string. We need to insert the missing nodes. */
    // We have finished the splitting step, and then we need to insert new nodes
    // Although it looks like a while, it only runs two cycles
    while(i < len) {
        raxNode *child;

        /* If this node is going to have a single child, and there
         * are other characters, so that that would result in a chain
         * of single-childed nodes, turn it into a compressed node. */
        if (h->size == 0 && len-i > 1) {
            debugf("Inserting compressed node\n");
            size_t comprsize = len-i;
            if (comprsize > RAX_NODE_MAX_SIZE)
                comprsize = RAX_NODE_MAX_SIZE;
            // Give n realloc and create child nodes
            raxNode *newh = raxCompressNode(h,s+i,comprsize,&child);
            if (newh == NULL) goto oom;
            h = newh;
            memcpy(parentlink,&h,sizeof(h));
            parentlink = raxNodeLastChildPtr(h);
            i += comprsize;
        } else {
        	// In the first step, create child nodes of splitnode
            debugf("Inserting normal node\n");
            raxNode **new_parentlink;
            raxNode *newh = raxAddChild(h,s[i],&child,&new_parentlink);
            if (newh == NULL) goto oom;
            h = newh;
            memcpy(parentlink,&h,sizeof(h));
            parentlink = new_parentlink;
            i++;
        }
        rax->numnodes++;
        h = child;
    }
    // Reloc the new node and copy the data
    raxNode *newh = raxReallocForData(h,data);
    if (newh == NULL) goto oom;
    h = newh;
    if (!h->iskey) rax->numele++;
    raxSetData(h,data);
    memcpy(parentlink,&h,sizeof(h));
    return 1; /* Element inserted. */

oom:
    /* This code path handles out of memory after part of the sub-tree was
     * already modified. Set the node as a key, and then remove it. However we
     * do that only if the node is a terminal node, otherwise if the OOM
     * happened reallocating a node in the middle, we don't need to free
     * anything. */
    if (h->size == 0) {
        h->isnull = 1;
        h->iskey = 1;
        rax->numele++; /* Compensate the next remove. */
        assert(raxRemove(rax,s,i,NULL) != 0);
    }
    errno = ENOMEM;
    return 0;
}

raxSeek

int raxSeek(raxIterator *it, const char *op, unsigned char *ele, size_t len) {
    int eq = 0, lt = 0, gt = 0, first = 0, last = 0;
	// Here's a point, rax_ ITER_ The EOF flag indicates that the element we need is not found
	// After the raxSeek function is completed, use flags to determine whether the required element is found
	// flags with RAX_ITER_EOF indicates that no value was found
    it->stack.items = 0; /* Just resetting. Initialized by raxStart(). */
    it->flags |= RAX_ITER_JUST_SEEKED;	// Set subscript bit
    it->flags &= ~RAX_ITER_EOF;			// Remove the EOF mark
    it->key_len = 0;
    it->node = NULL;

    /* Set flags according to the operator used to perform the seek. */
    if (op[0] == '>') {
        gt = 1;
        if (op[1] == '=') eq = 1;
    } else if (op[0] == '<') {
        lt = 1;
        if (op[1] == '=') eq = 1;
    } else if (op[0] == '=') {
        eq = 1;
    } else if (op[0] == '^') {
        first = 1;
    } else if (op[0] == '$') {
        last = 1;
    } else {
        errno = 0;
        return 0; /* Error. */
    }

    /* If there are no elements, set the EOF condition immediately and
     * return. */
    if (it->rt->numele == 0) {
        it->flags |= RAX_ITER_EOF;
        return 1;
    }

    if (first) {
        /* Seeking the first key greater or equal to the empty string
         * is equivalent to seeking the smaller key available. */
        return raxSeek(it,">=",NULL,0);
    }

    if (last) {
        /* Find the greatest key taking always the last child till a
         * final node is found. */
        it->node = it->rt->head;
        // The name should be the largest node, but in fact, the matching process is very naked
        // That is, the compressed node takes all the data, and the uncompressed node takes the last character
        if (!raxSeekGreatest(it)) return 0;
        assert(it->node->iskey);
        it->data = raxGetData(it->node);
        return 1;
    }

    /* We need to seek the specified key. What we do here is to actually
     * perform a lookup, and later invoke the prev/next key code that
     * we already use for iteration. */
    int splitpos = 0;
    // Find the corresponding value of ele in rt and set the mismatched index on key and node
    size_t i = raxLowWalk(it->rt,ele,len,&it->node,NULL,&splitpos,&it->stack);

    /* Return OOM on incomplete stack info. */
    if (it->stack.oom) return 0;
	
	// If you find an exact matching point and there is a key, you can refer to the first judgment condition of raxGenericInsert
    if (eq && i == len && (!it->node->iscompr || splitpos == 0) &&
        it->node->iskey)
    {	// Insert the key data into it - > key in the Iterator. Here is a short string optimization, which reminds me of the 15 bytes of std::string
        if (!raxIteratorAddChars(it,ele,len)) return 0; /* OOM. */
        it->data = raxGetData(it->node);
    } else if (lt || gt) {
   		// The first condition needs to match exactly and set eq,
        /* Exact key not found or eq flag not set. We have to set as current
         * key the one represented by the node we stopped at, and perform
         * a next/prev operation to seek. To reconstruct the key at this node
         * we start from the parent and go to the current node, accumulating
         * the characters found along the way. */
        // Continue to enter the stack, where the role of the stack will appear, which is to realize lt
        // But the logic here is a bit confusing. It seems that it is just to rebuild it - > key. Why not use i to update it directly with key?
        // Handed in a PR: #9115, see if it will merge
        if (!raxStackPush(&it->stack,it->node)) return 0; // In order to cater to the child below
        for (size_t j = 1; j < it->stack.items; j++) {
            raxNode *parent = it->stack.stack[j-1];
            raxNode *child = it->stack.stack[j];
            if (parent->iscompr) {	// Obviously, the compression node needs to copy size characters
                if (!raxIteratorAddChars(it,parent->data,parent->size))
                    return 0;
            } else {
                raxNode **cp = raxNodeFirstChildPtr(parent);
                unsigned char *p = parent->data;
                while(1) {
                    raxNode *aux;
                    memcpy(&aux,cp,sizeof(aux));
                    if (aux == child) break;
                    cp++;
                    p++;
                }
                if (!raxIteratorAddChars(it,p,1)) return 0;
            }
        }
        raxStackPop(&it->stack);

        /* We need to set the iterator in the correct state to call next/prev
         * step in order to seek the desired element. */
        debugf("After initial seek: i=%d len=%d key=%.*s\n",
            (int)i, (int)len, (int)it->key_len, it->key);
        if (i != len && !it->node->iscompr) {
        	// When the intermediate matching in the uncompressed node is unsuccessful
            if (!raxIteratorAddChars(it,ele+i,1)) return 0;
            debugf("Seek normal node on mismatch: %.*s\n",
                (int)it->key_len, (char*)it->key);

            it->flags &= ~RAX_ITER_JUST_SEEKED; // Clear the searched flag when looking for nodes
            if (lt && !raxIteratorPrevStep(it,1)) return 0;
            // For the two functions that are very important to understand the radius tree, describe the next function that is greater than / less than the current key
            // After parsing raxSeek, see raxIteratorNextStep
            // The algorithm is very similar to prev/next in red black tree
            if (gt && !raxIteratorNextStep(it,1)) return 0;
            it->flags |= RAX_ITER_JUST_SEEKED; /* Ignore next call. */
        } else if (i != len && it->node->iscompr) {
            debugf("Compressed mismatch: %.*s\n",
                (int)it->key_len, (char*)it->key);
            /* In case of a mismatch within a compressed node. */
            // It is similar to the above logic, but the bottom-up search process is performed manually
            int nodechar = it->node->data[splitpos];
            int keychar = ele[i];
            it->flags &= ~RAX_ITER_JUST_SEEKED;
            if (gt) {
                /* If the key the compressed node represents is greater
                 * than our seek element, continue forward, otherwise set the
                 * state in order to go back to the next sub-tree. */
                // The value of the split point is greater than the ele we passed in, and it will be finished directly next
                if (nodechar > keychar) {
                    if (!raxIteratorNextStep(it,0)) return 0;
                } else {	// Otherwise, you need to add this node to find the first element larger than the key constructed now. This element must be larger than ele
               		// I personally think that adding characters here is just to adapt to the character deletion operation in raxIteratorNextStep
               		// So I changed it to (it - > node = raxstackpop) &,
               		// But after lt is changed, it can't pass the test. gdb comes out because it crosses the boundary. It's OK to judge it as zero
                    if (!raxIteratorAddChars(it,it->node->data,it->node->size))
                        return 0;
                    if (!raxIteratorNextStep(it,1)) return 0;
                }
            }
            if (lt) {
                /* If the key the compressed node represents is smaller
                 * than our seek element, seek the greater key in this
                 * subtree, otherwise set the state in order to go back to
                 * the previous sub-tree. */
                if (nodechar < keychar) {
                    if (!raxSeekGreatest(it)) return 0;
                    it->data = raxGetData(it->node);
                } else {
                	// There are some conditions that are not considered here. Replace them with (it - > node = raxstackpop (& it - > stack);) He died suddenly
                    if (!raxIteratorAddChars(it,it->node->data,it->node->size))
                        return 0;
                    // prev is simpler than next, because it only needs to go to the root first, and it is possible on both sides of next
                    // I've seen the test. Here's the key_len may be equal to zero. If you go in - 1, you will die suddenly, so you can judge it as zero here
                    // But it's not as simple as the existing one, but it's obviously more efficient. Give it a PR try. After all, it took an afternoon to work out the solution and study with pay
                    // Handed in a PR:#9120 to see how the community views this issue
                    if (!raxIteratorPrevStep(it,1)) return 0;
                }
            }
            it->flags |= RAX_ITER_JUST_SEEKED; /* Ignore next call. */
        } else {
            debugf("No mismatch: %.*s\n",
                (int)it->key_len, (char*)it->key);
            /* If there was no mismatch we are into a node representing the
             * key, (but which is not a key or the seek operator does not
             * include 'eq'), or we stopped in the middle of a compressed node
             * after processing all the key. Continue iterating as this was
             * a legitimate key we stopped at. */
            it->flags &= ~RAX_ITER_JUST_SEEKED;
            if (it->node->iscompr && it->node->iskey && splitpos && lt) {
                /* If we stopped in the middle of a compressed node with
                 * perfect match, and the condition is to seek a key "<" than
                 * the specified one, then if this node is a key it already
                 * represents our match. For instance we may have nodes:
                 *
                 * "f" -> "oobar" = 1 -> "" = 2
                 *
                 * Representing keys "f" = 1, "foobar" = 2. A seek for
                 * the key < "foo" will stop in the middle of the "oobar"
                 * node, but will be our match, representing the key "f".
                 *
                 * So in that case, we don't seek backward. */
                // In the most perfect case, just get the value directly 
                it->data = raxGetData(it->node);
            } else {
            	// The value of ele is matched, but the desired value is not matched. Just find the previous or next one
                if (gt && !raxIteratorNextStep(it,0)) return 0;
                if (lt && !raxIteratorPrevStep(it,0)) return 0;
            }
            it->flags |= RAX_ITER_JUST_SEEKED; /* Ignore next call. */
        }
    } else {
        // Set equal to but no match
        it->flags |= RAX_ITER_EOF;
        return 1;
    }
    return 1;
}

raxIteratorNextStep

// The noup here is actually difficult to understand
// Personal understanding is that when it is false, it is better to directly find a larger key than the current key. Otherwise, you need to find a larger parent node and read the code better
int raxIteratorNextStep(raxIterator *it, int noup) {
    if (it->flags & RAX_ITER_EOF) {
        return 1;
    } else if (it->flags & RAX_ITER_JUST_SEEKED) {
        it->flags &= ~RAX_ITER_JUST_SEEKED;
        return 1;
    }

    /* Save key len, stack items and the node where we are currently
     * so that on iterator EOF we can restore the current key and state. */
    // Save the initial value and perform the recovery operation when there is no successful match
    size_t orig_key_len = it->key_len;
    size_t orig_stack_items = it->stack.items;
    raxNode *orig_node = it->node;

    while(1) {
    	// Number of child nodes
        int children = it->node->iscompr ? 1 : it->node->size;
        if (!noup && children) {
            debugf("GO DEEPER\n");
            /* Seek the lexicographically smaller key in this subtree, which
             * is the first one found always going towards the first child
             * of every successive node. */
            // Find the first element with smaller dictionary order in the subtree
            if (!raxStackPush(&it->stack,it->node)) return 0;
            raxNode **cp = raxNodeFirstChildPtr(it->node);
            if (!raxIteratorAddChars(it,it->node->data,
                it->node->iscompr ? it->node->size : 1)) return 0;
            memcpy(&it->node,cp,sizeof(it->node));
            /* Call the node callback if any, and replace the node pointer
             * if the callback returns true. */
            if (it->node_cb && it->node_cb(&it->node))
                memcpy(cp,&it->node,sizeof(it->node));
          	// If it is a key, it will be returned directly. Otherwise, it will continue to traverse downward
            if (it->node->iskey) {
                it->data = raxGetData(it->node);
                return 1;
            }
        } else {
            // We have finished exploring the previous subtree and will switch to the new subtree
            // Up until a node is found. The child nodes of the node are larger than the current key in dictionary order
            // Of course, noup controls the behavior of entering for the first time
            // It's said that this function can be written without error. It's really powerful, big brother
            while(1) {
                int old_noup = noup;

               	// head is found at the end of the search, but it hasn't been found yet. Set rax at this time_ ITER_ EOF, this is important
               	// RAX_ITER_EOF is used to judge whether it can be found. Write a small demo below for a simple test
                if (!noup && it->node == it->rt->head) {
                    it->flags |= RAX_ITER_EOF;
                    it->stack.items = orig_stack_items;
                    it->key_len = orig_key_len;
                    it->node = orig_node;
                    return 1;
                }
                /* If there are no children at the current node, try parent's
                 * next child. */
                unsigned char prevchild = it->key[it->key_len-1];
                if (!noup) {	// This node has no children. Go back to the previous node on the stack
                    it->node = raxStackPop(&it->stack);
                } else {
                    noup = 0;
                }
                // Delete all the data of this node in the key
                int todel = it->node->iscompr ? it->node->size : 1;
                raxIteratorDelChars(it,todel);

                /* Try visiting the next child if there was at least one
                 * additional child. */
                // Note here that only non compressed nodes can be compared, and compressed nodes can skip directly
                // Because the compression node must be followed by the key we just abandoned
                if (!it->node->iscompr && it->node->size > (old_noup ? 0 : 1)) {
                    raxNode **cp = raxNodeFirstChildPtr(it->node);
                    int i = 0;
                    while (i < it->node->size) {
                        debugf("SCAN NEXT %c\n", it->node->data[i]);
                      	// A subtree larger than the current node is found in the parent node, which is an uncompressed node
                      	// Of course, there is no need to worry about why prevchild takes [it - > key_len-1], because at most one letter of the uncompressed node is recorded
                        if (it->node->data[i] > prevchild) break;
                        i++;
                        cp++;
                    }
                    if (i != it->node->size) {
                        debugf("SCAN found a new node\n");
                        raxIteratorAddChars(it,it->node->data+i,1);
                        if (!raxStackPush(&it->stack,it->node)) return 0;
                        memcpy(&it->node,cp,sizeof(it->node));
                        /* Call the node callback if any, and replace the node
                         * pointer if the callback returns true. */
                        if (it->node_cb && it->node_cb(&it->node))
                            memcpy(cp,&it->node,sizeof(it->node));
                        if (it->node->iskey) {
                            it->data = raxGetData(it->node);
                            return 1;
                        }
                        break;
                    }
                }
            }
        }
    }
}

Small example

Remember the rax we mentioned above_ ITER_ Is there a small example of EOF? It doesn't matter if you don't remember. Just ctrl + f,

int main() {
    rax *radixTree;
    radixTree = raxNew();
    char *vector[] = {"ANNIBBLE",
                      "ANNIANTARE",
                      "ANNIANTBDE"};
    char *data[] = {"111",
                    "222"};
    void *old = NULL;
    for (int i = 0; i < sizeof(vector)/sizeof(char*); ++i) {
        raxInsert(radixTree, vector[i], strlen(vector[i]), data[i], &old);
    }
    
    if (old != NULL) {
        printf("%s\n", (char*)old);
    }
    raxIterator ri;
    raxStart(&ri,radixTree);
    //char item[] = "ANNIBADE";   // non-compress node
    char item[] = "ANNIC";
    raxSeek(&ri,">=",item,strlen(item));
    
    printf("key : %.*s, data : %s flag:%d\n", ri.key_len ,ri.key, ri.data, ri.flags);
    return 0;
}
// gcc -std=c99 -g -lm main.c rax.c -DRAX_DEBUG_MSG

The debugf information in the middle will not be printed. Look at the results directly:

key : ANNIC, data : (null) flag:3

Obviously, it can't be found. What does this show? After reading the above code, you can see why this problem occurs. The reason is that raxIteratorAddChars set EOF at the (! Noup & & it - > node = = it - > RT - > head) judgment before raxIteratorNextStep.

Therefore, it can be seen that raxSeek uses flags to judge whether the search is successful.

Let's mention rax There is an interesting function in C, namely raxShow, which can graphically display the current radius tree. Take a simple example:

The above Radix tree looks like this after using raxShow:

"ANNI" -> [AB]
        `-(A) "NT" -> [AB]
                       `-(A) "RE" -> []=0x406c23
                       `-(B) "DE" -> []=0x406c00
        `-(B) "BLE" -> []=0x406c1f

It can be said to be very simple and convenient.

Next, there is nothing more to say about other functions. Except for raxRemove, other functions are not auxiliary functions, or the packaging of raxInsert and raxSeek. They are relatively not so troublesome.

summary

Radix Tree is a relatively new data structure in Redis. For me, it's to see some code of stream, so I spent some time here. Of course, it's not necessary for ordinary enthusiasts. Just understand its basic principle.

Here you can recommend a short and concise article by sister Xia YuXun [1]. I have to say that the picture is very good! No, this article may have to spend another 30% of the time understanding this part of the code. What makes people angry is that the quality of most people on the current network really needs to be improved. The articles stained with radio tree source code to analyze these keywords basically copied this article without source.

reference resources:

  1. Redis · engine features · radius tree source code analysis

Keywords: Redis

Added by Leipe_Po on Thu, 27 Jan 2022 19:25:45 +0200