Data type analysis of Redis source code - DICT

Data type analysis of Redis source code - DICT

The current Redis analysis version is 6.2, which should be noted.

DICT, dictionary data structure, is essentially a hash table. Related documents mainly include dict.h and dict.c.

Infrastructure

dictEntry is a hash node and a key value pair, in which the value adopts the union v. it can be seen from * next in the structure that this is a one-way linked list, and the necessary header node is saved separately and hung on the hash table.

typedef struct dictEntry {
    void *key; // key
    union {
        void *val;
        uint64_t u64; // unsigned
        int64_t s64; // signed
        double d;
    } v; // value
    struct dictEntry *next; // Next node address
} dictEntry;

Then there is the dictType dictionary type, which distinguishes dictionaries. The related processing functions of different types of dictionaries can be different.

typedef struct dictType {
    // The hash function generates the corresponding hash value according to the Key
    uint64_t (*hashFunction)(const void *key);
    void *(*keyDup)(void *privdata, const void *key); // Key copy function
    void *(*valDup)(void *privdata, const void *obj); // Value copy function
    int (*keyCompare)(void *privdata, const void *key1, const void *key2); // Key comparison function
    void (*keyDestructor)(void *privdata, void *key); // Destroy key function
    void (*valDestructor)(void *privdata, void *obj); // Destroy value function
    int (*expandAllowed)(size_t moreMem, double usedRatio); // Expansion function?
} dictType;

dictht hash table structure.

typedef struct dictht {
    dictEntry **table; // The secondary pointer stores the head pointer of the hash node linked list, mainly to solve hash conflicts
    unsigned long size; // Hash table size
    unsigned long sizemask; // Hash table size mask size-1. Calculate index idx = hash & sizemask
    unsigned long used; // Number of existing hash nodes
} dictht; // Hashtable 

dict dictionary.

typedef struct dict {
    dictType *type; // Dictionary type
    void *privdata; // Private data
    // Hash table array. There are only two hash tables. Generally, only the first hash table is used.
    // ht[1] can only be used in rehash of ht[0]
    dictht ht[2]; 
    // Rehash IDX ID is - 1 rehash process index when there is no rehash
    long rehashidx;
    // rehash pause ID
    // >0 indicates pause < 0 indicates coding error
    int16_t pauserehash;
} dict;

dictIterator dictionary iterator.

typedef struct dictIterator {
    dict *d; // Current iteration dictionary
    long index; // The hash table index location that the iterator is currently pointing to
    // Table hash table 0 | 1 currently iterated
    // Safe identifies whether the current iterator is safe
    // If safe is 1, it is safe. During the iteration, dictAdd, dictFind and other functions can be executed to modify the dictionary
    // If safe is not 1, you can only call dictNext to iterate over the dictionary. Limited modification
    int table, safe;
    // entry current iteration node pointer
    // nextEntry is the next node of the current iteration node. The entry node may be modified when the security iterator runs
    // Additional pointers are required to save the next node to prevent pointer loss
    dictEntry *entry, *nextEntry;
    // Unsafe iterator signature for misuse detection
    long long fingerprint;
} dictIterator;

Macro definition function

DICT_NOTUSED

Do unused parameters generate warnings?

dictFreeVal

Releases the specified hash node value of the specified dictionary. If the value destructor is defined, it will not be processed if it is not defined.

#define dictFreeVal(d, entry) \
    if ((d)->type->valDestructor) \
        (d)->type->valDestructor((d)->privdata, (entry)->v.val)
dictSetVal

Set the pointing hash node value of the specified dictionary, and adopt the do/while structure to ensure that the code is executed once. When valDup is defined, the return value is used. On the contrary, it is set directly.

#define dictSetVal(d, entry, _val_) do { \
    if ((d)->type->valDup) \
        (entry)->v.val = (d)->type->valDup((d)->privdata, _val_); \
    else \
        (entry)->v.val = (_val_); \
} while(0)
dictSetSignedIntegerVal

Set hash node value, signed integer. Directly assign entry - > v.s64.

dictSetUnsignedIntegerVal

Set hash node value, unsigned integer. Directly assign entry - > v.u64.

dictSetDoubleVal

Set hash node value, double. Directly assign entry - > v.d.

dictFreeKey

Releases the specified hash node key of the specified dictionary, similar to dictFreeVal.

dictSetKey

Set the specified hash node key of the specified dictionary, similar to dictSetVal.

dictCompareKeys

Compares two hash keys of the specified dictionary. If D - > type - > keyCompare exists, call keyCompare for comparison. Otherwise, directly compare key1 == key2.

dictHashKey

Generates a hash value for the specified key of the specified dictionary.

dictGetKey

Gets the specified hash node key.

dictGetVal

Gets the value of the specified hash node.

dictGetSignedIntegerVal

Gets the signed integer value of the specified hash node.

dictGetUnsignedIntegerVal

Gets the unsigned integer value of the specified hash node.

dictGetDoubleVal

Gets the double value of the specified hash node.

dictSlots

Gets the total number of slots ht[0].size + ht[1].size for the specified dictionary.

dictSize

Gets the total number of nodes used by the hash table of the specified dictionary.

dictIsRehashing

Determines whether the specified dictionary is rehash.

dictPauseRehashing

Pauses the rehash process for the specified dictionary.

dictResumeRehashing

Continue the rehash process of the specified dictionary.

Constants and variables

DICT_OK

Macro definition constants. Indicates that the dictionary was processed successfully.

DICT_ERR

Macro definition constants. Indicates that dictionary processing failed.

DICT_HT_INITIAL_SIZE

Macro definition constants. Hash table initialization size.

dictTypeHeapStringCopyKey

External variables. Dictionary type, heap string copy key?

dictTypeHeapStrings

External variables. Dictionary type, heap string?

dictTypeHeapStringCopyKeyValue

External variables. Dictionary type, heap string copy key value?

dict_can_resize

Static variables. Whether the hash table size can be adjusted. However, if the value is 0, not all adjustments are blocked.

dict_force_resize_ratio

Static variables. Hash table size force scaling. used/size, if the ratio is greater than dict_force_resize_ratio, you need to force adjustment.

dict_hash_function_seed[16]

Static variables. Hash function seed.

hash function

dictSetHashFunctionSeed

Set hash seed. Implemented using memcpy.

dictGetHashFunctionSeed

Get hash seed. Direct return to Dict_ hash_ function_ Just seed.

dictGenHashFunction

Hash function. The shell is implemented in the sipash. C file.

dictGenCaseHashFunction

Is also a hash function. Shell sipash_ Nocase is implemented in the sipash. C file.

Private function

_dictExpandIfNeeded

Expand the hash table if necessary. Call the dictExpand interface function to implement.

static int _dictExpandIfNeeded(dict *d)
{
    // The rehash process is in progress. Extension is not supported. Just return directly
    if (dictIsRehashing(d)) return DICT_OK;

    // If the hash table is empty, it is directly extended to the initial size DICT_HT_INITIAL_SIZE
    if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);

    // Condition 1: hash table used size > = hash table size, indicating saturation or overflow
    // Condition 2 Dict_ can_ Whether resize supports resizing or has reached the forced expansion ratio (load factor)
    // Condition 3: the dictionary type can be extended and adjusted
    // Expand to used+1 Size... In this way, the function may be triggered more frequently. Only add existing nodes + 1.
    if (d->ht[0].used >= d->ht[0].size &&
        (dict_can_resize ||
         d->ht[0].used/d->ht[0].size > dict_force_resize_ratio) &&
        dictTypeExpandAllowed(d))
    {
        return dictExpand(d, d->ht[0].used + 1);
    }
    return DICT_OK;
}
_dictNextPower

Power expand the initial hash table size until it is greater than or equal to the given size.

static unsigned long _dictNextPower(unsigned long size)
{
    unsigned long i = DICT_HT_INITIAL_SIZE; // Initial size 4

    if (size >= LONG_MAX) return LONG_MAX + 1LU; // No overflow?
    while(1) {
        if (i >= size) // i = 2^n
            return i;
        i *= 2;
    }
}
dictTypeExpandAllowed

Determines whether the specified dictionary allows hash table extension. It is mainly based on the expandAllowed function in the dictionary type.

static int dictTypeExpandAllowed(dict *d) {
    if (d->type->expandAllowed == NULL) return 1;// Not set, OK by default?
    return d->type->expandAllowed(
                    _dictNextPower(d->ht[0].used + 1) * sizeof(dictEntry*),
                    (double)d->ht[0].used / d->ht[0].size);
}
_dictKeyIndex

Returns the index (head node) of the available slot. If in the rehash process, the index in ht[2] is always returned.

static long _dictKeyIndex(dict *d, const void *key, uint64_t hash, dictEntry **existing)
{ // Hash is generated by key. Then why not call the hash function to generate and pass parameters instead?
    unsigned long idx, table;
    dictEntry *he; // Hash node
    if (existing) *existing = NULL; // Hash node already exists. Force reset.
    if (_dictExpandIfNeeded(d) == DICT_ERR) return -1; // Expand the hash table if necessary
    
    for (table = 0; table <= 1; table++) { // Retrieved in ht[0] and ht[1], possibly in the rehash process. Need cycle
        idx = hash & d->ht[table].sizemask; // Index hash value and hash table size mask
        he = d->ht[table].table[idx]; // Get the corresponding hash header node and start the circular search key list
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key)) { // Presence return - 1
                if (existing) *existing = he;
                return -1;
            }
            he = he->next;
        }
        if (!dictIsRehashing(d)) break; // If not in the rehash process. Direct break. ht[1]=null
    }
    return idx; // If it does not exist, the return value is? You may want to drill down into hash functions.
}
_dictInit

Dictionary initialization.

int _dictInit(dict *d, dictType *type,
        void *privDataPtr)
{
    _dictReset(&d->ht[0]); // Reset hash table
    _dictReset(&d->ht[1]);
    d->type = type;
    d->privdata = privDataPtr;
    d->rehashidx = -1; // Default no rehash procedure
    d->pauserehash = 0;
    return DICT_OK;
}
// There is a problem here. The function implementation does not match the declaration. dict.c@66
static int _dictInit(dict *ht, dictType *type, void *privDataPtr);

application program interface

_dictReset

Reset used HT_ The hash table initialized by init (the function implementation is not found yet). This function is only used in HT_ Called in destroy, but now we see it in There is also a call in dictInit. Reset the related attributes in the hash table to NULL(table)/0(size|sizemask|used).

dictCreate

Interface. Create a dictionary, mainly a hash table.

dict *dictCreate(dictType *type,
        void *privDataPtr)
{
    dict *d = zmalloc(sizeof(*d)); // Allocate space

    _dictInit(d,type,privDataPtr);
    return d;
}
dictResize

Resize the hash to include the minimum value of the node.

int dictResize(dict *d)
{
    unsigned long minimal; // minimum value
    // If the hash table resizing is not supported or the rehash process is in progress, return directly
    if (!dict_can_resize || dictIsRehashing(d)) return DICT_ERR;
    minimal = d->ht[0].used; // Number of slots used
    if (minimal < DICT_HT_INITIAL_SIZE) // Less than the initialization size. replace
        minimal = DICT_HT_INITIAL_SIZE;
    return dictExpand(d, minimal);
}
_dictExpand

Expand or create a hash table. When malloc_ When failed is non null, the program will not be terminated when memory allocation fails (in this case, it is 1).

int _dictExpand(dict *d, unsigned long size, int* malloc_failed)
{
    if (malloc_failed) *malloc_failed = 0; // Reset memory allocation ID
    // In the rehash process, or the used slot is larger than the hash table size (needs to be expanded), it is returned directly
    if (dictIsRehashing(d) || d->ht[0].used > size) return DICT_ERR;

    dictht n; // New hash table
    unsigned long realsize = _dictNextPower(size); // New power expansion size
    // If the new size is equal to the hash table size, it is invalid.
    if (realsize == d->ht[0].size) return DICT_ERR;

    // Allocate a new hash table and reset all pointers to NULL
    n.size = realsize;
    n.sizemask = realsize-1;
    if (malloc_failed) { // I don't think I'll get in here, do I? The above has been reset to 0
        n.table = ztrycalloc(realsize*sizeof(dictEntry*)); // attempt
        *malloc_failed = n.table == NULL;
        if (*malloc_failed) // Here, malloc_ The value of failed is not NULL?
            return DICT_ERR;
    } else
        n.table = zcalloc(realsize*sizeof(dictEntry*)); // Allocate and initialize memory

    n.used = 0;

    // If the first hash table is NULL, it is the initialization of the first hash table. Direct assignment and return. Not rehash procedure
    if (d->ht[0].table == NULL) {
        d->ht[0] = n;
        return DICT_OK;
    }

    // Prepare the second hash table for incremental rehash
    d->ht[1] = n;
    d->rehashidx = 0;
    return DICT_OK;
}
dictExpand

Create or extend a hash table. Direct call_ dictExpand implementation, malloc_failed=NULL.

dictTryExpand

Attempt to create or extend a hash table. If allocation fails, return DICT_ERR.

int dictTryExpand(dict *d, unsigned long size) {
    int malloc_failed; // uninitialized
    _dictExpand(d, size, &malloc_failed);
    return malloc_failed? DICT_ERR : DICT_OK;
}
dictRehash

Perform an n-step incremental rehash process (incremental). Returns 1 if there are still keys to move from the old hash table to the new hash table, otherwise returns 0.

int dictRehash(dict *d, int n) {
    // Browse the maximum number of empty slots (head nodes), 10 at a time?
    int empty_visits = n*10;
    if (!dictIsRehashing(d)) return 0; // Not in rehash process

    while(n-- && d->ht[0].used != 0) { // n is not 0, and the number of slots used in the first hash table is not 0
        dictEntry *de, *nextde;

        // rehashidx cannot overflow. It ensures that there are nodes in the first hash table, because HT [0]. Used= 0
        assert(d->ht[0].size > (unsigned long)d->rehashidx);
        // Short node filtering. Non dense? Offset back to non short nodes. If empty_ If the visits is 0, it means there is no.
        while(d->ht[0].table[d->rehashidx] == NULL) {
            d->rehashidx++;
            if (--empty_visits == 0) return 1;
        }
        de = d->ht[0].table[d->rehashidx]; // Get the header node corresponding to rehashidx
        // Transfer all keys in this bucket from the old hash table to the new hash table, that is, HT [0] = > HT [1]
        while(de) {
            uint64_t h;

            nextde = de->next; // Get the next hash node, loop variable.
            // Get the new hash index in ht[1]
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
            // Update header node
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;
            d->ht[0].used--;
            d->ht[1].used++;
            de = nextde; // Update loop variable
        }
        // Backward offset. Update rehashidx
        d->ht[0].table[d->rehashidx] = NULL;
        d->rehashidx++;
    }

    // Detect complete transfer
    if (d->ht[0].used == 0) {
        zfree(d->ht[0].table); // Free original hash table space
        d->ht[0] = d->ht[1]; // Overall transfer HT [1] = > HT [0]
        _dictReset(&d->ht[1]); // Reset ht[1]
        d->rehashidx = -1; // Identify rehash end
        return 0; // nothing
    }

    // rehash is also required
    return 1;
}
timeInMilliseconds

Gets the time in milliseconds.

long long timeInMilliseconds(void) {
    struct timeval tv;

    gettimeofday(&tv,NULL);
    return (((long long)tv.tv_sec)*1000)+(tv.tv_usec/1000);
}
dictRehashMilliseconds

rehash in milliseconds + increments. The value of "delta" is greater than 0 and less than 1 in most cases. The exact upper limit depends on the running time of dictRehash(d,100).

int dictRehashMilliseconds(dict *d, int ms) {
    if (d->pauserehash > 0) return 0;

    long long start = timeInMilliseconds();
    int rehashes = 0;

    while(dictRehash(d,100)) {
        rehashes += 100;
        if (timeInMilliseconds()-start > ms) break;
    }
    return rehashes;
}
_dictRehashStep

Single node rehash. If pauserehash==0.

static void _dictRehashStep(dict *d) {
    if (d->pauserehash == 0) dictRehash(d,1);
}
dictAdd

Adds a hash element to the specified dictionary.

int dictAdd(dict *d, void *key, void *val)
{
    dictEntry *entry = dictAddRaw(d,key,NULL);

    if (!entry) return DICT_ERR;
    dictSetVal(d, entry, val);
    return DICT_OK;
}
dictAddRaw

Add or find hash nodes at a lower level.

dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
{
    long index;
    dictEntry *entry;
    dictht *ht;

    if (dictIsRehashing(d)) _dictRehashStep(d); // rehash process single node operation in progress

    // Get the new node index, and return - 1 to indicate that the node already exists.
    if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
        return NULL;

    // Locate the hash table. rehash process is ht[1], otherwise it is ht[0]
    ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
    entry = zmalloc(sizeof(*entry)); // Allocate node space
    entry->next = ht->table[index]; // Head node
    ht->table[index] = entry; // Fill in hash node
    ht->used++; // Update used slots

    dictSetKey(d, entry, key); // Set hash key
    return entry;
}
dictReplace

Add or replace hash nodes.

int dictReplace(dict *d, void *key, void *val)
{
    dictEntry *entry, *existing, auxentry;

    // If the entry is not NULL, the hash node is added
    entry = dictAddRaw(d,key,&existing);
    if (entry) {
        dictSetVal(d, entry, val);
        return 1;
    }

    // Set the new value and then release the old value (this order is important because val may be exactly the same as the existing value)
    auxentry = *existing;
    dictSetVal(d, existing, val);
    dictFreeVal(d, &auxentry);
    return 0;
}
dictAddOrFind

Add or find hash nodes. Directly call the dictAddRaw implementation. If the return value is not empty, it means adding. Otherwise, it means finding existing.

dictGenericDelete

Find and remove hash nodes. Auxiliary functions.

static dictEntry *dictGenericDelete(dict *d, const void *key, int nofree) {
    uint64_t h, idx;
    dictEntry *he, *prevHe;
    int table;
	// Empty hash table
    if (d->ht[0].used == 0 && d->ht[1].used == 0) return NULL;

    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key); // Generate corresponding hash value according to key

    for (table = 0; table <= 1; table++) {
        idx = h & d->ht[table].sizemask; // Get hash header node index
        he = d->ht[table].table[idx]; // Get the corresponding hash header node
        prevHe = NULL;
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key)) { // Key matching
                // Remove element from linked list
                if (prevHe) // Update the next of the previous node. If prevHe exists
                    prevHe->next = he->next;
                else // Otherwise, set the next node of the current node as the header node
                    d->ht[table].table[idx] = he->next;
                if (!nofree) { // nofree does not need to be released.! 0, release
                    dictFreeKey(d, he);
                    dictFreeVal(d, he);
                    zfree(he);
                }
                d->ht[table].used--; // Update used slots
                return he; // Returns the removed node
            }
            prevHe = he; // Record the current node as the previous node
            he = he->next; // Update current node
        }
        if (!dictIsRehashing(d)) break; // Non rehash, interrupt loop
    }
    return NULL; // Corresponding node not found
}
dictDelete

Delete hash node. Call dictGenericDelete directly, where nofree=0.

dictUnlink

Contact hash node binding. There is no need to actually release the corresponding hash key node. It is also a direct call to the dictGenericDelete implementation, where nofree=1 means that the corresponding space does not need to be released.

dictFreeUnlinkedEntry

Release the hash node that calls dictUnlink to unbind.

void dictFreeUnlinkedEntry(dict *d, dictEntry *he) {
    if (he == NULL) return;
    dictFreeKey(d, he);
    dictFreeVal(d, he);
    zfree(he);
}

_dictClear

Destroy the entire dictionary.

int _dictClear(dict *d, dictht *ht, void(callback)(void *)) {
    unsigned long i;
    // Release all hash nodes
    for (i = 0; i < ht->size && ht->used > 0; i++) {
        dictEntry *he, *nextHe;
		// The private data of the dictionary is processed only once when i=0
        if (callback && (i & 65535) == 0) callback(d->privdata);
		// The current header node is empty. Skip directly
        if ((he = ht->table[i]) == NULL) continue;
        while(he) {
            nextHe = he->next;
            // Release key node
            dictFreeKey(d, he);
            dictFreeVal(d, he);
            zfree(he);
            // Update the number of available slots, and the loop variable he
            ht->used--;
            he = nextHe;
        }
    }
    // Release hash table
    zfree(ht->table);
    // Reinitialize hash table
    _dictReset(ht);
    return DICT_OK;
}
dictRelease

Destroy and release the hash table.

void dictRelease(dict *d)
{
    _dictClear(d,&d->ht[0],NULL);
    _dictClear(d,&d->ht[1],NULL);
    zfree(d); // Release dictionary
}
dictFind

Find hash node. Press the key.

dictEntry *dictFind(dict *d, const void *key)
{
    dictEntry *he;
    uint64_t h, idx, table;

    if (dictSize(d) == 0) return NULL; // Empty dictionary
    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key);
    for (table = 0; table <= 1; table++) {
        idx = h & d->ht[table].sizemask;
        he = d->ht[table].table[idx];
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key))
                return he;
            he = he->next;
        }
        if (!dictIsRehashing(d)) return NULL;
    }
    return NULL;
}
dictFetchValue

Gets the value of the corresponding hash node.

void *dictFetchValue(dict *d, const void *key) {
    dictEntry *he;

    he = dictFind(d,key);
    return he ? dictGetVal(he) : NULL;
}
dictFingerprint

Dictionary signature.

long long dictFingerprint(dict *d) {
    long long integers[6], hash = 0;
    int j;

    integers[0] = (long) d->ht[0].table;
    integers[1] = d->ht[0].size;
    integers[2] = d->ht[0].used;
    integers[3] = (long) d->ht[1].table;
    integers[4] = d->ht[1].size;
    integers[5] = d->ht[1].used;

    for (j = 0; j < 6; j++) {
        hash += integers[j];
        /* For the hashing step we use Tomas Wang's 64 bit integer hash. */
        hash = (~hash) + (hash << 21); // hash = (hash << 21) - hash - 1;
        hash = hash ^ (hash >> 24);
        hash = (hash + (hash << 3)) + (hash << 8); // hash * 265
        hash = hash ^ (hash >> 14);
        hash = (hash + (hash << 2)) + (hash << 4); // hash * 21
        hash = hash ^ (hash >> 28);
        hash = hash + (hash << 31);
    }
    return hash;
}
dictGetIterator

Gets the dictionary iterator.

dictIterator *dictGetIterator(dict *d)
{
    dictIterator *iter = zmalloc(sizeof(*iter));

    iter->d = d; // Iterative dictionary
    iter->table = 0; // Hashtable 
    iter->index = -1; // Iterative index
    iter->safe = 0;
    iter->entry = NULL; // Current hash node
    iter->nextEntry = NULL; // Next hash node
    return iter;
}
dictGetSafeIterator

Gets the dictionary security iterator. After calling dictGetIterator, set ITER - > safe to 1.

dictNext

Iterative loop.

dictEntry *dictNext(dictIterator *iter)
{
    while (1) {
        if (iter->entry == NULL) { // The current node is NULL, indicating the first iteration process? No. The same is true for chain switching.
            dictht *ht = &iter->d->ht[iter->table]; // Get the corresponding hash table
            if (iter->index == -1 && iter->table == 0) { // It can be proved to be the first iterative process
                if (iter->safe) // If it is a secure iterator, pause the rehash process
                    dictPauseRehashing(iter->d);
                else // Otherwise, the signature is detected
                    iter->fingerprint = dictFingerprint(iter->d);
            }
            iter->index++; // Update iteration index
            if (iter->index >= (long) ht->size) { // If overflow occurs, detect whether the current hash table involves rehash. After judgment, switch the table and reset the index
                if (dictIsRehashing(iter->d) && iter->table == 0) {
                    iter->table++; // Switch hash tables, and reset indexes
                    iter->index = 0;
                    ht = &iter->d->ht[1]; // Update ht to ht[1]
                } else {
                    break; // If not, end
                }
            }
            iter->entry = ht->table[iter->index]; // Update current node
        } else {
            iter->entry = iter->nextEntry; // Update current node
        }
        if (iter->entry) {
            // Save the next node, and the iterator caller may delete the current returned node
            iter->nextEntry = iter->entry->next; 
            return iter->entry; // Returns the current iteration node
        }
    }
    return NULL; // Iteration end NULL
}
dictReleaseIterator

Releases the specified iterator.

void dictReleaseIterator(dictIterator *iter)
{
    if (!(iter->index == -1 && iter->table == 0)) { // It has been iterated
        if (iter->safe) // The security iterator needs to actively restart the rehash process
            dictResumeRehashing(iter->d); // That is, update D - > pauserehash--
        else
            assert(iter->fingerprint == dictFingerprint(iter->d)); // Verify iterator signature
    }
    zfree(iter); // Free iterator space
}
dictGetRandomKey

Randomly get a hash node.

dictEntry *dictGetRandomKey(dict *d)
{
    dictEntry *he, *orighe; // Target node, and original node?
    unsigned long h;
    int listlen, listele;

    if (dictSize(d) == 0) return NULL; //Empty dictionary
    if (dictIsRehashing(d)) _dictRehashStep(d); // I still don't know the effect
    if (dictIsRehashing(d)) { // In rehash process
        do {
            // 0 - > rehashidx-1 no element
            h = d->rehashidx + (randomULong() % (dictSlots(d) - d->rehashidx));
            he = (h >= d->ht[0].size) ? d->ht[1].table[h - d->ht[0].size] :
                                      d->ht[0].table[h];
        } while(he == NULL);
    } else {
        do {
            h = randomULong() & d->ht[0].sizemask;
            he = d->ht[0].table[h];
        } while(he == NULL);
    }
	
    // A non empty bucket is found, but it is a linked list. You need to get a random element from the linked list
    listlen = 0; // Linked list length
    orighe = he; // Original node
    while(he) { // next calculate the length of the linked list until it is empty
        he = he->next;
        listlen++;
    }
    listele = random() % listlen; // Random node location
    he = orighe; // return
    while(listele--) he = he->next; // Loop to target node
    return he;
}
dictGetSomeKeys

Sample the dictionary and return several keys at random.

unsigned int dictGetSomeKeys(dict *d, dictEntry **des, unsigned int count) {
    // des stores an array of random nodes 
    unsigned long j; // Internal hash table ID 0|1
    unsigned long tables; // Number of hash tables to be sampled 1|2
    unsigned long stored = 0, maxsizemask;
    unsigned long maxsteps;

    if (dictSize(d) < count) count = dictSize(d); // Detect and reset count (as needed)
    maxsteps = count*10; // Maximum steps? 10 at a time?

    // Try to make a rehash process batch proportional to count?
    for (j = 0; j < count; j++) {
        if (dictIsRehashing(d)) // You still need to judge whether it is in the rehash process
            _dictRehashStep(d);
        else
            break;
    }

    tables = dictIsRehashing(d) ? 2 : 1; // Distinguish according to current rehash
    maxsizemask = d->ht[0].sizemask; // Maximum size mask?
    if (tables > 1 && maxsizemask < d->ht[1].sizemask) // Indeed, the largest sizemask
        maxsizemask = d->ht[1].sizemask;

    // Select a random point (index?) in a larger hash table class
    unsigned long i = randomULong() & maxsizemask;
    unsigned long emptylen = 0; // So far, continuous empty node length
    while(stored < count && maxsteps--) { // count the specified number of random nodes
        for (j = 0; j < tables; j++) { // 0|1
            // Constant rehash: update the index to ht[0], which has been browsed (moved to ht[1]) during rehash, and there is no bucket during the period
            // Therefore, you can directly filter the 0 - > idx-1 buckets of ht[0]
            if (tables == 2 && j == 0 && i < (unsigned long) d->rehashidx) {
                // However, if the current index is outside the range of the second table
                // In both tables, no elements are being updated to the current rehash index
                // Switch from large table to small table
                if (i >= d->ht[1].size)
                    i = d->rehashidx;
                else
                    continue;
            }
            if (i >= d->ht[j].size) continue; // The current table has overflowed. Switch to the next hash table ht[1]
            dictEntry *he = d->ht[j].table[i];

            // Count the continuous empty buckets and jump to other locations until the count quantity is met (5 is the minimum?)
            if (he == NULL) {
                emptylen++;
                if (emptylen >= 5 && emptylen > count) {
                    i = randomULong() & maxsizemask; // Reselect index
                    emptylen = 0;
                }
            } else {
                emptylen = 0;
                while (he) {
                    // Collect all non empty buckets in the iteration process
                    *des = he;
                    des++; // des offset
                    he = he->next;
                    stored++; // Number of storage nodes updated
                    if (stored == count) return stored; // Reach the required quantity
                }
            }
        }
        i = (i+1) & maxsizemask; // Update index
    }
    return stored;
}
dictGetFairRandomKey

Obtaining a more reasonable random node is more likely to extract more buckets than dictGetRandomKey.

dictEntry *dictGetFairRandomKey(dict *d) {
    dictEntry *entries[GETFAIR_NUM_ENTRIES]; // GETFAIR_NUM_ENTRIES 15
    unsigned int count = dictGetSomeKeys(d,entries,GETFAIR_NUM_ENTRIES); // Random key
    // When there are elements in the hash table, dictGetSomeKeys may also return 0 elements. Worse
    // Therefore, you still need to call dictGetRandomKey to return at least one node.
    if (count == 0) return dictGetRandomKey(d);
    unsigned int idx = rand() % count; // Random again
    return entries[idx];
}
rev

Inverse transposition, tool function.

static unsigned long rev(unsigned long v) {
    unsigned long s = CHAR_BIT * sizeof(v); // Bit size, must be a power of 2
    unsigned long mask = ~0UL; // Set mask 0000 0000
    while ((s >>= 1) > 0) { // Shift bit right > >
        mask ^= (mask << s); // << ^
        v = ((v >> s) & mask) | ((v << s) & ~mask);
    }
    return v;
}
dictScan

Iterates over the elements of the dictionary.

unsigned long dictScan(dict *d,
                       unsigned long v,
                       dictScanFunction *fn,
                       dictScanBucketFunction* bucketfn,
                       void *privdata)
{
    dictht *t0, *t1;
    const dictEntry *de, *next;
    unsigned long m0, m1;

    if (dictSize(d) == 0) return 0;
    // Pause rehash, do not judge whether it is rehash?
    dictPauseRehashing(d);

    if (!dictIsRehashing(d)) {
        t0 = &(d->ht[0]);
        m0 = t0->sizemask;

        // At the current position, call the bucket callback handler
        if (bucketfn) bucketfn(privdata, &t0->table[v & m0]);
        de = t0->table[v & m0];
        while (de) {
            next = de->next;
            fn(privdata, de); // Node callback handler
            de = next; // continue
        }

        /* Set unmasked bits so incrementing the reversed cursor
         * operates on the masked bits */
        v |= ~m0;

        /* Increment the reverse cursor */
        v = rev(v);
        v++;
        v = rev(v);

    } else {
        t0 = &d->ht[0];
        t1 = &d->ht[1];

        /* Make sure t0 is the smaller and t1 is the bigger table */
        if (t0->size > t1->size) {
            t0 = &d->ht[1];
            t1 = &d->ht[0];
        }

        m0 = t0->sizemask;
        m1 = t1->sizemask;

        /* Emit entries at cursor */
        if (bucketfn) bucketfn(privdata, &t0->table[v & m0]);
        de = t0->table[v & m0];
        while (de) {
            next = de->next;
            fn(privdata, de);
            de = next;
        }

        /* Iterate over indices in larger table that are the expansion
         * of the index pointed to by the cursor in the smaller table */
        do {
            /* Emit entries at cursor */
            if (bucketfn) bucketfn(privdata, &t1->table[v & m1]);
            de = t1->table[v & m1];
            while (de) {
                next = de->next;
                fn(privdata, de);
                de = next;
            }

            /* Increment the reverse cursor not covered by the smaller mask.*/
            v |= ~m1;
            v = rev(v);
            v++;
            v = rev(v);

            /* Continue while bits covered by mask difference is non-zero */
        } while (v & (m0 ^ m1));
    }

    dictResumeRehashing(d); // Restore rehash

    return v;
}
dictEmpty

Reset dictionary.

void dictEmpty(dict *d, void(callback)(void*)) {
    _dictClear(d,&d->ht[0],callback);
    _dictClear(d,&d->ht[1],callback);
    d->rehashidx = -1;
    d->pauserehash = 0;
}
dictEnableResize

Enable dictionary sizing parameter, dict_can_resize = 1.

dictDisableResize

Disable dictionary sizing parameter, dict_can_resize = 0.

dictGetHash

Get the hash value of the key and directly call dictHashKey.

dictFindEntryRefByPtrAndHash

Find references to nodes based on pointers and hash values?

dictEntry **dictFindEntryRefByPtrAndHash(dict *d, const void *oldptr, uint64_t hash) {
    dictEntry *he, **heref;
    unsigned long idx, table;

    if (dictSize(d) == 0) return NULL;
    for (table = 0; table <= 1; table++) {
        idx = hash & d->ht[table].sizemask;
        heref = &d->ht[table].table[idx];
        he = *heref;
        while(he) {
            if (oldptr==he->key) // Key pointer match
                return heref;
            heref = &he->next;
            he = *heref;
        }
        if (!dictIsRehashing(d)) return NULL;
    }
    return NULL;
}

Summary of this chapter

Through the partial interpretation of the source code, we can see that the implementation of the dictionary is based on hash table, which is not available in C language. Redis has implemented a set of dictionary by itself, which adopts the structure of dict > HT > Table > headptr. However, hash conflict is a very important problem (when two or more keys are assigned to the same index of the hash table array), the public chain address method is used to solve this problem. Because each hash node has a next pointer, a one-way linked list can be formed when multiple nodes are assigned to the same index. It is also inevitable to use circular structure in search. However, this is also the reason why adding new nodes is always added to the head of the linked list. It is impossible to iterate to the tail node to add, which is too waste.

When the hash node increases or decreases, the rehash process will also be triggered to adjust the size of the hash table accordingly. ht[1] is used in this case. When expanding, the size of ht[1] is always the first 2^n greater than or equal to ht[0].used*2. If shrink, ht[1].size=ht[0].used*2. This can effectively avoid frequent adjustment of the hash table, resulting in unnecessary loss. The rehash process is to recalculate the hash value and index value of the key, place the key value pair in the corresponding position of ht[1], migrate from ht[0] to ht[1], release ht[0], set ht[1] to ht[0], and ht[1] will create a new blank hash table for the next time. Of course, the migration process is not completed at one time, but gradually. The main reason is that when there are too many key value pairs, the process has too much impact on the machine performance.

Keywords: Database Redis

Added by rUmX on Wed, 27 Oct 2021 06:25:39 +0300