Dictionary Implementation Based on Jump Table

A list that adds additional forward pointers to an ordered list is called a skip list. It uses random technology to determine which nodes in the list should add forward pointers and how many pointers should be added. Based on this stochastic technique, the average time complexity of finding, inserting and deleting jump tables is O(logN). In the worst case, however, the time complexity is asymptotically equal to N.

On the Determination of Jump Table Series

In the regular jump list structure, the ratio of the number of the first-order linked list to the number of the first-order linked list is a fraction p, so the probability that the number pairs belonging to the first-order linked list belong to the second-order linked list at the same time is p (for a structure like binary tree, P is 0.5). We use a random number generator to generate real numbers of 0-1 to simulate its series. If the number generated is less than p, the series is added one until the number generated is greater than P.
The disadvantage of this stochastic method is that some newly inserted nodes may be assigned a large series, far exceeding log1/pN, where N is the maximum expected number of dictionaries. To avoid this situation, we set an upper line maxLevel of a series with a maximum value of ceil(log1/pN)-1. Another disadvantage is that even if the number of new inserted nodes is less than maxLevel, it is much larger than the number of chain layers owned by the current jump table, so if this happens, we will adjust it to the number of chain layers owned by the current jump table + 1.

Complexity

When there are n pairs of dictionaries, the time complexity of searching, inserting and deleting operations is O(n+maxLevel). In the worst case, there is only one maxLevel series pair, and all the remaining pairs are on the 0-level list. When i > 0, the time spent on the i-level list is O(maxLevel). Despite poor performance in the worst case, jump tables are valuable data descriptions because the average complexity of finding, inserting and deleting is O(logN). For example, Redis database uses jump table to realize storage. Compared with the red-black tree, the performance of jump table is similar and the implementation is simpler. As for spatial complexity, at worst, each record is maxLevel level, requiring maxLevel+1 pointer. Therefore, in addition to the need to store n pairs of space, also need to store O (nmax Level) pointer space. However, in general (regular jump table), the first-level list has np pointers, the second-level list has np 2 pointers, and the first-level list has np i lists. When p=0.5, the average space requirement (plus n pairs of pointers) is about 2n pointer spaces. Although the space demand in the worst case is relatively large, the average space demand is not large.

Specific C++ Implementation

Data structure of dictionary

#pragma once
//Abstract description of dictionary structure
template <typename K,typename V>
class dictionary {
public:
	virtual ~dictionary(){ }
	virtual bool empty() const = 0;
	//Returns the number of pairs in the dictionary
	virtual size_t size() const = 0;
	//Returns a pointer to a matching pair
	virtual std::pair<const K, V>* find(const K& _key) const = 0;
	virtual void erase(const K& _key) = 0;
	virtual void insert(const std::pair<const K, V>& _pair) = 0;
};

Pointer node

#pragma once
template <typename K,typename V>
struct skipNode
{
	typedef std::pair<const K, V> pairType;

	pairType element;
	skipNode<K, V>** next;	//The pointer array next[i] denotes the pointer of the level-I linked list
		
	//For a pair of linked lists of lev level, its size value is lev+1	
	skipNode(const pairType& _element,size_t size)
		:element(_element),next(new skipNode<K,V>* [size]){ }

	//Use RAII to manage next pointer domain resources for each node
	~skipNode()
	{
		delete[] next;
		cout << "delete next[] array!" <<

 endl;
	}
};

Implementation of Jump Table

#pragma once
#ifndef skipList_H
#define skipList_H

#include <string>
using std::string;
#include <iostream>
using std::cout; using std::cin; using std::endl; using std::ends;
#include <cmath>
#include <random>
#include <sstream>
#include <ctime>

#include "dictionaryADT.h"
#include "skipNode.h"
#include "..//..//..//ch06/ChainList/ChainList/illegalParameterValue.h"

template <typename K,typename V>
class skipList :public dictionary<K, V> {
public:
	skipList(const K& _MAX_KEY,int _MAX_PAIRS =100,float _prob=0.5);	//2 ^ 7 = 128 is initialized to a 6-1 level list
	~skipList();
	
	bool empty() const { return dict_size == 0; }
	size_t size() const { return dict_size; }
	std::pair<const K, V>* find(const K& _Key)const;
	void insert(const std::pair<const K,V>& _pair);
	void erase(const K& _Key);
	void output(std::ostream& os)const;

private:
	skipNode<K, V>* headerNode;	//Header node
	skipNode<K, V>* tailNode;	//Tail node
	skipNode<K, V>** last;	//During the search for a specified Key, save the location of the last node of each level
							//Used to update the pointer of the node involved when deleting and inserting the node
	size_t dict_size;		//Number of pairs in a dictionary
	int levels;			//The largest number of links in the current dictionary
	int MAX_LEVEL;		//Maximum number of linked list layers allowed by dictionary
	float probability;		//The Series of Probability Used to Simulate and Judge Link Lists
							//That is to say, the number-to-number ratio of i-1 linked list is 0.5 if the number-to-number form of i-1 linked list is binary tree.
	K MAX_KEY;				//Maximum keyword

private:
	skipNode<K, V>* search(const K& _Key) const;	//Search for a given Key and store the nodes traversed by each level of the list in the last array
	int get_level()const;	//Distribution of analog series for newly inserted nodes
};




template <typename K,typename V>
skipList<K, V>::skipList(const K& _MAX_KEY, int _MAX_PAIRS, float _prob) {
	dict_size = 0;
	levels = 0;
	probability = _prob;
	//Although the code does not require an upper limit on the number of containers, performance is better if the number is less than this value.
	//Because our MAX_LEVEL is calculated from this value. 
	MAX_LEVEL = std::ceil(std::logf(static_cast<float>(_MAX_PAIRS)) / std::logf(1 / probability)) - 1;
	cout << "initial MAX_LEVEL= " << MAX_LEVEL << endl;
	MAX_KEY = _MAX_KEY;

	//Initialize the head and tail nodes, and the data fields of the head nodes can be arbitrary values
	std::pair<K, V> initialPair;
	initialPair.first = MAX_KEY;
	headerNode = new skipNode<K, V>(initialPair, MAX_LEVEL + 1);
	tailNode = new skipNode<K, V>(initialPair, 0);
	last = new skipNode<K, V>* [MAX_LEVEL + 1];

	//Initialization points each level list of the head node to the tail node
	for (int i = 0; i <= MAX_LEVEL;++i) {
		headerNode->next[i] = tailNode;
	}
}


template <typename K,typename V>
skipList<K, V>::~skipList() {
	skipNode<K, V>* nextNode;
	while (headerNode!=tailNode) {
		nextNode = headerNode->next[0];
		cout << "delete node " <<headerNode->element.first<< endl;
		delete headerNode;
		headerNode = nextNode;
	}
	delete tailNode;
	delete[] last;
}


template <typename K,typename V>
std::pair<const K, V>* skipList<K, V>::find(const K& _Key)const {
	//Find a number of pairs with the keyword _Key
	if (_Key >= MAX_KEY)
		return nullptr;//No matchable pair of numbers greater than or equal to a given Key upper limit
	//The pointer beforeNode points to the last position of _Key after the lookup
	skipNode<K, V>* beforeNode = headerNode;
	//Search from the upper list to the lower list in turn
	for (int i = levels; i >=0 ; --i) {
		while (beforeNode->next[i]->element.first < _Key)
			beforeNode = beforeNode->next[i];
	}
	//Although you can stop when you find the keyword ==_Key, the extra operation of checking for equality is generally unnecessary.
	//Because most of the equal pairs of numbers appear in the 0-level list, we judge them directly after we have cycled through the list of all layers.
	if (beforeNode->next[0]->element.first == _Key)
		return &beforeNode->next[0]->element;
	//Otherwise, I couldn't find it.
	return nullptr;
}


template <typename K,typename V>
int skipList<K, V>::get_level()const {
	//Level assignment method
	static std::default_random_engine e(std::time(0));//Use time(0) as seed to get random number engine
	std::uniform_real_distribution<double> u(0, 1);//Limit the value of double between 0 and 1
	int currentLev=0;
	while (u(e) < probability)
		++currentLev;
	//If the series obtained is too large, the limit is max_level.
	return (currentLev <= MAX_LEVEL) ? currentLev:MAX_LEVEL;
}


template <typename K,typename V>
skipNode<K, V>* skipList<K, V>::search(const K& _Key) const{
	//Search for the keyword _Key and store the last node in each level list in last
	//Returns the node containing the keyword _key
	skipNode<K, V>* beforeNode = headerNode;
	for (int i = levels; i >= 0;--i) {
		while (beforeNode->next[i]->element.first < _Key)
			beforeNode = beforeNode->next[i];
		last[i] = beforeNode;
	}
	return beforeNode->next[0];
}


template <typename K,typename V>
void skipList<K, V>::insert(const std::pair<const K,V>& _pair) {
	//Insert a pair of numbers into a dictionary and override its value if it exists
	//The number of pairs of keys to be inserted exceeds the upper limit
	if (_pair.first >= MAX_KEY) {
		std::ostringstream oss;
		oss << "Key = " << _pair.first << " of inserted pair must be < " << MAX_KEY;
		throw illegalParameterValue(oss.str());
	}
	//Find out if the pair of _pairs exists
	skipNode<K, V>* theNode = search(_pair.first);
	if (theNode->element.first == _pair.first) {//Update the value if it exists
		theNode->element.second = _pair.second;
		return;
	}
	//If there is no linked list series to determine where the new node is located
	int theLevel = get_level();
	cout << "get level of node " << _pair.first << " = " << theLevel;
	if (theLevel > levels) {//If the series is larger than the highest one in the current dictionary, let it be level + 1
		theLevel = ++levels;
		last[theLevel] = headerNode;
		cout << " , but make it = " << theLevel;
	}
	cout << endl;
	//Insert a new node after the node theNode and update the node pointer
	skipNode<K, V>* newNode = new skipNode<K, V>(_pair, theLevel + 1);
	for (int i = 0; i <= theLevel;++i) {//Step by step update
		newNode->next[i] = last[i]->next[i];
		last[i]->next[i] = newNode;
	}
	++dict_size;
	return;
}

template<typename K,typename V>
void skipList<K, V>::erase(const K& _Key) {
	//Delete a number of pairs with the keyword _Key
	if (_Key >= MAX_KEY) {
		cout << "Key= " << _Key << " ,while must be < " << MAX_KEY << endl;
		return;
	}
	//Check for matching pairs
	skipNode<K, V>* theNode = search(_Key);
	if (theNode->element.first != _Key) {
		cout << "Key= " << _Key << " is not in the dictionary" << endl;
		return;
	}
	//Delete the node
	for (int i = 0; i <= levels && last[i]->next[i] == theNode;++i)
		last[i]->next[i] = theNode->next[i];
	//Updating Link List Level
	while (levels > 0 && headerNode->next[levels] == theNode)
		--levels;
	delete theNode;
	--dict_size;
}


template <typename K,typename V>
void skipList<K, V>::output(std::ostream& os) const{
	skipNode<K, V>* theNode;
	//The Number of Output Chain Layers
	for (int i = levels; i >= 0;--i) {
		theNode = headerNode;
		os << "level " << i << " : ";
		while (theNode!=tailNode) {
			os << "(" << theNode->element.first << "," << theNode->element.second << ")-->";
			theNode = theNode->next[i];
		}
		//Output tail node
		os << "(" << theNode->element.first << "," << theNode->element.second << ")" << endl;
	}
	os << endl;
}

template <typename K,typename V>
std::ostream& operator<<(std::ostream& os,const skipList<K,V>& list) {
	list.output(os);
	return os;
}

#endif

Keywords: less Redis Database

Added by karimali831 on Fri, 26 Jul 2019 17:04:01 +0300

Programming VIP