1, Experiment Name: Huffman Coding
2, Purpose of the experiment:
- Master the structure of Huffman tree;
- It can realize the basic operations of Huffman tree, such as construction, insertion, etc
- The minimum heap is used to reduce the time complexity of Huffman tree.
- Master the data structure of the minimum heap and the characteristics of the structure;
- It can realize the basic operations of minimum heap, such as construction, insertion, deletion, etc
3, Experiment content:
-
Determine the data structures for a binary Huffman tree
//Huffman tree is constructed with tree structure //Adopt the structure of array to construct the minimum heap typedef struct TreeNode*Huffman; typedef struct TreeNode** heap; struct TreeNode { int data; Huffman left; Huffman right; };
-
Implement the algorithm of creating a binary Huffman tree for a given postive integer sequence
Input a sequence and use the sequence array to construct the minimum heap, so as to facilitate the construction of Huffman tree.
//Constructing minimum heap using sequence void BuildMinHeap(heap h,int size); //Constructing Huffman tree with minimum heap Huffman BuildHuffman(heap h,int size);
-
Design and implement a Huffman coding prototype
//Print all Huffman codes by deep search method void getHuffmanCode(Huffman h,char* codeStore,int* index);
IV Program list (more detailed comments):
#include <stdio.h> #include <stdlib.h> typedef struct TreeNode*Huffman; typedef struct TreeNode** heap; struct TreeNode { int data; Huffman left; Huffman right; }; //Filter down. On the premise that all subtrees are the minimum heap, find the appropriate location of the node. The tree with this node as the root node is a minimum heap void FixDown(heap h,int size,int index){ int parent; int child; Huffman temp=h[index]; for(parent=index;parent*2<=size;parent=child){ child=parent*2; if(child+1<=size&&h[child+1]->data<h[child]->data) child++; if(h[child]->data<temp->data){ h[parent]=h[child]; } else break; } h[parent]=temp; } //The idea of recursion is adopted to establish the minimum heap from bottom to top, which ensures that the subtrees of the current tree are the minimum heap. void BuildMinHeap(heap h,int size){ for(int i=size/2;i>=1;i--){ FixDown(h,size,i); } } //Return and delete the minimum value of the minimum heap (i.e. the head node), and keep the tree as the minimum heap /*Move the last node to the head node and size --, Because the subtree is the minimum heap at this time, direct downward filtering can ensure that the tree is still the minimum heap*/ Huffman PopMinHeap(heap h,int* size){ Huffman result=h[1]; h[1]=h[(*size)--]; //Filter down FixDown(h,*size,1); return result; } //Insert node values and keep the tree at the minimum heap /**/ void InsertMinHeap(heap h,int* size,Huffman node){ h[++(*size)]=node; //Upward filtering int child=*size; int parent; for(;child>1;child=parent){ parent=child/2; if(h[parent]->data>node->data){ h[child]=h[parent]; } else { break; } } h[child]=node; } //Establish Huffman tree Huffman BuildHuffman(heap h,int size){ Huffman node=(Huffman)malloc(sizeof(struct TreeNode)); Huffman fristNode,lastNode; int time=size-1; for(int i=0;i<time;i++){//size-1 consolidation fristNode=PopMinHeap(h,&size); lastNode=PopMinHeap(h,&size); Huffman node=(Huffman)malloc(sizeof(struct TreeNode)); node->data=fristNode->data+lastNode->data; node->left=fristNode; node->right=lastNode; InsertMinHeap(h,&size,node); } return PopMinHeap(h,&size); } //Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is a Huffman tree void print(Huffman h){ if(h){ printf("%d(",h->data); print(h->left); printf(")("); print(h->right); printf(")"); } } //Freeing Huffman tree space void deleteHuff(Huffman h){ if(h){ deleteHuff(h->left); deleteHuff(h->right); free(h); } } //Print all Huffman codes by deep search method void getHuffmanCode(Huffman h,char* codeStore,int* index){ if(!h)return; if(h->left==NULL&&h->right==NULL){ printf("\n%d's HuffmanCode:",h->data); codeStore[*index]='\0'; printf("%s",codeStore); } else{ codeStore[(*index)++]='0'; getHuffmanCode(h->left,codeStore,index); --(*index);//to flash back codeStore[(*index)++]='1'; getHuffmanCode(h->right,codeStore,index); --(*index);//to flash back } } int main() { int n;//Enter the number of sequences; printf("Please enter the number of sequences:\n"); scanf("%d",&n); heap MinHeap=(heap)malloc(sizeof(struct TreeNode*)*(2*n));//Make sure there is enough space in the stack //Establish minimum heap printf("Please enter the sequence one by one:\n"); for(int i=1;i<=n;i++){ MinHeap[i]=(Huffman)malloc(sizeof(struct TreeNode)); scanf("%d",&MinHeap[i]->data); MinHeap[i]->left=NULL; MinHeap[i]->right=NULL; } BuildMinHeap(MinHeap,n); // for(int i=1;i<=n;i++){ // printf("%d ",MinHeap[i]->data); // } //Establish Huffman tree Huffman huff=BuildHuffman(MinHeap,n); //Print Huffman tree in the format of "parent node (left subtree) (right subtree)" to prove that it is a Huffman tree printf("Print Huffman Tree\n"); print(huff); //Output Huffman code of all original input numbers printf("\nOutput the Huffman code of all original input numbers"); char* codeStore=(char*)malloc(sizeof(char)*n); int index=0; getHuffmanCode(huff,codeStore,&index); free(codeStore); //Memory release work free(MinHeap); deleteHuff(huff); return 0; }#include <stdio.h> #include <stdlib.h> //Add height members on the basis of binary tree data structure struct ALVnode { int val; struct ALVnode* left; struct ALVnode* right; int high; }; typedef struct ALVnode* ALV; int getHigh(ALV tree){ if(!tree) return 0; int i=getHigh(tree->left); int j=getHigh(tree->right); return i>j?i+1:j+1; } /*Make left single rotation (right rotation) between root and its left node, because the tree height of root and its left node changes ,Therefore, we need to update the height of both, point the root node to the new root node (i.e. the left node) and return*/ ALV LeftLeft(ALV root){ ALV temp=root; root=root->left; temp->left=root->right; root->right=temp; //Update tree height temp->high=getHigh(temp); root->high=getHigh(root); return root; } /*Make a right single rotation (left rotation) between root and its right node, because the tree height of root and its left node changes ,Therefore, we need to update the height of both, point the root node to the new root node (i.e. the right node) and return*/ ALV RightRight(ALV root){ ALV temp=root; root=root->right; temp->right=root->left; root->left=temp; //Update tree height temp->high=getHigh(temp); root->high=getHigh(root); return root; } /*The right single rotation of the left subtree of root can transfer the insertion node to the left subtree of the left subtree. At this time, the left single rotation of the whole root can be carried out*/ ALV LeftRight(ALV root){ root->left=RightRight(root->left); root=LeftLeft(root); return root; } /*The left single rotation of the right node of root can transfer the insertion node to the right subtree of the right subtree. At this time, the right single rotation of the whole root can be carried out*/ ALV RightLeft(ALV root){ root->right=LeftLeft(root->right); root=RightRight(root); return root; } ALV insert(ALV root,int val){ if(root==NULL){//val is the first number struct ALVnode*newNode=(struct ALVnode*)malloc(sizeof(struct ALVnode)); newNode->val=val; newNode->high=1; newNode->left=NULL; newNode->right=NULL; root=newNode;//root points to the new node } else{ //If the value greater than root is inserted to the right and less than the value to the left, it means that the insertion is invalid (unable to insert) if(val<root->val){ root->left=insert(root->left,val); if((root->right==NULL&&root->left->high==2)||(root->right!=NULL&&root->left->high-root->right->high==2)){ if(val<root->left->val) root=LeftLeft(root);//Sinistral else { root=LeftRight(root);//Left and right double rotation } } } else if(val>root->val){ root->right=insert(root->right,val); if((root->left==NULL&&root->right->high==2)||(root->left!=NULL&&root->left->high-root->right->high==-2)){ if(val<root->right->val) root=RightLeft(root);//Right left double spin else { root=RightRight(root);//Right single rotation } } } //Update tree height root->high=getHigh(root); } return root; } /*Reclaim tree memory*/ void clear(ALV root){ if(!root)return; clear(root->left); clear(root->right); free(root); } //Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is an AVL tree void print(ALV root){ if(!root)return; printf("%d",root->val); printf("("); print(root->left); printf(")"); printf("("); print(root->right); printf(")"); } int main(){ int num;//Number of nodes to insert ALV head=NULL; scanf("%d",&num); int temp;//Used to store the number of to insert for(int i=0;i<num;i++){ scanf("%d",&temp); head=insert(head,temp); } //Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is an AVL tree print(head); clear(head); return 0; }
5, Test results:
Input / output format description:
-
Input format:
Enter the number of sequences first according to the display prompt;
Input sequence elements in sequence; -
Output format:
Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is an AVL tree -
Test example 1:
Input:
5 5 4 3 2 1
Output:
Please enter the number of sequences: 5 Please enter the sequence one by one: 5 4 3 2 1 Print Huffman Tree 15(6(3()())(3(1()())(2()())))(9(4()())(5()())) Output the Huffman code of all original input numbers 3's HuffmanCode:00 1's HuffmanCode:010 2's HuffmanCode:011 4's HuffmanCode:10
-
Test example 2:
Input:
9 33 22 11 44 55 6 43 22 11
Output:
Please enter the number of sequences: 9 Please enter the sequence one by one: 33 22 11 44 55 6 43 22 11 Print Huffman Tree 247(99(44()())(55()()))(148(61(28(11()())(17(6()())(11()())))(33()()))(87(43()())(44(22()())(22()())))) Output the Huffman code of all original input numbers 44's HuffmanCode:00 55's HuffmanCode:01 11's HuffmanCode:1000 6's HuffmanCode:10010 11's HuffmanCode:10011 33's HuffmanCode:101 43's HuffmanCode:110 22's HuffmanCode:1110 22's HuffmanCode:1111
-
Test example 3:
Input:
7 88 70 61 96 120 90 65
Output:
Please enter the number of sequences: 7 Please enter the sequence one by one: 88 70 61 96 120 90 65 Print Huffman Tree 590(246(120()())(126(61()())(65()())))(344(158(70()())(88()()))(186(90()())(96()()))) Output the Huffman code of all original input numbers 120's HuffmanCode:00 61's HuffmanCode:010 65's HuffmanCode:011 70's HuffmanCode:100 88's HuffmanCode:101 90's HuffmanCode:110 96's HuffmanCode:111
-
Test example 4:
Input:
0
Output:
-
Test example 5: (random number example)
Input:
100 1610 1280 299 53 1760 1677 149 1769 1325 1991 721 1451 2168 465 824 1517 1162 944 1782 69 1188 1112 16 148 1418 1429 1675 1232 2184 775 761 1714 13 623 1399 971 112 1299 1778 1819 1199 1463 924 603 356 1462 230 1828 320 2218 1429 24 1697 936 640 1462 1510 150 114 1101 206 194 357 1895 999 2082 15 320 2051 228 1325 216 741 1768 69 1311 1019 150 2190 824 210 790 788 2144 1039 1154 269 83 1521 491 1238 539 2007 1054 433 1186 855 1450 833 2049
Output:
Please enter the number of sequences: 100 Please enter the sequence one by one: 1610 1280 299 53 1760 1677 149 1769 1325 1991 721 1451 2168 465 824 1517 1162 944 1782 69 1188 1112 16 148 1418 1429 1675 1232 2184 775 761 1714 13 623 1399 971 112 1299 1778 1819 1199 1463 924 603 356 1462 230 1828 320 2218 1429 24 1697 936 640 1462 1510 150 114 1101 206 194 357 1895 999 2082 15 320 2051 228 1325 216 741 1768 69 1311 1019 150 2190 824 210 790 788 2144 1039 1154 269 83 1521 491 1238 539 2007 1054 433 1186 855 1450 833 2049 Print Huffman Tree 102441(42410(19313(9208(4484(2218()())(2266(1112()())(1154()())))(4724(2348(1162()())(1186()()))(2376(1188(585(286(138(69()())(69()()))(148()()))(299(149()())(150()())))(603()()))(1188()()))))(10105(4911(2431(1199()())(1232()()))(2480(1238()())(1242(619(299()())(320()()))(623()()))))(5194(2579(1280()())(1299()()))(2615(1304(640()())(664(320()())(344(150()())(194()()))))(1311()())))))(23097(11209(5467(2650(1325()())(1325()()))(2817(1399()())(1418()())))(5742(2858(1429()())(1429()()))(2884(1434(713(356()())(357()()))(721()()))(1450()()))))(11888(5838(2913(1451()())(1462()()))(2925(1462()())(1463()())))(6050(3012(1502(741()())(761()()))(1510()()))(3038(1517()())(1521()()))))))(60031(27380(13175(6438(3173(1563(775()())(788()()))(1610()()))(3265(1614(790()())(824()()))(1651(824()())(827(401(195(83()())(112()()))(206()()))(426(210()())(216()()))))))(6737(3352(1675()())(1677()()))(3385(1688(833()())(855()()))(1697()()))))(14205(7011(3474(1714()())(1760()()))(3537(1768()())(1769()())))(7194(3560(1778()())(1782()()))(3634(1815(891(433()())(458(228()())(230()())))(924()()))(1819()())))))(32651(15595(7530(3708(1828()())(1880(936()())(944()())))(3822(1895()())(1927(956(465()())(491()()))(971()()))))(8065(3998(1991()())(2007()()))(4067(2018(999()())(1019()()))(2049()()))))(17056(8359(4133(2051()())(2082()()))(4226(2082(1039()())(1043(504(235(114()())(121(53()())(68(28(13()())(15()()))(40(16()())(24()())))))(269()()))(539()())))(2144()())))(8697(4323(2155(1054()())(1101()()))(2168()()))(4374(2184()())(2190()())))))) Output the Huffman code of all original input numbers 2218's HuffmanCode:00000 1112's HuffmanCode:000010 1154's HuffmanCode:000011 1162's HuffmanCode:000100 1186's HuffmanCode:000101 69's HuffmanCode:0001100000 69's HuffmanCode:0001100001 148's HuffmanCode:000110001 149's HuffmanCode:000110010 150's HuffmanCode:000110011 603's HuffmanCode:0001101 1188's HuffmanCode:000111 1199's HuffmanCode:001000 1232's HuffmanCode:001001 1238's HuffmanCode:001010 299's HuffmanCode:00101100 320's HuffmanCode:00101101 623's HuffmanCode:0010111 1280's HuffmanCode:001100 1299's HuffmanCode:001101 640's HuffmanCode:0011100 320's HuffmanCode:00111010 150's HuffmanCode:001110110 194's HuffmanCode:001110111 1311's HuffmanCode:001111 1325's HuffmanCode:010000 1325's HuffmanCode:010001 1399's HuffmanCode:010010 1418's HuffmanCode:010011 1429's HuffmanCode:010100 1429's HuffmanCode:010101 356's HuffmanCode:01011000 357's HuffmanCode:01011001 721's HuffmanCode:0101101 1450's HuffmanCode:010111 1451's HuffmanCode:011000 1462's HuffmanCode:011001 1462's HuffmanCode:011010 1463's HuffmanCode:011011 741's HuffmanCode:0111000 761's HuffmanCode:0111001 1510's HuffmanCode:011101 1517's HuffmanCode:011110 1521's HuffmanCode:011111 775's HuffmanCode:1000000 788's HuffmanCode:1000001 1610's HuffmanCode:100001 790's HuffmanCode:1000100 824's HuffmanCode:1000101 824's HuffmanCode:1000110 83's HuffmanCode:1000111000 112's HuffmanCode:1000111001 206's HuffmanCode:100011101 210's HuffmanCode:100011110 216's HuffmanCode:100011111 1675's HuffmanCode:100100 1677's HuffmanCode:100101 833's HuffmanCode:1001100 855's HuffmanCode:1001101 1697's HuffmanCode:100111 1714's HuffmanCode:101000 1760's HuffmanCode:101001 1768's HuffmanCode:101010 1769's HuffmanCode:101011 1778's HuffmanCode:101100 1782's HuffmanCode:101101 433's HuffmanCode:10111000 228's HuffmanCode:101110010 230's HuffmanCode:101110011 924's HuffmanCode:1011101 1819's HuffmanCode:101111 1828's HuffmanCode:110000 936's HuffmanCode:1100010 944's HuffmanCode:1100011 1895's HuffmanCode:110010 465's HuffmanCode:11001100 491's HuffmanCode:11001101 971's HuffmanCode:1100111 1991's HuffmanCode:110100 2007's HuffmanCode:110101 999's HuffmanCode:1101100 1019's HuffmanCode:1101101 2049's HuffmanCode:110111 2051's HuffmanCode:111000 2082's HuffmanCode:111001 1039's HuffmanCode:1110100 114's HuffmanCode:1110101000 53's HuffmanCode:11101010010 13's HuffmanCode:1110101001100 15's HuffmanCode:1110101001101 16's HuffmanCode:1110101001110 24's HuffmanCode:1110101001111 269's HuffmanCode:111010101 539's HuffmanCode:11101011 2144's HuffmanCode:111011 1054's HuffmanCode:1111000 1101's HuffmanCode:1111001 2168's HuffmanCode:111101 2184's HuffmanCode:111110 2190's HuffmanCode:111111
6, Algorithm analysis:
The process starts with the leaf node containing the probability of the symbol it represents. Then, the process obtains the minimum two nodes (minimum probability) and creates a new internal node with these two nodes as child nodes. The weight of the new node is set to the sum of the child node weights. Then, we apply the process again on the new internal node and the remaining nodes (that is, excluding two leaf nodes), and repeat the process until there is only one node left, which is the root of the Huffman tree.
The construction algorithm uses a priority queue (minimum heap), in which the node with the lowest probability is given the highest priority:
- Create a leaf node for each symbol and add it to the priority queue.
- When there are multiple nodes in the queue:
- Delete the two nodes with the highest priority (lowest probability) from the queue
- Create a new internal node and take the two nodes as child nodes, and its probability is equal to the sum of the probabilities of the two nodes.
- Add a new node to the queue.
- The remaining nodes are root nodes and the tree has been completed.
Because the efficient priority queue data structure requires o (log n) time for each insertion, and the tree with n leaves has 2n - 1 nodes, the algorithm operates in O (log n) time. The spatial complexity of the algorithm is the memory of Huffman tree and the minimum heap for storing Huffman tree nodes.