# Huffman Coding Huffman tree

1, Experiment Name: Huffman Coding

2, Purpose of the experiment:

1. Master the structure of Huffman tree;
2. It can realize the basic operations of Huffman tree, such as construction, insertion, etc
3. The minimum heap is used to reduce the time complexity of Huffman tree.
4. Master the data structure of the minimum heap and the characteristics of the structure;
5. It can realize the basic operations of minimum heap, such as construction, insertion, deletion, etc

3, Experiment content:

1. Determine the data structures for a binary Huffman tree

```//Huffman tree is constructed with tree structure
//Adopt the structure of array to construct the minimum heap
typedef struct TreeNode*Huffman;
typedef  struct TreeNode** heap;
struct TreeNode
{
int data;
Huffman left;
Huffman right;
};
```
2. Implement the algorithm of creating a binary Huffman tree for a given postive integer sequence

Input a sequence and use the sequence array to construct the minimum heap, so as to facilitate the construction of Huffman tree.

```//Constructing minimum heap using sequence
void BuildMinHeap(heap h,int size);
//Constructing Huffman tree with minimum heap
Huffman BuildHuffman(heap h,int size);
```
3. Design and implement a Huffman coding prototype

```//Print all Huffman codes by deep search method
void getHuffmanCode(Huffman h,char* codeStore,int* index);
```

IV Program list (more detailed comments):

```#include <stdio.h>
#include <stdlib.h>
typedef struct TreeNode*Huffman;
typedef  struct TreeNode** heap;
struct TreeNode
{
int data;
Huffman left;
Huffman right;
};
//Filter down. On the premise that all subtrees are the minimum heap, find the appropriate location of the node. The tree with this node as the root node is a minimum heap
void FixDown(heap h,int size,int index){
int parent;
int child;
Huffman temp=h[index];
for(parent=index;parent*2<=size;parent=child){
child=parent*2;
if(child+1<=size&&h[child+1]->data<h[child]->data)
child++;
if(h[child]->data<temp->data){
h[parent]=h[child];
}
else break;
}
h[parent]=temp;
}
//The idea of recursion is adopted to establish the minimum heap from bottom to top, which ensures that the subtrees of the current tree are the minimum heap.
void BuildMinHeap(heap h,int size){
for(int i=size/2;i>=1;i--){
FixDown(h,size,i);
}
}
//Return and delete the minimum value of the minimum heap (i.e. the head node), and keep the tree as the minimum heap
/*Move the last node to the head node and size --,
Because the subtree is the minimum heap at this time, direct downward filtering can ensure that the tree is still the minimum heap*/
Huffman PopMinHeap(heap h,int* size){
Huffman result=h;
h=h[(*size)--];
//Filter down
FixDown(h,*size,1);
return result;
}
//Insert node values and keep the tree at the minimum heap
/**/
void InsertMinHeap(heap h,int* size,Huffman node){
h[++(*size)]=node;
//Upward filtering
int child=*size;
int parent;
for(;child>1;child=parent){
parent=child/2;
if(h[parent]->data>node->data){
h[child]=h[parent];
}
else
{
break;
}

}
h[child]=node;
}
//Establish Huffman tree
Huffman BuildHuffman(heap h,int size){
Huffman node=(Huffman)malloc(sizeof(struct TreeNode));
Huffman fristNode,lastNode;
int time=size-1;
for(int i=0;i<time;i++){//size-1 consolidation
fristNode=PopMinHeap(h,&size);
lastNode=PopMinHeap(h,&size);
Huffman node=(Huffman)malloc(sizeof(struct TreeNode));
node->data=fristNode->data+lastNode->data;
node->left=fristNode;
node->right=lastNode;
InsertMinHeap(h,&size,node);
}
return PopMinHeap(h,&size);

}
//Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is a Huffman tree
void print(Huffman h){
if(h){
printf("%d(",h->data);
print(h->left);
printf(")(");
print(h->right);
printf(")");
}
}
//Freeing Huffman tree space
void deleteHuff(Huffman h){
if(h){
deleteHuff(h->left);
deleteHuff(h->right);
free(h);
}
}
//Print all Huffman codes by deep search method
void getHuffmanCode(Huffman h,char* codeStore,int* index){
if(!h)return;
if(h->left==NULL&&h->right==NULL){
printf("\n%d's HuffmanCode:",h->data);
codeStore[*index]='\0';
printf("%s",codeStore);
}
else{
codeStore[(*index)++]='0';
getHuffmanCode(h->left,codeStore,index);
--(*index);//to flash back
codeStore[(*index)++]='1';
getHuffmanCode(h->right,codeStore,index);
--(*index);//to flash back
}
}
int main() {
int n;//Enter the number of sequences;
printf("Please enter the number of sequences:\n");
scanf("%d",&n);
heap MinHeap=(heap)malloc(sizeof(struct TreeNode*)*(2*n));//Make sure there is enough space in the stack
//Establish minimum heap
printf("Please enter the sequence one by one:\n");
for(int i=1;i<=n;i++){
MinHeap[i]=(Huffman)malloc(sizeof(struct TreeNode));
scanf("%d",&MinHeap[i]->data);
MinHeap[i]->left=NULL;
MinHeap[i]->right=NULL;
}
BuildMinHeap(MinHeap,n);
// for(int i=1;i<=n;i++){
//     printf("%d ",MinHeap[i]->data);
// }
//Establish Huffman tree
Huffman huff=BuildHuffman(MinHeap,n);
//Print Huffman tree in the format of "parent node (left subtree) (right subtree)" to prove that it is a Huffman tree
printf("Print Huffman Tree\n");
print(huff);
//Output Huffman code of all original input numbers
printf("\nOutput the Huffman code of all original input numbers");
char* codeStore=(char*)malloc(sizeof(char)*n);
int index=0;
getHuffmanCode(huff,codeStore,&index);
free(codeStore);
//Memory release work
free(MinHeap);
deleteHuff(huff);
return 0;
}#include <stdio.h>
#include <stdlib.h>
//Add height members on the basis of binary tree data structure
struct ALVnode
{
int val;
struct ALVnode* left;
struct ALVnode* right;
int high;
};
typedef struct ALVnode* ALV;
int getHigh(ALV tree){
if(!tree) return 0;
int i=getHigh(tree->left);
int j=getHigh(tree->right);
return i>j?i+1:j+1;
}
/*Make left single rotation (right rotation) between root and its left node, because the tree height of root and its left node changes
，Therefore, we need to update the height of both, point the root node to the new root node (i.e. the left node) and return*/
ALV LeftLeft(ALV root){
ALV temp=root;
root=root->left;
temp->left=root->right;
root->right=temp;
//Update tree height
temp->high=getHigh(temp);
root->high=getHigh(root);
return root;
}
/*Make a right single rotation (left rotation) between root and its right node, because the tree height of root and its left node changes
，Therefore, we need to update the height of both, point the root node to the new root node (i.e. the right node) and return*/
ALV RightRight(ALV root){
ALV temp=root;
root=root->right;
temp->right=root->left;
root->left=temp;
//Update tree height
temp->high=getHigh(temp);
root->high=getHigh(root);
return root;
}
/*The right single rotation of the left subtree of root can transfer the insertion node to the left subtree of the left subtree. At this time, the left single rotation of the whole root can be carried out*/
ALV LeftRight(ALV root){
root->left=RightRight(root->left);
root=LeftLeft(root);
return root;
}
/*The left single rotation of the right node of root can transfer the insertion node to the right subtree of the right subtree. At this time, the right single rotation of the whole root can be carried out*/
ALV RightLeft(ALV root){
root->right=LeftLeft(root->right);
root=RightRight(root);
return root;
}
ALV insert(ALV root,int val){
if(root==NULL){//val is the first number
struct ALVnode*newNode=(struct ALVnode*)malloc(sizeof(struct ALVnode));
newNode->val=val;
newNode->high=1;
newNode->left=NULL;
newNode->right=NULL;
root=newNode;//root points to the new node
}
else{
//If the value greater than root is inserted to the right and less than the value to the left, it means that the insertion is invalid (unable to insert)
if(val<root->val){
root->left=insert(root->left,val);
if((root->right==NULL&&root->left->high==2)||(root->right!=NULL&&root->left->high-root->right->high==2)){
if(val<root->left->val)
root=LeftLeft(root);//Sinistral
else
{
root=LeftRight(root);//Left and right double rotation
}

}

}
else if(val>root->val){
root->right=insert(root->right,val);
if((root->left==NULL&&root->right->high==2)||(root->left!=NULL&&root->left->high-root->right->high==-2)){
if(val<root->right->val)
root=RightLeft(root);//Right left double spin
else
{
root=RightRight(root);//Right single rotation
}

}
}
//Update tree height
root->high=getHigh(root);
}
return root;
}
/*Reclaim tree memory*/
void clear(ALV root){
if(!root)return;
clear(root->left);
clear(root->right);
free(root);
}
//Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is an AVL tree
void print(ALV root){
if(!root)return;
printf("%d",root->val);
printf("(");
print(root->left);
printf(")");
printf("(");
print(root->right);
printf(")");
}
int main(){
int num;//Number of nodes to insert
scanf("%d",&num);
int temp;//Used to store the number of to insert
for(int  i=0;i<num;i++){
scanf("%d",&temp);
}
//Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is an AVL tree
return 0;
}
```

5, Test results:

Input / output format description:

• Input format:
Enter the number of sequences first according to the display prompt;
Input sequence elements in sequence;

• Output format:
Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is an AVL tree

• Test example 1:

Input:

```5
5 4 3 2 1
```

Output:

```Please enter the number of sequences:
5
Please enter the sequence one by one:
5 4 3 2 1
Print Huffman Tree
15(6(3()())(3(1()())(2()())))(9(4()())(5()()))
Output the Huffman code of all original input numbers
3's HuffmanCode:00
1's HuffmanCode:010
2's HuffmanCode:011
4's HuffmanCode:10
```
• Test example 2:

Input:

```9
33 22 11 44 55 6 43 22 11
```

Output:

```Please enter the number of sequences:
9
Please enter the sequence one by one:
33 22 11 44 55 6 43 22 11
Print Huffman Tree
247(99(44()())(55()()))(148(61(28(11()())(17(6()())(11()())))(33()()))(87(43()())(44(22()())(22()()))))
Output the Huffman code of all original input numbers
44's HuffmanCode:00
55's HuffmanCode:01
11's HuffmanCode:1000
6's HuffmanCode:10010
11's HuffmanCode:10011
33's HuffmanCode:101
43's HuffmanCode:110
22's HuffmanCode:1110
22's HuffmanCode:1111
```
• Test example 3:

Input:

```7
88 70 61 96 120 90 65
```

Output:

```Please enter the number of sequences:
7
Please enter the sequence one by one:
88 70 61 96 120 90 65
Print Huffman Tree
590(246(120()())(126(61()())(65()())))(344(158(70()())(88()()))(186(90()())(96()())))
Output the Huffman code of all original input numbers
120's HuffmanCode:00
61's HuffmanCode:010
65's HuffmanCode:011
70's HuffmanCode:100
88's HuffmanCode:101
90's HuffmanCode:110
96's HuffmanCode:111
```
• Test example 4:

Input:

```0
```

Output:

• Test example 5: (random number example)

Input:

```100
1610 1280 299 53 1760 1677 149 1769 1325 1991 721 1451 2168 465 824 1517 1162 944 1782 69 1188 1112 16 148 1418 1429 1675 1232 2184 775 761 1714 13 623 1399 971 112 1299 1778 1819 1199 1463 924 603 356 1462 230 1828 320 2218 1429 24 1697 936 640 1462 1510 150 114 1101 206 194 357 1895 999 2082 15 320 2051 228 1325 216 741 1768 69 1311 1019 150 2190 824 210 790 788 2144 1039 1154 269 83 1521 491 1238 539 2007 1054 433 1186 855 1450 833 2049
```

Output:

```Please enter the number of sequences:
100
Please enter the sequence one by one:
1610 1280 299 53 1760 1677 149 1769 1325 1991 721 1451 2168 465 824 1517 1162 944 1782 69 1188 1112 16 148 1418 1429 1675 1232 2184 775 761 1714 13 623 1399 971 112 1299 1778 1819 1199 1463 924 603 356 1462 230 1828 320 2218 1429 24 1697 936 640 1462 1510 150 114 1101 206 194 357 1895 999 2082 15 320 2051 228 1325 216
741 1768 69 1311 1019 150 2190 824 210 790 788 2144 1039 1154 269 83 1521 491 1238 539 2007 1054 433 1186 855 1450 833 2049
Print Huffman Tree
102441(42410(19313(9208(4484(2218()())(2266(1112()())(1154()())))(4724(2348(1162()())(1186()()))(2376(1188(585(286(138(69()())(69()()))(148()()))(299(149()())(150()())))(603()()))(1188()()))))(10105(4911(2431(1199()())(1232()()))(2480(1238()())(1242(619(299()())(320()()))(623()()))))(5194(2579(1280()())(1299()()))(2615(1304(640()())(664(320()())(344(150()())(194()()))))(1311()())))))(23097(11209(5467(2650(1325()())(1325()()))(2817(1399()())(1418()())))(5742(2858(1429()())(1429()()))(2884(1434(713(356()())(357()()))(721()()))(1450()()))))(11888(5838(2913(1451()())(1462()()))(2925(1462()())(1463()())))(6050(3012(1502(741()())(761()()))(1510()()))(3038(1517()())(1521()()))))))(60031(27380(13175(6438(3173(1563(775()())(788()()))(1610()()))(3265(1614(790()())(824()()))(1651(824()())(827(401(195(83()())(112()()))(206()()))(426(210()())(216()()))))))(6737(3352(1675()())(1677()()))(3385(1688(833()())(855()()))(1697()()))))(14205(7011(3474(1714()())(1760()()))(3537(1768()())(1769()())))(7194(3560(1778()())(1782()()))(3634(1815(891(433()())(458(228()())(230()())))(924()()))(1819()())))))(32651(15595(7530(3708(1828()())(1880(936()())(944()())))(3822(1895()())(1927(956(465()())(491()()))(971()()))))(8065(3998(1991()())(2007()()))(4067(2018(999()())(1019()()))(2049()()))))(17056(8359(4133(2051()())(2082()()))(4226(2082(1039()())(1043(504(235(114()())(121(53()())(68(28(13()())(15()()))(40(16()())(24()())))))(269()()))(539()())))(2144()())))(8697(4323(2155(1054()())(1101()()))(2168()()))(4374(2184()())(2190()()))))))
Output the Huffman code of all original input numbers
2218's HuffmanCode:00000
1112's HuffmanCode:000010
1154's HuffmanCode:000011
1162's HuffmanCode:000100
1186's HuffmanCode:000101
69's HuffmanCode:0001100000
69's HuffmanCode:0001100001
148's HuffmanCode:000110001
149's HuffmanCode:000110010
150's HuffmanCode:000110011
603's HuffmanCode:0001101
1188's HuffmanCode:000111
1199's HuffmanCode:001000
1232's HuffmanCode:001001
1238's HuffmanCode:001010
299's HuffmanCode:00101100
320's HuffmanCode:00101101
623's HuffmanCode:0010111
1280's HuffmanCode:001100
1299's HuffmanCode:001101
640's HuffmanCode:0011100
320's HuffmanCode:00111010
150's HuffmanCode:001110110
194's HuffmanCode:001110111
1311's HuffmanCode:001111
1325's HuffmanCode:010000
1325's HuffmanCode:010001
1399's HuffmanCode:010010
1418's HuffmanCode:010011
1429's HuffmanCode:010100
1429's HuffmanCode:010101
356's HuffmanCode:01011000
357's HuffmanCode:01011001
721's HuffmanCode:0101101
1450's HuffmanCode:010111
1451's HuffmanCode:011000
1462's HuffmanCode:011001
1462's HuffmanCode:011010
1463's HuffmanCode:011011
741's HuffmanCode:0111000
761's HuffmanCode:0111001
1510's HuffmanCode:011101
1517's HuffmanCode:011110
1521's HuffmanCode:011111
775's HuffmanCode:1000000
788's HuffmanCode:1000001
1610's HuffmanCode:100001
790's HuffmanCode:1000100
824's HuffmanCode:1000101
824's HuffmanCode:1000110
83's HuffmanCode:1000111000
112's HuffmanCode:1000111001
206's HuffmanCode:100011101
210's HuffmanCode:100011110
216's HuffmanCode:100011111
1675's HuffmanCode:100100
1677's HuffmanCode:100101
833's HuffmanCode:1001100
855's HuffmanCode:1001101
1697's HuffmanCode:100111
1714's HuffmanCode:101000
1760's HuffmanCode:101001
1768's HuffmanCode:101010
1769's HuffmanCode:101011
1778's HuffmanCode:101100
1782's HuffmanCode:101101
433's HuffmanCode:10111000
228's HuffmanCode:101110010
230's HuffmanCode:101110011
924's HuffmanCode:1011101
1819's HuffmanCode:101111
1828's HuffmanCode:110000
936's HuffmanCode:1100010
944's HuffmanCode:1100011
1895's HuffmanCode:110010
465's HuffmanCode:11001100
491's HuffmanCode:11001101
971's HuffmanCode:1100111
1991's HuffmanCode:110100
2007's HuffmanCode:110101
999's HuffmanCode:1101100
1019's HuffmanCode:1101101
2049's HuffmanCode:110111
2051's HuffmanCode:111000
2082's HuffmanCode:111001
1039's HuffmanCode:1110100
114's HuffmanCode:1110101000
53's HuffmanCode:11101010010
13's HuffmanCode:1110101001100
15's HuffmanCode:1110101001101
16's HuffmanCode:1110101001110
24's HuffmanCode:1110101001111
269's HuffmanCode:111010101
539's HuffmanCode:11101011
2144's HuffmanCode:111011
1054's HuffmanCode:1111000
1101's HuffmanCode:1111001
2168's HuffmanCode:111101
2184's HuffmanCode:111110
2190's HuffmanCode:111111
```

6, Algorithm analysis:

The process starts with the leaf node containing the probability of the symbol it represents. Then, the process obtains the minimum two nodes (minimum probability) and creates a new internal node with these two nodes as child nodes. The weight of the new node is set to the sum of the child node weights. Then, we apply the process again on the new internal node and the remaining nodes (that is, excluding two leaf nodes), and repeat the process until there is only one node left, which is the root of the Huffman tree.

The construction algorithm uses a priority queue (minimum heap), in which the node with the lowest probability is given the highest priority:

1. Create a leaf node for each symbol and add it to the priority queue.
2. When there are multiple nodes in the queue:
1. Delete the two nodes with the highest priority (lowest probability) from the queue
2. Create a new internal node and take the two nodes as child nodes, and its probability is equal to the sum of the probabilities of the two nodes.
3. Add a new node to the queue.
3. The remaining nodes are root nodes and the tree has been completed.

Because the efficient priority queue data structure requires o (log n) time for each insertion, and the tree with n leaves has 2n - 1 nodes, the algorithm operates in O (log n) time. The spatial complexity of the algorithm is the memory of Huffman tree and the minimum heap for storing Huffman tree nodes.

Keywords: C data structure

Added by Operandi on Wed, 02 Mar 2022 08:31:27 +0200