Huffman Coding Huffman tree

1, Experiment Name: Huffman Coding

2, Purpose of the experiment:

  1. Master the structure of Huffman tree;
  2. It can realize the basic operations of Huffman tree, such as construction, insertion, etc
  3. The minimum heap is used to reduce the time complexity of Huffman tree.
  4. Master the data structure of the minimum heap and the characteristics of the structure;
  5. It can realize the basic operations of minimum heap, such as construction, insertion, deletion, etc

3, Experiment content:

  1. Determine the data structures for a binary Huffman tree

    //Huffman tree is constructed with tree structure
    //Adopt the structure of array to construct the minimum heap
    typedef struct TreeNode*Huffman;
    typedef  struct TreeNode** heap;
    struct TreeNode
    {
        int data;
        Huffman left;
        Huffman right;
    };
    
  2. Implement the algorithm of creating a binary Huffman tree for a given postive integer sequence

    Input a sequence and use the sequence array to construct the minimum heap, so as to facilitate the construction of Huffman tree.

    //Constructing minimum heap using sequence
    void BuildMinHeap(heap h,int size);
    //Constructing Huffman tree with minimum heap
    Huffman BuildHuffman(heap h,int size);
    
  3. Design and implement a Huffman coding prototype

    //Print all Huffman codes by deep search method
    void getHuffmanCode(Huffman h,char* codeStore,int* index);
    

IV Program list (more detailed comments):

#include <stdio.h>
#include <stdlib.h>
typedef struct TreeNode*Huffman;
typedef  struct TreeNode** heap;
struct TreeNode
{
    int data;
    Huffman left;
    Huffman right;
};
//Filter down. On the premise that all subtrees are the minimum heap, find the appropriate location of the node. The tree with this node as the root node is a minimum heap
void FixDown(heap h,int size,int index){
    int parent;
    int child;
    Huffman temp=h[index];
    for(parent=index;parent*2<=size;parent=child){
        child=parent*2;
        if(child+1<=size&&h[child+1]->data<h[child]->data)
            child++;
        if(h[child]->data<temp->data){
            h[parent]=h[child];
        }
        else break;
    }
    h[parent]=temp;
}
//The idea of recursion is adopted to establish the minimum heap from bottom to top, which ensures that the subtrees of the current tree are the minimum heap.
void BuildMinHeap(heap h,int size){
    for(int i=size/2;i>=1;i--){
        FixDown(h,size,i);
    }
}
//Return and delete the minimum value of the minimum heap (i.e. the head node), and keep the tree as the minimum heap
/*Move the last node to the head node and size --,
Because the subtree is the minimum heap at this time, direct downward filtering can ensure that the tree is still the minimum heap*/
Huffman PopMinHeap(heap h,int* size){
    Huffman result=h[1];
    h[1]=h[(*size)--];
    //Filter down
    FixDown(h,*size,1);
    return result;
}
//Insert node values and keep the tree at the minimum heap
/**/
void InsertMinHeap(heap h,int* size,Huffman node){
    h[++(*size)]=node;
    //Upward filtering
    int child=*size;
    int parent;
    for(;child>1;child=parent){
        parent=child/2;
        if(h[parent]->data>node->data){
            h[child]=h[parent];
        }
        else
        {
            break;
        }
        
    }
    h[child]=node;
}
//Establish Huffman tree
Huffman BuildHuffman(heap h,int size){
    Huffman node=(Huffman)malloc(sizeof(struct TreeNode));
    Huffman fristNode,lastNode;
    int time=size-1;
    for(int i=0;i<time;i++){//size-1 consolidation
        fristNode=PopMinHeap(h,&size);
        lastNode=PopMinHeap(h,&size);
        Huffman node=(Huffman)malloc(sizeof(struct TreeNode));
        node->data=fristNode->data+lastNode->data;
        node->left=fristNode;
        node->right=lastNode;
        InsertMinHeap(h,&size,node);
    }
    return PopMinHeap(h,&size);
    
}
//Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is a Huffman tree
void print(Huffman h){
    if(h){
        printf("%d(",h->data);
        print(h->left);
        printf(")(");
        print(h->right);
        printf(")");
    }
}
//Freeing Huffman tree space
void deleteHuff(Huffman h){
    if(h){
        deleteHuff(h->left);
        deleteHuff(h->right);
        free(h);
    }
}
//Print all Huffman codes by deep search method
void getHuffmanCode(Huffman h,char* codeStore,int* index){
    if(!h)return;
    if(h->left==NULL&&h->right==NULL){
        printf("\n%d's HuffmanCode:",h->data);
        codeStore[*index]='\0';
        printf("%s",codeStore);
    }
    else{
        codeStore[(*index)++]='0';
        getHuffmanCode(h->left,codeStore,index);
        --(*index);//to flash back
        codeStore[(*index)++]='1';
        getHuffmanCode(h->right,codeStore,index);
        --(*index);//to flash back
    }
}
int main() {
    int n;//Enter the number of sequences;
    printf("Please enter the number of sequences:\n");
    scanf("%d",&n);
    heap MinHeap=(heap)malloc(sizeof(struct TreeNode*)*(2*n));//Make sure there is enough space in the stack
    //Establish minimum heap
    printf("Please enter the sequence one by one:\n");
    for(int i=1;i<=n;i++){
        MinHeap[i]=(Huffman)malloc(sizeof(struct TreeNode));
        scanf("%d",&MinHeap[i]->data);
        MinHeap[i]->left=NULL;
        MinHeap[i]->right=NULL;
    }
    BuildMinHeap(MinHeap,n);
    // for(int i=1;i<=n;i++){
    //     printf("%d ",MinHeap[i]->data);
    // }
    //Establish Huffman tree
    Huffman huff=BuildHuffman(MinHeap,n);
    //Print Huffman tree in the format of "parent node (left subtree) (right subtree)" to prove that it is a Huffman tree
    printf("Print Huffman Tree\n");
    print(huff);
    //Output Huffman code of all original input numbers
    printf("\nOutput the Huffman code of all original input numbers");
    char* codeStore=(char*)malloc(sizeof(char)*n);
    int index=0;
    getHuffmanCode(huff,codeStore,&index);
    free(codeStore);
    //Memory release work
    free(MinHeap);
    deleteHuff(huff);
    return 0;
}#include <stdio.h>
#include <stdlib.h>
//Add height members on the basis of binary tree data structure
struct ALVnode
{
    int val;
    struct ALVnode* left;
    struct ALVnode* right;
    int high;
};
typedef struct ALVnode* ALV;
int getHigh(ALV tree){
    if(!tree) return 0;
    int i=getHigh(tree->left);
    int j=getHigh(tree->right);
    return i>j?i+1:j+1;
}
/*Make left single rotation (right rotation) between root and its left node, because the tree height of root and its left node changes
,Therefore, we need to update the height of both, point the root node to the new root node (i.e. the left node) and return*/
ALV LeftLeft(ALV root){
    ALV temp=root;
    root=root->left;
    temp->left=root->right;
    root->right=temp;
    //Update tree height
    temp->high=getHigh(temp);
    root->high=getHigh(root);
    return root;
}
/*Make a right single rotation (left rotation) between root and its right node, because the tree height of root and its left node changes
,Therefore, we need to update the height of both, point the root node to the new root node (i.e. the right node) and return*/
ALV RightRight(ALV root){
    ALV temp=root;
    root=root->right;
    temp->right=root->left;
    root->left=temp;
    //Update tree height
    temp->high=getHigh(temp);
    root->high=getHigh(root);
    return root;
}
/*The right single rotation of the left subtree of root can transfer the insertion node to the left subtree of the left subtree. At this time, the left single rotation of the whole root can be carried out*/
ALV LeftRight(ALV root){
    root->left=RightRight(root->left);
    root=LeftLeft(root);
    return root;
}
/*The left single rotation of the right node of root can transfer the insertion node to the right subtree of the right subtree. At this time, the right single rotation of the whole root can be carried out*/
ALV RightLeft(ALV root){
    root->right=LeftLeft(root->right);
    root=RightRight(root);
    return root;
}
ALV insert(ALV root,int val){
    if(root==NULL){//val is the first number
        struct ALVnode*newNode=(struct ALVnode*)malloc(sizeof(struct ALVnode));
        newNode->val=val;
        newNode->high=1;
        newNode->left=NULL;
        newNode->right=NULL;
        root=newNode;//root points to the new node
    }
    else{
        //If the value greater than root is inserted to the right and less than the value to the left, it means that the insertion is invalid (unable to insert)
        if(val<root->val){
            root->left=insert(root->left,val);
            if((root->right==NULL&&root->left->high==2)||(root->right!=NULL&&root->left->high-root->right->high==2)){
                if(val<root->left->val)
                    root=LeftLeft(root);//Sinistral
                else
                {
                    root=LeftRight(root);//Left and right double rotation
                }
                
            }
        
        }
        else if(val>root->val){
            root->right=insert(root->right,val);
            if((root->left==NULL&&root->right->high==2)||(root->left!=NULL&&root->left->high-root->right->high==-2)){
                if(val<root->right->val)
                    root=RightLeft(root);//Right left double spin
                else
                {
                    root=RightRight(root);//Right single rotation
                }
                
            }
        }
        //Update tree height
        root->high=getHigh(root);
    }
    return root;
}
/*Reclaim tree memory*/
void clear(ALV root){
    if(!root)return;
    clear(root->left);
    clear(root->right);
    free(root);
}
//Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is an AVL tree
void print(ALV root){
    if(!root)return;
    printf("%d",root->val);
    printf("(");
    print(root->left);
    printf(")");
    printf("(");
    print(root->right);
    printf(")");
}
int main(){
    int num;//Number of nodes to insert
    ALV head=NULL;
    scanf("%d",&num);
    int temp;//Used to store the number of to insert
    for(int  i=0;i<num;i++){
        scanf("%d",&temp);
        head=insert(head,temp);
    }
    //Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is an AVL tree
    print(head);
    clear(head);
    return 0;
}

5, Test results:

Input / output format description:

  • Input format:
    Enter the number of sequences first according to the display prompt;
    Input sequence elements in sequence;

  • Output format:
    Print the tree in the form of "node value (left subtree) (right subtree)" and prove that it is an AVL tree

  • Test example 1:

    Input:

    5
    5 4 3 2 1
    

    Output:

    Please enter the number of sequences:
    5
    Please enter the sequence one by one:
    5 4 3 2 1
    Print Huffman Tree
    15(6(3()())(3(1()())(2()())))(9(4()())(5()()))
    Output the Huffman code of all original input numbers
    3's HuffmanCode:00
    1's HuffmanCode:010
    2's HuffmanCode:011
    4's HuffmanCode:10
    
  • Test example 2:

    Input:

    9
    33 22 11 44 55 6 43 22 11
    

    Output:

    Please enter the number of sequences:
    9 
    Please enter the sequence one by one:
    33 22 11 44 55 6 43 22 11
    Print Huffman Tree
    247(99(44()())(55()()))(148(61(28(11()())(17(6()())(11()())))(33()()))(87(43()())(44(22()())(22()()))))
    Output the Huffman code of all original input numbers
    44's HuffmanCode:00
    55's HuffmanCode:01
    11's HuffmanCode:1000
    6's HuffmanCode:10010
    11's HuffmanCode:10011
    33's HuffmanCode:101
    43's HuffmanCode:110
    22's HuffmanCode:1110
    22's HuffmanCode:1111
    
  • Test example 3:

    Input:

    7
    88 70 61 96 120 90 65
    

    Output:

    Please enter the number of sequences:
    7
    Please enter the sequence one by one:
    88 70 61 96 120 90 65
    Print Huffman Tree
    590(246(120()())(126(61()())(65()())))(344(158(70()())(88()()))(186(90()())(96()())))
    Output the Huffman code of all original input numbers
    120's HuffmanCode:00
    61's HuffmanCode:010
    65's HuffmanCode:011
    70's HuffmanCode:100
    88's HuffmanCode:101
    90's HuffmanCode:110
    96's HuffmanCode:111
    
  • Test example 4:

    Input:

    0
    

    Output:

  • Test example 5: (random number example)

    Input:

    100
    1610 1280 299 53 1760 1677 149 1769 1325 1991 721 1451 2168 465 824 1517 1162 944 1782 69 1188 1112 16 148 1418 1429 1675 1232 2184 775 761 1714 13 623 1399 971 112 1299 1778 1819 1199 1463 924 603 356 1462 230 1828 320 2218 1429 24 1697 936 640 1462 1510 150 114 1101 206 194 357 1895 999 2082 15 320 2051 228 1325 216 741 1768 69 1311 1019 150 2190 824 210 790 788 2144 1039 1154 269 83 1521 491 1238 539 2007 1054 433 1186 855 1450 833 2049
    

    Output:

    Please enter the number of sequences:
    100
    Please enter the sequence one by one:
    1610 1280 299 53 1760 1677 149 1769 1325 1991 721 1451 2168 465 824 1517 1162 944 1782 69 1188 1112 16 148 1418 1429 1675 1232 2184 775 761 1714 13 623 1399 971 112 1299 1778 1819 1199 1463 924 603 356 1462 230 1828 320 2218 1429 24 1697 936 640 1462 1510 150 114 1101 206 194 357 1895 999 2082 15 320 2051 228 1325 216
    741 1768 69 1311 1019 150 2190 824 210 790 788 2144 1039 1154 269 83 1521 491 1238 539 2007 1054 433 1186 855 1450 833 2049
    Print Huffman Tree
    102441(42410(19313(9208(4484(2218()())(2266(1112()())(1154()())))(4724(2348(1162()())(1186()()))(2376(1188(585(286(138(69()())(69()()))(148()()))(299(149()())(150()())))(603()()))(1188()()))))(10105(4911(2431(1199()())(1232()()))(2480(1238()())(1242(619(299()())(320()()))(623()()))))(5194(2579(1280()())(1299()()))(2615(1304(640()())(664(320()())(344(150()())(194()()))))(1311()())))))(23097(11209(5467(2650(1325()())(1325()()))(2817(1399()())(1418()())))(5742(2858(1429()())(1429()()))(2884(1434(713(356()())(357()()))(721()()))(1450()()))))(11888(5838(2913(1451()())(1462()()))(2925(1462()())(1463()())))(6050(3012(1502(741()())(761()()))(1510()()))(3038(1517()())(1521()()))))))(60031(27380(13175(6438(3173(1563(775()())(788()()))(1610()()))(3265(1614(790()())(824()()))(1651(824()())(827(401(195(83()())(112()()))(206()()))(426(210()())(216()()))))))(6737(3352(1675()())(1677()()))(3385(1688(833()())(855()()))(1697()()))))(14205(7011(3474(1714()())(1760()()))(3537(1768()())(1769()())))(7194(3560(1778()())(1782()()))(3634(1815(891(433()())(458(228()())(230()())))(924()()))(1819()())))))(32651(15595(7530(3708(1828()())(1880(936()())(944()())))(3822(1895()())(1927(956(465()())(491()()))(971()()))))(8065(3998(1991()())(2007()()))(4067(2018(999()())(1019()()))(2049()()))))(17056(8359(4133(2051()())(2082()()))(4226(2082(1039()())(1043(504(235(114()())(121(53()())(68(28(13()())(15()()))(40(16()())(24()())))))(269()()))(539()())))(2144()())))(8697(4323(2155(1054()())(1101()()))(2168()()))(4374(2184()())(2190()()))))))
    Output the Huffman code of all original input numbers
    2218's HuffmanCode:00000
    1112's HuffmanCode:000010
    1154's HuffmanCode:000011
    1162's HuffmanCode:000100
    1186's HuffmanCode:000101
    69's HuffmanCode:0001100000
    69's HuffmanCode:0001100001
    148's HuffmanCode:000110001
    149's HuffmanCode:000110010
    150's HuffmanCode:000110011
    603's HuffmanCode:0001101
    1188's HuffmanCode:000111
    1199's HuffmanCode:001000
    1232's HuffmanCode:001001
    1238's HuffmanCode:001010
    299's HuffmanCode:00101100
    320's HuffmanCode:00101101
    623's HuffmanCode:0010111
    1280's HuffmanCode:001100
    1299's HuffmanCode:001101
    640's HuffmanCode:0011100
    320's HuffmanCode:00111010
    150's HuffmanCode:001110110
    194's HuffmanCode:001110111
    1311's HuffmanCode:001111
    1325's HuffmanCode:010000
    1325's HuffmanCode:010001
    1399's HuffmanCode:010010
    1418's HuffmanCode:010011
    1429's HuffmanCode:010100
    1429's HuffmanCode:010101
    356's HuffmanCode:01011000
    357's HuffmanCode:01011001
    721's HuffmanCode:0101101
    1450's HuffmanCode:010111
    1451's HuffmanCode:011000
    1462's HuffmanCode:011001
    1462's HuffmanCode:011010
    1463's HuffmanCode:011011
    741's HuffmanCode:0111000
    761's HuffmanCode:0111001
    1510's HuffmanCode:011101
    1517's HuffmanCode:011110
    1521's HuffmanCode:011111
    775's HuffmanCode:1000000
    788's HuffmanCode:1000001
    1610's HuffmanCode:100001
    790's HuffmanCode:1000100
    824's HuffmanCode:1000101
    824's HuffmanCode:1000110
    83's HuffmanCode:1000111000
    112's HuffmanCode:1000111001
    206's HuffmanCode:100011101
    210's HuffmanCode:100011110
    216's HuffmanCode:100011111
    1675's HuffmanCode:100100
    1677's HuffmanCode:100101
    833's HuffmanCode:1001100
    855's HuffmanCode:1001101
    1697's HuffmanCode:100111
    1714's HuffmanCode:101000
    1760's HuffmanCode:101001
    1768's HuffmanCode:101010
    1769's HuffmanCode:101011
    1778's HuffmanCode:101100
    1782's HuffmanCode:101101
    433's HuffmanCode:10111000
    228's HuffmanCode:101110010
    230's HuffmanCode:101110011
    924's HuffmanCode:1011101
    1819's HuffmanCode:101111
    1828's HuffmanCode:110000
    936's HuffmanCode:1100010
    944's HuffmanCode:1100011
    1895's HuffmanCode:110010
    465's HuffmanCode:11001100
    491's HuffmanCode:11001101
    971's HuffmanCode:1100111
    1991's HuffmanCode:110100
    2007's HuffmanCode:110101
    999's HuffmanCode:1101100
    1019's HuffmanCode:1101101
    2049's HuffmanCode:110111
    2051's HuffmanCode:111000
    2082's HuffmanCode:111001
    1039's HuffmanCode:1110100
    114's HuffmanCode:1110101000
    53's HuffmanCode:11101010010
    13's HuffmanCode:1110101001100
    15's HuffmanCode:1110101001101
    16's HuffmanCode:1110101001110
    24's HuffmanCode:1110101001111
    269's HuffmanCode:111010101
    539's HuffmanCode:11101011
    2144's HuffmanCode:111011
    1054's HuffmanCode:1111000
    1101's HuffmanCode:1111001
    2168's HuffmanCode:111101
    2184's HuffmanCode:111110
    2190's HuffmanCode:111111
    

6, Algorithm analysis:

The process starts with the leaf node containing the probability of the symbol it represents. Then, the process obtains the minimum two nodes (minimum probability) and creates a new internal node with these two nodes as child nodes. The weight of the new node is set to the sum of the child node weights. Then, we apply the process again on the new internal node and the remaining nodes (that is, excluding two leaf nodes), and repeat the process until there is only one node left, which is the root of the Huffman tree.

The construction algorithm uses a priority queue (minimum heap), in which the node with the lowest probability is given the highest priority:

  1. Create a leaf node for each symbol and add it to the priority queue.
  2. When there are multiple nodes in the queue:
    1. Delete the two nodes with the highest priority (lowest probability) from the queue
    2. Create a new internal node and take the two nodes as child nodes, and its probability is equal to the sum of the probabilities of the two nodes.
    3. Add a new node to the queue.
  3. The remaining nodes are root nodes and the tree has been completed.

Because the efficient priority queue data structure requires o (log n) time for each insertion, and the tree with n leaves has 2n - 1 nodes, the algorithm operates in O (log n) time. The spatial complexity of the algorithm is the memory of Huffman tree and the minimum heap for storing Huffman tree nodes.

Keywords: C data structure

Added by Operandi on Wed, 02 Mar 2022 08:31:27 +0200