0. Basic concepts
The location of the 0.1 video stream in the video player is as follows:
H.264 original code stream (also known as "naked stream") is composed of one NALU. Their structure is shown in the figure below.
0.2 to be more accurate, the original NALU unit is composed of:
[start code] + [NALU header] + [NALU payload]
[start code] takes up 3 or 4 bytes, which is 0x000001 or 0x00000001.
And [NALU header] is composed of the following:
forbidden_zero_bit(1bit) + nal_ref_idc(2bit) + nal_unit_type(5bit)
0.3 NALU type
0.3.1 forbidden_zero_bit:
The forbidden bit, initialized to 0, can be set to 1 when the network finds that the NAL unit has a bit error,
So that the receiver can correct or lose the unit.
0.3.2 nal_ref_idc:
Nal importance indicates the importance of the nal unit. The larger the value is, the more important it is. When the decoder cannot decode it,
You can lose a NALU with an importance of 0.
0.3.3 nal_unit_type:
The syntax table of NALU is as follows:
Generally, the first two nalus of H.264 are SPS and PPS, and the third is IDR. SPS, PPS and SEI are three kinds of NALU that do not belong to frame category. Their definitions are as follows:
SPS(Sequence Parameter Sets): a set of sequence parameters, which acts on a series of consecutive encoded images.
PPS(Picture Parameter Set): set of image parameters, which acts on one or more independent images in the encoded video sequence.
SEI (supplementary enhancement information): additional enhancement information, including video picture timing and other information, is generally placed before the main coding image data. In some applications, it can be omitted.
IDR (instant decoding refresh): instant decoding refresh.
HRD(Hypothetical Reference Decoder): hypothetical stream scheduler.
1. Code
Or learn the thunderobot code, I wrote a further detailed note.
extern "C" { #ifdef __cplusplus #define __STDC_CONSTANT_MACROS #endif } extern "C" { #include <stdio.h> #include <stdlib.h> #include <string.h> #include <math.h> } typedef enum { NALU_TYPE_SLICE = 1, NALU_TYPE_DPA = 2, NALU_TYPE_DPB = 3, NALU_TYPE_DPC = 4, NALU_TYPE_IDR = 5, NALU_TYPE_SEI = 6, NALU_TYPE_SPS = 7, NALU_TYPE_PPS = 8, NALU_TYPE_AUD = 9, NALU_TYPE_EOSEQ = 10, NALU_TYPE_EOSTREAM = 11, NALU_TYPE_FILL = 12, } NaluType; typedef enum { NALU_PRIORITY_DISPOSABLE = 0, NALU_PRIRITY_LOW = 1, NALU_PRIORITY_HIGH = 2, NALU_PRIORITY_HIGHEST = 3 } NaluPriority; typedef struct { int startcodeprefix_len; //! 4 for parameter sets and first slice in picture, 3 for everything else (suggested) //In H264 bitstream, the start code is "0x00 0x00 0x01" or "0x00 0x00 0x01" //startcodeprefix_len can be three bytes or four bytes unsigned len; //! Length of the NAL unit (Excluding the start code, which does not belong to the NALU) unsigned max_size; //! Nal Unit Buffer size int forbidden_bit; //! should be always FALSE //Forbidden bit_ Bit, initially 0, can be set to 1 when the network finds that there is a bit error in the NAL unit, //So that the receiver can correct or discard the modification unit int nal_reference_idc; //! NALU_PRIORITY_xxxx //nal_reference_idc, this is an indication of the importance of the nal unit. The higher the value, the more important it is, //When the decoder fails to decode, it can lose the NALU with the importance of 0. int nal_unit_type; //! NALU_TYPE_xxxx char* buf; //! contains the first byte followed by the EBSP } NALU_t; //h264bitstream is a global file pointer FILE* h264bitstream = NULL; //!< the bit stream file int info2 = 0, info3 = 0; //In H264 bitstream, the start code is "0x00 0x00 0x01" or "0x00 0x00 0x01" //The first three bytes of data read out is 0x00 0x00 0x01, return true, otherwise return false static int FindStartCode2(unsigned char* Buf) { if (Buf[0] != 0 || Buf[1] != 0 || Buf[2] != 1) return 0; //0x000001? else return 1; } //The first four bytes of data read out is 0x00 0x00 0x00 0x01, return true, otherwise return false static int FindStartCode3(unsigned char* Buf) { if (Buf[0] != 0 || Buf[1] != 0 || Buf[2] != 0 || Buf[3] != 1) return 0;//0x00000001 else return 1; } int GetAnnexbNALU(NALU_t* nalu) { int pos = 0; int StartCodeFound, rewind; unsigned char* Buf; //Allocate 100000 bytes of space if ((Buf = (unsigned char*)calloc(nalu->max_size, sizeof(char))) == NULL) printf("GetAnnexbNALU: Could not allocate Buf memory\n"); //Default initialization is 3 nalu->startcodeprefix_len = 3; //Read the data of the first three bytes. If it cannot be read, it indicates that the data is abnormal and returns directly. if (3 != fread(Buf, 1, 3, h264bitstream)) { free(Buf); return 0; } info2 = FindStartCode2(Buf); if (info2 != 1) { // If the first three bytes read are not 0x00 0x00 0x01, enter the judgment statement if (1 != fread(Buf + 3, 1, 1, h264bitstream)) { //Read another byte of data free(Buf); return 0; //This is to avoid null pointer } info3 = FindStartCode3(Buf); //When info3 is 1, the data of the first four bytes is 0x00 0x00 0x01 if (info3 != 1) { free(Buf); //Both are not satisfied, indicating that the data is abnormal. return directly return -1; } else { pos = 4; nalu->startcodeprefix_len = 4;//The code can go here to show that the start code is four bytes } } else { nalu->startcodeprefix_len = 3;//Otherwise, the start code is three bytes pos = 3; } StartCodeFound = 0; info2 = 0; info3 = 0; while (!StartCodeFound) { if (feof(h264bitstream)) { nalu->len = (pos - 1) - nalu->startcodeprefix_len; memcpy(nalu->buf, &Buf[nalu->startcodeprefix_len], nalu->len); nalu->forbidden_bit = nalu->buf[0] & 0x80; //1 bit nalu->nal_reference_idc = nalu->buf[0] & 0x60; // 2 bit nalu->nal_unit_type = (nalu->buf[0]) & 0x1f;// 5 bit free(Buf); return pos - 1; } Buf[pos++] = fgetc(h264bitstream); info3 = FindStartCode3(&Buf[pos - 4]); if (info3 != 1) info2 = FindStartCode2(&Buf[pos - 3]); StartCodeFound = (info2 == 1 || info3 == 1); } // Here, we have found another start code //and read length of startcode bytes more than we should // have. Hence, go back in the file rewind = (info3 == 1) ? -4 : -3; //Start code is 4 bytes, backward 4 positions, start code is 3 bytes, backward 3 positions if (0 != fseek(h264bitstream, rewind, SEEK_CUR)) { //Back file pointer h264bitstream to location seek_ Where cur + rewind, //Note that when the pointer is offset to the file header, it does not exceed the file header and returns 0. If it exceeds the file header, the file pointer remains unchanged and returns - 1 free(Buf); printf("GetAnnexbNALU: Cannot fseek in the bit stream file"); } // Here the Start code, the complete NALU, and the next start code is in the Buf. // The size of Buf is pos, pos+rewind are the number of bytes excluding the next // start code, and (pos+rewind)-startcodeprefix_len is the size of the NALU // excluding the start code nalu->len = (pos + rewind) - nalu->startcodeprefix_len; memcpy(nalu->buf, &Buf[nalu->startcodeprefix_len], nalu->len); //Copy the NALU data after the start code to the buff nalu->forbidden_bit = nalu->buf[0] & 0x80; //1 bit nalu->nal_reference_idc = nalu->buf[0] & 0x60; // 2 bit nalu->nal_unit_type = (nalu->buf[0]) & 0x1f;// 5 bit free(Buf); return (pos + rewind); } /** * Analysis H.264 Bitstream * @param url Location of input H.264 bitstream file. */ int simplest_h264_parser(const char* url) { NALU_t* n; int buffersize = 100000; //FILE *myout=fopen("output_log.txt","wb+"); FILE* myout = stdout; h264bitstream = fopen(url, "rb+"); //h264bitstream is a file pointer initialized to NULL if (h264bitstream == NULL) { printf("Open file error\n"); return 0; } n = (NALU_t*)calloc(1, sizeof(NALU_t)); //Assign 1 sizeof (Nalu) length_ t) The first address of the space is given to n if (n == NULL) { printf("Alloc NALU Error\n"); return 0; } //Nal unit buffer size 100000 n->max_size = buffersize; // buf is a 100000 byte buffer n->buf = (char*)calloc(buffersize, sizeof(char)); // The following code avoids null pointer if (n->buf == NULL) { free(n); printf("AllocNALU: n->buf"); return 0; } int data_offset = 0; int nal_num = 0; printf("-----+-------- NALU Table ------+---------+\n"); printf(" NUM | POS | IDC | TYPE | LEN |\n"); printf("-----+---------+--------+-------+---------+\n"); while (!feof(h264bitstream)) { int data_lenth; data_lenth = GetAnnexbNALU(n); char type_str[20] = { 0 }; switch (n->nal_unit_type) { case NALU_TYPE_SLICE:sprintf(type_str, "SLICE"); break; case NALU_TYPE_DPA:sprintf(type_str, "DPA"); break; case NALU_TYPE_DPB:sprintf(type_str, "DPB"); break; case NALU_TYPE_DPC:sprintf(type_str, "DPC"); break; case NALU_TYPE_IDR:sprintf(type_str, "IDR"); break; case NALU_TYPE_SEI:sprintf(type_str, "SEI"); break; case NALU_TYPE_SPS:sprintf(type_str, "SPS"); break; case NALU_TYPE_PPS:sprintf(type_str, "PPS"); break; case NALU_TYPE_AUD:sprintf(type_str, "AUD"); break; case NALU_TYPE_EOSEQ:sprintf(type_str, "EOSEQ"); break; case NALU_TYPE_EOSTREAM:sprintf(type_str, "EOSTREAM"); break; case NALU_TYPE_FILL:sprintf(type_str, "FILL"); break; } char idc_str[20] = { 0 }; switch (n->nal_reference_idc >> 5) { // 0x60 corresponds to 0110 0000, and this shift 5 bit to the right is nal_reference_idc value, value range is 0-3 case NALU_PRIORITY_DISPOSABLE:sprintf(idc_str, "DISPOS"); break; case NALU_PRIRITY_LOW:sprintf(idc_str, "LOW"); break; case NALU_PRIORITY_HIGH:sprintf(idc_str, "HIGH"); break; case NALU_PRIORITY_HIGHEST:sprintf(idc_str, "HIGHEST"); break; } fprintf(myout, "%5d| %8d| %7s| %6s| %8d|\n", nal_num, data_offset, idc_str, type_str, n->len); //nal_num is the length of data, data_offset is the length of the whole data, which keeps increasing, //idc_str indicates the importance of the nal unit. The higher the value, the more important it is, //When the decoder fails to decode, it can lose the NALU with the importance of 0. //type_str represents the type of NALU unit //N - > len is determined by the subfunction NALU - > len. This is the space occupied by each NALU unit (number of bytes), //Note that this does not contain a start code data_offset = data_offset + data_lenth; nal_num++; } //Free if (n) { if (n->buf) { free(n->buf); n->buf = NULL; } free(n); } return 0; } int main() { simplest_h264_parser("sintel.h264"); return 0; }
The code can be successfully compiled on visual studio 2019.
2. Key explanation
while (!StartCodeFound) { if (feof(h264bitstream)) { nalu->len = (pos - 1) - nalu->startcodeprefix_len; memcpy(nalu->buf, &Buf[nalu->startcodeprefix_len], nalu->len); nalu->forbidden_bit = nalu->buf[0] & 0x80; //1 bit nalu->nal_reference_idc = nalu->buf[0] & 0x60; // 2 bit nalu->nal_unit_type = (nalu->buf[0]) & 0x1f;// 5 bit free(Buf); return pos - 1; } Buf[pos++] = fgetc(h264bitstream); info3 = FindStartCode3(&Buf[pos - 4]); if (info3 != 1) info2 = FindStartCode2(&Buf[pos - 3]); StartCodeFound = (info2 == 1 || info3 == 1); }
This code is the key point. I didn't understand it for a while. nalu->forbidden_ bit,nalu->nal_ reference_ idc,nalu->nal_ unit_ There is nothing to say about type. The information related to the name header is stored in a byte after the start code.
if (feof(h264bitstream)) {
This judgment needs to wait until the end of the file is found. After the previous general situation, the ALU header related information and the acquired code are written
Once.
Buf[pos++] = fgetc(h264bitstream); info3 = FindStartCode3(&Buf[pos - 4]); if (info3 != 1) info2 = FindStartCode2(&Buf[pos - 3]); StartCodeFound = (info2 == 1 || info3 == 1);
These are the key points. Our current start code is four bytes, which can be seen from the previous analysis (see the code notes I wrote for details),
At this point, before the pos runs to this code, the value is 4.
Buf[0], Buf[1], Buf[2] and Buf[3] have stored the data of four bytes of the start code respectively. The data is stored in Buf through the h264bitstream file pointer. At this time, Buf[pos++] = fgetc(h264bitstream); read the data of the next byte Buf[4].
This byte of data corresponds to the name header.
At this time, POS is running, and its value is 5. Then the data of the next four addresses after the start of & buf [pos-4] must not be the start code:
0x00 0x00 0x00 0x01, so the while (! Startcode found) {loop jump condition must not be met.
Further example understanding
// Here the Start code, the complete NALU, and the next start code is in the Buf.
// The size of Buf is pos, pos+rewind are the number of bytes excluding the next
// start code, and (pos+rewind)-startcodeprefix_len is the size of the NALU excluding the start code
Let's assume that a NALU is 99 bytes long, and the first four bytes are the start code. pos before entering the while loop
The values of are 4. Buf[0], Buf[1], Buf[2], and Buf[3] have stored the data of four bytes of start code respectively
info3 = FindStartCode3(&Buf[pos - 4]);
To make this info3 return 1, you need to find the start code before the second NALU, that is, pos-4 is Buf[102]
According to the code Buf[pos++] = fgetc(h264bitstream), pos has obtained Buf[102], Buf[103]
Buf[104], Buf[105]; note that Buf[102] is the data after the first start code and the first NALU,
These four bytes are the four bytes of the next start code.
So the code annotation says that pos is the current buf length, and pos+rewind is the length of the next start code (see the code for details)
, rewind's value is - 4, because the start code is four bytes)
pos+rewind-startcodeprefix_len is the length of NALU in addition to the start code.
3. Reference link
Thank you for reading. The reference link of this article is as follows:
- https://blog.csdn.net/leixiaohua1020/article/details/50534369
- https://www.jianshu.com/p/5ec31394649a
- https://github.com/leixiaohua1020/simplest_mediadata_test