Analysis of ELF format III: sections

Transferred from: https://www.cnblogs.com/jiqingwu/p/elf_explore_3.html

Previous link:

Today we will talk about section s that are important to both the target file (relocatable file) and the executable file.

When we talked about ELF Header, we talked about section header table. It is a collection of section headers. Each section header is a structure describing a section. In the same ELF file, the size of each section header is the same. (in fact, after reading the source code, we can see that the section headers in 32-bit elf files are the same size, and the section headers in 64 bit elf files are the same size)

Each section has a section header to describe it, but a section header may not have a corresponding section in the file, because some sections do not occupy file space. Each section is a continuous sequence of bytes in the file. There will be no overlap between sections.

There may be space in an object file that is not covered. For example, all kinds of header s and section s are not covered. The contents of these bytes are unspecified and meaningless.

section header definition

The section header structure can be defined in / usr / include / ELF Found in H.

/* Section header.  */

typedef struct
{
  Elf32_Word	sh_name;		/* Section name (string tbl index) */
  Elf32_Word	sh_type;		/* Section type */
  Elf32_Word	sh_flags;		/* Section flags */
  Elf32_Addr	sh_addr;		/* Section virtual addr at execution */
  Elf32_Off	sh_offset;		/* Section file offset */
  Elf32_Word	sh_size;		/* Section size in bytes */
  Elf32_Word	sh_link;		/* Link to another section */
  Elf32_Word	sh_info;		/* Additional section information */
  Elf32_Word	sh_addralign;		/* Section alignment */
  Elf32_Word	sh_entsize;		/* Entry size if section holds table */
} Elf32_Shdr;

typedef struct
{
  Elf64_Word	sh_name;		/* Section name (string tbl index) */
  Elf64_Word	sh_type;		/* Section type */
  Elf64_Xword	sh_flags;		/* Section flags */
  Elf64_Addr	sh_addr;		/* Section virtual addr at execution */
  Elf64_Off	sh_offset;		/* Section file offset */
  Elf64_Xword	sh_size;		/* Section size in bytes */
  Elf64_Word	sh_link;		/* Link to another section */
  Elf64_Word	sh_info;		/* Additional section information */
  Elf64_Xword	sh_addralign;		/* Section alignment */
  Elf64_Xword	sh_entsize;		/* Entry size if section holds table */
} Elf64_Shdr;

Let's explain the fields of the structure in turn:

  1. sh_name, 4 bytes, is an index value in shstrtable (section header string table, a string table containing section name, which is also a section). In the second lecture, when introducing the ELF file header, there is a special field e_shstrndx means the index of the section header corresponding to shstrtable in the section header table.

  2. sh_type, 4 bytes, describes the type of section. Common values are as follows:

    • SHT_NULL = 0 indicates that the section header is invalid and there is no associated section.
    • SHT_PROGBITS # 1, section contains the data required by the program. The format and meaning are explained by the program.
    • SHT_SYMTAB 2 contains a symbol table. Currently, there is only one symbol table in an ELF file. SHT_SYMTAB provides symbols for link editor. Of course, these symbols may also be used for dynamic links. This is a complete symbol table, which contains many symbols.
    • SHT_STRTAB 3, which contains a string table. An object file contains multiple string tables, such as strtab (containing the name of the symbol) and shstrtab (contains the name of the section).
    • SHT_RELA # 4, Relocation Section, including relocation entry, see Elf32_Rela. A file may have multiple relocation sections. For example rela.text,. rela.dyn.
    • SHT_HASH # 5, such a section contains a symbol hash table, and the object code file participating in dynamic connection must have a hash table. Currently, an ELF file contains only one hash table. Talk about the link in detail.
    • SHT_DYNAMIC # 6, including dynamic link information. Currently, an ELF file has only one DYNAMIC section.
    • SHT_NOTE 7, note section, marks the information of the file in some way, which will be described in detail later.
    • SHT_NOBITS # 8. This section does not contain bytes and does not occupy file space. SH in section header_ The offset field is only a conceptual offset.
    • SHT_REL # 9, relocation section, contains relocation entries. And SHT_RELA is basically the same. The differences between the two will be discussed in detail later when we talk about relocation.
    • SHT_SHLIB # 10, reserved, semantics unspecified, elf file containing this type of section does not comply with ABI.
    • SHT_ The symbol of dytable is a subset of nsymbol, which is used to speculate dynamically.
    • SHT_LOPROC 0x70000000 to SHT_HIPROC 0x7fffff, reserved for processor specific semantics.
    • SHT_LOUSER 0x80000000 and SHT_HIUSER 0xffffffff specifies the lower and upper bounds of the index reserved for the application. The index within this range can be used by the application.
  3. sh_flags, 32 bits occupy 4 bytes and 64 bits occupy 8 bytes. Contains bit flags. You can see many flags with , readelf - s < elf >. Common are:

    • SHF_WRITE 0x1, when the process is executing, the data in the section can be written.
    • SHF_ALLOC 0x2, when the process is executing, the section needs to occupy memory.
    • SHF_EXECINSTR 0x4, which contains executable machine instructions.
    • SHF_STRINGS 0x20, including strings ending in 0.
    • SHF_MASKOS 0x0ff00000. This mask reserves 8 bits for OS specific semantics.
    • SHF_MASKPROC 0xf0000000. All bits contained in this mask are reserved (i.e. the upper 4 bits of the highest byte), which is used for processor related semantics.
  4. sh_addr is 4 bytes for 32 bits and 8 bytes for 64 bits. If the section appears in the memory image of the process, the virtual address of the first byte of the section is given.

  5. sh_offset is 4 bytes for 32 bits and 8 bytes for 64 bits. Byte offset of section relative to file header. For sections that do not occupy file space (such as SHT_NOBITS), its sh_offset only gives the logical position of the section.

  6. sh_size, how many bytes does the section occupy? For SHT_NOBITS type section, sh_size is useless. Its value may not be 0, but it does not occupy file space.

  7. sh_link, which contains the index of a section header. The interpretation of this value depends on the section type.

    • If SHT_DYNAMIC,sh_link is the section header index of string table, that is, it points to the string table.
    • If SHT_HASH,sh_link points to the section header index of symbol table, and hash table is applied to symbol table.
    • In case of relocation section SHT_REL or SHT_RELA,sh_link points to the section header index of the corresponding symbol table.
    • If SHT_SYMTAB or SHT_DYNSYM,sh_link points to the associated symbol table. I don't understand it for the time being.
    • For other section type s, sh_ The value of link is SHN_UNDEF
  8. sh_info, which stores additional information. The interpretation of the value depends on the section type.

    • If SHT_REL and sht_ Relocation section of rela type, sh_info is the section header index of the section to which relocation is applied.
    • If SHT_SYMTAB and SHT_DYNSYM,sh_info is the index of the first non local symbol in the symbol table. It is speculated that local symbols are in the front and non local symbols are closely followed, so the document also says, sh_info is the index of the last local symbol in the symbol table plus 1.
    • For other types of section s, sh_info is 0.
  9. sh_addralign, address alignment. If a section has a double word field, the memory address of the system when loading the section must be double word alignment. That is, sh_addr must be sh_ An integer multiple of addralign. Only positive integer powers of 2 are valid. 0 and 1 indicate that there are no alignment constraints.

  10. sh_entsize, some sections contain fixed size records, such as symbol tables. This value gives the size of each record. For sections that do not contain fixed size records, this value is 0.

Predefined section name

The system predefines some section names (starting with). These sections have their specific types and meanings.

  • . bss: contains uninitialized data (global variables and static variables) when the program runs. When the program is running, these data are initialized to 0. Its type is SHT_NOBITS, indicating that it does not occupy file space. SHF_ALLOC + SHF_WRITE, the amount of memory to be occupied by the runtime.
  • . comment contains version control information (does it contain the comment information of the program? No, the comment has been deleted during preprocessing). Type SHT_PROGBITS.
  • . data and data1, including initialized global variables and static variables. Type SHT_PROGBITS, marked SHF_ALLOC + SHF_WRITE (occupied memory, writable).
  • . debug contains information for symbol debugging. If we want to debug programs with tools such as gdb, we need this type of information, which is SHT_PROGBITS.
  • . dynamic, type SHT_DYNAMIC, which contains dynamic link information. Sign SHF_ALLOC, including SHF_WRITE is related to the processor.
  • .dynstr,SHT_STRTAB contains the string used for dynamic link, which is usually the string associated with the symbol in the symbol table. Flag: SHF_ALLOC
  • . dynsym, type SHT_DYNSYM, including dynamic link symbol table, flag SHF_ALLOC.
  • . fini, type SHT_PROGBITS: when the program ends normally, the instructions in the section shall be executed. Sign SHF_ALLOC + SHF_EXECINSTR (memory occupied executable). Now ELF also includes fini_array section.
  • . got, type SHT_PROGBITS, global offset table, will be highlighted later.
  • . hash, type SHT_HASH, including the symbol hash table, will be described in detail later. Sign SHF_ALLOC.
  • .init,SHT_PROGBITS, when the program runs, execute the code in this section first. SHF_ALLOC + SHF_EXECINSTR, and fini correspondence. Now ELF also includes init_array section.
  • .interp,SHT_PROGBITS, which is a string specifying the pathname of the program interpreter. If a loadable segment in the file contains the section, the attribute contains SHF_ALLOC, otherwise not included.
  • .line,SHT_PROGBITS, which contains the line number information of symbol debugging, describes the corresponding relationship between source program and machine code. Debuggers such as gdb need this information.
  • . note # Note Section, type SHT_NOTE. I'll talk about it separately later.
  • . plt Procedure Linkage Table, type SHT_PROGBITS, which will be emphasized later.
  • . relNAME, type SHT_REL, containing relocation information. If the file has a loadable segment containing the section, the section attribute will contain SHF_ALLOC is not included otherwise. NAME is the NAME of the section where the relocation is applied, for example The relocation information for text is stored in rel.text.
  • . relaname type SHT_RELA, and Rel is the same. SHT_RELA and sht_ The difference between rel will be explained when talking about relocation.
  • . rodata and rodata1. Type SHT_PROGBITS contains read-only data and forms a non writable segment. Sign SHF_ALLOC.
  • . shstrtab, type SHT_STRTAB, containing the name of the section. Some readers may ask: isn't the name already included in the section header? Why are the names stored here?  sh_name , contains The index in shstrtab, where the real string is stored Shstrtab. So why should section names be stored centrally? I think so: if you have the same string, you can share a piece of storage space. If the string has an inclusion relationship, you can also share a piece of storage space.
  • .strtab SHT_STRTAB, containing strings, is usually the variable name corresponding to the symbol in the symbol table. If the file has a loadable segment containing the section, the attribute will contain SHF_ALLOC. String ends with \ 0, section starts with \ 0 and ends with \ 0. One Strtab can be empty and its sh_ The size will be 0. Non-0 indexes on empty string tables are allowed.
  • symtab, type SHT_SYMTAB, Symbol Table, Symbol Table. Contains the information needed to locate, reposition symbol definitions and references. The Symbol Table is an array. The first entry of Index 0 means undefined symbol index, STN_UNDEF. If the file has a loadable segment containing the section, the attribute will contain SHF_ALLOC.

Exercise: reading section names

From this point on, there will be exercises to facilitate the comprehensive application of the previous theoretical knowledge.

The goal of this exercise is to read the string table storing section name from an ELF file. As mentioned earlier, the string table is also a section. There is a corresponding section header in the section header table, and the index of the section header corresponding to the section name string table is given in the ELF file header, e_shstrndx.

Our idea is as follows:

  1. Read the starting position of the section header table, the size of each section header, and the index of the section header corresponding to the section name string table from the ELF header.
  2. Calculate section_header_table_offset + section_header_size * e_shstrndx , is the offset of the section header corresponding to the section name string table.
  3. Read the section header to get the offset and size of the section name string table in the file.
  4. Read the section name string table into memory and print its contents.

The code is as follows:

/* 64 Bit ELF file reading section name string table */
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

int main(int argc, char *argv[])
{
    /* Open the local ELF executable hello */
    FILE *fp = fopen("./hello", "rb");
    if(!fp) {
        perror("open ELF file");
        exit(1);
    }

    /* 1. Get the offset of section header table by reading ELF header */
    /* for 64 bit ELF,
       e_ident(16) + e_type(2) + e_machine(2) +
       e_version(4) + e_entry(8) + e_phoff(8) = 40 */
    fseek(fp, 40, SEEK_SET);
    uint64_t sh_off;
    int r = fread(&sh_off, 1, 8, fp);
    if (r != 8) {
        perror("read section header offset");
        exit(2);
    }
    /* The obtained offset value can be verified with 'realf - H Hello' */
    printf("section header offset in file: %ld (0x%lx)\n", sh_off, sh_off);

    /* 2. Read the size of each section header e_shentsize,
       section header Quantity of e_shnum,
       And the index e of the section header corresponding to the section name string table_ shstrndx
       After obtaining these values, you can use 'readelf -h hello' to verify whether they are correct */
    /* e_flags(4) + e_ehsize(2) + e_phentsize(2) + e_phnum(2) = 10 */
    fseek(fp, 10, SEEK_CUR);
    uint16_t sh_ent_size;            /* Size of each section header */
    r = fread(&sh_ent_size, 1, 2, fp);
    if (r != 2) {
        perror("read section header entry size");
        exit(2);
    }
    printf("section header entry size: %d\n", sh_ent_size);

    uint16_t sh_num;            /* section header Number of */
    r = fread(&sh_num, 1, 2, fp);
    if (r != 2) {
        perror("read section header number");
        exit(2);
    }
    printf("section header number: %d\n", sh_num);

    uint16_t sh_strtab_index;   /* The index of the section header corresponding to the section name string table */
    r = fread(&sh_strtab_index, 1, 2, fp);
    if (r != 2) {
        perror("read section header string table index");
        exit(2);
    }
    printf("section header string table index: %d\n", sh_strtab_index);

    /* 3. read section name string table offset, size */
    /* First find the offset position of the section header corresponding to the section header string table */
    fseek(fp, sh_off + sh_strtab_index * sh_ent_size, SEEK_SET);
    /* Then find the offset of the section header string table from the section header */
    /* sh_name(4) + sh_type(4) + sh_flags(8) + sh_addr(8) = 24 */
    fseek(fp, 24, SEEK_CUR);
    uint64_t str_table_off;
    r = fread(&str_table_off, 1, 8, fp);
    if (r != 8) {
        perror("read section name string table offset");
        exit(2);
    }
    printf("section name string table offset: %ld\n", str_table_off);

    /* Find the size of the section header string table from the section header */
    uint64_t str_table_size;
    r = fread(&str_table_size, 1, 8, fp);
    if (r != 8) {
        perror("read section name string table size");
        exit(2);
    }
    printf("section name string table size: %ld\n", str_table_size);

    /* Dynamically allocate memory and read the section header string table into memory */
    char *buf = (char *)malloc(str_table_size);
    if(!buf) {
        perror("allocate memory for section name string table");
        exit(3);
    }
    fseek(fp, str_table_off, SEEK_SET);
    r = fread(buf, 1, str_table_size, fp);
    if(r != str_table_size) {
        perror("read section name string table");
        free(buf);
        exit(2);
    }
    uint16_t i;
    for(i = 0; i < str_table_size; ++i) {
        /* If the byte in the section header string table is 0, print ` \ 0` */
        if (buf[i] == 0)
            printf("\\0");
        else
            printf("%c", buf[i]);
    }
    printf("\n");
    free(buf);
    fclose(fp);
    return 0;
}

Save the above code as chap3_read_section_names.c. Execute gcc -Wall -o secnames chap3_read_section_names.c, and the output executable file is called secnames. Execute secnames and the output is as follows:

./secnames
section header offset in file: 14768 (0x39b0)
section header entry size: 64
section header number: 29
section header string table index: 28
section name string table offset: 14502
section name string table size: 259
\0.symtab\0.strtab\0.shstrtab\0.interp\0.note.ABI-tag\0.note.gnu.build-id\0.gnu.hash\0.dynsym\0.dynstr\0.gnu.version\0.gnu.version_r\0.rela.dyn\0.rela.plt\0.init\0.text\0.fini\0.rodata\0.eh_frame_hdr\0.eh_frame\0.init_array\0.fini_array\0.dynamic\0.got\0.got.plt\0.data\0.bss\0.comment\0

It can be found that the section header string table starts with \ 0 and ends with \ 0. If the name field of a section points to 0, the byte value it points to is 0, then it has no name or the name is empty.

summary

This chapter mainly explains the definition of section header, the meaning of each field and possible values. Then it introduces some predefined section names. Finally, we used the knowledge of Chapter 2 and Chapter 3 to do an exercise of reading section names.

In the next chapter, we will talk about the principle of symbol table and relocation. This series of articles will also be updated on WeChat official account's "wings of joy".

Added by pennythetuff on Fri, 04 Feb 2022 12:54:48 +0200