Operating system family - detailed explanation of target files

Previous address:

Theme of this issue:
Detailed explanation of target file

1. Definition and classification of target documents

Target file definition:

The file generated by the compiler after compiling the source code is called an object file
In terms of structure, the target file has not been linked, so some symbols may not have been adjusted

At present, the executable file formats on the PC side mainly include PE(Portable Execute) under windows and ELF under linux (Execute Linkable Format). The object file is the intermediate file whose source code has been compiled but not linked, such as. obj file under windows and. o file under linux. The content of this type of file is almost the same as the structure and executable file. Therefore, in a broad sense, the object file and executable file can be regarded as one type of file, which is collectively referred to as ELF file lattice under linux Type.
Therefore, ELF file formats can be divided into the following categories:

ELF file typeexplainexample
Relocatable fileCompiled files can be linked to generate executable filesUnder linux o and windows obj file
executable filePrograms that can be executed directly, typical ELF file format/ bin/bash under linux and exe file under windows
shared object fileDynamically linked files can be linked with other relocatable files and shared target files to generate new target filesUnder linux DLL files under so and windows
core dump fileProcess information storage file. When the process terminates unexpectedly, the system saves the address space of the process and some information at the time of termination to the filecore dump file under linux

2. What is the target document like?

1. Intuitively understand the target document

Write a simple test program: main c

#include <stdio.h>

int global_init_var = 100; //data segment
int global_uninit_var; //bss segment

int main(void)
{
        static int static_var = 200; //data segment
        static int static_var2; //bss segment

        int a = 1; //text
        int b; //text

        return 0;
}
jason@ubuntu:~/WorkSpace/3.OS_study/1.object_file$ gcc -c main.c 
jason@ubuntu:~/WorkSpace/3.OS_study/1.object_file$ file main.o 
main.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped

You can see main O is a relocate file. The target file will store information in the form of section. It can be simply understood as the following figure

Among them,

The beginning of ELF file is a file header, which describes the file attributes of the whole file, including section table, which describes the situation of each section
. text section, the compiled execution statements are translated into machine code and placed in text segment
. data section, initialized global variables and local static variables
. bss segment, uninitialized global and local static variables

2. Use of ELF file analysis tool

Common tools include objdump and readelf

objdump:
Usage: objdump < Options > < File >
Displays information from the target < File >.
At least one of the following options must be given:
-a, --archive-headers Display archive header information
-f, --file-headers Display the contents of the overall file header
-p, --private-headers Display object format specific file header contents
-P, --private=OPT,OPT... Display object format specific contents
-h, --[section-]headers Display the contents of the section headers
-x, --all-headers Display the contents of all headers
-d, --disassemble Display assembler contents of executable sections
-D, --disassemble-all Display assembler contents of all sections
–disassemble= Display assembler contents from
-S, --source Intermix source code with disassembly
–source-comment[=] Prefix lines of source code with
-s, --full-contents Display the full contents of all sections requested
-g, --debugging Display debug information in object file
-e, --debugging-tags Display debug information using ctags style
-G, --stabs Display (in raw form) any STABS info in the file
-W[lLiaprmfFsoRtUuTgAckK] or
–dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
=frames-interp,=str,=loc,=Ranges,=pubtypes,
=gdb_index,=trace_info,=trace_abbrev,=trace_aranges,
=addr,=cu_index,=links,=follow-links]
Display DWARF info in the file
–ctf=SECTION Display CTF info from SECTION
-t, --syms Display the contents of the symbol table(s)
-T, --dynamic-syms Display the contents of the dynamic symbol table
-r, --reloc Display the relocation entries in the file
-R, --dynamic-reloc Display the dynamic relocation entries in the file
@ Read options from
-v, --version Display this program's version number
-i, --info List object formats and architectures supported
-H, --help Display this information

objdump -x display elf All files header information
objdump -h display elf Document section header information
objdump -s Display the disassembled information in hexadecimal

jason@ubuntu:~/WorkSpace/3.OS_study/1.object_file$ objdump -x main.o

main.o:      file format elf64-x86-64
main.o
 Architecture: i386:x86-64, Flag 0 x00000011: 
HAS_RELOC, HAS_SYMS
 Start address 0 x0000000000000000

Section:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000016  0000000000000000  0000000000000000  00000040  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000008  0000000000000000  0000000000000000  00000058  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000004  0000000000000000  0000000000000000  00000060  2**2
                  ALLOC
  3 .comment      0000002b  0000000000000000  0000000000000000  00000060  2**0
                  CONTENTS, READONLY
  4 .note.GNU-stack 00000000  0000000000000000  0000000000000000  0000008b  2**0
                  CONTENTS, READONLY
  5 .note.gnu.property 00000020  0000000000000000  0000000000000000  00000090  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .eh_frame     00000038  0000000000000000  0000000000000000  000000b0  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
SYMBOL TABLE:
0000000000000000 l    df *ABS*	0000000000000000 main.c
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000000 l    d  .data	0000000000000000 .data
0000000000000000 l    d  .bss	0000000000000000 .bss
0000000000000000 l     O .bss	0000000000000004 static_var2.2319
0000000000000004 l     O .data	0000000000000004 static_var.2318
0000000000000000 l    d  .note.GNU-stack	0000000000000000 .note.GNU-stack
0000000000000000 l    d  .note.gnu.property	0000000000000000 .note.gnu.property
0000000000000000 l    d  .eh_frame	0000000000000000 .eh_frame
0000000000000000 l    d  .comment	0000000000000000 .comment
0000000000000000 g     O .data	0000000000000004 global_init_var
0000000000000004       O *COM*	0000000000000004 global_uninit_var
0000000000000000 g     F .text	0000000000000016 main


RELOCATION RECORDS FOR [.eh_frame]:
OFFSET           TYPE              VALUE 
0000000000000020 R_X86_64_PC32     .text

jason@ubuntu:~/WorkSpace/3.OS_study/1.object_file$ objdump -h main.o 

main.o:      file format elf64-x86-64

Section:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000016  0000000000000000  0000000000000000  00000040  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000008  0000000000000000  0000000000000000  00000058  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000004  0000000000000000  0000000000000000  00000060  2**2
                  ALLOC
  3 .comment      0000002b  0000000000000000  0000000000000000  00000060  2**0
                  CONTENTS, READONLY
  4 .note.GNU-stack 00000000  0000000000000000  0000000000000000  0000008b  2**0
                  CONTENTS, READONLY
  5 .note.gnu.property 00000020  0000000000000000  0000000000000000  00000090  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .eh_frame     00000038  0000000000000000  0000000000000000  000000b0  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA

Through the information from dump in the figure above, we mainly look at the "size" and "file off" sections. We can roughly put main O the ELF file structure is understood as shown in the following figure:

3. Analysis of various segments

Look at the main O information from objdump -s -d dump actually used

1.text code snippet
For the disassembly of code instructions, the hexadecimal in the content can correspond to the disassembly instructions.
2.data segment
. The data section holds the initialized global variables and local static variables.
For example, global_ init_ The hexadecimal value corresponding to var = 100 is 0x64, which can be found in the hexadecimal content of the data section
3.bss section
. The bss segment holds uninitialized global variables and local static variables
doubt:

You can see from the previous dump information The bss segment has only four bytes, but there is actually a global in the code_ unint_ VaR and static_var2 is only right for two uninitialized variables, which should account for 8 bytes?

So we analyze this problem through the symbol table:
objdump -x obtains the following results:

We found that static_var2 is bss segment, but global_uninit_var is in the COM section, which involves the related knowledge of links. I won't talk about it in detail here. I can give a preliminary conclusion:

  • The static variable visible inside the compilation unit will be placed in the if it is not initialized bss segment (such as static_var2 in the above example)
  • Uninitialized global variables will be treated as weak symbols because they involve the link mechanism, so they will be placed under the COMMON type

Keywords: Linux Operation & Maintenance bash

Added by triphis on Wed, 29 Dec 2021 00:22:59 +0200