preface
This article continues Detail the process and decoupling from code to executable On this basis, make sure you are familiar with it before reading this article Detail the process and decoupling from code to executable The example program in this article is still gemfield c. The code is as follows:
int main(int argc,char *argv[]) { int a = 0; char gemfield[32]; printf("input gemfield's blog: "); scanf("%s",gemfield); printf("gemfield's blog is %s\n",gemfield); } ~
Step 1: compile.
gcc gemfield.c -o gemfield
Step 2: run.
./gemfield & Output:[1] 5500 input gemfield's blog:
Step 3: ps command
ps -e|grep gemfield Output: 5500 00:00:00 gemfield
Indicates that the gemfield process with process id 5500 has been generated.
Step 4: check the cmdline of the process gemfield and switch to the kernel image proc Directory:
cd /proc/5500 cat cmdline Output: ./gemfield
It is the parameters of program operation
Step 5: check the environment parameters of the process gemfield and switch to the kernel image proc Directory:
cd /proc/5500 cat environ Output: XDG_SESSION_ID=20468HOSTNAME=iZwz94wr80gpxpbjwp3i3tZTERM=xtermSHELL=/bin/bashHISTSIZE=100 0SSH_CLIENT=27.38.242.219 23432 22SSH_TTY=/dev/pts/0USER=rootLS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do= 01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30 ;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;3 1:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*. t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*. lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01; 31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*. ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=0 1;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35: *.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.p cx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm= 01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35 :*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl =01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35 :*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01; 36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:MAIL=/var/spool/mail/rootPATH=/usr/local/clang+ll vm-9.0.0-x86_64-linux-sles11.3/bin:/usr/local/clang+llvm-9.0.0-x86_64-linux- sles11.3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/lib/jvm/java-1.8.0- openjdk-1.8.0.252.b09- 2.el7_8.x86_64/bin:/root/bin:/root/bin:/usr/local/python3/bin:/opt/rh/devtoolset- 9/root/bin:/root/bin:/root/bin:/usr/local/python3/binPWD=/homeJAVA_HOME=/usr/lib/jvm/java -1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64LANG=en_US.UTF- 8HISTCONTROL=ignoredupsSHLVL=1HOME=/rootLOGNAME=rootCLASSPATH=.:/usr/lib/jvm/java-1.8.0- openjdk-1.8.0.252.b09-2.el7_8.x86_64/jre/lib/rt.jar:/usr/lib/jvm/java-1.8.0-openjdk- 1.8.0.252.b09-2.el7_8.x86_64/lib/dt.jar:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09- 2.el7_8.x86_64/lib/tools.jarSSH_CONNECTION=27.38.242.219 23432 172.18.226.139 22LESSOPEN=||/usr/bin/lesspipe.sh %sXDG_RUNTIME_DIR=/run/user/0CMAKE_ROOT=/home/cmake- 3.15.4-Linux-x86_64_=./gemfieldOLDPWD=/root[root@iZ
Step 6: check the files used by the process gemfield and switch to the kernel image proc Directory:
cd /proc/5500 ls fd Output: 0 1 2 description current gemfield The process only uses standard input, output and error.
Step 7: check the io used by the process gemfield and switch to the kernel image proc Directory:
cd /proc/5500 cat io Output: rchar: 2012 wchar: 25 syscr: 8 syscw: 1 read_bytes: 0 write_bytes: 0 cancelled_write_bytes: 0
Step 8: check the memory mapping of the process gemfield and switch to the kernel image proc Directory:
cd /proc/5500 cat maps output 00400000-00401000 r-xp 00000000 fd:01 1060081 /home/gemfield 00600000-00601000 r--p 00000000 fd:01 1060081 /home/gemfield 00601000-00602000 rw-p 00001000 fd:01 1060081 /home/gemfield 7f77c3ff2000-7f77c41b4000 r-xp 00000000 fd:01 658657 /usr/lib64/libc-2.17.so 7f77c41b4000-7f77c43b4000 ---p 001c2000 fd:01 658657 /usr/lib64/libc-2.17.so 7f77c43b4000-7f77c43b8000 r--p 001c2000 fd:01 658657 /usr/lib64/libc-2.17.so 7f77c43b8000-7f77c43ba000 rw-p 001c6000 fd:01 658657 /usr/lib64/libc-2.17.so 7f77c43ba000-7f77c43bf000 rw-p 00000000 00:00 0 7f77c43bf000-7f77c43e1000 r-xp 00000000 fd:01 658372 /usr/lib64/ld-2.17.so 7f77c45d0000-7f77c45d3000 rw-p 00000000 00:00 0 7f77c45dd000-7f77c45e0000 rw-p 00000000 00:00 0 7f77c45e0000-7f77c45e1000 r--p 00021000 fd:01 658372 /usr/lib64/ld-2.17.so 7f77c45e1000-7f77c45e2000 rw-p 00022000 fd:01 658372 /usr/lib64/ld-2.17.so 7f77c45e2000-7f77c45e3000 rw-p 00000000 00:00 0 7ffc60a37000-7ffc60a58000 rw-p 00000000 00:00 0 [stack] 7ffc60b7b000-7ffc60b7d000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Step 9: check the status of the process gemfield and switch to the kernel image proc Directory:
cd /proc/5500 cat status Output: Name: gemfield Umask: 0022 State: T (stopped) Tgid: 5500 Ngid: 0 Pid: 5500 PPid: 5286 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 FDSize: 256 Groups: 0 VmPeak: 4256 kB VmSize: 4216 kB VmLck: 0 kB VmPin: 0 kB VmHWM: 356 kB VmRSS: 356 kB RssAnon: 76 kB RssFile: 280 kB RssShmem: 0 kB VmData: 56 kB VmStk: 132 kB VmExe: 4 kB VmLib: 1936 kB VmPTE: 28 kB VmSwap: 0 kB Threads: 1 SigQ: 0/15076 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000000000 SigCgt: 0000000000000000 CapInh: 0000000000000000 CapPrm: 0000001fffffffff CapEff: 0000001fffffffff CapBnd: 0000001fffffffff CapAmb: 0000000000000000 Seccomp: 0 Speculation_Store_Bypass: vulnerable Cpus_allowed: 1 Cpus_allowed_list: 0 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000 ,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000000 0,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000 00,00000000,00000001 Mems_allowed_list: 0 voluntary_ctxt_switches: 1 nonvoluntary_ctxt_switches: 0
Step 10 check the scheduling information of the process gemfield and switch to the kernel image proc Directory:
cd /proc/5500 cat sched
When viewing this file, it is defined in kernel / sched_ debug. Proc in C_ sched_ show_ The task () function will be called.
output
gemfield (5500, #threads: 1) ------------------------------------------------------------------- se.exec_start : 8106596666.907056 se.vruntime : 118833600.764166 se.sum_exec_runtime : 0.696281 se.nr_migrations : 0 nr_switches : 1 nr_voluntary_switches : 1 nr_involuntary_switches : 0 se.load.weight : 1024 policy : 0 prio : 120 clock-delta : 41 mm->numa_scan_seq : 0 numa_migrations, 0 numa_faults_memory, 0, 0, 1, 0, -1 numa_faults_memory, 1, 0, 0, 0, -1
Step 11 rough description
We can also view the partition information used by the process gemifeld from / proc/mount * / proc/net / * to view the network devices used by the process gemfield
Let's put the steps first. What is the process?
A process is an instance of a program running on a computer system, which manages various resources allocated to it by the computer system, such as:
- A memory image of the runnable machine code of a program;
- Allocated memory, including runnable code, process specific data (input and output), heap and stack (used to save data generated during transportation during operation);
- Operating system descriptors of resources allocated to the process, such as file descriptors (Unix terminology) or file handles (Windows), data sources and data terminals;
- Security features, such as the process owner and the permission set of the process (allowable operations);
- Processor status (context), such as register contents, etc. When a process is running, the state is usually stored in the registers of the cpu, otherwise in memory.
Let's first understand why this article spends a lot of money to introduce the following contents of / proc/pid. After establishing an intuitive impression, let's go deep into the Linux kernel to see the process. In the kernel, the process is composed of a process called task_struct structure to maintain.
Step 12 analyze the task_struct structure.
Let's guess what this structure should have. From the previous eleven steps, it should have the following contents:
1. Process id or something;
2. The status of the process, waiting, running, or deadlock;
3. The process family, like the vampire family, has vampire ancestors, descendants and Zombies;
4. Memory mapping of processes;
5. Time slice information for cpu scheduling;
6. File descriptor used;
7. The context of the processor register.
Let's actually look at this structure in / include / Linux / sched Task in H_ Definition of struct:
struct task_struct { volatile long state;//In the running state, - 1 cannot run, 0 can run, and > 0 has stopped unsigned long flags; //Flag is the current process flag, including being created, being ready to exit, being fork ed out but not executing exec, and being killed because other processes send relevant signals int sigpending; //Are there any pending signals on the process mm_segment_t addr_limit; volatile long need_resched;//The scheduling flag indicates whether the process needs to be rescheduled int lock_depth; //Lock depth long nice; //Basic time slice of the process unsigned long policy;//There are three scheduling strategies, real-time process: SCHED_FIFO,SCHED_RR, time sharing process: SCHED_OTHER struct mm_struct *mm; // Process memory information int processor; unsigned long cpus_runnable, cpus_allowed;//If the process is not running on any CPU, CPUs_ The value of runnable is 0, otherwise it is 1. This value is updated when the running queue is locked struct list_head run_list; //Pointer to the run queue unsigned long sleep_time; //Sleep time of the process struct task_struct *next_task, *prev_task;//It is used to connect all processes in the system into a two-way circular linked list, and its root is init_task struct mm_struct *active_mm; struct list_head local_pages;//Point to local page unsigned int allocation_order, nr_local_pages; struct linux_binfmt *binfmt;//The format of the executable file in which the process is running int exit_code, exit_signal; int pdeath_signal; //The termination of a parent process is a signal sent to a child process unsigned long personality; int did_exec:1; pid_t pid; //Process number pid_t pgrp; //Process group ID, indicating the process group to which the process belongs pid_t tty_old_pgrp; //Group ID of the process control terminal pid_t session; //Session ID of the process pid_t tgid;//Process group number int leader; //Indicates whether the process is a session supervisor struct task_struct *p_opptr,*p_pptr,*p_cptr,*p_ysptr,*p_osptr; struct list_head thread_group; //Thread linked list struct task_struct *pidhash_next; //Used to chain processes into the HASH table struct task_struct **pidhash_pprev; wait_queue_head_t wait_chldexit; //For wait4() struct completion *vfork_done; //For vfork() unsigned long rt_priority; //Real time priority, which is used to calculate the weight value of real-time process scheduling unsigned long it_real_value, it_prof_value, it_virt_value; unsigned long it_real_incr, it_prof_incr, it_virt_value; struct timer_list real_timer; //Pointer to real-time timer struct tms times; //Record the time consumed by the process unsigned long start_time; //Time the process was created long per_cpu_utime[NR_CPUS], per_cpu_stime[NR_CPUS]; unsigned long min_flt, maj_flt, nswap, cmin_flt, cmaj_flt, cnswap;//See below int swappable:1; //Indicates whether the virtual address space of the process is allowed to be swapped out uid_t uid,euid,suid,fsuid; gid_t gid,egid,sgid,fsgid; int ngroups; //Record how many user groups the process is in gid_t groups[NGROUPS]; //Record the group in which the process is located kernel_cap_t cap_effective, cap_inheritable, cap_permitted; int keep_capabilities:1; struct user_struct *user; struct rlimit rlim[RLIM_NLIMITS]; //Process related resource restriction information unsigned short used_math; //Use FPU char comm[16]; //The name of the executable that the process is running int link_count, total_link_count; struct tty_struct *tty;//NULL if no tty unsigned int locks; struct sem_undo *semundo; //All undo operations of the process on the semaphore struct sem_queue *semsleeping; //When a process is suspended due to a semaphore operation, it records the waiting operation in the queue struct thread_struct thread;//For the context of storing cpu registers, refer to the following. struct fs_struct *fs; //file system information struct files_struct *files;//Open file information spinlock_t sigmask_lock; //Signal processing function struct signal_struct *sig; //Signal processing function sigset_t blocked; //The signal that the process is currently blocking. Each signal corresponds to one bit struct sigpending pending; //Are there any pending signals on the process unsigned long sas_ss_sp; size_t sas_ss_size; int (*notifier)(void *priv); void *notifier_data; sigset_t *notifier_mask; u32 parent_exec_id; u32 self_exec_id; spinlock_t alloc_lock; void *journal_info; };
Missing pages in memory and exchanging information: min_flt, maj_flt accumulated the number of secondary missing pages (Copy on Write page and anonymous page) and the number of primary missing pages (the number of pages read from the mapping file or switching device); nswap records the cumulative number of pages swapped out by the process, that is, the number of pages written to the switching device. cmin_ flt, cmaj_ FLT and cnswap record the cumulative number of secondary missing pages, main missing pages and swapped out pages of all descendant processes of the ancestor. When the parent process reclaims the terminated child process, the parent process will accumulate the information of the child process into these fields of its own structure.
struct thread_struct { struct desc_struct tls_array[GDT_ENTRY_TLS_ENTRIES]; unsigned long esp0; unsigned long sysenter_cs; unsigned long eip; unsigned long esp; unsigned long fs; unsigned long gs; unsigned long debugreg[8]; //Debug related register contents unsigned long cr2, trap_no, error_code; union i387_union i387; //Save the contents of the registers related to the mathematical coprocessor struct vm86_struct __user * vm86_info; unsigned long screen_bitmap; unsigned long v86flags, v86mask, saved_esp0; unsigned int saved_fs, saved_gs; unsigned long *io_bitmap_ptr; //Saves the I/O permission bitmap of the current process unsigned long io_bitmap_max; };
Step 13 sail from ELF - fork
Gemfield After C is compiled into gemfield/ How does gemfield load the ELF file gemfield into memory?
1. First of all, you will use the fork system call whether you have a child process in the program or start the program in the shell
2. The process of fork function call is roughly as follows: program call fork () – > library function fork () – > system call (fork function number) – > the function number is displayed in sys_ call_ Sys not found in table []_ Fork() function address – > call sys_fork()–>do_ Fork(), which completes the change process from user state to kernel state.
3,do_ Implementation of fork:
p = copy_process(clone_flags, stack_start, regs, stack_size,child_tidptr, NULL, trace); wake_up_new_task(p, clone_flags);
The first step is to call copy_process function to copy a process and set the corresponding flag bit. Next, if copy_ If the process call is successful, the system will deliberately let the newly opened process run, because the child process will generally immediately call the exec() function to perform other tasks.
copy_ Implementation of process:
1,p = dup_task_struct(current);
2. Create a kernel stack and thread for the new process_ Iofo and task_ Struct, where the content of the parent process is completely copied (equivalent to a complete copy of the task_struct above). So far, there is no difference between the parent process and the child process.
3. Check whether the number of all processes has exceeded the maximum number of processes specified by the system. If not, start setting the initial value in the process descriptor. From this point on, the parent process and child process begin to distinguish.
4. Set the status of the subprocess as non taskable_ Uninterruptible, so as to ensure that the process can not be put into operation now, because there are many flag bits, data, etc. that have not been set.
5. Copy flag bit (falls member), permission bit (PE_SUPERPRIV) and some other flags.
6. Call get_pid() gets a valid and unique process identifier PID for the child process.
7. According to the incoming cloning flags (Reference: http://civilnet.cn/bbs/topicno/71163 )copy the corresponding content. For example, open file symbols, signals, etc.
8. The parent-child process divides the remaining time slice of the parent process equally.
9,return p; Returns a pointer to the child process.
Step 14: exec() function family
The fork in step 13 has produced a new process in the system, but it is basically useless; Because the logic and data of the new process are still in the gemfield binary file, here must be how to fill the ELF file of gemfield into the new task_struct. This is the function of the exec() family.
1. Each function of the exec family will eventually call execve() in the c function library. The prototype of this function is as follows: int execve(const char * filename,char * const argv[],char * const envp []); It can be seen that it accepts three parameters: program file name, program parameters and program environment variables, which are mentioned in steps 1 ~ 13; At the same time, the fact that the first parameter is the program file name tells us that the gemfield executable will be loaded by this function;
2. execve() uses the system call sys_execve(), which checks the parameters before using do_execve() system call;
3,do_execve() looks for the gemfield program according to the passed parameters.
Step 15: do_execve() system call
1. Open up a Linux_ The binprm structure (in / usr/src/linux/include/linux/binfmts.h) is filled according to the gemfield binary file;
struct linux_binprm{ char buf[BINPRM_BUF_SIZE]; struct page * page[MAX_ARG_PAGES];//#ifdef __KERNEL__#define MAX_ARG_PAGES 32 struct mm_struct * mm; unsigned long p; //current top of mem int sh_bang; struct file * file; int e_uid, e_gid; kernel_cap_t cap_inheritable, cap_permitted, cap_effective; void * security; int argc, envc; char * filename; //Name of binary as seen by procps char * interp; //Name of the binary really executed unsigned interp_flags; unsigned interp_data; unsigned long loader, exec; };
2. Call path_lookup(), dentry_open(), and path_release() to obtain dentry object, file object and inode object related to gemfield file
3. By viewing the I of the inode structure_ Write count field to check that the gemfield is not being written, and then in i_writecount is stored in - 1 to prevent other writes;
4. In multi cpu system, through sched_exec() to determine which cpu to use to execute gemfield;
5. Call init_new_context() to determine whether the current process is using the custom Local Descriptor Table. If so, this function allocates and fills a new LDT for gemfield;
6. Call prepare_binprm() function to populate linux_binprm data structure:
*Check whether gemfield is executable* Initialize Linux_ E of binprm structure_ Uid and e_gid domain* Fill Linux with the first 128 bytes of gemfield_ buf domain of binprm structure These 128 bytes contain magic numbers and other information to identify the executable file (Reference: from code to executable file).
7. Copy the gemfield file pathname, command line parameters, and environment parameters to one or more newly assigned page frames (they will eventually be assigned to user space);
8. Call search_binary_handler() function, which scans the linked list of executable format to determine whether there is load in ELF format such as gemfield_ Binary function, if found, will be Linux_ The binprm structure is passed to load_binary function, and finally release linux_binprm data structure;
Step 16 load_binary work load_ The binary function is implemented as follows
1. Judge whether the magic number of gemfield matches;
2. Through kernel_read() reads the ELF header of gemfield. The ELF header contains the program segment and shared library information. The code is as follows:
size = loc->elf_ex.e_phnum * sizeof(struct elf_phdr); retval = -ENOMEM; elf_phdata = (struct elf_phdr *) kmalloc(size, GFP_KERNEL); if (!elf_phdata) goto out; retval = kernel_read(bprm->file, loc->elf_ex.e_phoff, (char *) elf_phdata, size); ......files = current->files; ......retval = get_unused_fd(); ... ...get_file(bprm->file); fd_install(elf_exec_fileno = retval, bprm->file); elf_ppnt = elf_phdata; elf_bss = 0; elf_brk = 0; start_code = ~0UL; end_code = 0; start_data = 0; end_data = 0;
The open gemfield image file is also allocated another table entry in the open file table of the current process, which is similar to the execution of dup() once. The purpose is to maintain two different contexts for gemfield so that it can be read from different locations;
Then to elf_bss ,elf_brk,start_code,end_ Initialization of code and other variables. These variables record the BSS segment, code segment, data segment of the current (up to now) target image and the location of the dynamically allocated "heap" in user space. Except start_ Except that the initial value of code is 0xffffffff, the rest are 0. As the image contents are loaded, these variables will be adjusted gradually.
3. Get the path name of dynamic linker (such as / lib / LD Linux. So. 2), and dynamic linker will map the shared library to memory; The binary image in ELF format needs the assistance of a tool software in the process of loading and starting. Its main purpose is to establish a dynamic connection with the shared library for the target image. This tool is called "dynamic linker". What dynamic linker an elf image needs to use when loading is determined by compiling / connecting. This information is stored in the "interpreter" part of the image. The type of "interpreter" part is PT_INTERP, once found, according to its location p_offset and size p_filesz reads the entire "dynamic linker" into the buffer. The whole "interpreter" part is actually just a string, that is, the file name of the interpreter, such as "/ lib / LD Linux. So. 2". Once you have the file name of the interpreter, you can open it_ Exec () opens this file, and then through the kernel_read() reads its first 128 bytes, which is the head of the image. The early dynamic linker images were in a.out format, but now they are all in ELF format, / lib / LD Linux so. 2 is an elf image.
4. Get dentry object, inode object and file object of dynamic linker;
5. Check the execution authority of dynamic linker;
6. Copy the first 128 bytes of dynamic linker into a buffer;
7. Implement some consistency checks of dynamic linker types;
8. So far, we are ready to load the target image and dynamic linker image. The current process (thread) can be separated from its parent process, transformed into a real process and go its own way:
retval = flush_old_exec(bprm); ... .../* OK, This is the point of no return */ current->mm->start_data = 0; current->mm->end_data = 0; current->mm->end_code = 0; current->mm->mmap = NULL; current->flags &= ~PF_FORKNOEXEC; current->mm->def_flags = def_flags; ... ...retval = setup_arg_pages(bprm, randomize_stack_top(STACK_TOP), executable_stack);
Call flush_ old_ The exec() function releases the previously used resources and uses flush_ old_ The exec() function performs the following steps:
a. If the signal processing table is shared with other processes, this function allocates a new table and decrements the relevant counter of the old process by one; Also, it separates from the old process group, all by calling De_ Completed by thread() function;
b. Call unshare_files() to copy a copy of files_struct structure, which contains the information of the file opened by the process;
c. Call Exec_ The MMAP () function releases the memory descriptor, all memory areas, and all page frames allocated to the gemfield process, and clears the page table of the process;
d. Set the comm field in the gemfield process descriptor as the pathname of the gemfield executable file;
e. Call flush_ The value of floating-point register and debug register saved in TSS segment by thread() function is cleared;
f. By calling flush_ signal_ Use the handlers() function to reset the signal processing table to the default value;
g. Call flush_ old_ The files() function closes all the files of "files - > close_on_exec domain enable in the process descriptor";
Through the above steps, flush_old_exec() frees up all the pages in the user space of the current process. In this way, the user space of the current process is new. Next, we need to rebuild the mapping of user space. User space stack is necessary for a new image to run, so first draw a virtual address interval in user space for stack. Further, when the CPU enters the program entry of the new image, there should be argc, argv [], envc, envp [], and other parameters on the stack. These parameters come from the old program and need to be passed to the new image through the stack. In fact, argv [] and envp [] contain some string pointers. It is meaningless to pass the pointer to the new image without passing the corresponding string to the new image. To do this, enter search_binary_handler() to enter load_ elf_ Before binary(), do_execve() has assigned several pages to these strings, and through copy_strings() copies these strings from user space into these pages. Now we need to map these pages back to the user space (on different addresses, of course). This is the setup here_ arg_ What pages () does. The addresses mapped to these pages are at the top of the user space stack. For x86 processors, the user space stack extends downward from the 3GB boundary. The first is the page where these strings are stored, and then down is the real user space stack. The argc and argv [] parameters are on the user space stack in the real sense.
Now you can mount the new image. The so-called "loading" is actually mapping the (part) content of the image to some sections of the user (virtual address) space. Under the action of MMU's swap mechanism, this process does not even need to really read the contents of the image into the physical page, but leave the actual reading to be interrupted in the future.
for(i = 0, elf_ppnt = elf_phdata; i < loc->elf_ex.e_phnum; i++, elf_ppnt++) { int elf_prot = 0, elf_flags; unsigned long k, vaddr; if (elf_ppnt->p_type != PT_LOAD) continue; ... ... vaddr = elf_ppnt->p_vaddr; if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) { elf_flags |= MAP_FIXED; } else if (loc->elf_ex.e_type == ET_DYN) { /* Try and get dynamic programs out of the way of the default mmap base, as well as whatever program they might try to exec. This is because the brk will follow the loader, and is not movable. */ load_bias = ELF_PAGESTART(ELF_ET_DYN_BASE - vaddr); } error = elf_map(bprm->file, load_bias + vaddr, elf_ppnt, elf_prot, elf_flags); ...... if (!load_addr_set) { load_addr_set = 1; load_addr = (elf_ppnt->p_vaddr - elf_ppnt->p_offset); if (loc->elf_ex.e_type == ET_DYN) { load_bias += error - ELF_PAGESTART(load_bias + vaddr); load_addr += load_bias; reloc_func_desc = load_bias; } } k = elf_ppnt->p_vaddr; if (k < start_code) start_code = k; if (start_data < k) start_data = k; ..... k = elf_ppnt->p_vaddr + elf_ppnt->p_filesz; if (k > elf_bss) elf_bss = k; if ((elf_ppnt->p_flags & PF_X) && end_code < k) end_code = k; if (end_data < k) end_data = k; k = elf_ppnt->p_vaddr + elf_ppnt->p_memsz; if (k > elf_brk) elf_brk = k; } //end for() loop loc->elf_ex.e_entry += load_bias; elf_bss += load_bias; elf_brk += load_bias; start_code += load_bias; end_code += load_bias; start_data += load_bias; end_data += load_bias; /* Calling set_brk effectively mmaps the pages that we need * for the bss and break sections. We must do this before * mapping in the interpreter, to make sure it doesn't wind * up getting placed where the bss needs to go. */ retval = set_brk(elf_bss, elf_brk); ......
9. PF in gemfield process descriptor_ The forknoxec flag is cleared. This flag is set when the gemfield process is forked, and is cleared after the execution of new code;
10. Set the personality field in the process descriptor to a new value;
11. Call arch_pick_mmap_layout() to select the memory area layout of gemfield process;
12. Call setup_ arg_ The pages () function allocates a new memory region descriptor for the empty space stack of the gemfield process, and inserts the memory region into the address space of the gemfield process; setup_arg_pages() assigns page frames containing both command line parameters and environment parameters to this new memory region;
13. Call do_mmap() function to create a new memory region and map it to the code segment of gemfield executable file. The initial linear address of this memory region depends on the format of the executable file, because the code segment of the program usually cannot be relocated, so do_mmap() considers that the code segment starts from a specific address, and ELF format (like gemfield) starts from linear address 0 × Starting at 08048000;
14. Call do_mmap() function to create a new memory region and map it to the data segment of gemfield executable file. The initial linear address of this memory region depends on the format of the executable file, because the executable code expects to find the required variables from a specific offset. The data segment in ELF format (like gemfield) is loaded just after the code segment;
15. Allocate additional memory regions for other loadable segments of the gemfield executable;
16. Call load_ elf_ The interp () function loads the dynamic linker. Generally, this step is similar to steps 12-14. In order to prevent memory conflicts with executable files such as gemfield, the initial address of the dynamic linker is 0 × Above 40000000, the code is as follows:
if (elf_interpreter) { if (interpreter_type == INTERPRETER_AOUT) elf_entry = load_aout_interp(&loc->interp_ex, interpreter); else elf_entry = load_elf_interp(&loc->interp_elf_ex, interpreter, &interp_load_addr); ...... reloc_func_desc = interp_load_addr; allow_write_access(interpreter); fput(interpreter); kfree(elf_interpreter); } else { elf_entry = loc->elf_ex.e_entry; }
If dynamic linker needs to be loaded and the image of dynamic linker is in ELF format, load_elf_interp() loads its image and sets the entry address to load when entering user space in the future_ elf_ The return value of interp (), which is obviously the program entry of dynamic linker; If dynamic linker is not loaded, this address is the program entry of the target image itself.
17,Stores in the binfmt field of the process descriptor the address of the linux_binfmt object of the executable format.
18. Determine the new capabilities of gemfield process;
19. Create a specific dynamic link table and put it in the middle of the command line parameter of the user state stack and the environment string pointer array
20. Set start in the memory descriptor of the gemfield process_ code, end_ code, start_ data, end_ data, start_ brk, brk, start_ Stack the values of these fields;
struct mm_struct { struct vm_area_struct * mmap;//Point to virtual interval (VMA) linked list rb_root_t mm_rb; //Point to red_black tree struct vm_area_struct * mmap_cache; //Point to the nearest virtual interval pgd_t * pgd; //Page directory pointing to the process atomic_t mm_users;//How many users are there in the user space atomic_t mm_count; //Pair "struct mm_ How many references does "struct" have int map_count; //Number of virtual intervals struct rw_semaphore mmap_sem; spinlock_t page_table_lock;//Protection task page table and mm - > RSS struct list_head mmlist; //All active mm linked lists unsigned long start_code, end_code, start_data, end_data; unsigned long start_brk, brk, start_stack; unsigned long arg_start, arg_end, env_start, env_end; unsigned long rss, total_vm, locked_vm; unsigned long def_flags; unsigned long cpu_vm_mask; unsigned long swap_address; unsigned dumpable:1; /* Architecture-specific MM context */ mm_context_t context; };
21. Call do_brk() function to create a new anonymous memory region and map it to the bss section of gemfield file. When gemfield writes a variable, it will trigger page missing interrupt, resulting in the allocation of a page frame; The size of the new memory region is calculated when the program is linked, and the initial linear address must be specified. In ELF format programs (such as gemfield), bss is loaded after the data segment;
22. Call start_thread() macro to set the values of user state registers eip and esp, which are stored on the kernel state stack so that they point to the entry point of dynamic linker and the vertex of the new user state stack respectively; start_thread() is a macro operation, which is defined as follows:
#define start_thread(regs, new_eip, new_esp) do { __asm__("movl %0,%%fs ; movl %0,%%gs": :"r" (0)); set_fs(USER_DS); regs->xds = __USER_DS; regs->xes = __USER_DS; regs->xss = __USER_DS; regs->xcs = __USER_CS; regs->eip = new_eip; regs->esp = new_esp; } while (0)
These instructions set the user space program entry and stack pointer passed down as parameters to the regs data structure, which is actually in the system stack and is saved by the current process when it enters the kernel through system call_ All, and the pointer regs to the saved field is passed to sys as a parameter_ Execve() and passed it down layer by layer. Changing the eip and esp in the saved scene to a new address makes the CPU enter a new program entry when returning to the user space. If a dynamic linker image exists, this is the program entry of the dynamic linker image, otherwise it is the program entry of the target image.
23. If the gemfield process is tracked, it will notify the debugger that the execve() system call has been completed;
24. If successful, return zero.
Step 17: return to user status
After returning to the user state, because the value of EIP register has been set as the entry point of dynamic linker in 22, the program starts from dynamic linker, and finally the new process starts running.