[linux] process control

[linux] process control

01. Process number correlation function

  • getpid function
pid_t getpid(void);
Function:
    Get this process number( PID)
Parameters:
    nothing
 Return value:
    This process number
  • getppid function
pid_t getppid(void);
Function:
    Gets the parent process number of the process calling this function( PPID)
Parameters:
    nothing
 Return value:
    The parent process number of the process calling this function( PPID)
  • getpgid function
pid_t getpgid(pid_t pid);
Function:
    Get process group number( PGID)
Parameters:
    pid: Process number
 Return value:
    When the parameter is 0, the current process group number is returned; otherwise, the process group number of the process specified by the parameter is returned

02. Process creation

The system allows a process to create a new process. The new process is a sub process. The sub process can also create a new sub process to form a process tree structure model.

  • fork function
pid_t fork(void);
Function:
    It is used to create a new process from an existing process. The new process is called a child process and the original process is called a parent process.
Parameters:
    nothing
 Return value:
    Success: 0 is returned in the child process, and the child process is returned in the parent process ID. pid_t,Is an integer.
    Failed: Return-1. 
    The two main reasons for failure are:
        1)The current number of processes has reached the upper limit specified by the system errno The value of is set to EAGAIN. 
        2)The system is out of memory, and errno The value of is set to ENOMEM. 

03. Parent child relationship

The child process obtained by using the fork() function is a replica of the parent process. It inherits the address space of the whole process from the parent process: including process context (static description of the whole process of process execution activities), process stack, open file descriptor, signal control setting, process priority, process group number, etc.
Only its process number and timer are unique to the child process (only a small amount of information). Therefore, using the fork() function is expensive.

In short, after a process calls the fork() function, the system first allocates resources to the new process, such as the space to store data and code. Then copy all the values of the original process to the new process, and only a few values are different from those of the original process. It's equivalent to cloning yourself.
In fact, to be more precise, the use of fork() in Linux is realized through copy - on - with. Copy on write is a technology that can delay or even avoid copying data. At this time, the kernel does not copy the address space of the whole process, but allows the parent and child processes to share the same address space. The address space is copied only when it needs to be written, so that each has its own address space. In other words, resources are copied only when they need to be written. Before that, they can only be shared in a read-only manner.

Note: after fork, the parent and child processes share files. The file descriptor of the child process generated by fork is the same as that of the parent process, points to the same file table, the reference count increases, and the offset pointer of the shared file.

04. Distinguish between parent-child processes

A child process is a copy of the parent process. You can simply think that the code of the parent-child process is the same. Have you thought about it? In this case, what does the parent process do and what does the child process do (as in the above example), whether it can not meet our requirements for multitasking? Do we want to find a way to distinguish the parent-child process? This is through the return value of fork().
The fork() function is called once, but returns twice. The difference between the two returns is that the return value of the child process is 0, while the return value of the parent process is the process ID of the new child process.

Test procedure

int main()
{
    pid_t pid;
    pid = fork();
    if (pid < 0)
    {   // Not created successfully  
        perror("fork");
        return 0;
    }
    if (0 == pid)
    { // Subprocess  
        while (1)
        {
            printf("I am son\n");
            sleep(1);
        }
    }
    else if (pid > 0)
    { // Parent process  
        while (1)
        {
            printf("I am father\n");
            sleep(1);
        }
    }
    return 0;
}

Generally speaking, it is uncertain whether the parent process will execute first or the child process will execute first after fork(). This depends on the scheduling algorithm used by the kernel.
It should be noted that in the address space of the child process, the child process starts executing code after the fork() function.

05.GDB debugging multi process

When debugging with gdb, gdb can only track one process. You can set the gdb debugging tool to track the parent process or the child process through the instruction before the fork function is called. The parent process is tracked by default.

  • Set follow fork mode child sets gdb to track child processes after fork.
  • Set follow fork mode parent to track the parent process (default).

Note that the setting must be set before the fork function is called.

06. Process exit function

#include <stdlib.h>
void exit(int status);

#include <unistd.h>
void _exit(int status);
Function:
    End the process calling this function.
Parameters:
    status: The parameter returned to the parent process (the lower 8 bits are valid), and the number of this parameter can be filled in as needed.
Return value:
    nothing

The functions and usage of exit() and * * exit() are the same, except that the header files contained are different. The other difference is that exit() belongs to the standard library function and * * exit() belongs to the system call function.

07. Wait for the subprocess to exit the function

7.1 general

When each process exits, the kernel releases all the resources of the process, including open files, occupied memory, etc. However, certain information is still reserved for it, which mainly refers to the information of the process control block PCB (including process number, exit status, running time, etc.).
The parent process can get its exit status by calling wait or waitpid, and completely clear the process.
The functions of wait() and waitpid() are the same. The difference is that the wait() function will block. waitpid() can set no blocking. waitpid() can also specify which child process to wait for to end.

Header file

#include <sys/types.h>
#include <sys/wait.h>

Note: a wait or waitpid call can only clean up one sub process, and a loop should be used to clean up multiple sub processes.

7.1 wait function

pid_t wait(int *status);
Function:
    Wait for any child process to end. If any child process ends, this function will recycle the resources of the child process.
Parameters:
    status : Status information when the process exits.
Return value:
    Success: the process number of the child process has ended
    Failed: -1

The process that calls the wait() function will hang (block) until one of its child processes exits or receives a signal that cannot be ignored (equivalent to continuing to execute).

If the calling process has no child process, the function returns immediately; If its child process has ended, the function will also return immediately and reclaim the resources of the process that has already ended.
Therefore, the main function of the wait() function is to reclaim the resources of the finished child process.
If the value of the parameter status is not NULL, wait() will take out and store the state of the child process when it exits. This is an integer value (int), indicating whether the child process exits normally or ends abnormally.

This exit message contains multiple fields in an int. it is meaningless to use this value directly. We need to use macro definition to get each field

Macro functions can be divided into three groups

  • Disabled (status) is non-0 → the process ends normally
    WEXITSTATUS(status)
    If the above macro is true, use this macro → to obtain the process exit status (exit parameter)
  • WIFSIGNALED(status)
    Is not 0 → the process terminated abnormally WTERMSIG(status)
    If the above macro is true, use this macro → to obtain the number of the signal that terminates the process.
  • WIFSTOPPED(status)
    Is not 0 → the process is in a suspended state wSTOPSIG(status)
    If the above macro is true, use this macro → to obtain the number of the signal that pauses the process.
    wIFCONTINUED(status)
    True → the process has continued to run after being suspended

7.3 waitpid function

pid_t waitpid(pid_t pid, int *status, int options);
Function:
    Wait for the child process to terminate. If the child process terminates, this function will recycle the resources of the child process.

Parameters:
    pid : parameter pid There are several types of values:
      pid > 0  Waiting process ID be equal to pid Child process of.
      pid = 0  Wait for any child process in the same process group. If the child process has joined another process group, waitpid I won't wait for it.
      pid = -1 Wait for any child process, at this time waitpid and wait Same effect.
      pid < -1 Wait for any child process in the specified process group, and the ID be equal to pid Absolute value of.

    status : Status information when the process exits. and wait() Same usage.

    options : options Some additional options are provided to control waitpid(). 
            0: with wait(),Block the parent process and wait for the child process to exit.
            WNOHANG: If there is no blocking and there are no child processes that have ended, return immediately.
            WUNTRACED: If the child process is suspended, this function returns immediately and ignores the end state of the child process. (because it involves some knowledge of tracking and debugging, and it is rarely used)
                 
Return value:
    waitpid() Return value ratio of wait() Slightly more complicated, there are three situations:
        1) When returning normally, waitpid() Return the collected process number of the recycled sub process;
        2) If options are set WNOHANG,While calling waitpid() If it is found that there are no child processes that have exited to wait, 0 is returned;
        3) If there is an error in the call, return-1,At this time errno It will be set to the corresponding value to indicate the error, such as when pid The corresponding child process does not exist, or the process exists, but it is not a child process of the calling process, waitpid() Will return in error, and then errno Set to ECHILD;

08. Orphan process

A child process whose parent process is running but the child process is still running (not running) is called an orphan process.

Whenever an orphan process appears, the kernel sets the parent process of the orphan process to init, and the init process will wait() its exited child process cyclically. In this way, when an orphan process sadly ends its life cycle, init process will deal with all its aftermath on behalf of the party and the government. Therefore, the orphan process will not do any harm.

09. Zombie process

The process terminates, the parent process has not been recycled, and the child process residual resources (PCB) are stored in the kernel and become a zombie process.

This will lead to a problem. If the process does not call wait() or waitpid(), the retained information will not be released, and its process number will always be occupied, but the process number that the system can use is limited. If a large number of zombie processes are generated, the system will not generate new processes because there are no available process numbers, This is the harm of zombie process and should be avoided.

10. Process replacement

Under the Windows platform, we can run the executable program by double clicking to make the executable program become a process; On the Linux platform, we can make an executable program become a process by running.
However, if we are already running a program (process), how can we start an external program inside the process, and the kernel reads the external program into memory to make it execute into a process? Here we implement it through the exec function family.

The exec function family, as its name suggests, is a cluster of functions. In Linux, there is no exec() function. Exec refers to a group of functions, with a total of 6 functions:

#include <unistd.h>
extern char **environ;

int execl(const char *path, const char *arg, .../* (char  *) NULL */);
int execlp(const char *file, const char *arg, ... /* (char  *) NULL */);
int execle(const char *path, const char *arg, .../*, (char *) NULL, char * const envp[] */);
int execv(const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);
int execvpe(const char *file, char *const argv[], char *const envp[]);
int execve(const char *filename, char *const argv[], char *const envp[]);

Among them, only execve() is a real system call, and others are wrapped library functions on this basis.

The function of exec family is to find the executable file according to the specified file name or directory name and use it to replace the contents of the calling process. In other words, it is to execute an executable file inside the calling process.

When a process calls an exec function, the process is completely replaced by a new program, and the new program starts from its main function. Because calling exec does not create a new process, the process ID before and after (of course, the parent process number, process group number, current working directory...) has not changed. Exec just replaces the body, data, heap and stack segments of the current process with another new program (process replacement).

Instructions for exec function

The six functions of xec function family seem very complex, but in fact, they are very similar in function and usage, with only slight differences.


The exec function family is different from ordinary functions. The functions in the exec function family will not return after successful execution, and the code below the exec function family cannot be executed. Only when the call fails, they will return - 1, and then execute down from the call point of the original program.

See man page for more details

Keywords: Linux multiple processes

Added by simmsy on Sun, 30 Jan 2022 14:11:48 +0200