Explain Linux threads in detail

concept

  • A process is the smallest resource allocation unit of the operating system, while a thread is the smallest scheduling unit of the operating system. In other words, a program will create a process when it is running. The process has at least one thread, and the operating system schedules this thread to operate when scheduling. There can be multiple threads in a process, which share the address of the process. That is, when creating a thread, the operating system will not open up space for him to create page tables and other work. The thread belongs to a process. The process exits, and all threads in the process exit directly.
  • In Linux, there is no special description of the data structure of threads. In the view of Liunx, threads are lightweight processes (lwp), so each thread also has its own pcb. But this pcb is different from the process, only saving some things unique to the thread. Such as the private stack of each thread (saving local variables inside the thread), private register (saving thread context data) and local storage.

Create thread

pthread_create();

The first parameter: the identifier of the newly created thread, which is used for other operations, such as thread waiting.
The second parameter: the attribute information of the newly created thread. Generally, it is passed to NULL, which can be used by default.
The third parameter: function pointer, the function to be executed after the thread is created.
The fourth parameter: the parameter passed to the function to be executed by the thread.

#include <stdio.h>
#include <pthread.h>

void* run(void* arg){ //The return value of the function specified by the thread is void *, and only one parameter is also void*
    printf("i am a new thread!\n");
    return NULL;
}

int main(){
    pthread_t tid;
    pthread_create(&tid, NULL, run, NULL);
    return 0;
}

Note that when compiling this code, we must add - lpthread to specify the name of the library to be linked, because there is no library function for creating threads in the standard library of C language. The pthread functions we use are all third-party libraries, which are equivalent to the libraries written by ourselves. Therefore, we should add the library name and specify the link when compiling.

Now we have created a thread. The thread only prints one sentence, "i am a new thread!\n". Let's execute it now.

The fact is not what we think. The new thread doesn't print anything. This is because a process can have multiple threads, but one thread is called the main thread, that is, the thread that created the process, and all other threads are created by this thread.
Within a process, the main thread exits and the process exits
You can see. In the above example, the main thread does not do anything after creating a new thread and exits directly. Then the whole process exits before the newly created thread comes and prints, and the new thread exits.
Therefore, the main thread should not exit directly after creating a new thread. It should exit after the thread he created executes his task. This requires the functions described below. The thread waits.

Thread waiting

A thread is waiting not only to complete the task of changing the thread, but also to release the resources occupied by the thread.

  • The space of the thread that has exited is not released and is still in the address space of the process.
  • Creating a new thread does not reuse the address space of the thread that just exited

Function waiting for Thread:

pthread_join();


First parameter: the identifier of the thread to wait for
The second parameter: to wait for the return value of the thread, you can set it to NULL if you don't care

#include <stdio.h>
#include <pthread.h>

void* run(void* arg){
    printf("i am a new thread!\n");
    return NULL;
}

int main(){
    pthread_t tid;
    pthread_create(&tid, NULL, run, NULL);
    pthread_join(tid, NULL);
    return 0;
}

Here, our main thread waits for a new thread, so the main thread will wait for the new thread to exit before exiting. This function can also obtain the return value of the thread, and judge whether the thread completes the task through the return value.

Thread exit and return value

There are two ways for thread exit. The first is that the main thread calls the function to exit the thread executing the task, that is, the main thread can control other threads.

pthread_cancel();


Parameter: identifier of the thread

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

void* run(void* arg){
    while(1){
        printf("i am a new thread!\n");
        sleep(1);
    }
    return NULL;
}

int main(){
    pthread_t tid;
    pthread_create(&tid, NULL, run, NULL);
    int i = 0;
    for(; i < 5; i++){
        sleep(1);
    }
    pthread_cancel(tid);
    return 0;
}

Here, the main thread creates a new thread. The new thread prints every second. The main thread terminates the new thread after 5 seconds and the thread exits.

The second method is for the thread to exit by itself. The exit () function cannot be used to exit within the thread, because this is the function of process exit. If any thread calls this function, the whole process will exit directly. Exiting only one thread uses pthread_exit() function

pthread_exit();


Parameter: the return value of the thread.
Note: since the return value of the thread is of pointer type, the return value of each thread must be global or opened on the heap, not on the stack, because each thread has its own independent stack. The variables on the stack are temporary variables. After returning to the main thread, these variables are destroyed and cannot be read by the main thread.

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>


void* run(void* arg){
    printf("i am a new thread!\n");
    int *p = (int*)malloc(sizeof(int));
    *p = 10;
    pthread_exit((void*)p);
}

int main(){
    pthread_t tid;
    pthread_create(&tid, NULL, run, NULL);
    void* val;
    pthread_join(tid, &val);
    printf("return val is : %d\n",*(int*)val);
    free(val);
    return 0;
}

Dynamically apply for space in the thread and return to the main thread. The main thread reads the return value and then releases the space.

Thread separation

Using pthread_ When joining (), the main thread is in a blocked state. The purpose of waiting is to release the resources of the thread and obtain the return value of the thread. What if the main thread doesn't care about the return value of the thread? Can a thread release its own resources when it exits? In this way, the main thread does not need to block the waiting thread, which greatly improves the efficiency. The answer is yes.

Pthread_detatch();


Parameter: thread identifier

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>


void* run(void* arg){
    printf("i am a new thread!\n");
    return NULL;
}

int main(){
    pthread_t tid;
    pthread_create(&tid, NULL, run, NULL);
    void* val;
    pthread_detach(tid);
    return 0;
}

After thread separation, the main thread does not need to wait for other threads.

View commands for threads

ps -aL

Check how many lightweight processes (threads) the current system has

ps -T-p[pid]

Check the number of threads in a process

There are two threads in the test process.

Thread VS process

  • Process is the smallest unit of resources allocated by the operating system, and thread is the smallest scheduling unit of the operating system.
  • To create a process operating system, you need to batch physical addresses, create page tables, construct virtual addresses, etc., and you don't need to create threads.
  • All threads in a process share the resources of the process. Each process is independent of each other.
  • The efficiency of thread switching and communication is much higher than that of process.
  • If a thread in a process has an error, other threads in the process may also make an error. If a process has an error, other processes will not be affected.

Keywords: C C++ Linux Multithreading Operating System

Added by Norsk.Firefox on Fri, 18 Feb 2022 08:07:38 +0200