[pooling technology] thread pool technology principle and C language implementation

1, Basic concepts

Before talking about thread pool technology, we first explain some basic concepts in the operating system, such as process, thread, thread creation and destruction.

Processes and threads

process
An application running in memory. Each process has its own independent memory space. A process can have multiple threads. For example, in Windows system, a running XX Exe is a process.

thread
An execution task (control unit) in a process that is responsible for the execution of programs in the current process. A process has at least one thread. A process can run multiple threads, and multiple threads can share data.

Different from the process, multiple threads of the same kind share the heap and method area resources of the process, but each thread has its own program counter, virtual machine stack and local method stack. Therefore, when the system generates a thread or switches between threads, the burden is much smaller than that of the process. For this reason, threads are also called lightweight processes.

Java programs are inherently multithreaded programs. We can see what threads an ordinary Java program has through JMX. The code is as follows.

public class MultiThread {
	public static void main(String[] args) {
		// Get Java thread management MXBean
		ThreadMXBean threadMXBean = ManagementFactory.getThreadMXBean();
		// It is not necessary to obtain the synchronized monitor and synchronizer information, but only the thread and thread stack information
		ThreadInfo[] threadInfos = threadMXBean.dumpAllThreads(false, false);
		// Traverse the thread information, and only print the thread ID and thread name information
		for (ThreadInfo threadInfo : threadInfos) {
			System.out.println("[" + threadInfo.getThreadId() + "] " + threadInfo.getThreadName());
		}
	}
}

The output of the above program is as follows (the output content may be different. Don't worry about the role of each thread below. Just know that the main thread executes the main method):

[6] Monitor Ctrl break / / threads listening for thread dump or thread stack trace
[5] Attach Listener / / is responsible for receiving an external command, executing the command, and returning the result to the sender
[4] Signal Dispatcher / / distributes the threads that process signals to the JVM
[3] Finalizer / / the thread of calling the object finalize method before garbage collection.
[2] Reference Handler / / the thread used to handle garbage collection of the reference object itself (soft reference, weak reference, virtual reference)
[1] Main / / mainthread, program entry

From the above output, we can see that the running of a Java program is the main thread and multiple other threads running at the same time.

Summary of differences between process and thread
Threads have many characteristics of traditional processes, so they are also called Light weight process or process element; The traditional process is called Heavy weight process, which is equivalent to a task with only one thread. In the operating system with threads, usually a process has several threads, including at least one thread.

Fundamental difference: process is the basic unit of operating system resource allocation, while thread is the basic unit of processor task scheduling and execution

Resource overhead: each process has independent code and data space (program context), and switching between programs will have a large overhead; Threads can be regarded as lightweight processes. The same kind of threads share code and data space. Each thread has its own independent running stack and program counter (PC). The switching overhead between threads is small.

Inclusion relationship: if there are multiple threads in a process, the execution process is not one line, but completed by multiple lines (threads); Threads are part of a process, so threads are also called lightweight processes or lightweight processes.

Memory allocation: threads of the same process share the address space and resources of the process, and the address space and resources between processes are independent of each other

Impact relationship: after a process crashes, it will not affect other processes in protected mode, but a thread crashes and the whole process dies. So multiprocessing is more robust than multithreading.

Execution process: each independent process has program entry, sequential execution sequence and program exit. However, threads cannot be executed independently. They must be stored in the application. The application provides multiple thread execution control, and both can be executed concurrently

The heap and method area are resources shared by all threads. The heap is the largest memory in the process, which is mainly used to store newly created objects (all objects allocate memory here). The method area is mainly used to store loaded class information, constants, static variables, code compiled by the real-time compiler and other data.

Difference between multi process and multi thread

Multiprocess: multiple programs running simultaneously in an operating system

Multithreading: multiple tasks running simultaneously in the same process

For example, multi-threaded software download can run multiple threads at the same time, but it is found that the results are inconsistent every time. Because multithreading has a characteristic: randomness. Cause: the CPU is constantly switching to deal with various threads in an instant. It can be understood that multiple threads are scrambling for CPU resources.

Multithreading improves CPU utilization

Multithreading can not improve the running speed, but it can improve the running efficiency and make the CPU utilization higher. However, if multithreading has security problems or frequent context switching, the operation speed may be lower.

2, Thread pool technology

Thread pool technology can ensure that any thread created is in a busy state without frequently creating and destroying threads for a task, because the system consumes a lot of cpu resources when creating and destroying threads. If there are many tasks and the frequency is very high, and the thread is started and then cancelled for a single task, the efficiency is quite low. Thread pool technology It is used to solve such an application scenario.

When do I need to create a thread pool? In short, if an application needs to create and destroy threads frequently and the task execution time is very short, the overhead of thread creation and destruction can not be ignored. This is also an opportunity for the thread pool to appear. If the task execution time T2 is negligible compared with the thread creation time T1 and the destruction time T3, there is no need to use the thread pool. On the contrary, if T1 + T3 > T2, it is necessary to use thread pool.

Working principle of thread pool technology: at the beginning, a certain number of threads are created in the form of queues, and a work queue is allocated to them. When the work queue is empty, it means that there are no tasks. At this time, all threads hang up and wait for new work. When a new job comes, Thread queue The head starts to execute the task, and then the second and third threads execute the new task in turn. When a thread finishes processing the task, the thread immediately starts to accept task assignment, so as to keep all threads busy and improve the efficiency of parallel processing.

Thread pool technology is a typical producer consumer model. Therefore, no matter which language is used, it can work well as long as it follows its principle. So what technical issues do we need to consider to implement the thread pool technology?

Implementation of C language thread pool technology:

The first technical problem to consider is which member variables should be included in the thread pool.

Since a certain number of threads must be opened, this "max_thread_num" must be a member of the thread pool.

How to indicate whether a thread pool has been closed? If it is closed, resources must be released immediately. So "shutdown" is also a member.

To create a thread, you need an id. you need to prepare an id for each thread, so you need an id id array Its length is max_thread_num.

Thread lock is used to ensure mutual exclusion when operating on threads. So you need a lock, queue_lock.

Conditional variable (condition_variable). The condition variable is mainly used to broadcast the incoming message of the task to all threads. When there are idle threads, this thread

Accept task assignment. So you need a conditional variable queue_ready.

The most important thing is the task itself, that is, the work. So what are the needs of the job itself Member variable ? The first must be the mission entrance, routine function

The second is the parameter args of the routine function; Thirdly, the task exists as a queue, so the task itself should contain a next.

The second technical problem to consider is which APIs should be included in the thread pool.

1, Create thread pool_ tpool

2, Destroy thread pool_ tpool

3, Assign task, add_task_2_tpool

Based on the above analysis, we can construct the header file first.

tpool.h

#ifndef T_POOL
#define T_POOL
 
#include <pthread.h>
#include <ctype.h>
 
typedef struct tpool_work{
   void* (*work_routine)(void*); //function to be called
   void* args;                   //arguments 
   struct tool_work* next;
}tpool_work_t;
 
typedef struct tpool{
   size_t               shutdown;       //is tpool shutdown or not, 1 ---> yes; 0 ---> no
   size_t               maxnum_thread; // maximum of threads
   pthread_t            *thread_id;     // a array of threads
   tpool_work_t*        tpool_head;     // tpool_work queue
   pthread_cond_t       queue_ready;    // condition varaible
   pthread_mutex_t      queue_lock;     // queue lock
}tpool_t;
 
 
/***************************************************
*@brief:
*       create thread pool
*@args:   
*       max_thread_num ---> maximum of threads
*       pool           ---> address of thread pool
*@return value: 
*       0       ---> create thread pool successfully
*       othres  ---> create thread pool failed
***************************************************/
 
int create_tpool(tpool_t** pool,size_t max_thread_num);
 
/***************************************************
*@brief:
*       destroy thread pool
*@args:
*        pool  --->  address of pool
***************************************************/
void destroy_tpool(tpool_t* pool);
 
/**************************************************
*@brief:
*       add tasks to thread pool
*@args:
*       pool     ---> thread pool
*       routine  ---> entry function of each thread
*       args     ---> arguments
*@return value:
*       0        ---> add ok
*       others   ---> add failed        
**************************************************/
int add_task_2_tpool(tpool_t* pool,void* (*routine)(void*),void* args);
 
#endif//tpool.h

The third technical problem to be considered is who should have the ownership of the thread pool.

Here we need to consider that it is a good idea to encapsulate the thread pool into an so library, so the ownership of the thread pool should be handed over to the function calling it. So that's what I'm doing here.

tpool.c

#include "tpool.h"
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
 
 
static void* work_routine(void* args)
{
   tpool_t* pool = (tpool_t*)args;
   tpool_work_t* work = NULL;
 
   while(1){
        pthread_mutex_lock(&pool->queue_lock);
        while(!pool->tpool_head && !pool->shutdown){ // if there is no works and pool is not shutdown, it should be suspended for being awake
            pthread_cond_wait(&pool->queue_ready,&pool->queue_lock);
        }
 
        if(pool->shutdown){
           pthread_mutex_unlock(&pool->queue_lock);//pool shutdown,release the mutex and exit
           pthread_exit(NULL);
        }
 
        /* tweak a work*/
        work = pool->tpool_head;
        pool->tpool_head = (tpool_work_t*)pool->tpool_head->next;
        pthread_mutex_unlock(&pool->queue_lock);
 
        work->work_routine(work->args);
 
        free(work);
   }
return NULL;
}
 
int create_tpool(tpool_t** pool,size_t max_thread_num)
{
   (*pool) = (tpool_t*)malloc(sizeof(tpool_t));
   if(NULL == *pool){
        printf("in %s,malloc tpool_t failed!,errno = %d,explain:%s\n",__func__,errno,strerror(errno));
        exit(-1);
   }
   (*pool)->shutdown = 0;
   (*pool)->maxnum_thread = max_thread_num;
   (*pool)->thread_id = (pthread_t*)malloc(sizeof(pthread_t)*max_thread_num);
   if((*pool)->thread_id == NULL){
        printf("in %s,init thread id failed,errno = %d,explain:%s",__func__,errno,strerror(errno));
        exit(-1);
   }
   (*pool)->tpool_head = NULL;
   if(pthread_mutex_init(&((*pool)->queue_lock),NULL) != 0){
        printf("in %s,initial mutex failed,errno = %d,explain:%s",__func__,errno,strerror(errno));
        exit(-1);
   }
 
   if(pthread_cond_init(&((*pool)->queue_ready),NULL) != 0){
        printf("in %s,initial condition variable failed,errno = %d,explain:%s",__func__,errno,strerror(errno));
        exit(-1);
   }
 
   for(int i = 0; i < max_thread_num; i++){
        if(pthread_create(&((*pool)->thread_id[i]),NULL,work_routine,(void*)(*pool)) != 0){
           printf("pthread_create failed!\n");
           exit(-1);
        }
   }
return 0;
}
 
void destroy_tpool(tpool_t* pool)
{
   tpool_work_t* tmp_work;
 
   if(pool->shutdown){
        return;
   }
   pool->shutdown = 1;
 
   pthread_mutex_lock(&pool->queue_lock);
   pthread_cond_broadcast(&pool->queue_ready);
   pthread_mutex_unlock(&pool->queue_lock);
 
   for(int i = 0; i < pool->maxnum_thread; i++){
        pthread_join(pool->thread_id[i],NULL);
   }
   free(pool->thread_id);
   while(pool->tpool_head){
        tmp_work = pool->tpool_head;
        pool->tpool_head = (tpool_work_t*)pool->tpool_head->next;
        free(tmp_work);
   }
 
   pthread_mutex_destroy(&pool->queue_lock);
   pthread_cond_destroy(&pool->queue_ready);
   free(pool);
}
 
int add_task_2_tpool(tpool_t* pool,void* (*routine)(void*),void* args)
{
   tpool_work_t* work,*member;
 
   if(!routine){
        printf("rontine is null!\n");
        return -1;
   }
 
   work = (tpool_work_t*)malloc(sizeof(tpool_work_t));
   if(!work){
        printf("in %s,malloc work error!,errno = %d,explain:%s\n",__func__,errno,strerror(errno));
        return -1;
   }
 
   work->work_routine = routine;
   work->args = args;
   work->next = NULL;
 
   pthread_mutex_lock(&pool->queue_lock);
   member = pool->tpool_head;
   if(!member){
        pool->tpool_head = work;
   }
   else{
        while(member->next){
           member = (tpool_work_t*)member->next;
        }
        member->next = work;
   }
 
   //notify the pool that new task arrived!
   pthread_cond_signal(&pool->queue_ready);
   pthread_mutex_unlock(&pool->queue_lock);
return 0;
}

demo.c

#include "tpool.h"
#include <stdio.h>
#include <unistd.h>
#include <time.h>
 
void* fun(void* args)
{
   int thread = (int)args;
   printf("running the thread of %d\n",thread);
return NULL;
}
 
int main(int argc, char* args[])
{
   tpool_t* pool = NULL;
   if(0 != create_tpool(&pool,5)){
        printf("create_tpool failed!\n");
        return -1;
   }
 
   for(int i = 0; i < 1000; i++){
        add_task_2_tpool(pool,fun,(void*)i);
   }
   sleep(2);
   destroy_tpool(pool);
return 0;
}

Makefile

pool: tpool.c demo.c
        gcc tpool.c demo.c -o pool -lpthread -std=c99 
 
clean:
        rm ./pool

For other C language implementations, please refer to the following articles:

Implementation of thread pool technology with C language

Implementation of thread pool in C language

Principle of thread pool and implementation of thread pool in C language

Implementation of thread pool function in C language

Difference between process and thread (super detailed)

Keywords: C C++ thread pool

Added by driverdave on Sun, 12 Dec 2021 17:13:29 +0200