Multithread Matrix Multiplication

Job Requirements: Multithread programming is used to realize matrix multiplication under Windows and Linux respectively.
Analysis: Linux multi-threading and Windows multi-threading implementation principle is basically the same, but the API provided by the system is different. In fact, if the API provided by Windows is not applicable under Windows, the pthread interface API that comes with GUN compiler is used. Its interface function is the same as that of linux. Because of the job requirement, the program under windows is developed with the interface function provided by windows.h.
Matrix multiplication is an O(n^4) operation with time complexity. The amount of computation increases rapidly with the increase of n. Therefore, multithreading of cpu can be used for parallel operation in large-scale matrix multiplication.
Multithread Matrix Multiplication in Windows
Testing System: Windows 10
IDE: codeblocks
Header file: windows.h
Language: C++ format for code, because windows API is provided for c++, to use the concept of reference
  • Main function core code
int main{
......
InitializeCriticalSection(&cs);
HANDLE thread1 = CreateThread(NULL, NULL, MultWork, NULL, NULL, NULL);
......
WaitForSingleObject(thread1, INFINITE);
......
}
Initialized Critical Section (& cs) is the initialization function of mutex. CS is a lock variable of CRITICAL_SECTION type. It appears as a global variable in this program.
Among them:
thread1 is the thread name of the handle object, creating four means creating four threads at the same time
The CreatThread function creates threads for the Windows API as follows:
The first parameter represents the security attributes of the thread kernel object, and NULL is typically passed in to indicate the default settings.
The second parameter represents the thread stack space size. Input 0 indicates the default size (1MB).
The third parameter represents the address of the thread function executed by the new thread, and multiple threads can use the same function address.
The fourth parameter is the parameter passed to the thread function.
The fifth parameter specifies additional flags to control thread creation. For 0, the thread can be scheduled immediately after creation. For CREATE_SUSPENDED, the thread is suspended after creation, so it cannot be scheduled until ResumeThread() is called.
The sixth parameter returns the ID number of the thread, and the NULL entry indicates that the thread ID number does not need to be returned.
Function return value: The handle of the new thread is returned successfully, and NULL is returned failed.
WaitForSingleObject is a function of waiting object, parameter one is a handle type object, parameter two is the maximum waiting time.
  • Working Function Core Code
DWORD WINAPI MultWork(LPVOID pM){
......
EnterCriticalSection(&cs);
......
LeaveCriticalSection(&cs);
......
ExitThread(NULL);
}
Enter Critical Section (&cs) and Leave Critical Section (&cs) are mutually exclusive locking and unlocking functions. The code or instruction between these two functions is a critical value to protect read and write. CS needs to be initialized.
The ExitThread(NULL) function is the thread termination function, and the parameter is the exit function called by the thread.
  • Realization of Matrix Multiplication Function
In order to realize matrix multiplication conveniently, this program matches matrix creation function, matrix printing function and single unit calculation function of matrix C. Creating matrices uses the method of filling in vacancies with random numbers. In the code of thread working function, the main problems to be solved are as follows:
a. Assignment of Multiplication Tasks
For this problem, the program has two design methods.
One is to assign tasks in advance, such as forcing tasks into four parts under four threads for calculation. There are two main problems in this allocation: first, it is difficult to design algorithms when the task load is small, such as only three units need to be computed, so that the code volume increases; second, when a large amount of data needs to be computed, there may be a situation that the thread is not completed synchronously, the thread completed first and then the thread, which wastes CPU. Resources.
Second, assign tasks dynamically. When C matrix row s and col s are known, a _C matrix is generated and all elements of the matrix are set to NULL. The following steps are looped during threading:
a. Inspect the nearest NULL unit in the _C matrix and set it to 1. If there is no such unit, set the end flag to 1 and exit the critical area to unlock.
b. Calculate the data value of the corresponding position in the C matrix and update the data value to its position; if there is no data to calculate, exit the thread.
The advantage of this method lies in dynamic allocation, while the disadvantage lies in occupying part of memory to store _C matrix and occupying CPU in critical area calculation. Large-scale operations can effectively improve efficiency.
  • The source code is as follows
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <windows.h>


typedef struct{
    int x;
}unit;

typedef struct{
    int row;
    int col;
    unit *x;
}matrix;

matrix A,B,C,_C;
CRITICAL_SECTION cs;

matrix CreatMatrix(int row, int col)      //Row is the number of rows, col is the number of columns, that is, the matrix is row*col
{
    matrix target;
    int i,j;
    int data;
    //target = (matrix *)malloc(sizeof(matrix));

    target.row=row;
    target.col=col;
    target.x=(unit *)malloc(row*col*sizeof(unit));
    for(i=0;i<row;i++)
        for(j=0;j<col;j++)
        {
            data = rand()%100;
            (target.x+i*target.col+j)->x = data;
        }
    return target;
}

int CalcuOneUnit(int first,int second)
{
    int i,res = 0;

    //EnterCriticalSection(&cs);
    //printf("%d,%d is working\n",first,second);
    //Leave Critical Section (& cs); used here to test the running state of threads
    for(i = 0;i<A.col;i++)
        res += (A.x+first*A.col+i)->x * (B.x+i*B.col+second)->x;
    return res;
}

DWORD WINAPI MultWork(LPVOID pM)
{
    while(1)
    {

        int firstNum;
        int secondNum;
        int res,i,j,flag = 0,close = 0;

        EnterCriticalSection(&cs);
        for(i = 0;i<_C.row;i++)
        {
            for(j = 0;j<_C.col;j++)
            {
                if((_C.x+i*_C.col+j)->x == NULL)
                {
                    firstNum = i;
                    secondNum = j;
                    (_C.x+i*_C.col+j)->x = 1;
                    close = 1;
                    break;
                }
            }
            if(close == 1)
                break;
            else if(i == _C.row-1)
                flag = 1;
        }
        LeaveCriticalSection(&cs);

        if(flag == 1)
            ExitThread(NULL);
        res = CalcuOneUnit(firstNum,secondNum);
        (C.x+firstNum*C.col+secondNum)->x = res;
    }
    ExitThread(NULL);
}

void ShowMatrix(matrix shows)
{
    int row = shows.row;
    int col = shows.col;
    int i,j;
    for(i = 0;i<row;i++)
    {
        for(j = 0;j<col;j++)
            printf("%d\t",(shows.x+i*col+j)->x);
        printf("\n");
    }
}

int main()
{
    int row,col;
    int i;
    printf("     The simplest way to create multithreaded instances\n");

    srand((unsigned)time(NULL));
    printf("Please enter the matrix A Number of rows:\n");
    scanf("%d %d",&row,&col);
    printf("________matrix A by_______: \n");
    A=CreatMatrix(row,col);
    //ShowMatrix(A);
    printf("Please enter the matrix B Number of rows:\n");
    scanf("%d %d",&row,&col);
    printf("________matrix B by_______: \n");
    B=CreatMatrix(row,col);
    //ShowMatrix(B);
    if(A.col != B.row)
    {
        printf("error input");
        return 1;
    }

    C = CreatMatrix(A.row,B.col);
    for(i = 0;i<C.col*C.row;i++)
        (C.x+i)->x = NULL;
    _C = CreatMatrix(A.row,B.col);
    for(i = 0;i<_C.col*_C.row;i++)
        (_C.x+i)->x = NULL;

    InitializeCriticalSection(&cs);
    HANDLE thread1 = CreateThread(NULL, NULL, MultWork, NULL, NULL, NULL);
    HANDLE thread2 = CreateThread(NULL, NULL, MultWork, NULL, NULL, NULL);
    HANDLE thread3 = CreateThread(NULL, NULL, MultWork, NULL, NULL, NULL);
    HANDLE thread4 = CreateThread(NULL, NULL, MultWork, NULL, NULL, NULL);
    WaitForSingleObject(thread1, INFINITE);
    WaitForSingleObject(thread2, INFINITE);
    WaitForSingleObject(thread3, INFINITE);
    WaitForSingleObject(thread4, INFINITE);

    printf("________matrix C by_______: \n");
    //ShowMatrix(C);
    system("pause");
    return 0;
}

Multithread Matrix Multiplication under Linux
System: Kali Linux
IDE: codeblocks
Language: Pure C
Linux uses its own multi-threaded function. The overall logic of the program is no different from that of windows. It only needs to change the header file to pthread.h and use the corresponding interface function.
Note that cpp needs to be compiled with g++ and compiled with the condition - lpthread, otherwise ptread-related functions will not be found.
  • Thread Control Related Functions under Linux
The pthread_t thread1 thread1 variable is a thread number type, generally unsigned int type, but in some cases it will be different.
The pthread_mutex_t mutex mutex variable is mutex and needs to be initialized.
The pthread_mutex_init(& mutex, NULL) function is the mutex initialization function with two parameters. The first parameter, mutex, is a pointer to the mutex to be initialized; the second parameter, mutexattr, is a pointer to the attribute object, which defines the properties of the mutex to be initialized. If the pointer is NULL, the default attribute is used.
Pthread_mutex_lock(&mutex) and pthread_mutex_unlock(&mutex) are used to lock and unlock critical variables with mutex parameters.
Pthread_create(&thread1, NULL, MultWork, NULL) is used to create threads. The specific parameters are as follows:
The first parameter thread: thread identifier;
The second parameter attr: Thread property settings;
The third parameter, start_routine, is the starting address of the thread function.
The fourth parameter arg: the parameter passed to start_routine;
Pthread_join (NULL) waits for the end of a thread. The first parameter is the target thread number, and the second parameter is the location where the target thread exits the information.
pthread_exit(NULL) terminates the current thread with the parameter pthread_exit() calling the return value of the thread, which can be retrieved by other functions such as pthread_join.
  • Program structure
The structure and implementation of functions in Linux are basically the same as those in windows, so it is no longer necessary to elaborate on them.
  • Program source code
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <pthread.h>


typedef struct{
    int x;
}unit;

typedef struct{
    int row;
    int col;
    unit *x;
}matrix;

matrix A,B,C,_C;
pthread_mutex_t mutex;

matrix CreatMatrix(int row, int col)      //Row is the number of rows, col is the number of columns, that is, the matrix is row*col
{
    matrix target;
    int i,j;
    int data;
    //target = (matrix *)malloc(sizeof(matrix));

    target.row=row;
    target.col=col;
    target.x=(unit *)malloc(row*col*sizeof(unit));
    for(i=0;i<row;i++)
        for(j=0;j<col;j++)
        {
            data = rand()%100;
            (target.x+i*target.col+j)->x = data;
        }
    return target;
}

int CalcuOneUnit(int first,int second)
{
    int i,res = 0;

    //pthread_mutex_lock(&mutex);
    //printf("%d,%d is working\n",first,second);
    //Pthread_mutex_unlock(& mutex); // Here for testing thread running status
    for(i = 0;i<A.col;i++)
        res += (A.x+first*A.col+i)->x * (B.x+i*B.col+second)->x;
    return res;
}

void * MultWork(void *param)
{
    while(1)
    {

        int firstNum;
        int secondNum;
        int res,i,j,flag = 0,close = 0;

        pthread_mutex_lock(&mutex);
        for(i = 0;i<_C.row;i++)
        {
            for(j = 0;j<_C.col;j++)
            {
                if((_C.x+i*_C.col+j)->x == NULL)
                {
                    firstNum = i;
                    secondNum = j;
                    (_C.x+i*_C.col+j)->x = 1;
                    close = 1;
                    break;
                }
            }
            if(close == 1)
                break;
            else if(i == _C.row-1)
                flag = 1;
        }
        pthread_mutex_unlock(&mutex);

        if(flag == 1)
            pthread_exit(NULL);
        res = CalcuOneUnit(firstNum,secondNum);
        (C.x+firstNum*C.col+secondNum)->x = res;
    }
    pthread_exit(NULL);
}

void ShowMatrix(matrix shows)
{
    int row = shows.row;
    int col = shows.col;
    int i,j;
    for(i = 0;i<row;i++)
    {
        for(j = 0;j<col;j++)
            printf("%d\t",(shows.x+i*col+j)->x);
        printf("\n");
    }
}

int main()
{
    int row,col;
    int i;
    pthread_t thread1,thread2,thread3,thread4;
    printf("Simple example \n");

    srand((unsigned)time(NULL));
    printf("input row and col of Matrix A :\n");
    scanf("%d %d",&row,&col);
    printf("Matrix A is :\n");
    A=CreatMatrix(row,col);
    ShowMatrix(A);
    printf("input row and col of Matrix B :\n");
    scanf("%d %d",&row,&col);
    printf("Matrix B is :\n");
    B=CreatMatrix(row,col);
    ShowMatrix(B);
    if(A.col != B.row)
    {
        printf("error input");
        return 1;
    }

    C = CreatMatrix(A.row,B.col);
    for(i = 0;i<C.col*C.row;i++)
        (C.x+i)->x = NULL;
    _C = CreatMatrix(A.row,B.col);
    for(i = 0;i<_C.col*_C.row;i++)
        (_C.x+i)->x = NULL;

    pthread_mutex_init(&mutex, NULL);
    pthread_create(&thread1,NULL,MultWork,NULL);
    pthread_create(&thread2,NULL,MultWork,NULL);
    pthread_create(&thread3,NULL,MultWork,NULL);
    pthread_create(&thread4,NULL,MultWork,NULL);
    pthread_join(thread1,NULL);
    pthread_join(thread2,NULL);
    pthread_join(thread3,NULL);
    pthread_join(thread4,NULL);

    printf("Matrix C is :\n");
    ShowMatrix(C);
    return 0;
}


Keywords: Windows Linux Attribute Programming

Added by wiccanwoman1103 on Sun, 14 Jul 2019 00:50:20 +0300