[multithreading] learning notes

Multithreading Foundation

Processes and threads

In the computer, we call a task a process. The browser is a process, and the video player is another process. Similarly, both the music player and Word are processes.

Some processes also need to execute multiple subtasks at the same time. For example, when we use Word, Word allows us to check spelling while typing, and print in the background. We call the subtask thread.

Relationship between process and thread: a process can contain one or more threads, but there will be at least one thread.

                        ┌──────────┐
                        │Process   │
                        │┌────────┐│
            ┌──────────┐││ Thread ││┌──────────┐
            │Process   ││└────────┘││Process   │
            │┌────────┐││┌────────┐││┌────────┐│
┌──────────┐││ Thread ││││ Thread ││││ Thread ││
│Process   ││└────────┘││└────────┘││└────────┘│
│┌────────┐││┌────────┐││┌────────┐││┌────────┐│
││ Thread ││││ Thread ││││ Thread ││││ Thread ││
│└────────┘││└────────┘││└────────┘││└────────┘│
└──────────┘└──────────┘└──────────┘└──────────┘
┌──────────────────────────────────────────────┐
│               Operating System               │
└──────────────────────────────────────────────┘

The smallest task unit scheduled by the operating system is not a process, but a thread How to schedule threads is entirely determined by the operating system. The program itself cannot decide when to execute and how long to execute.

Because the same application can have multiple processes or threads, there are several methods to realize multitasking:

1. Multi process mode (each process has only one thread)

┌──────────┐ ┌──────────┐ ┌──────────┐
│Process   │ │Process   │ │Process   │
│┌────────┐│ │┌────────┐│ │┌────────┐│
││ Thread ││ ││ Thread ││ ││ Thread ││
│└────────┘│ │└────────┘│ │└────────┘│
└──────────┘ └──────────┘ └──────────┘

2. Multithreading mode (one process has multiple threads)

┌────────────────────┐
│Process             │
│┌────────┐┌────────┐│
││ Thread ││ Thread ││
│└────────┘└────────┘│
│┌────────┐┌────────┐│
││ Thread ││ Thread ││
│└────────┘└────────┘│
└────────────────────┘

3. Multi process + multi thread mode (with the highest complexity)

┌──────────┐┌──────────┐┌──────────┐
│Process   ││Process   ││Process   │
│┌────────┐││┌────────┐││┌────────┐│
││ Thread ││││ Thread ││││ Thread ││
│└────────┘││└────────┘││└────────┘│
│┌────────┐││┌────────┐││┌────────┐│
││ Thread ││││ Thread ││││ Thread ││
│└────────┘││└────────┘││└────────┘│
└──────────┘└──────────┘└──────────┘

Process vs thread

Compared with multithreading, the disadvantages of multithreading are:

Creating process is more expensive than creating thread, especially on Windows system;
Inter process communication is slower than inter thread communication, because inter thread communication is to read and write the same variable, which is very fast.

The advantages of multi process are:

The stability of multi process is higher than that of multi thread, because in the case of multi process, the collapse of one process will not affect other processes, while in the case of multi thread, the collapse of any thread will directly lead to the collapse of the whole process.

Create a new thread

The Java language has built-in multithreading support. When a java program starts, it actually starts a JVM process, and then the JVM starts the main thread to execute the main() method. In the main() method, we can start other threads.

It is easy to create a new thread, just instantiate a Thread instance, and then call its start() method.

public class Main {
    public static void main(String[] args) {
        Thread t = new Thread();
        t.start(); // Start a new thread
    }
}

But when the thread starts, it actually ends without doing anything. We hope that the new thread can execute the specified code in the following ways:

1. Derived Thread class

Derive a custom class from Thread and override the run() method:

public class Main {
    public static void main(String[] args) {
        Thread t = new MyThread();
        t.start(); // Start a new thread
    }
}

class MyThread extends Thread {
    @Override
    public void run() {
        System.out.println("start new thread!");
    }
}

Execute the above code and notice that the start() method will automatically call the run() method of the instance internally.

2. Thread class is passed into the implementation class of Runnable interface

When creating a Thread instance, pass in a Runnable instance:

public class Main {
    public static void main(String[] args) {
        Thread t = new Thread(new MyRunnable());
        t.start(); // Start a new thread
    }
}

class MyRunnable implements Runnable {
    @Override
    public void run() {
        System.out.println("start new thread!");
    }
}

Or the lambda syntax introduced by Java 8 can be further abbreviated as:

public class Main {
    public static void main(String[] args) {
        Thread t = new Thread(() -> {
            System.out.println("start new thread!");
        });
        t.start(); // Start a new thread
    }
}

Special note: calling the run() method of Thread instance directly is invalid

public class Main {
    public static void main(String[] args) {
        Thread t = new MyThread();
        t.run();
    }
}

class MyThread extends Thread {
    public void run() {
        System.out.println("hello");
    }
}

Directly calling the run() method is equivalent to calling an ordinary Java method. The current thread has not changed, and no new thread will be started. The above code actually calls the run() method inside the main() method. The print hello statement is executed in the main thread, and no new thread is created.

The start() method of the Thread instance must be called to start a new Thread. If we look at the source code of the Thread class, we will see that a private native void start0() method is called inside the start() method. The native modifier indicates that this method is implemented by the C code inside the JVM virtual machine, not by Java code.

What is the difference between using threads to execute print statements and directly executing them in the main method?

public class Main {
    public static void main(String[] args) {
        System.out.println("main start...");
        Thread t = new Thread() {
            public void run() {
                System.out.println("thread run...");
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {}
                System.out.println("thread end.");
            }
        };
        t.start();
        try {
            Thread.sleep(100);
        } catch (InterruptedException e) {}
        System.out.println("main end...");
    }
}

The main thread must print main start first and then main end;
t thread must print thread run first and then thread end.

However, except that it is certain that main start will print first, the printing of main end before, after or between thread run and thread end cannot be determined. Because after the t thread starts running, the two threads start running at the same time, and are scheduled by the operating system. The program itself cannot determine the scheduling order of threads.

Thread state

In Java programs, a thread object can only call the start() method once to start a new thread and execute the run() method in the new thread. Once the run() method is executed, the thread ends. Therefore, the states of Java threads are as follows:

New: the newly created thread has not been executed;
Runnable: the running thread is executing the Java code of the run() method;
Blocked: the running thread is suspended because some operations are blocked;
Waiting: running thread, because some operations are waiting;
Timed Waiting: running thread, because the execution of sleep() method is timing and waiting;
Terminated: the thread has terminated because the run() method has finished executing.

It is represented by a state transition diagram as follows:

         ┌─────────────┐
         │     New     │
         └─────────────┘
                │
                ▼
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
 ┌─────────────┐ ┌─────────────┐
││  Runnable   │ │   Blocked   ││
 └─────────────┘ └─────────────┘
│┌─────────────┐ ┌─────────────┐│
 │   Waiting   │ │Timed Waiting│
│└─────────────┘ └─────────────┘│
 ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
                │
                ▼
         ┌─────────────┐
         │ Terminated  │
         └─────────────┘

When the thread starts, it can switch between Runnable, Blocked, Waiting and Timed Waiting until it finally becomes Terminated and the thread terminates.

Reason for thread termination:

The thread terminates normally: the run() method executes and returns to the return statement;
Unexpected thread termination: the run() method terminates the thread due to an uncapped exception;
Call the stop() method on the Thread instance of a Thread to force termination (strongly deprecated).

A thread can also wait for another thread until its run ends. For example, after the main thread starts the T thread, it can wait for the end of the T thread through t.join()

public class Main {
    public static void main(String[] args) throws InterruptedException {
        Thread t = new Thread(() -> {
            System.out.println("hello");
        });
        System.out.println("start");
        t.start();
        t.join();
        System.out.println("end");
    }
}

When the main thread calls the join() method on the thread object T, the main thread will wait for the end of the thread represented by the variable t, that is, join means waiting for the end of the thread before continuing to execute its own thread. Therefore, the printing order of the above code is that the main thread prints start first, the T thread then prints hello, and the main thread finally prints end.

If the t thread has ended, the call to join() on instance t will return immediately. In addition, the overload method of join(long) can also specify a waiting time. After the waiting time is exceeded, it will not continue to wait.

Interrupt thread

Interrupt thread: other threads send a signal to the thread. After receiving the signal, the thread ends executing the run() method, so that its own thread can immediately end running.

Interrupt the thread through the interrupt() method

To interrupt a thread, you only need to call the interrupt() method on the target thread in other threads. The target thread needs to repeatedly check whether its state is interrupted. If so, it will immediately end the operation.

public class Main {
    public static void main(String[] args) throws InterruptedException {
        Thread t = new MyThread();
        t.start();
        Thread.sleep(5); // Pause for 1 ms
        t.interrupt(); // Interrupt t thread
        t.join(); // Wait for the t thread to end
        System.out.println("end");
    }
}

class MyThread extends Thread {
    public void run() {
        int n = 0;
        while (! isInterrupted()) {
            n ++;
            System.out.println(n + " hello!");
        }
    }
}

Look carefully at the above code. The main thread interrupts the T thread by calling the t.interrupt() method, but note that the interrupt() method only sends an "interrupt request" to the T thread. Whether the T thread can respond immediately depends on the specific code. The while loop of T thread will detect isInterrupted(), so the above code can correctly respond to the interrupt() request and make itself immediately end running the run() method.

If the thread is in the waiting state, for example, t.join() will make the main thread enter the waiting state. At this time, if interrupt() is called on the main thread, the join() method will immediately throw an InterruptedException. Therefore, as long as the target thread catches the InterruptedException thrown by the join() method, it indicates that other threads have called the interrupt() method, Normally, the thread should end running immediately.

Set flag bit

We usually use a running flag bit to identify whether the thread should continue to run. In the external thread, by setting hellothread Set running to false to end the thread:

public class Main {
    public static void main(String[] args)  throws InterruptedException {
        HelloThread t = new HelloThread();
        t.start();
        Thread.sleep(1);
        t.running = false; // Flag position is false
    }
}

class HelloThread extends Thread {
    public volatile boolean running = true;
    public void run() {
        int n = 0;
        while (running) {
            n ++;
            System.out.println(n + " hello!");
        }
        System.out.println("end!");
    }
}

Note that the flag bit boolean running of HelloThread is a variable shared between threads. Variables shared between threads need to be marked with volatile keyword to ensure that each thread can read the updated variable value.

Why declare variables shared between threads with the keyword volatile? This involves Java's memory model. In the Java virtual machine, the value of the variable is saved in the main memory, but when the thread accesses the variable, it will first obtain a copy and save it in its own working memory. If the thread modifies the value of the variable, the virtual opportunity will write the modified value back to the main memory at a certain time, but the time is uncertain!

┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐
           Main Memory
│                               │
   ┌───────┐┌───────┐┌───────┐
│  │ var A ││ var B ││ var C │  │
   └───────┘└───────┘└───────┘
│     │ ▲               │ ▲     │
 ─ ─ ─│─│─ ─ ─ ─ ─ ─ ─ ─│─│─ ─ ─
      │ │               │ │
┌ ─ ─ ┼ ┼ ─ ─ ┐   ┌ ─ ─ ┼ ┼ ─ ─ ┐
      ▼ │               ▼ │
│  ┌───────┐  │   │  ┌───────┐  │
   │ var A │         │ var C │
│  └───────┘  │   │  └───────┘  │
   Thread 1          Thread 2
└ ─ ─ ─ ─ ─ ─ ┘   └ ─ ─ ─ ─ ─ ─ ┘

This will cause that if one thread updates a variable, the value read by another thread may still be the value before the update. For example, when the main memory variable a = true and thread 1 executes a = false, it only changes the copy of variable a to false at this moment. The main memory variable a is still true. Before the JVM writes the modified a back to the main memory, the value of a read by other threads is still true, which leads to inconsistent variables shared among multiple threads.

Therefore, the purpose of volatile keyword is to tell the virtual machine:

Each time a variable is accessed, the latest value of the main memory is always obtained;
Write back to main memory immediately after modifying variables.

volatile keyword solves the visibility problem: when a thread modifies the value of a shared variable, other threads can immediately see the modified value.

If we remove the volatile keyword and run the above program, we find that the effect is similar to that with volatile. This is because under the x86 architecture, the JVM writes back to the main memory very fast. However, if we change to the ARM architecture, there will be a significant delay.

Daemon thread

Java program entry is that the JVM starts the main thread, and the main thread can start other threads. When all threads have finished running, the JVM exits and the process ends.

If a thread does not exit, the JVM process will not exit. Therefore, we must ensure that all threads can end in time.

However, there is a thread whose purpose is to loop indefinitely. For example, a thread that triggers a task regularly:

class TimerThread extends Thread {
    @Override
    public void run() {
        while (true) {
            System.out.println(LocalTime.now());
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                break;
            }
        }
    }
}

If this thread does not end, the JVM process cannot end. The question is, who is responsible for ending this thread?

However, such threads often have no owner to end them. However, when other threads end, the JVM process must end. What should I do?

The answer is to use daemon threads.

A daemon thread is a thread that serves other threads. In the JVM, after all non daemon threads are executed, the virtual machine will automatically exit whether there is a daemon thread or not.

Therefore, when the JVM exits, it does not have to care whether the daemon thread has ended.

How to create a daemon thread? The method is just like the common thread, calling setDaemon(true) before calling the start() method, marking it as daemon thread:

Thread t = new MyThread();
t.setDaemon(true);
t.start();

In the daemon thread, pay attention to when writing code: the daemon thread cannot hold any resources that need to be closed, such as opening files, because when the virtual machine exits, the daemon thread has no chance to close files, which will lead to data loss.

Keywords: Java Back-end Multithreading

Added by Shovinus on Mon, 07 Feb 2022 20:47:57 +0200

Programming VIP