java multithreading: detailed explanation of ThreadLocal

Scenario: the problem of saving and obtaining the login User's information. In our conventional system design, the back-end system usually has a long call link (Controller - > Service - > Dao). Usually, after logging in, the User information will be saved in session or token. However, if we need to use User information in multiple calling methods of controller, service and service, we can pass the User object as a parameter, that is, use the User as the context. However, this is extremely cumbersome, and it is not elegant and concise enough for the long call link; At the same time, if the call chain involves a third-party library and the rewritten method cannot modify the parameters, the object cannot be passed in. We can't directly save the User object as static, because there will be concurrency problems when multiple users access it. At this time, we can use ThreadLocal object to store User information globally.

Ask questions:

  • What is ThreadLocal? To solve what problem?
  • Use of ThreadLocal
  • The underlying implementation of ThreadLocal
  • ThreadLocal memory leak

What is ThreadLocal

ThreadLocal is a tool for saving thread local variables. Each thread can independently save and obtain its own variables through ThreadLocal, and the variables will not be affected by other threads.

Use of ThreadLocal

ThreadLocal mainly provides three external methods: get(), set (T) and remov().
Usually, we set the threadLocal object to static so that it can be obtained globally.
set (T): a thread fills in data that only belongs to its own thread and cannot be obtained by other threads.
get(): the thread gets its own set data.
remove(): the thread removes the value set by itself.

public class Test {

    public static ThreadLocal<User> threadLocal = new ThreadLocal<>();

    public static void main(String[] args){
        Thread thread1 = new Thread(()->{
            User user = new User(10, "jun");   // Simulate user 1
            threadLocal.set(user);

            playGame();                                     // User 1 plays games
        }, "Thread 1");
        Thread thread2 = new Thread(()->{
            User user = new User(20, "ge");    // Simulate user 2
            threadLocal.set(user);

            playGame();                                     // User 2 plays games
        }, "Thread 2");

        thread1.start();
        thread2.start();

    }

    public static void playGame(){
        int age = threadLocal.get().getAge();               // Simulate the business logic and judge the age of the login user
        if(age < 18){
            System.out.println("Sorry, You are under the age of 18 and currently" + age + "Years old, can not participate in the current game! The current thread is:" + Thread.currentThread().getName());
            return;
        }
        System.out.println("You are over 18 years old and currently" + age + "Years old, have a good time! The current thread is:" + Thread.currentThread().getName());

    }
}

We simulate creating two users to log in, saving them into threadLocal, and then executing the playGame method respectively. In the above method, we correctly obtain the user object to which the thread belongs directly according to threadLocal without passing parameters on the method.
As mentioned above, we have solved the cumbersome parameter transfer when the call link is too long, eliminating the process of method parameter transfer. When each thread calls the get method of threadLocal, it gets the value set by itself, which solves the problem of concurrency.

ThreadLocal principle

So, how does ThreadLocal implement thread local variables?
First, let's look at the set method:

	public void set(T value) {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
    }
	ThreadLocalMap getMap(Thread t) {
        return t.threadLocals;
    }

See the answer here, through threadLocal When set (T) sets the value, it actually gets the ThreadLocalMap of the current thread. Each thread holds a ThreadLocalMap object. The map takes threadLocal as the key and value as the stored value, which is saved into the map.
get():

	public T get() {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null) {
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        return setInitialValue();
    }

As above, the get method actually obtains the threadLocalMap held by the current thread, takes the current threadLoca as the key, and obtains the value from the map.
Summary: ThreadLocal implements thread local variables by holding and maintaining a threadLocalMap for each thread. When executing the get and set methods of ThreadLocal object, it obtains the map object of the current thread, and then takes the current ThreadLocal as the key to operate on value, so as to realize the isolation of thread local variables

ThreadLocalMap underlying structure:

Each thread holds a ThreadLocalMap to store thread local variables. ThreadLocalMap is a map class specially written to realize ThreadLocal function. Why not use an existing HashMap?
Reading the source code of ThreadLocalMap, we can find several different points:
1. The key of Entry in ThreadLocalMap is set as weak reference.
This is to prevent the memory leakage of the key. Let's talk about the memory leakage of ThreadLocal in detail

	static class Entry extends WeakReference<ThreadLocal<?>> {
            /** The value associated with this ThreadLocal. */
            Object value;

            Entry(ThreadLocal<?> k, Object v) {
                super(k);
                value = v;
            }
        }

2. ThreadLocalMap is a method to resolve hash conflicts.
The hash algorithm of ThreadLocalMap is threadlocalhashcode & (table.length - 1), while table Length is specified as an integer power of 2, so it is equivalent to threadlocalhashcode% (table.length - 1).
You can see that you can locate the array subscript through the hash algorithm, and then make a judgment: if the k of the entry is a given key, update the value directly; If k is empty, it means that the k has been garbage collected, and the entry should also be emptied by executing replacestateentry; If the conditions are not met, the next element with empty array entry will be obtained, and the for loop will jump out. Therefore, we can see that the method of ThreadLocalMap to solve hash conflict is to move the index of the located array back.

	private void set(ThreadLocal<?> key, Object value) {

            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);

            for (Entry e = tab[i];
                 e != null;
                 e = tab[i = nextIndex(i, len)]) {
                ThreadLocal<?> k = e.get();

                if (k == key) {		// If k is the given key, update value directly
                    e.value = value;
                    return;
                }

                if (k == null) {	// If k is empty, it means that the k has been garbage collected, and the entry should also be emptied by executing replacestateentry
                    replaceStaleEntry(key, value, i);
                    return;
                }
            }

            tab[i] = new Entry(key, value);
            int sz = ++size;
            if (!cleanSomeSlots(i, sz) && sz >= threshold)
                rehash();
        }

3. threadLocalHashCode is an integer multiple of 0x61c88647. Then why is this magic value?

	private final int threadLocalHashCode = nextHashCode();

    private static AtomicInteger nextHashCode =
        new AtomicInteger();

    private static final int HASH_INCREMENT = 0x61c88647;

    private static int nextHashCode() {
        return nextHashCode.getAndAdd(HASH_INCREMENT);
    }

We know table Length is an integer power of 2. Next, take the array lengths of 16, 32 and 64 as examples to explore the hash value of ThreadLocal as an integer multiple of the magic value and send hash conflict:

	public static void main(String[] args){
        hash(16);
        hash(32);
        hash(64);
    }

    public static void hash(int length){
        final int HASH_INCREMENT = 0x61c88647;
        int[] table = new int[length];
        int hash = 0;

        for (int i = 0; i <length ; i++) {
            hash += HASH_INCREMENT;
            table[i] = hash & (length-1);
        }

        Arrays.sort(table);
        System.out.println(Arrays.toString(table));
    }

The results are as follows:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]
At the same time, the length is extended to 64, 128, 256... Without repetition.
Therefore, we can conclude that setting ThreadLocal to an integer multiple of the magic value can greatly reduce the probability of hash conflict stored in ThreadLocalMap. At the same time, I have to feel the author's deep mathematical foundation!

ThreadLocal memory leak

We just mentioned that the custom Entry of ThreadLocalMap inherits the WeakReference, which is actually a weak reference to threadLocal by the key in the map.

Introduction to weak references: weak references are used to solve the problem of memory leakage. If an object has only weak references, the jvm When garbage collection occurs, the object is recycled.
Scenario: A a = new A();
	 B b = new B();
	 b.a = a;
	 a = null;          // Here, we just set the reference of a to null, because b.a has a strong reference to a, and the a object will still exist in memory and will not be garbage collected.
terms of settlement:	WeakReference<A> wr = new WeakReference<>(a);
			b.wr = wr;			//Change b's reference to a to a weak reference
	static class Entry extends WeakReference<ThreadLocal<?>> {
            /** The value associated with this ThreadLocal. */
            Object value;

            Entry(ThreadLocal<?> k, Object v) {
                super(k);
                value = v;
            }
        }

The reference relationship is as shown in the figure: if the key is set as a weak reference, when the threadLocal reference is set to null, the jvm can gc the threadLocal object because the key's references to threadLocal are all weak references, so as to prevent the memory leakage of threadLocal object.

But! You can see that the value reference is a strong reference. If the thread can end normally, it's OK to say that when the thread ends, the strong references of map, entry and value are disconnected and can be recycled by gc. However, in general, because the creation and destruction of threads consume performance, we will use methods such as thread pool for thread reuse. At this time, if the threads are not destroyed, there is likely to be a problem of memory leakage.

terms of settlement

The developers of ThreadLocal have also noticed the problem of leakage in value. Therefore, when calling the get and set methods of ThreadLocal, if the key is null, the replacestateentry () method will be executed to clean up the calling entry. For memory leak caused by thread reuse, threadLocal. can be invoked after execution. The remove () method cleans manually.

Keywords: Java Back-end Multithreading

Added by ttroutmpr on Wed, 09 Feb 2022 04:08:27 +0200