A ThreadLocal fights the interviewer for 30 rounds

start

A battle between job seekers and interviewers is taking place in a business building in Hangzhou.

Interviewer: introduce yourself first.

Angela: Hello, interviewer. I'm the three bitches in the grass, the strongest single (Daji refuses to accept), the grass motorcycle driver, the promoter of the 21st set of radio gymnastics and the successor of fire. Angela, this is my resume. Please have a look.

Interviewer: according to your resume, how familiar are you with multithreading programming?

Angela: proficient.

yes..., You're right. Asking is "proficient". Type 666 in the comment area.

interviewer:

[thought] isn't it a silly person who says he's proficient? Who talks about his proficiency? Isn't he a fool!

Interviewer: let's start. Have you used Threadlocal?

Angela: Yes.

Interviewer: then tell me about the use of ThreadLocal in your project.

Angela: our project is confidential. No comment. You'd better change your question!

Interviewer: let's talk about an unclassified project, or you can directly tell me the implementation principle of Threadlocal.

Topic

Angela: show time...

Angela: lifting a chestnut, we Alipay will have many user requests every second. Each request has user information. We know that a thread usually processes a user request, and we can throw the user information into the Threadlocal so that each thread can process its own user information, and the threads will not interfere with each other.

Interviewer: wait a minute. What's the personal question? Why did you run out of the interview from Alipay and can't stand PUA?

Angela: PUA, I don't exist. The person who can PUA me hasn't been born yet! I'm tired of eating in the company canteen and want to change my taste.

Interviewer: can you tell me what Threadlocal does?

Angela: Threadlocal is mainly used to isolate thread variables, which may not be very intuitive.

Or in the example mentioned above, when our program processes user requests, the back-end server usually has A thread pool, and A request is handed over to A thread for processing. In order to prevent string data when multiple threads process requests concurrently, for example, thread AB processes Angela's and Daji's requests respectively, while thread A originally processes Angela's requests, As A result, they visited Da Ji's data and transferred the money of Alipay.

Therefore, Angela's data can be bound to thread A and unbound after the thread is processed.

Interviewer: then use pseudo code to realize the scene you just said. Here you are!

Angela: ok

//ThreadLocal for storing user information
private static final ThreadLocal<UserInfo> userInfoThreadLocal = new ThreadLocal<>();

public Response handleRequest(UserInfo userInfo) {
  Response response = new Response();
  try {
    // 1. set the user information into the thread local variable
    userInfoThreadLocal.set(userInfo);
    doHandle();
  } finally {
    // 3. Remove after use
    userInfoThreadLocal.remove();
  }

  return response;
}
    
//Business logic processing
private void doHandle () {
  // 2. Take it out when you actually use it
  UserInfo userInfo = userInfoThreadLocal.get();
  //Query user assets
  queryUserAsset(userInfo);
}

Step 1.2 is clear.

Interviewer: then tell me how Threadlocal implements the isolation of thread variables?

Angela: Oh, get to the point so quickly. I'll draw you a picture first, as follows

Interviewer: I've seen the chart. Then tell me about the process in the corresponding chart in front of the code you wrote.

Angela: no problem

First, we initialize a ThreadLocal object through ThreadLocal < userinfo > userinfothreadlocal = new threadlocal(), that is, the ThreadLocal reference in the above figure, which points to the ThreadLocal object in the heap;

Then we call userinfothreadlocal set(userInfo); What's going on here?

Let's take out the source code and see it clearly.

We know that the Thread class has a ThreadLocalMap member variable. The Map key is the Threadlocal object, and the value is the Thread local variable you want to store.

# Threadlocal class Threadlocal class 
public void set(T value) {
  //Get the Thread of the current Thread, which is the Thread reference of the picture above
  Thread t = Thread.currentThread(); 
  //The Thread class has a member variable ThreadlocalMap. Get this Map
  ThreadLocalMap map = getMap(t);
  if (map != null)
    //this refers to the Threadlocal object
    map.set(this, value);
  else
    createMap(t, value);
}

ThreadLocalMap getMap(Thread t) {
  //Get ThreadLocalMap of thread
  return t.threadLocals;
}

void createMap(Thread t, T firstValue) {
  //initialization
  t.threadLocals = new ThreadLocalMap(this, firstValue);
}

Thread class class

public class Thread implements Runnable {
//Each thread has its own ThreadLocalMap member variable
ThreadLocal.ThreadLocalMap threadLocals = null;
}

Here is the name of the object in the current thread ThreadlocalMap in put An element was(Entry)，key yes**Threadlocal object**，value yes userInfo. 

Understand both things:

ThreadLocalMap Class is defined in Threadlocal Yes.

First, Thread object is the carrier of Thread running in Java language. Each Thread has a corresponding Thread object to store some Thread related information,
Second, there is a member variable ThreadlocalMap in the Thread class. You treat it as an ordinary Map. The key stores the Threadlocal object, and the value is the value you want to bind to the Thread (Thread isolated variable). For example, here is the user information object (UserInfo).

Interviewer: you just said that the Thread class has a member variable of ThreadlocalMap attribute, but the definition of ThreadlocalMap is in Threadlocal. Why?

Angela: let's take a look at the description of ThreadlocalMap

class ThreadLocalMap
* ThreadLocalMap is a customized hash map suitable only for
* maintaining thread local values. No operations are exported
* outside of the ThreadLocal class. The class is package private to
* allow declaration of fields in class Thread.  To help deal with
* very large and long-lived usages, the hash table entries use
* WeakReferences for keys. However, since reference queues are not
* used, stale entries are guaranteed to be removed only when
* the table starts running out of space.

ThreadLocalMap is designed to maintain thread local variables. It only does this.

This is also why ThreadLocalMap is a member variable of Thread, but it is an internal class of Threadlocal (non-public, only package access permission, Thread and Threadlocal are under java.lang package), which is to let users know that ThreadLocalMap only saves Thread local variables.

Interviewer: since it is a Thread local variable, why not use the Thread object (Thread object) as the key? Isn't it clearer to directly use the Thread as the key to obtain the Thread variable?

Angela: there will be a problem with this design. For example, I have stored the user information in the thread variable. At this time, I need to add a new thread variable. For example, add the user geographic location information. The key of ThreadlocalMap uses the thread, and then save a geographic location information. The keys are the same thread (the same as the key), Just overwrite the original user information. Map.put(key,value) operation is familiar, so some articles on the Internet say that ThreadlocalMap uses thread as key is nonsense.

Interviewer: what should I do to add geographic information?

Angela: just create a new Threadlocal object, because the key of threadlocalmap is Threadlocal object. For example, if a geographic location is added, I will store geographic location information in Threadlocal < geo > geo = new Threadlocal(). In this way, there will be two elements in the Threadlocal map of the thread, one is user information and the other is geographic location.

Interviewer: what data structure is ThreadlocalMap implemented?

Angela: like HashMap, it is also implemented by array.

The code is as follows:

class ThreadLocalMap {
 //Initial capacity
 private static final int INITIAL_CAPACITY = 16;
 //An array of elements
 private Entry[] table;
 //Number of elements
 private int size = 0;
}

table is an array that stores thread local variables. The array element is the entry class. Entry is composed of key and value. Key is the Threadlocal object and value is the stored corresponding thread variable

For the example we gave earlier, the array storage structure is shown in the figure below:

Interviewer: what if there is a hash conflict in ThreadlocalMap? What's the difference with HashMap?

Angela: [thought] the first time I met someone who asked about ThreadlocalMap hash conflict, this interview is becoming more and more interesting.

Say: there is a difference. For hash conflicts, HashMap adopts the form of linked list + red black tree. As shown in the following figure, if the length of the linked list is too long (> 8), it will be turned into red black tree:

HashMap details:

reference resources

Angela, official account: Angela's blog A HashMap talked with the interviewer for half an hour

ThreadlocalMap has neither linked list nor red black tree. It adopts the open addressing method. In this way, if there is a conflict, ThreadlocalMap will directly look back to the next adjacent node. If the adjacent node is empty, it will be saved directly. If it is not empty, it will continue to look back until it is found empty, put the elements in, or the number of elements exceeds the threshold of array length, Expand the capacity.

As shown in the figure below, the length of ThreadlocalMap array is 4. Now there is a hash conflict when saving the geographical location (there is already data in location 1). Look back and find that location 2 is empty, and store it directly in location 2.

Source code (if it's difficult to read, you can go back to read after reading):

private void set(ThreadLocal<?> key, Object value) {
  Entry[] tab = table;
  int len = tab.length;
  // Hashcode & operation is actually the remainder of% array length. For example, if the array length is 4, hashcode% (4-1) will find the array subscript to store the elements
  int i = key.threadLocalHashCode & (len-1);

  //Find the empty slot (= null) of the array. Generally, there are not many elements in ThreadlocalMap
  for (Entry e = tab[i];
       e != null; //Empty slot found for array (= null)
       e = tab[i = nextIndex(i, len)]) {
    ThreadLocal<?> k = e.get();

    //If the key value is the same, it is an update operation and can be replaced directly
    if (k == key) {
      e.value = value;
      return;
    }
  //If the key is empty, replace and clean it up. This will be discussed later when we talk about WeakReference
    if (k == null) {
      replaceStaleEntry(key, value, i);
      return;
    }
  }
 //new Entry
  tab[i] = new Entry(key, value);
  //Number of array elements + 1
  int sz = ++size;
  //If the elements are not cleaned up or the number of stored elements exceeds the array threshold, expand the capacity
  if (!cleanSomeSlots(i, sz) && sz >= threshold)
    rehash();
}

//The sequence traverses + 1 to the end of the array and returns to the head of the array (the position of 0)
private static int nextIndex(int i, int len) {
  return ((i + 1 < len) ? i + 1 : 0);
}

// get() method to obtain thread variables according to ThreadLocal key
private Entry getEntry(ThreadLocal<?> key) {
  //Calculate hash value & the operation is actually to take the remainder of% array length. For ex amp le, if the array length is 4, hashcode% (4-1) will find the array address to be queried
  int i = key.threadLocalHashCode & (table.length - 1);
  Entry e = table[i];
  //Quickly judge if there is a value in this position, and the key is equal, it means that it is found and returned directly
  if (e != null && e.get() == key)
    return e;
  else
    return getEntryAfterMiss(key, i, e); //Look back in sequence after miss (chain address method, which will be introduced later)
}

Interviewer: I see that the key in the ThreadlocalMap you drew in the front figure is the type of WeakReference. Can you tell me about several similar references in Java and what are the differences?

Angela: Yes

The most commonly used reference is the strong reference. If an object has strong references, the garbage collector will never recycle it. When the memory space is insufficient, the Java virtual machine would rather throw OutOfMemoryError error to make the program terminate abnormally, and will not solve the problem of insufficient memory by recycling objects with strong references at will.
If an object has only soft references, the garbage collector will not recycle it when the memory space is sufficient; If the memory space is insufficient, the memory of these objects will be reclaimed.
The difference between weak references and soft references is that objects with only weak references have a shorter life cycle. When the garbage collector thread scans the memory area, once it finds an object with only weak references, it will reclaim its memory regardless of whether the current memory space is sufficient or not. However, since the garbage collector is a low priority thread, it is not necessary to quickly find objects with only weak references.
As the name suggests, virtual reference is in vain. Unlike several other references, virtual references do not determine the life cycle of an object. If an object holds only virtual references, it can be recycled by the garbage collector at any time, just as it does not have any references.

A proper eight part essay! Embarrassment (-. - |).

Interviewer: can you explain why the key in ThreadlocalMap is designed as weak reference?

Angela: Yes, in order to try our best to avoid memory leakage.

Interviewer: can you elaborate? Why do you try your best? As you said earlier, the objects referenced by WeakReference will be directly recycled by GC (memory collector). Why not directly avoid memory leakage?

Angela: let's look at the picture below

private static final ThreadLocal<UserInfo> userInfoThreadLocal = new ThreadLocal<>();
userInfoThreadLocal.set(userInfo);

The reference relationship here is that userInfoThreadLocal references the ThreadLocal object, which is a strong reference. The ThreadLocal object is also referenced by the key of ThreadlocalMap, which is a WeakReference. We said earlier that the premise for GC to recycle ThreadLocal object is that it is only referenced by WeakReference without any strong reference.

In order to facilitate you to understand weak references, I wrote a Demo program

public static void main(String[] args) {
  Object angela = new Object();
  //Weak reference
  WeakReference<Object> weakReference = new WeakReference<>(angela);
  //angela and weak references point to the same object
  System.out.println(angela);//java.lang.Object@4550017c
  System.out.println(weakReference.get());//java.lang.Object@4550017c 
  //Set the strong reference angela to null, and the object will only have weak references. If there is enough memory, the weak references will also be recycled
  angela = null; 
  System.gc();//If there is enough memory, the gc will not be automatically awakened. Wake up the gc manually
  System.out.println(angela);//null
  System.out.println(weakReference.get());//null
}

You can see that once an object is only weakly referenced, it will be recycled during GC.

Therefore, as long as the ThreadLocal object is still referenced by userInfoThreadLocal (strong reference), GC will not recycle the object referenced by WeakReference.

Interviewer: since ThreadLocal objects have strong references and cannot be recycled, why should they be designed as WeakReference?

Angela: the designer of ThreadLocal considers that threads often have a long life cycle. For example, thread pools are often used, and threads are always alive. According to the JVM root search algorithm, there has always been a reference link such as thread - > ThreadLocalMap - > Entry (element). As shown in the figure below, if the key is not designed as a WeakReference type and is strongly referenced, it will not be recycled by GC, The key will never be null, and the Entry element that is not null will not be cleaned up (ThreadLocalMap determines whether to clean up the Entry according to whether the key is null)

Therefore, the designer of ThreadLocal believes that as long as the scope of ThreadLocal is finished and cleaned up, the key reference object will be recycled during GC recycling, and the key will be set to null. ThreadLocal will try its best to ensure that the Entry is cleaned up to avoid memory leakage as much as possible.

Look at the code

//Element class
static class Entry extends WeakReference<ThreadLocal<?>> {
  /** The value associated with this ThreadLocal. */
  Object value; //key is inherited from the parent class, so there is only value here

  Entry(ThreadLocal<?> k, Object v) {
    super(k);
    value = v;
  }
}

//WeakReference inherits Reference, and key is a referent that inherits the paradigm
public abstract class Reference<T> {
  //This is the inherited key
  private T referent; 
  Reference(T referent) {
    this(referent, null);
  }
}

The Entry inherits the WeakReference class. The key in the Entry is of type WeakReference. In Java, when the object is only referenced by WeakReference and there is no other object reference, the object referenced by WeakReference will be directly recycled when GC occurs.

Interviewer: what if Threadlocal objects always have strong references? There is a risk of memory leakage.

Angela: the best practice is to call the remove function manually.

Let's look at the source code:

class Threadlocal {
  public void remove() {
      //This is the ThreadLocalMap to get the thread
      ThreadLocalMap m = getMap(Thread.currentThread());
      if (m != null)
        m.remove(this); //this is the ThreadLocal object. The removal method is shown below
  }
}

class ThreadlocalMap {
  private void remove(ThreadLocal<?> key) {
    Entry[] tab = table;
    int len = tab.length;
    //Calculation position
    int i = key.threadLocalHashCode & (len-1);
    for (Entry e = tab[i];
         e != null;
         e = tab[i = nextIndex(i, len)]) {
      //clear
      if (e.get() == key) {
        e.clear();
        expungeStaleEntry(i); //Clean the empty slot
        return;
      }
   }
 }
}

//This method is to do element cleaning
private int expungeStaleEntry(int staleSlot) {
  Entry[] tab = table;
  int len = tab.length;

  //Set the value of staleSlot to null, and then set the array element to null
  tab[staleSlot].value = null;
  tab[staleSlot] = null;
  size--; //Number of elements - 1

  // Rehash until we encounter null
  Entry e;
  int i;
  for (i = nextIndex(staleSlot, len);
       (e = tab[i]) != null;
       i = nextIndex(i, len)) {
    ThreadLocal<?> k = e.get();
    //k is null, which means that the reference object has been recycled by GC
    if (k == null) {
      e.value = null;
      tab[i] = null;
      size--;
    } else {
      //Because the number of elements is reduced, the following elements are re hash ed
      int h = k.threadLocalHashCode & (len - 1);
      //If the hash addresses are not equal, it means that this element has had a hash conflict before (it should have been placed here but not here),
      //Now, because some elements have been removed, it is likely that the original conflicting location is empty. Try again
      if (h != i) {
        tab[i] = null;

        //Continue to use the chain address method to store elements
        while (tab[h] != null)
          h = nextIndex(h, len);
        tab[h] = e;
      }
    }
  }
  return i;
}

Interviewer: do you have any practical experience in Threadlocal engineering? Tell me about it.

Angela: Yes!

I talked with one of your interviewers before, how did I improve the performance of the forty core rpc interfaces of Alipay system, which is the result of one of the interfaces after tangential flow, and Threadlocal is used.

Interviewer: Well, talk about it.

Angela: I just said that there are more than 40 interfaces that need technical transformation and optimization. The risk is very high. I need to ensure that the business will not be affected after interface switching, which is also called equivalent switching.

The process is as follows:

Define the interface constant names of these more than 40 interfaces according to the business meaning, such as the interface name Alipay quickquick. follow. angela；
Cut the flow from low to high according to the flow of the interface, and configure the cutting proportion and user white list of each interface in advance;
Cutting flow is also important. First cut according to the white list of userId, and then cut the percentage according to the tail number of userId. If there is no problem at all, then cut it completely;
At the entrance of the top-level abstract template method, insert the interface name through ThreadLocal Set interface name;
Then I get the interface name through ThreadLocal at the place where I cut the flow, which is used to judge the cut flow of the interface;

Interviewer: last question, if I have many variables in Threadlocal map, don't I have to declare many Threadlocal objects? Is there a good solution.

Angela: our best practice is to make a re encapsulation. Just make the value of ThreadLocalMap into a Map. In this way, only one Threadlocal object is needed.

Interviewer: can you elaborate?

Angela: I can't speak. I'm too tired.

Interviewer: tell me.

Angela: I really don't want to talk.

Interviewer: let's get here first today. You go out of the door and turn right. Go back and wait for the notice!

Keywords: Java Interview

Added by WebbieDave on Thu, 17 Feb 2022 16:24:48 +0200

Programming VIP

A ThreadLocal fights the interviewer for 30 rounds

start

Topic

Thread class class

Popular Keywords