Some thoughts on Java Record - the use of default methods and the underlying implementation of generating relevant bytecode based on precompiling

Quick start Record class

Let's take a simple example to declare a user Record.

public record User(long id, String name, int age) {}

After the code is written in this way, the default elements and method implementations of the Record class include:

  1. The record header specifies the constituent elements (int id, String name, int age), and these elements are final.
  2. By default, record has only one constructor, which is a constructor containing all elements.
  3. Each element of record has a corresponding getter (but this getter is not getxxx(), but is directly named with the variable name, so the serialization framework is used, and the DAO framework should pay attention to this point)
  4. The implemented hashCode(), equals(), toString() method (implemented by automatically generating the bytecode implemented by hashCode(), equals(), toString() method in the compilation stage).

Let's use this Record:

User zhx = new User(1, "zhx", 29);
User ttj = new User(2, "ttj", 25);
System.out.println(zhx.id());//1
System.out.println(zhx.name());//zhx
System.out.println(zhx.age());//29
System.out.println(zhx.equals(ttj));//false
System.out.println(zhx.toString());//User[id=1, name=zhx, age=29]
System.out.println(zhx.hashCode());//3739156

How is the structure of Record implemented

Insert bytecode of related fields and methods after compilation

There are two ways to view the bytecode of the above example: one is through javap - V user Class command to view the bytecode of the text version and intercept the important bytecode as follows:

//Omit file header, file constant pool
{
  //The public constructor takes all attributes as parameters and assigns values to each Field
  public com.github.hashzhang.basetest.User(long, java.lang.String, int);
    descriptor: (JLjava/lang/String;I)V
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=3, locals=5, args_size=4
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Record."<init>":()V
         4: aload_0
         5: lload_1
         6: putfield      #7                  // Field id:J
         9: aload_0
        10: aload_3
        11: putfield      #13                 // Field name:Ljava/lang/String;
        14: aload_0
        15: iload         4
        17: putfield      #17                 // Field age:I
        20: return
      LineNumberTable:
        line 3: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      21     0  this   Lcom/github/hashzhang/basetest/User;
            0      21     1    id   J
            0      21     3  name   Ljava/lang/String;
            0      21     4   age   I
    MethodParameters:
      Name                           Flags
      id
      name
      age

  //toString method modified by public final
  public final java.lang.String toString();
    descriptor: ()Ljava/lang/String;
    flags: (0x0011) ACC_PUBLIC, ACC_FINAL
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         //The core implementation is the invokedynamic, which we will analyze later
         1: invokedynamic #21,  0             // InvokeDynamic #0:toString:(Lcom/github/hashzhang/basetest/User;)Ljava/lang/String;
         6: areturn
      LineNumberTable:
        line 3: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       7     0  this   Lcom/github/hashzhang/basetest/User;
  //hashCode method modified by public final
  public final int hashCode();
    descriptor: ()I
    flags: (0x0011) ACC_PUBLIC, ACC_FINAL
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         //The core implementation is the invokedynamic, which we will analyze later
         1: invokedynamic #25,  0             // InvokeDynamic #0:hashCode:(Lcom/github/hashzhang/basetest/User;)I
         6: ireturn
      LineNumberTable:
        line 3: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       7     0  this   Lcom/github/hashzhang/basetest/User;
  //equals method modified by public final
  public final boolean equals(java.lang.Object);
    descriptor: (Ljava/lang/Object;)Z
    flags: (0x0011) ACC_PUBLIC, ACC_FINAL
    Code:
      stack=2, locals=2, args_size=2
         0: aload_0
         1: aload_1
         //The core implementation is the invokedynamic, which we will analyze later
         2: invokedynamic #29,  0             // InvokeDynamic #0:equals:(Lcom/github/hashzhang/basetest/User;Ljava/lang/Object;)Z
         7: ireturn
      LineNumberTable:
        line 3: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       8     0  this   Lcom/github/hashzhang/basetest/User;
            0       8     1     o   Ljava/lang/Object;
  //getter of public modified id
  public long id();
    descriptor: ()J
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=2, locals=1, args_size=1
         0: aload_0
         1: getfield      #7                  // Field id:J
         4: lreturn
      LineNumberTable:
        line 3: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       5     0  this   Lcom/github/hashzhang/basetest/User;
  //getter of public modified name
  public java.lang.String name();
    descriptor: ()Ljava/lang/String;
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: getfield      #13                 // Field name:Ljava/lang/String;
         4: areturn
      LineNumberTable:
        line 3: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       5     0  this   Lcom/github/hashzhang/basetest/User;
  //getter of public modified age
  public int age();
    descriptor: ()I
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: getfield      #17                 // Field age:I
         4: ireturn
      LineNumberTable:
        line 3: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       5     0  this   Lcom/github/hashzhang/basetest/User;
}
SourceFile: "User.java"
Record:
  long id;
    descriptor: J

  java.lang.String name;
    descriptor: Ljava/lang/String;

  int age;
    descriptor: I

//The following are the methods and parameter information that invokedynamic will call, which we will analyze later
BootstrapMethods:
  0: #50 REF_invokeStatic java/lang/runtime/ObjectMethods.bootstrap:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/TypeDescriptor;Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/invoke/MethodHandle;)Ljava
/lang/Object;
    Method arguments:
      #8 com/github/hashzhang/basetest/User
      #57 id;name;age
      #59 REF_getField com/github/hashzhang/basetest/User.id:J
      #60 REF_getField com/github/hashzhang/basetest/User.name:Ljava/lang/String;
      #61 REF_getField com/github/hashzhang/basetest/User.age:I
InnerClasses:
  public static final #67= #63 of #65;    // Lookup=class java/lang/invoke/MethodHandles$Lookup of class java/lang/invoke/MethodHandles

The other is through the jclasslib plug-in of IDE. I recommend this method. The viewing results are as follows:

Automatically generated private final field

Auto generated full attribute constructor

Auto generated public getter method

Automatically generated hashCode(), equals(), toString() method

The core of these methods is invokedynamic:

It looks like calling another method. Isn't this indirect call a performance loss problem? JVM developers have thought of this. Let's first learn about invokedynamic.

Background of invokedynamic

Java was originally a statically typed language, that is, the main process of its type checking was mainly at compile time rather than run time. In order to be compatible with the dynamic type syntax and the JVM can be compatible with the dynamic language (the JVM is not designed to only run Java), the bytecode instruction invokedynamic is introduced in Java 7, which is also the basis for the implementation of ramda expression and var syntax in Java 8.

invokedynamic and MethodHandle

invokedynamic is inseparable from Java Use of lang.invoke package. The main purpose of this package is to provide a new mechanism for dynamically determining the target method, called MethodHandle, in addition to relying solely on symbolic references to determine the called target method.

Through the MethodHandle, you can dynamically obtain the method you want to call for calling, which is similar to Java Reflection. However, in order to pursue performance efficiency, you need to use the MethodHandle. The main reason is that Reflection is only a supplementary implementation for Reflection in the Java language, without considering the problem of efficiency, In particular, JIT can hardly effectively optimize this kind of Reflection call. MethodHandle is more like the simulation of method instruction call of bytecode. If it is used properly, JIT can also optimize it. For example, declare the method reference related to MethodHandle as static final:

private static final MutableCallSite callSite = new MutableCallSite(
        MethodType.methodType(int.class, int.class, int.class));
private static final MethodHandle invoker = callSite.dynamicInvoker();

Implementation of automatically generated toString(), hashcode(), equals()

It can be seen from the bytecode that incokedynamic actually calls the #0 method in bootstrappmethods:

0 aload_0
1 invokedynamic #24 <hashCode, BootstrapMethods #0>
6 ireturn

The bootstrap method table includes:

BootstrapMethods:
  //The actual call is Java lang.runtime. Boost method of objectmethods
  0: #50 REF_invokeStatic java/lang/runtime/ObjectMethods.bootstrap:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/TypeDescriptor;Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/invoke/MethodHandle;)Ljava
/lang/Object;
    Method arguments:
      #8 com/github/hashzhang/basetest/User
      #57 id;name;age
      #59 REF_getField com/github/hashzhang/basetest/User.id:J
      #60 REF_getField com/github/hashzhang/basetest/User.name:Ljava/lang/String;
      #61 REF_getField com/github/hashzhang/basetest/User.age:I
InnerClasses:
  //Declare methodhandles Lookup is final, which speeds up the calling performance. In this way, calling the methods in bootstrappmethods can achieve performance similar to direct calling 
  public static final #67= #63 of #65;    // Lookup=class java/lang/invoke/MethodHandles$Lookup of class java/lang/invoke/MethodHandles

From here, we can see that toString() actually calls Java lang.runtime. The bootstrap() method of objectmethods. Its core code is:
ObjectMethods.java

public static Object bootstrap(MethodHandles.Lookup lookup, String methodName, TypeDescriptor type,
                                   Class<?> recordClass,
                                   String names,
                                   MethodHandle... getters) throws Throwable {
        MethodType methodType;
        if (type instanceof MethodType)
            methodType = (MethodType) type;
        else {
            methodType = null;
            if (!MethodHandle.class.equals(type))
                throw new IllegalArgumentException(type.toString());
        }
        List<MethodHandle> getterList = List.of(getters);
        MethodHandle handle;
        //Process the corresponding logic according to the method name, corresponding to the implementation of equals(), hashCode(), toString()
        switch (methodName) {
            case "equals":
                if (methodType != null && !methodType.equals(MethodType.methodType(boolean.class, recordClass, Object.class)))
                    throw new IllegalArgumentException("Bad method type: " + methodType);
                handle = makeEquals(recordClass, getterList);
                return methodType != null ? new ConstantCallSite(handle) : handle;
            case "hashCode":
                if (methodType != null && !methodType.equals(MethodType.methodType(int.class, recordClass)))
                    throw new IllegalArgumentException("Bad method type: " + methodType);
                handle = makeHashCode(recordClass, getterList);
                return methodType != null ? new ConstantCallSite(handle) : handle;
            case "toString":
                if (methodType != null && !methodType.equals(MethodType.methodType(String.class, recordClass)))
                    throw new IllegalArgumentException("Bad method type: " + methodType);
                List<String> nameList = "".equals(names) ? List.of() : List.of(names.split(";"));
                if (nameList.size() != getterList.size())
                    throw new IllegalArgumentException("Name list and accessor list do not match");
                handle = makeToString(recordClass, getterList, nameList);
                return methodType != null ? new ConstantCallSite(handle) : handle;
            default:
                throw new IllegalArgumentException(methodName);
        }
    }

The core implementation logic of toString() method depends on the branch of case "toString". The core logic is makeToString(recordClass, getterList, nameList):

private static MethodHandle makeToString(Class<?> receiverClass,
                                            //All getter methods
                                            List<MethodHandle> getters,
                                            //All field names
                                            List<String> names) {
    assert getters.size() == names.size();
    int[] invArgs = new int[getters.size()];
    Arrays.fill(invArgs, 0);
    MethodHandle[] filters = new MethodHandle[getters.size()];
    StringBuilder sb = new StringBuilder();
    //Splice class name first[
    sb.append(receiverClass.getSimpleName()).append("[");
    for (int i=0; i<getters.size(); i++) {
        MethodHandle getter = getters.get(i); // (R)T
        MethodHandle stringify = stringifier(getter.type().returnType()); // (T)String
        MethodHandle stringifyThisField = MethodHandles.filterArguments(stringify, 0, getter);    // (R)String
        filters[i] = stringifyThisField;
        //After splicing, field name = value
        sb.append(names.get(i)).append("=%s");
        if (i != getters.size() - 1)
            sb.append(", ");
    }
    sb.append(']');
    String formatString = sb.toString();
    MethodHandle formatter = MethodHandles.insertArguments(STRING_FORMAT, 0, formatString)
                                          .asCollector(String[].class, getters.size()); // (R*)String
    if (getters.size() == 0) {
        // Add back extra R
        formatter = MethodHandles.dropArguments(formatter, 0, receiverClass);
    }
    else {
        MethodHandle filtered = MethodHandles.filterArguments(formatter, 0, filters);
        formatter = MethodHandles.permuteArguments(filtered, MethodType.methodType(String.class, receiverClass), invArgs);
    }
    return formatter;
}

Similarly, the implementation of hashcode() is:

private static MethodHandle makeHashCode(Class<?> receiverClass,
                                            List<MethodHandle> getters) {
    MethodHandle accumulator = MethodHandles.dropArguments(ZERO, 0, receiverClass); // (R)I

    // For each field, find the corresponding hashcode method, take the hash value, and finally combine them together
    for (MethodHandle getter : getters) {
        MethodHandle hasher = hasher(getter.type().returnType()); // (T)I
        MethodHandle hashThisField = MethodHandles.filterArguments(hasher, 0, getter);    // (R)I
        MethodHandle combineHashes = MethodHandles.filterArguments(HASH_COMBINER, 0, accumulator, hashThisField); // (RR)I
        accumulator = MethodHandles.permuteArguments(combineHashes, accumulator.type(), 0, 0); // adapt (R)I to (RR)I
    }

    return accumulator;
}

Similarly, the equals() implementation is:

private static MethodHandle makeEquals(Class<?> receiverClass,
                                          List<MethodHandle> getters) {
        MethodType rr = MethodType.methodType(boolean.class, receiverClass, receiverClass);
        MethodType ro = MethodType.methodType(boolean.class, receiverClass, Object.class);
        MethodHandle instanceFalse = MethodHandles.dropArguments(FALSE, 0, receiverClass, Object.class); // (RO)Z
        MethodHandle instanceTrue = MethodHandles.dropArguments(TRUE, 0, receiverClass, Object.class); // (RO)Z
        MethodHandle isSameObject = OBJECT_EQ.asType(ro); // (RO)Z
        MethodHandle isInstance = MethodHandles.dropArguments(CLASS_IS_INSTANCE.bindTo(receiverClass), 0, receiverClass); // (RO)Z
        MethodHandle accumulator = MethodHandles.dropArguments(TRUE, 0, receiverClass, receiverClass); // (RR)Z
        //Compare whether the getter of each field of the two objects obtains the same value. For the reference type, use objects Equals method, for the original type, directly through== 
        for (MethodHandle getter : getters) {
            MethodHandle equalator = equalator(getter.type().returnType()); // (TT)Z
            MethodHandle thisFieldEqual = MethodHandles.filterArguments(equalator, 0, getter, getter); // (RR)Z
            accumulator = MethodHandles.guardWithTest(thisFieldEqual, accumulator, instanceFalse.asType(rr));
        }

        return MethodHandles.guardWithTest(isSameObject,
                                           instanceTrue,
                                           MethodHandles.guardWithTest(isInstance, accumulator.asType(ro), instanceFalse));
    }

I'm involved Nuggets' popularity list in 2021 , please vote for me. Thank you

Keywords: Java Back-end

Added by kikki on Thu, 23 Dec 2021 17:47:20 +0200