Serialization of Lambda expression and ingenious use of SerializedLambda in JDK

premise

In my spare time after work, I think about writing a set of lightweight ORM framework based on JDBC with Javassist as the core, and studying the source code of mybatis, TK mapper, mybatis plus and spring boot starter JDBC, It is found that LambdaQueryWrapper in mybatis-plus can get the method information (actually CallSite information) in the Lambda expression that is currently invoked, and make a complete record here. This article is based on JDK11. Other versions of JDK are not necessarily suitable.

Magic Lambda expression serialization

When looking at the source code implementation of Lambda expression, I didn't take a closer look at the comments of LambdaMetafactory. One of the comments at the top of this class is as follows:

Serializable feature. In general, the generated function object (here, it should specifically refer to the special function object implemented based on , Lambda , expression) does not need to support the serialization feature. Flag if you need to support this feature_ Serializable (a static integer attribute of LambdaMetafactory with a value of "1 < < 0") can be used to indicate that the function object is serialized. Once function objects that support serialization are used, they are serialized in a similar form to # SerializedLambda # and these # SerializedLambda # instances need the assistance of additional "capture classes" (capture classes, as described in the # caller # parameter of # MethodHandles.Lookup # for details, see # SerializedLambda # for details.

Search for "flag" in the comments of "LambdaMetafactory"_ Serializable, you can see this note:

Flag is set_ The function object instance generated after Serializable tag will implement the Serializable interface, and there will be a method named writeReplace whose return value type is SerializedLambda. The caller of the method calling these function objects (the "capture class" mentioned earlier) must have a method named $deserializeLambda $, as described in the "SerializedLambda" class.

Finally, look at the description of SerializedLambda. There are four paragraphs in the annotation, which are posted here, and the core information is extracted from each paragraph:

The main idea of each paragraph is as follows:

  • Paragraph 1: SerializedLambda , is the serialized form of , Lambda , expression, which stores the runtime information of , Lambda , expression
  • Paragraph 2: in order to ensure the correctness of the serialization implementation of safe , Lambda , expressions, one way that the compiler or language class library can choose is to ensure that the , writeReplace , method returns a , SerializedLambda , instance
  • Paragraph 3: SerializedLambda provides a readResolve method. Its function is similar to calling the static method $deserializeLambda$(SerializedLambda) in the "capture class" and taking its own instance as an input parameter. This process is understood as a deserialization process
  • Paragraph 4: the identification form of the identity sensitive operation of the function object generated by serialization and deserialization (such as System.identityHashCode(), object locking, etc.) is unpredictable

The final conclusion is: if a functional interface implements the , Serializable , interface, its instance will automatically generate a , writeReplace , method that returns the , SerializedLambda , instance, and the runtime information of the functional interface can be obtained from the , SerializedLambda , instance. These runtime information are the properties of {SerializedLambda}:

attribute

meaning

capturingClass

Capture class: the class where the current Lambda expression appears

functionalInterfaceClass

Name, separated by "/", the static type of the returned 'Lambda' object

functionalInterfaceMethodName

Functional interface method name

functionalInterfaceMethodSignature

Functional interface method signature (actually parameter type and return value type. If generic type is used, it is the erased type)

implClass

Name, separated by "/", and holding the type of the implementation method of the functional interface method (the implementation class that implements the functional interface method)

implMethodName

Implementation method name of functional interface method

implMethodSignature

Method signature of implementation method of functional interface method (parameter type and return value type)

instantiatedMethodType

Functional interface type after replacing with instance type variable

capturedArgs

Lambda capture dynamic parameters

implMethodKind

The {MethodHandle} type that implements the method

As a practical example, define a functional interface that implements , Serializable , and call it:

public class App {

    @FunctionalInterface
    public interface CustomerFunction<S, T> extends Serializable {

        T convert(S source);
    }

    public static void main(String[] args) throws Exception {
        CustomerFunction<String, Long> function = Long::parseLong;
        Long result = function.convert("123");
        System.out.println(result);
        Method method = function.getClass().getDeclaredMethod("writeReplace");
        method.setAccessible(true);
        SerializedLambda serializedLambda = (SerializedLambda)method.invoke(function);
        System.out.println(serializedLambda.getCapturingClass());
    }
}

The # DEBUG # information executed is as follows:

In this way, we can get the runtime information of the call point of the functional interface instance when calling the method, and even the type before the generic parameter is erased, so many skills can be derived. For example:

public class ConditionApp {

    @FunctionalInterface
    public interface CustomerFunction<S, T> extends Serializable {

        T convert(S source);
    }

    @Data
    public static class User {

        private String name;
        private String site;
    }

    public static void main(String[] args) throws Exception {
        Condition c1 = addCondition(User::getName, "=", "throwable");
        System.out.println("c1 = " + c1);
        Condition c2 = addCondition(User::getSite, "IN", "('throwx.cn','vlts.cn')");
        System.out.println("c1 = " + c2);
    }

    private static <S> Condition addCondition(CustomerFunction<S, String> function,
                                              String operation,
                                              Object value) throws Exception {
        Condition condition = new Condition();
        Method method = function.getClass().getDeclaredMethod("writeReplace");
        method.setAccessible(true);
        SerializedLambda serializedLambda = (SerializedLambda) method.invoke(function);
        String implMethodName = serializedLambda.getImplMethodName();
        int idx;
        if ((idx = implMethodName.lastIndexOf("get")) >= 0) {
            condition.setField(Character.toLowerCase(implMethodName.charAt(idx + 3)) + implMethodName.substring(idx + 4));
        }
        condition.setEntityKlass(Class.forName(serializedLambda.getImplClass().replace("/", ".")));
        condition.setOperation(operation);
        condition.setValue(value);
        return condition;
    }

    @Data
    private static class Condition {

        private Class<?> entityKlass;
        private String field;
        private String operation;
        private Object value;
    }
}

// results of enforcement
c1 = ConditionApp.Condition(entityKlass=class club.throwable.lambda.ConditionApp$User, field=name, operation==, value=throwable)
c1 = ConditionApp.Condition(entityKlass=class club.throwable.lambda.ConditionApp$User, field=site, operation=IN, value=('throwx.cn','vlts.cn'))

Many people worry about the performance of reflection calls. In fact, in the higher version of JDK, the reflection performance has been greatly optimized, which is very close to the performance of direct calls. Moreover, some scenes are a small number of reflection calls, which can be used safely.

I spent a lot of time on the function and use of , SerializedLambda , and then looked at the serialization and deserialization of , Lambda , expression:

public class SerializedLambdaApp {

    @FunctionalInterface
    public interface CustomRunnable extends Serializable {

        void run();
    }

    public static void main(String[] args) throws Exception {
        invoke(() -> {
        });
    }

    private static void invoke(CustomRunnable customRunnable) throws Exception {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        ObjectOutputStream oos = new ObjectOutputStream(baos);
        oos.writeObject(customRunnable);
        oos.close();
        ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(baos.toByteArray()));
        Object target = ois.readObject();
        System.out.println(target);
    }
}

The results are as follows:

Lambda expression serialization principle

For the principle of Lambda expression serialization, you can directly refer to the source code of ObjectStreamClass, ObjectOutputStream and ObjectInputStream. Here you can directly say the conclusion:

  • Prerequisite: serialized objects need to implement the {Serializable} interface
  • If there is a "writeReplace" method in the object to be serialized, the return value type obtained by calling this method directly based on the incoming instance reflection is used as the target type of serialization. For the "Lambda" expression, it is the "SerializedLambda" type
  • The deserialization process is just a reverse process. The called method is "readResolve". As mentioned earlier, there is also a private method with the same name in "SerializedLambda"
  • The implementation type of lambda # expression is the template class generated by VM # from the result, the instance before serialization and the instance after deserialization belong to different template classes. For the example in the previous section, the template class before serialization is # club throwable. lambda. Serializedlambdaapp $$lambda $14 / 0x000000080065840. The template class after deserialization is club throwable. lambda. SerializedLambdaApp$$Lambda$26/0x00000008000a4040

ObjectStreamClass is the class descriptor of serialization and deserialization implementation. The class description information of object serialization and deserialization can be found from the member attribute of this class, such as writeReplace and readResolve methods mentioned here

The graphical process is as follows:

How to get SerializedLambda

Through the previous analysis, we know that there are two ways to obtain the # SerializedLambda # instance of # Lambda # expression:

  • Method 1: call the # writeReplace # method based on the # Lambda # expression instance and the template class reflection of # Lambda # expression, and the return value is the # SerializedLambda # instance
  • Method 2: obtain the SerializedLambda instance based on serialization and deserialization

Based on these two methods, examples can be written separately. For example, the reflection method is as follows:

// Reflection mode
public class ReflectionSolution {

    @FunctionalInterface
    public interface CustomerFunction<S, T> extends Serializable {

        T convert(S source);
    }

    public static void main(String[] args) throws Exception {
        CustomerFunction<String, Long> function = Long::parseLong;
        SerializedLambda serializedLambda = getSerializedLambda(function);
        System.out.println(serializedLambda.getCapturingClass());
    }

    public static SerializedLambda getSerializedLambda(Serializable serializable) throws Exception {
        Method writeReplaceMethod = serializable.getClass().getDeclaredMethod("writeReplace");
        writeReplaceMethod.setAccessible(true);
        return (SerializedLambda) writeReplaceMethod.invoke(serializable);
    }
}

Serialization and deserialization are slightly more complicated because
ObjectInputStream. The readObject () method will eventually call back
SerializedLambda. The result returned by the readResolve() method is an instance of a , Lambda , expression carried by a new template class. Therefore, we need to find a way to interrupt this call and return the result in advance. The scheme is to construct a , shadow type similar to , serializedlambda , but without the , readResolve() method:

package cn.vlts;
import java.io.Serializable;

/**
 * Pay attention to Java lang.invoke. The name of the "ObjectStreamClass () method in the package is different, which can be judged by the name of the" lambda "method
 */
@SuppressWarnings("ALL")
public class SerializedLambda implements Serializable {
    private static final long serialVersionUID = 8025925345765570181L;
    private  Class<?> capturingClass;
    private  String functionalInterfaceClass;
    private  String functionalInterfaceMethodName;
    private  String functionalInterfaceMethodSignature;
    private  String implClass;
    private  String implMethodName;
    private  String implMethodSignature;
    private  int implMethodKind;
    private  String instantiatedMethodType;
    private  Object[] capturedArgs;

    public String getCapturingClass() {
        return capturingClass.getName().replace('.', '/');
    }
    public String getFunctionalInterfaceClass() {
        return functionalInterfaceClass;
    }
    public String getFunctionalInterfaceMethodName() {
        return functionalInterfaceMethodName;
    }
    public String getFunctionalInterfaceMethodSignature() {
        return functionalInterfaceMethodSignature;
    }
    public String getImplClass() {
        return implClass;
    }
    public String getImplMethodName() {
        return implMethodName;
    }
    public String getImplMethodSignature() {
        return implMethodSignature;
    }
    public int getImplMethodKind() {
        return implMethodKind;
    }
    public final String getInstantiatedMethodType() {
        return instantiatedMethodType;
    }
    public int getCapturedArgCount() {
        return capturedArgs.length;
    }
    public Object getCapturedArg(int i) {
        return capturedArgs[i];
    }
}


public class SerializationSolution {

    @FunctionalInterface
    public interface CustomerFunction<S, T> extends Serializable {

        T convert(S source);
    }

    public static void main(String[] args) throws Exception {
        CustomerFunction<String, Long> function = Long::parseLong;
        cn.vlts.SerializedLambda serializedLambda = getSerializedLambda(function);
        System.out.println(serializedLambda.getCapturingClass());
    }

    private static cn.vlts.SerializedLambda getSerializedLambda(Serializable serializable) throws Exception {
        try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
             ObjectOutputStream oos = new ObjectOutputStream(baos)) {
            oos.writeObject(serializable);
            oos.flush();
            try (ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(baos.toByteArray())) {
                @Override
                protected Class<?> resolveClass(ObjectStreamClass desc) throws IOException, ClassNotFoundException {
                    Class<?> klass = super.resolveClass(desc);
                    return klass == java.lang.invoke.SerializedLambda.class ? cn.vlts.SerializedLambda.class : klass;
                }
            }) {
                return (cn.vlts.SerializedLambda) ois.readObject();
            }
        }
    }
}

Forgotten $deserializeLambda $method

As mentioned earlier, the Lambda expression instance will be called when deserializing
java.lang.invoke.SerializedLambda.readResolve() method. The magic thing is that the source code of this method is as follows:

private Object readResolve() throws ReflectiveOperationException {
    try {
        Method deserialize = AccessController.doPrivileged(new PrivilegedExceptionAction<>() {
            @Override
            public Method run() throws Exception {
                Method m = capturingClass.getDeclaredMethod("$deserializeLambda$", SerializedLambda.class);
                m.setAccessible(true);
                return m;
            }
        });

        return deserialize.invoke(null, this);
    }
    catch (PrivilegedActionException e) {
        Exception cause = e.getException();
        if (cause instanceof ReflectiveOperationException)
            throw (ReflectiveOperationException) cause;
        else if (cause instanceof RuntimeException)
            throw (RuntimeException) cause;
        else
            throw new RuntimeException("Exception in SerializedLambda.readResolve", e);
    }
}

It seems that there is such a static method in the "capture class":

class CapturingClass {

    private static Object $deserializeLambda$(SerializedLambda serializedLambda){
        return [serializedLambda] => Lambda Expression instance;
    }  
}

You can try to retrieve the list of methods in the capture class:

public class CapturingClassApp {

    @FunctionalInterface
    public interface CustomRunnable extends Serializable {

        void run();
    }

    public static void main(String[] args) throws Exception {
        invoke(() -> {
        });
    }

    private static void invoke(CustomRunnable customRunnable) throws Exception {
        Method writeReplaceMethod = customRunnable.getClass().getDeclaredMethod("writeReplace");
        writeReplaceMethod.setAccessible(true);
        java.lang.invoke.SerializedLambda serializedLambda = (java.lang.invoke.SerializedLambda)
                writeReplaceMethod.invoke(customRunnable);
        Class<?> capturingClass = Class.forName(serializedLambda.getCapturingClass().replace("/", "."));
        ReflectionUtils.doWithMethods(capturingClass, method -> {
                    System.out.printf("Method name:%s,Modifier :%s,Method parameter list:%s,Method return value type:%s\n", method.getName(),
                            Modifier.toString(method.getModifiers()),
                            Arrays.toString(method.getParameterTypes()),
                            method.getReturnType().getName());
                },
                method -> Objects.equals(method.getName(), "$deserializeLambda$"));
    }
}

// results of enforcement
 Method name:$deserializeLambda$,Modifier :private static,Method parameter list:[class java.lang.invoke.SerializedLambda],Method return value type:java.lang.Object

If there is one, as mentioned earlier
java. lang.invoke. The SerializedLambda annotation describes the method of converting a SerializedLambda instance of a consistent "capture class" into a "Lambda" expression instance, because no trace of this method can be found in many places. It is a hidden skill to guess that $deserializeLambda $is generated by "VM" and can only be called through reflected methods.

Summary

The # Lambda # expression function in JDK # has been released for many years. I didn't expect to find out its serialization and deserialization methods until today after so many years. Although this is not a complex problem, it is an interesting knowledge point seen recently.

Keywords: Java

Added by epicalex on Fri, 18 Feb 2022 18:21:35 +0200