Java ASM framework learning - building class bytecode from zero

Tips: ASM uses visitor mode. Learn visitor mode and look at ASM more clearly

What is ASM

ASM is a class library that operates on Java bytecode

Before learning this class library, I hope you have a certain understanding of Java basic IO and bytecode.

A higher version of ASM library can operate on the byte code of the highest JAVA version and below it

ASM versionJava version
2.05
3.26
4.07
5.08
6.09
6.110
7.011
7.113
8.014
9.016
9.117

Functions of ASM

  1. Generate the bytecode of a class from zero
  2. Analyze existing bytecode
  3. Modify existing bytecode

The above is a popular explanation. In fact, ASM can do a lot. We need to explore it slowly. The famous Spring uses ASM internally.

How does ASM generate the bytecode of a class

In the process of generating the bytecode of a class from 0, ClassVisitor and ClassWriter mainly work

ClassVisitor

ClassVisitor is an abstract class. Common subclasses include ClassWriter class (Core API) and ClassNode class (Tree API)

fields

public abstract class ClassVisitor {
    protected final int api;
    protected ClassVisitor cv;
}
  • api: indicates the version of ASM currently used. There are corresponding constants in Opcodes for selection, such as Opcodes ASM9
  • cv: used to connect classvisitors, just like a linked list

methods

The visitor pattern is used in ASM. There are many visitxxxx methods in this class. We focus on four of them (visit/visitField/visitMethod/visitEnd), which are also the core of the class file

These methods are used to build bytecode and follow a certain calling order as follows

visit
[visitSource][visitModule][visitNestHost][visitPermittedSubclass][visitOuterClass]
(
 visitAnnotation |
 visitTypeAnnotation |
 visitAttribute
)*
(
 visitNestMember |
 visitInnerClass |
 visitRecordComponent |
 visitField |
 visitMethod
)* 
visitEnd

visit() is used to fill in the basic information of the class

public void visit(
    int version, //java version
    int access, //Access modifier 
    String name, //Class name
    String signature, //Generic parameter, null without generic
    String superName, //Direct parent class
    String[] interfaces) //Interface set

visitField() is used to populate the field table information of the class

public FieldVisitor visitField(
    final int access, //Access modifier for field
    final String name, //Simple name of the field
    final String descriptor, //Field descriptor
    final String signature, //The generic type of the field. No generic type is null
    final Object value) //The value of the field (if static is not final, it will be automatically added to the < clinit > method, which has been verified)

visitMethod() is used to populate the method table information of the class

public MethodVisitor visitMethod( 
    final int access, //Access modifier for method
    final String name, //Simple name of the method
    final String descriptor, //Method descriptor
    final String signature,// Generic type of method
    final String[] exceptions) //Additional information about the method (code, exception)

visitEnd() is used to indicate that the work is completed and the visit method will not be called again

public void visitEnd()

ClassWriter

fields

The fields are basically consistent with the composition of the class file

public class ClassWriter extends ClassVisitor {
    public static final int COMPUTE_MAXS = 1;
    public static final int COMPUTE_FRAMES = 2;
    private int version;
    private final SymbolTable symbolTable;
    private int accessFlags;
    private int thisClass;
    private int superClass;
    private int interfaceCount;
    private int[] interfaces;
    private FieldWriter firstField;
    private FieldWriter lastField;
    private MethodWriter firstMethod;
    private MethodWriter lastMethod;
    ...
}

methods

Construction method

public ClassWriter(int flags) { //COMPUTE_MAXS or COMPUTE_FRAMES
        this((ClassReader)null, flags);
}

Parameter resolution:

  • ClassWriter.COMPUTE_MAXS. ASM automatically calculates max stacks and Max locales
  • ClassWriter.COMPUTE_FRAMES (common). ASM automatically calculates max stacks, Max locales, and stack map frames
other visitxxxx Method references its parent class
public byte[] toByteArray()   //Used to output bytecode

Simple demonstration

/**
     * Generate bytecode
     * @return 
     */
    public static byte[] generateClass() {
        // Create ClassWriter object
        ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES);
        
        // Add content to the ClassWriter object, that is, call the visitxxx method
        cw.visit();
        cw.visitField();
        cw.visitMethod();
        cw.visitEnd();
        
        // Gets the result of the ClassWriter object
        return cw.toByteArray();
    }

Detailed setting fields and methods

If we only define the interface, the previous methods are enough. However, if we need more definitions, such as adding annotations to fields, method bodies, etc., we need to further operate the FieldVisitor returned by visitField and the MethodVisitor returned by visitMethod. These two classes are similar to the previous ClassVisitor

FieldVisitor

It is also an abstract class, and the attributes are nothing special. Fields similar to ClassVisitor and methods are also visitxxx methods, which also have execution order

(
 visitAnnotation |
 visitTypeAnnotation |
 visitAttribute
)*
visitEnd

Generally, you can't use this, but if you receive FieldVisitor, you must remember visitEnd()

{
            FieldVisitor fv = cw.visitField(
                    Opcodes.ACC_PUBLIC + Opcodes.ACC_FINAL + Opcodes.ACC_STATIC,
                    "TEST",
                    "I",
                    null,
                    1   //constant value
            );
            fv.visitEnd();
        }

MethodVisitor

It is also an abstract class, and the attributes are nothing special. Fields similar to ClassVisitor and methods are also visitxxx methods, which also have execution order

(visitParameter)*
[visitAnnotationDefault]
(visitAnnotation | visitAnnotableParameterCount | visitParameterAnnotation | visitTypeAnnotation | visitAttribute)*
[
    visitCode
    (
        visitFrame |
        visitXxxInsn |
        visitLabel |
        visitInsnAnnotation |
        visitTryCatchBlock |
        visitTryCatchAnnotation |
        visitLocalVariable |
        visitLocalVariableAnnotation |
        visitLineNumber
    )*
    visitMaxs
]
visitEnd

Here is an example of the definition of null construction method

{
    MethodVisitor mv1 = cw.visitMethod(
            Opcodes.ACC_PUBLIC,
            "<init>",
            "()V",
            null,
            null
    );
    // Mark method body start
    mv1.visitCode(); 

    // A series of methods for executing bytecode
    mv1.visitVarInsn(Opcodes.ALOAD, 0);
    mv1.visitMethodInsn(Opcodes.INVOKESPECIAL, "java/lang/Object", "<init>", "()V", false);
    mv1.visitInsn(Opcodes.RETURN);

    // Mark the end of the method body, set max stacks and Max locales, and set ClassWriter COMPUTE_ Frames or ClassWriter COMPUTE_ Maxs can scribble the parameters of this method, but it cannot fail to call this method
    mv1.visitMaxs(1, 1); 

    //Flag end
    mv1.visitEnd();
}

Generate a simple class by actual operation

The target classes are as follows:

public class TestClass {
    public int content;
    public String result;
    public static final boolean flag = true;
    public TestClass() {}
}

ASM codes are as follows

package example.generate;

import org.objectweb.asm.ClassWriter;
import org.objectweb.asm.FieldVisitor;
import org.objectweb.asm.MethodVisitor;
import org.objectweb.asm.Opcodes;

public class Test {
    public static byte[] generateClass() {
        ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_FRAMES);

        cw.visit(
                Opcodes.V1_8,   // Java8
                Opcodes.ACC_PUBLIC | Opcodes.ACC_SUPER, // Access modifier 
                "example/sample/TestClass", // Fully qualified name
                null,
                "java/lang/Object", // Direct parent class
                null
        );

        {
            FieldVisitor fv = cw.visitField(    // The effect of not receiving return values is the same
                    Opcodes.ACC_PUBLIC,
                    "content",
                    "I",
                    null,
                    0
            );
            fv.visitEnd();
        }

        {
            FieldVisitor fv = cw.visitField(
                    Opcodes.ACC_PUBLIC,
                    "result",
                    "Ljava/lang/String;",  // Don't forget the semicolon
                    null,
                    ""
            );
            fv.visitEnd();
        }

        {
            FieldVisitor fv = cw.visitField(
                    Opcodes.ACC_PUBLIC | Opcodes.ACC_STATIC | Opcodes.ACC_FINAL,
                    "flag",
                    "Z",
                    null,
                    true
            );
            fv.visitEnd();
        }

        {
            MethodVisitor mv = cw.visitMethod(
                    Opcodes.ACC_PUBLIC,
                    "<init>", // The name of the constructor in bytecode
                    "()V",
                    null,
                    null
            );
            mv.visitCode();
            mv.visitVarInsn(Opcodes.ALOAD, 0);
            mv.visitMethodInsn(Opcodes.INVOKESPECIAL, "java/lang/Object", "<init>", "()V", false);
            mv.visitInsn(Opcodes.RETURN);
            mv.visitMaxs(1, 1);
            mv.visitEnd();
        }

        cw.visitEnd();

        return cw.toByteArray();
    }
}

Test code:

@Test
void fun4() {
    try {
        Class cls = Class.forName("example.sample.TestClass");
        Object obj = cls.newInstance();
        Field f1 = cls.getField("content");
        f1.set(obj, 10);
        System.out.println(f1.get(obj));
        System.out.println(f1.getName());
    } catch (Exception e) {
        e.printStackTrace();
    }
}

Keywords: Java Back-end

Added by jsschmitt on Tue, 25 Jan 2022 03:44:38 +0200