Java Agent for bytecode instrumentation

Java Agent for bytecode instrumentation

This article will explain the knowledge of Java Agent in detail, uncover its mystery, help developers understand its dark magic and help us complete more business needs

What is Java Agent

Java Agent, also known as Java probe, provides the function of adding bytecode to existing compiled Java classes, which is equivalent to the entry of bytecode plug-in; through the use of Java Instrumentation API , it can invade the application running on the JVM and modify various kinds of bytecode in the application.

In the Java program running environment, instrumentation is a technology used to change existing applications and add code to them, which can be performed during compilation or runtime. Its advantage is that we can modify the code and change its behavior without editing the source code file; This is very effective, but it is also very dangerous.

We can use it to realize many functions, such as AOP

Java Agent is a part of Java Instrumentation API. Java Instrumentation API provides a mechanism that can modify bytecode dynamically or statically, which means that we can add code to Java classes without modifying the source code, which has a very important impact on Java applications

To put it simply, Java Agent is a special jar file, but it contains Java classes that follow special conventions, and the manifest MF (the file is usually stored in src/main/resources/META-INF folder) file specifies Java classes that follow special conventions; There are two types of methods in this special class:

  • premain: the Java Agent loaded when the JVM is started. For example, when using the Java jar command to start the jar file, add the - javaagent command to add the Java Agent; Generally speaking, it is to run the agent after JVM initialization and before the main method

    Its method signature is as follows:

    public static void premain(String agentArgs, Instrumentation inst) 
    

    If the above methods are not included, the optional methods are as follows:

    public static void premain(String agentArgs) 
    
  • agentmain: use Java Attach API Dynamically load the Java Agent into the JVM and run it after the JVM is initialized. You can call the agent through the Attach API in our main method

    Its method signature is as follows:

    public static void agentmain(String agentArgs, Instrumentation inst) 
    public static void agentmain(String agentArgs) 
    

How does the application recognize the Java Agent? The answer is manifest MF file, as a part of the jar file, contains the metadata information of the jar file. The properties of the Java Agent are as follows:

  • Premain class: when the JVM is started, this attribute specifies the proxy class. This class is the class containing premain method, and its value is the fully qualified class name; If you specify a Java Agent at JVM startup, you must define this property
  • Agent class: if the implementation supports the mechanism of starting the agent at a certain time after the JVM is started, this attribute specifies the agent class; The fully qualified value of this class is agentmain
  • Can readdefine classes: defines whether the Java Agent can redefine Java classes. The value is true or false. The default is false
  • Can retransform classes: defines whether the Java Agent can retransmit Java classes. The value is true or false. The default is false
  • Can set native method prefix: defines whether the Java Agent can set the required native method prefix. The default is false
  • Boot class path: set the path list for starting class loader search; After the platform specific mechanism for finding classes fails, the boot class loader will search these paths; Search for paths in the order listed, and the paths in the list are separated by one or more spaces; The path uses the path component syntax of hierarchical URI; If the path starts with a slash character ("/"), it is an absolute path, otherwise it is a relative path; The relative path is parsed according to the absolute path of the agent JAR file, ignoring the path with incorrect format and the path that does not exist. If the agent is started at a certain time after the VM is started, ignoring the path that does not represent the JAR file

In the process of building a project using maven, we can use POM The build attribute in XML sets the Maven jar plugin plug-in to automatically generate manifest MF file, for example:

	<build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <version>3.2.0</version>
                <configuration>
                    <archive>
                        <!--true Delegate auto add MANIFEST.MF file-->
                        <manifest>
                            <addClasspath>true</addClasspath>
                        </manifest>
                        <!--set up MANIFEST.MF File related properties,Class files use fully qualified class names-->
                        <manifestEntries>
                            <Premain-Class>cn.wygandwdn.agent.TestAgent</Premain-Class>
                            <Agent-Class>cn.wygandwdn.agent.TestAgent</Agent-Class>
                            <Can-Redefine-Classes>true</Can-Redefine-Classes>
                            <Can-Retransform-Classes>true</Can-Retransform-Classes>
                        </manifestEntries>
                    </archive>
                </configuration>
            </plugin>
        </plugins>
    </build>

Java Instrumentation API parsing

Official documents:

java.instrument (Java SE 9 & JDK 9 )

java.lang.instrument (Java Platform SE 8 )

Java Instrumentation API mainly provides two interfaces: Instrumentation and ClassFileTransformer:

  • Instrumentation: this class provides services for inserting code into Java
  • ClassFileTransformer: as the name suggests, this class is the class file converter interface. Java Agent provides the implementation of this interface to convert class files

Instrumentation provides interfaces for adding & removing class transformers, obtaining classes loaded by the JVM, redefining classes, and judging whether classes support re conversion & redefinition. Details are as follows:

  • Addtransformer (classfiletransformer, Boolean canRetransform): register the class converter. Except for the class definitions that the registered class converter depends on, all class definitions will pass through the class converter; When a class is loaded, redefined, or reconverted (if canRetransform is true), the class converter will be called; If one class converter throws an exception, the JVM will call other class converters in turn; The same class converter instance can be added multiple times, but it is strongly recommended not to do so. This can be avoided by creating a new instance
  • Addtransformer (classfiletransformer): registers a class transformer, which is the same as addTransformer(transformer, false)
  • Removetransformer (classfiletransformer): remove the registered transformer, and the class definition will no longer pass through the removed transformer; The removal policy is to remove the recently added transformer instance; Due to multithreading, a removed transformer may be called. The transformer we implement should avoid this situation
  • isRetransformClassesSupported(): judge whether the JVM configuration supports the re conversion of classes. This attribute is in manifest. XML of jar package It is defined in the MF file and the attribute name is can retransform classes
  • Retransformclasses (class <? >... classes): retransfer the classes provided in the parameters. This function is helpful to modify the loaded classes; The original class file bytes can be converted by ClassFileTransformer; If the method of the converted class is running during the re conversion, the method will also run the bytecode of the original method, and the new method will not run until the subsequent new call; This method will not cause any initialization. In other words, redefining a class will not cause its initialization method to run, and the value of the static variable will remain in the state before the call; The instance of the re conversion class is not affected; Re conversion may modify the method body, constant pool and properties, but it cannot add, remove and rename fields and methods, modify the method signature, and modify the inheritance relationship (these restrictions may be cancelled in future versions)
  • Isredefinecclassessupported(): judge whether the JVM configuration supports class redefinition. This attribute is in manifest. XML of the jar package Defined in MF file, attribute name: can redefine classes
  • Redefineclasses (class definition... definitions): redefine the class definition provided by the parameter; This method is used to replace the class definition without referring to the bytes of the existing class file, just as it was done when recompiling from the source code to repair and continue debugging. If referring to the existing class byte file, the retransformClasses method will be called; Other properties are similar to retransform
  • Ismodifiableclass (class <? > theclass): judge whether the class has been modified, such as being re converted or redefined; If it has been modified, it will return true; otherwise, it will return false (classes of original type and array type will never be modified)
  • getAllLoadedClassed(): returns all classes loaded by the JVM
  • Getinitiated classes (classloader): returns an array of all classes loaded by the specified class loader; If the loader is empty, the class array loaded by the startup class loader is returned
  • getObjectSize(Object objectToSize): returns the array size consumed by the specified object

The implementation class of Instrumentation is sun instrument. Instrumentationimpl. Refer to this class for the specific implementation of the interface, which will not be repeated here; Mastering the usage of most interfaces provided by Instrumentation is very helpful for us to complete specific functions

ClassFileTransformer only provides one interface transform and default implementation, which is used to convert class files and return byte array:

	/*
	This class is used to convert the specified class file and return the converted new class file
	Detailed explanation of parameters:
	loader-The class loader of the converted class is null if it starts the class loader to load
	className-Fully qualified class name of the converted class
	classBeingRedefined-If this is triggered by redefinition or retransmission, it is the class redefined or retransmitted; null if this is a class load
	protectionDomain-The protection domain of the class being defined or redefined
	classfileBuffer-Input byte buffer for class file format - cannot be modified
	*/
	default byte[]
    transform(  ClassLoader         loader,
                String              className,
                Class<?>            classBeingRedefined,
                ProtectionDomain    protectionDomain,
                byte[]              classfileBuffer)
        throws IllegalClassFormatException {
        return null;
    }

In the implementation of transform, we modify the class through bytecode enhancement. For example, we use javassist or asm to modify the class definition to realize the functions we want; Then, the implementation class of the ClassFileTransformer interface is registered in the Instrumentation. If the JVM allows re conversion, the Instrumentation will call our registered transform to convert the classes loaded by the JVM

Principle of Instrumentation API

java. The specific implementation of lang.instrument package depends on JVMTI(Java Virtual Machine Tool Interface). JVMTI is a set of local interfaces provided by Java virtual machine for JVM related tools. JVMTI was introduced from Java SE5, integrating and replacing the previously used Java Virtual Machine Profiler Interface(JVMPI) and Java Virtual Machine Debug Interface(JVMDI). In Java SE6, JVMPI and JVMDI have disappeared. JVMTI provides a set of "proxy" program mechanism, which can support third-party tool programs to connect and access the JVM by proxy, and use the rich programming interfaces provided by JVMTI to complete many functions related to the JVM.

In fact, Java The implementation of lang.instrument package is based on this mechanism: in the implementation of Instrumentation, there is a JVMTI agent, which completes the dynamic operation of Java classes by calling the functions related to Java classes in JVMTI. In addition to the Instrumentation function, JVMTI also provides a large number of valuable functions in virtual machine memory management, thread control, method and variable operation and so on.

JVM official document address: JVMTI

Process of loading instrument agent at startup:

  1. Create and initialize JPLISAgent;
  2. Listen to VMInit event and do the following after JVM initialization:
    1. Create an InstrumentationImpl object;
    2. Listen for ClassFileLoadHook events;
    3. Call the loadClassAndCallPremain method of InstrumentationImpl. In this method, you will call manifest in javaagent The premain method of premain class specified in MF;
  3. Parsing manifest. In javaagent MF file parameters, and set some contents in JPLISAgent according to these parameters.

During instrument agent loading:

Request the target JVM to load the corresponding agent through the JVM's attach mechanism. The process is roughly as follows:

  1. Create and initialize JPLISAgent;
  2. Parsing manifest. In javaagent Parameters in MF;
  3. Create an InstrumentationImpl object;
  4. Listen for ClassFileLoadHook events;
  5. Call the loadClassAndCallAgentmain method of InstrumentationImpl. In this method, you will call manifest in javaagent agentmain method of agent class specified in MF.

Example

The general process of implementing Java Agent is as follows:

  1. Write Java Agent classes and add functions to be implemented in combination with ClassFileTransformer
  2. Customize resources / meta-inf / manifest MF file, or through POM XML attribute configuration automatically generates the file
  3. Type the jar package and determine the absolute path of the jar package to prepare for subsequent use
  4. Start the main method or jar package we want to run through the JVM parameter: - javaagent:jar package absolute path = args, so as to run the specified Java Agent

The above is a simple process to realize Java Agent, which can be simply adjusted according to actual needs

premain

A custom transformer is used to convert classes and calculate the execution time of each method. The implementation classes are as follows:

public class CalculateTime implements ClassFileTransformer {

    private String config;
    private ClassPool pool;

    public CalculateTime(String config, ClassPool pool) {
        this.config = config;
        this.pool = pool;
    }

    private final static String source = "{\n"
            + "    long begin = System.currentTimeMillis();\n"
            + "    Object result;\n"
            + "    try {\n"
            + "        result = ($w) %s$agent($$);\n"
            + "    } finally {\n"
            + "        long end = System.currentTimeMillis();\n"
            + "        System.out.println(\"%s Method execution time is: \" + (end - begin) + \"ms\");"
            + "    }\n"
            + "    return ($r) result;"
            + "}\n";

    private final static String voidSource = "{\n" +
            "            long begin = System.currentTimeMillis();\n" +
            "            try {\n" +
            "                %s$agent($$);\n" +
            "            } finally {\n" +
            "                long end = System.currentTimeMillis();\n" +
            "                System.out.println(\"%s Method execution time is: \" + (end - begin) + \"ms\");\n" +
            "            }\n" +
            "        }";

    @Override
    public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException {
        if (className == null || !className.replaceAll("/", ".").startsWith(this.config)) {
            return null;
        }
        try {
            className = className.replaceAll("/", ".");
            CtClass ctClass = pool.get(className);
            // Get all non private methods
            CtMethod[] methods = ctClass.getDeclaredMethods();
            for (CtMethod method : methods) {
                newMethod(method);
            }
            return ctClass.toBytecode();
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

    private static CtMethod newMethod(CtMethod oldMethod) throws CannotCompileException, NotFoundException {
        CtMethod copy = CtNewMethod.copy(oldMethod, oldMethod.getDeclaringClass(), null);
        copy.setName(oldMethod.getName() + "$agent");
        oldMethod.getDeclaringClass().addMethod(copy);
        if (oldMethod.getReturnType().equals(CtClass.voidType)) {
            oldMethod.setBody(String.format(voidSource, oldMethod.getName(), oldMethod.getName()));
        } else {
            oldMethod.setBody(String.format(source, oldMethod.getName(), oldMethod.getName()));
        }
        return copy;
    }

}

Here, the original class is not modified directly through the following methods:

method.insertBefore("long start = System.currentTimeMillis();");
method.insertAfter("System.out.println(System.currentTimeMillis() - start);");

Because the code inserted by javassist is a code block, and the variables between the two code blocks cannot be interconnected, using the above method will report an error; Therefore, here is to copy a method named (original method name + $agent) on top of the original method, add it to CtClass, then modify the method body of the original method, call the new method, and add timing operation; In addition, there are two cases with and without return values. For details, see the newMethod method

Customize the proxy class with premain method:

public class MyAgent {

    /**
     * The method is executed before the main method of the main program is executed
     * @param agentArgs
     * @param inst
     */
    public static void premain(String agentArgs, Instrumentation inst) {
        System.out.println("premain start");
        final String config = agentArgs;
        final ClassPool pool = ClassPool.getDefault();
        inst.addTransformer(new CalculateTime(config, pool));
    }

    /**
     * When the main method of the main program is executed, the method can be called through the attach api
     * @param args
     * @param inst
     */
    public static void agentmain(String args, Instrumentation inst) {
        System.out.println("load agent after main run.args=" + args);
		// Print all classes loaded by the class loader in the application
        Class[] allLoadedClasses = inst.getAllLoadedClasses();
        for (Class allLoadedClass : allLoadedClasses) {
            System.out.println(allLoadedClass.getName());
        }
        System.out.println("agent run completely.");
    }
}

Resources / meta-inf / manifest is not customized here MF file, but in POM XML is configured as follows:

	<dependencies>
        <dependency>
            <groupId>org.javassist</groupId>
            <artifactId>javassist</artifactId>
        </dependency>
    </dependencies>
	<build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <version>3.2.0</version>
                <configuration>
                    <archive>
                        <manifest>
                            <addClasspath>true</addClasspath>
                        </manifest>
                        <manifestEntries>
                            <Premain-Class>cn.wygandwdn.agent.MyAgent</Premain-Class>
                            <Agent-Class>cn.wygandwdn.agent.MyAgent</Agent-Class>
                            <Can-Redefine-Classes>true</Can-Redefine-Classes>
                            <Can-Retransform-Classes>true</Can-Retransform-Classes>
                        </manifestEntries>
                    </archive>
                </configuration>
            </plugin>
        </plugins>
    </build>

After the above settings are completed, you can start packaging. After packaging, test the agent through the following methods:

public class BlogService {
    public String getBlog() {
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return "blog's content";
    }
}

public class TestAgentMain {
    public static void main(String[] args) {
        BlogService service = new BlogService();
        service.getBlog();
    }
}

Configure startup parameters in idea:

1. Configure main method operation parameters:

2. Add VM startup parameters

3. Set VM startup parameters and specify - javaagent startup parameters

Then run the main method, and the results are as follows:

agentmain

attach api

agentmain is the Java Agent loaded after the JVM is started, which is mainly realized through the attach api; The attach api mainly realizes its function of loading Java Agent through VirtualMachine and VirtualMachineDescriptor. Its official api document address is: com.sun.tools.attach (Java SE 9 & JDK 9 )

  • VirtualMachine: this class represents the target JVM process to which the current JVM is connected; The application program loads the Java Agent into the target JVM through VirtualMachine; For example, a profiler tool written in the Java language might connect to a running application and load its profiler agent to profile the running application. In addition, VirtualMachine provides access to the target JVM system properties. Class can refer to the relevant source code

    This class allows us to remotely connect to the jvm by passing in a jvm PID (process id) to the attach method; Agent class injection is only one of its many functions. Register an agent program agent with the jvm through the loadAgent method, and an Instrumentation instance will be obtained in the agent program of the agent. The instance can change the bytecode of the class before the class is loaded, or it can be reloaded after the class is loaded. When calling the methods of the Instrumentation instance, these methods will be processed using the methods provided in the ClassFileTransformer interface.

  • VirtualMachineDescriptor: this class is a container class used to describe the JVM. It cooperates with VirtualMachine class to complete various functions

The principle of attach dynamic injection is very simple:

  1. Via virtualmachine The attach (PID) method can remotely connect to a running JVM process
  2. Inject the jar package of Java Agent into the corresponding process through loadAgent(agent), and then the corresponding process will call agentmain method

Through the attach api and agentmain method, we can easily dynamically set and load the agent class during the running process, and then run the agent after the main function starts running, so as to achieve the purpose of instrumentation; Compared with the premain method, which can only run before the main method with the - javaagent parameter, it has greater scalability, making it possible for us to modify the running program

Simple case

In the agent example of premain, there is also agentmain method. This method mainly realizes printing all classes loaded by the application. We need to load the Java Agent jar package in the main method through the attach api. After loading, the agentmain method will be run. For simplicity, find the current JVM in the main method and load the agent:

public class TestAgentMain {
    public static void main(String[] args) throws IOException, AttachNotSupportedException, AgentLoadException, AgentInitializationException {
        // test agentmain
        System.out.println("====start test agentmain====");
        List<VirtualMachineDescriptor> list = VirtualMachine.list();
        for (VirtualMachineDescriptor virtualMachineDescriptor : list) {
            // To connect to the current jvm, you need to add jvm running parameters: - djdk attach. allowAttachSelf=true
            if (virtualMachineDescriptor.displayName().endsWith("cn.wygandwdn.TestAgentMain")) {
                VirtualMachine virtualMachine = VirtualMachine.attach(virtualMachineDescriptor.id());
                virtualMachine.loadAgent("D:/java_project/trace/learn-javaagent/target/learn-javaagent-1.0-SNAPSHOT.jar");
                virtualMachine.detach();
            }
        }
    }
}

If the previous parameters are still used at this time, an error will be reported:

This is because the main method does not allow connection to the current JVM. At this time, we can avoid this problem through parameter setting. The specific settings are as follows:

-javaagent:learn-javaagent-1.0-SNAPSHOT.jar
-Djdk.attach.allowAttachSelf=true

The VM parameter that works is - djdk attach. allowAttachSelf=true

The operation results of the main method are as follows:

reference

Special thanks:

JVM bytecode plug-in magic, you can monitor the java online system without changing a line of code. Alibaba eagle eye and Huawei SkyWalking are implemented with it

Java agent user guide

Function and principle of Java Instrument

Java programmers must know: in-depth understanding of Instrument

Javassist explanation

Welcome to my personal blog: Who is Feng on the blog

Keywords: Java jar

Added by php_jord on Mon, 14 Feb 2022 03:17:53 +0200