Deep understanding of Instrument

I premise

I learned a long time ago that the current mainstream APM open source frameworks, such as Pinpoint and SkyWalking, are all through Java The bytecode enhancement provided by lang.instrument package. Take time to analyze Java while your enthusiasm for this has not subsided Record the usage of lang.instrument package and write it into a series of articles. This series of blog posts is aimed at JDK11, and other versions of JDK may not be suitable.

II Introduction to instrument

java. The structure of lang.instrument package is as follows:

 

java.lang.instrument
    - ClassDefinition
    - ClassFileTransformer
    - IllegalClassFormatException
    - Instrumentation
    - UnmodifiableClassException
    - UnmodifiableModuleException

Among them, the core function consists of the interface java lang.instrument. Instrumentation is provided. Here you can understand what is an instrument through the API annotation of the instrumentation class:

The instrumentation class provides services that control Java language program code. Instrumentation can insert additional bytecode into the method, so as to collect the data in use to the specified tool. Since the inserted bytecode is additional, these changes will not modify the state or behavior of the original program. Benign tools implemented in this way include monitoring agents, analyzers, coverage analyzers, event loggers, and so on.

That is, Java The biggest function of lang.instrument package is to add (modify) bytecode to existing classes to realize enhanced logic. If it is used benign, of course, it will not affect the normal behavior of the program, If it is maliciously used, it may have some negative effects (in fact, many commercial Java programs, such as the cracking of the License of IntelliJ IDEA, can be implemented based on the function of Instrumentation, provided that the entry of the program authentication License is found).

1.1 instrument principle

The underlying implementation of the instrument depends on JVMTI, that is, the JVM Tool Interface. It is a set of interfaces exposed by the JVM for users to extend. JVMTI is event driven. Every time the JVM executes a certain logic, it will call some event callback interfaces (if any), which can be used by developers to extend their own logic. JVMTIAgent is a dynamic library that uses the interface exposed by JVMTI to provide the functions of agent on load, agent on attach and agent on unload. The instrument agent can be understood as a kind of jvmtiaagent dynamic library. Its alias is JPLISAgent(Java Programming Language Instrumentation Services Agent), which is an agent specially designed to support the plug-in service written in java language. Because it involves source code analysis, the author is not able to carry out it for the time being. You can read in detail the article dedicated to the analysis of JVM related source code implementation in resources.

 

Among them, the command line parameter - javaagent: youragent can be used to load Agent when VM starts Jar.

III Detailed explanation of Instrumentation interface

  • void addTransformer(ClassFileTransformer transformer, boolean canRetransform)
    Register the ClassFileTransformer instance. Registering multiple instances will be called in the order of registration. After all classes are loaded, the ClassFileTransformer instance will be called, which is equivalent to redefining them through the redefinclasses method. The Boolean parameter canRetransform determines whether the class redefined here can be rolled back through the retransformClasses method.

  • void addTransformer(ClassFileTransformer transformer)
    Equivalent to addTransformer(transformer, false), that is, the class redefined through the ClassFileTransformer instance cannot be rolled back.

  • boolean removeTransformer(ClassFileTransformer transformer)
    Remove (unregister) the ClassFileTransformer instance.

  • boolean isRetransformClassesSupported()
    Returns whether the current JVM configuration supports the feature of class retranslation.

  • void retransformClasses(Class<?>... classes) throws UnmodifiableClassException
    The method of relocating the loaded class. The reloaded class will be called back to the list of ClassFileTransformer for processing. If you want to have an in-depth understanding, it is recommended to read the API notes.

  • boolean isRedefineClassesSupported()
    Returns whether the current JVM configuration supports the feature of redefining the class (modifying the bytecode of the class).

void redefineClasses(ClassDefinition... definitions) throws ClassNotFoundException, UnmodifiableClassException

Redefining a class means redefining the loaded class. The input parameter of ClassDefinition type includes the corresponding type class <? > Object and byte code file.

Other functions:

  • Boolean ismodifiableclass (class <? > theclass): judge whether the corresponding class has been modified.
  • Class[] getAllLoadedClasses(): get all loaded classes.
  • Class[] getInitiatedClasses(ClassLoader loader): get all the classes that have been initialized.
  • long getObjectSize(Object objectToSize): get the (byte) size of an object. Note that nested objects or attribute references in objects need to be calculated separately.
  • void appendToBootstrapClassLoaderSearch(JarFile jarfile): add a jar to the Bootstrap Classpath, giving priority to other jars to be loaded.
  • void appendToSystemClassLoaderSearch(JarFile jarfile): add a jar to the Classpath for appclassloader to load.
  • Void setnativemethodprefix (classfiletransformer, string prefix): set the prefix of some native methods, mainly for rule matching when finding native methods.
  • boolean isNativeMethodPrefixSupported(): whether to support setting the prefix of native methods.
  • void redefineModule(...): Redefine the Module.
  • Boolean ismodifiable Module (Module module): judge whether the specified Module has been redefined.

IV How to use Instrumentation

The Instrumentation class has a very concise description of its usage in the API notes:

There are two ways to obtain an instance of the Instrumentation interface:

  1. The JVM starts in the mode of specifying the agent. At this time, the Instrumentation instance will be passed to the premain method of the agent class.
  2. The JVM provides a mechanism to start the agent at a certain time after startup. At this time, the Instrumentation instance will be passed to the agentmain method of the agent class code.

First of all, we know that the implementation class of Instrumentation is sun instrument. Instrumentationimpl, after JDK9, due to module permission control, it is impossible to construct its instance through reflection. Generally, what reflection cannot do can only be realized through JVM. Moreover, according to the concise API comments above, we can't know how to use Instrumentation. In fact, premain corresponds to the loading of the Instrument Agent when the VM is started, that is, the agent on load mentioned above, while agentmain corresponds to the loading of the Instrument Agent when the VM is running, that is, the agent on attach mentioned above. The instrument agents loaded by the two loading forms pay attention to the same JVMTI event - ClassFileLoadHook event, which is used for callback after reading the bytecode file. In other words, the callback timing of premain and agentmain methods is after the bytecode of the class file is read (or after the class is loaded).

In fact, the ultimate purpose of premain and agentmain is to call back the Instrumentation instance and activate sun instrument. Instrumentationimpl#transform() calls back the ClassFileTransformer registered in Instrumentation to modify the bytecode. In essence, the function is not very different. The non essential functions of the two are different as follows:

  • premain needs to use the external agent jar package through the command line; Agentmain can be directly attached to the target VM through the attach mechanism to load the agent, that is, under the agentmain mode, the program operating the attach and the program being proxied can be two completely different programs.
  • The classes called back to ClassFileTransformer by premain mode are all classes loaded by the virtual machine. This is determined by the loading order of the agent. In the view of the developer's logic, the premain method will be activated before all classes are loaded for the first time and enter the program main() method, and then all loaded classes will execute the callback in the ClassFileTransformer list.
  • Because the agentmain method adopts the attach mechanism, the VM of the proxy target program may have been started long ago. Of course, all its classes have been loaded. At this time, the corresponding classes need to be re converted with the help of instrumentation #retransformclasses (class <? >... Classes), so as to activate the re converted classes and execute the callback in the ClassFileTransformer list.
  • The premain mode is jdk1 5, and the agentmain method is jdk1 6, namely jdk1 After 6, you can choose to use premain or agentmain.

4.1 usage of premain

The premain method relies on an independent Java agent, that is, a separate project is established, the code is written, and then it is made into a jar package for another user program to introduce in the form of an agent. The simple steps are as follows:

① To write the premain function is to write an ordinary Java class that contains one of the following two methods.

 

public static void premain(String agentArgs, Instrumentation inst);  [1]
public static void premain(String agentArgs); [2]

② Run through the specified Agent.

 

java -javaagent:agent Jar Path to package [=afferent premain Parameters of] yourTarget.jar

Simple examples are as follows:

Create a premain agent project and a new class club throwable. permain. Permainagent is as follows:

 

public class PermainAgent {
    private static Instrumentation INST;
    public static void premain(String agentArgs, Instrumentation inst) {
        INST = inst;
        process();
    }
    private static void process() {
        INST.addTransformer(new ClassFileTransformer() {
            @Override
                        public byte[] transform(ClassLoader loader, String className,
                                                Class<?> clazz,
                                                ProtectionDomain protectionDomain,
                                                byte[] byteCode) throws IllegalClassFormatException {
                System.out.println(String.format("Process by ClassFileTransformer,target class = %s", className));
                return byteCode;
            }
        }
        );
    }
}

Introduce Maven jar plugin:

 

<plugins>
    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-jar-plugin</artifactId>
        <version>3.1.2</version>
        <configuration>
            <archive>
                <manifestEntries>
                    <Premain-Class>club.throwable.permain.PermainAgent</Premain-Class>
                    <Can-Redefine-Classes>true</Can-Redefine-Classes>
                    <Can-Retransform-Classes>true</Can-Retransform-Classes>
                </manifestEntries>
            </archive>
        </configuration>
    </plugin>
</plugins>

Through the mvn package command, you can get the premain agent Jar (the author found that the plug-in does not support JDK11, so it is downgraded to JDK8). You can then use the proxy jar:

 

// This is a sample class
public class HelloSample {
    public void sayHello(String name) {
        System.out.println(String.format("%s say hello!", name));
    }
}
// main function, vm parameter: - javaagent: I: j-projectsinstruction-samplepremain-agenttargetpremain-agent jar
public class PermainMain {
    public static void main(String[] args) throws Exception{
    }
}
// Output results
Process by ClassFileTransformer,target class = sun/nio/cs/ThreadLocalCoders
Process by ClassFileTransformer,target class = sun/nio/cs/ThreadLocalCoders$1
Process by ClassFileTransformer,target class = sun/nio/cs/ThreadLocalCoders$Cache
Process by ClassFileTransformer,target class = sun/nio/cs/ThreadLocalCoders$2
Process by ClassFileTransformer,target class = com/intellij/rt/execution/application/AppMainV2$Agent
Process by ClassFileTransformer,target class = com/intellij/rt/execution/application/AppMainV2
Process by ClassFileTransformer,target class = com/intellij/rt/execution/application/AppMainV2$1
Process by ClassFileTransformer,target class = java/lang/reflect/InvocationTargetException
Process by ClassFileTransformer,target class = java/net/InetAddress$1
Process by ClassFileTransformer,target class = java/lang/ClassValue
// ...  Omit a large number of other outputs

In fact, if we want to customize the function, we need to exclude some Java Lang package and sun package. Of course, this is only for demonstration, so it's harmless.

4.2 agentmain usage

agentmain is used in a very similar way to permain, including writing manifest MF and generate agent Jar packages. However, it does not need to introduce the agent Jar in the form of - javaagent command line, but activate the specified agent through the attach tool at runtime. The simple steps are as follows:

① To write the premain function is to write an ordinary Java class that contains one of the following two methods.

 

public static void agentmain(String agentArgs, Instrumentation inst);  [1]
public static void agentmain(String agentArgs); [2]

① The callback priority of will be higher than ②, that is, when [1] and [2] exist at the same time, only ① will be called back. agentArgs is the program parameter obtained from agentmain function, which can be accessed through COM sun. tools. attach. Var2 in virtualmachine #loadagent (var1, var2) is passed in. var1 is the absolute path of the agent Jar.

② The proxy service is packaged as Jar.

Agent is generally an ordinary Java service, but the agentmain function needs to be written, and the agent class needs to be added to the manifest (i.e. MANIFEST.MF file) attribute of the Jar package to specify the Java class that has written the agentmain function in step 1.

③ The Agent is loaded directly through the attach tool. The program executing the attach and the program to be proxied can be two completely different programs.

 

// List all VM instances
List<VirtualMachineDescriptor> list = VirtualMachine.list();
// attach target VM
VirtualMachine.attach(descriptor.id());
// Target VM loading Agent
VirtualMachine#loadAgent("agent Jar path", "command parameter");

Take a simple example: the classes of agentmain function are as follows:

 

public class AgentmainAgent {
    private static Instrumentation INST;
    public static void agentmain(String agentArgs, Instrumentation inst) {
        INST = inst;
        process();
    }
    private static void process() {
        INST.addTransformer(new ClassFileTransformer() {
            @Override
                        public byte[] transform(ClassLoader loader, String className,
                                                Class<?> clazz,
                                                ProtectionDomain protectionDomain,
                                                byte[] byteCode) throws IllegalClassFormatException {
                System.out.println(String.format("Agentmain process by ClassFileTransformer,target class = %s", className));
                return byteCode;
            }
        }
        , true);
        try {
            INST.retransformClasses(Class.forName("club.throwable.instrument.AgentTargetSample"));
        }
        catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Change the configuration of Maven jar plugin and package it through MVN package:

 

<plugins>
    <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-jar-plugin</artifactId>
        <version>3.1.2</version>
        <configuration>
            <archive>
                <manifestEntries>
                    <!-- Mainly change this configuration item -->
                    <Agent-Class>club.throwable.permain.PermainAgent</Premain-Class> 
                    <Can-Redefine-Classes>true</Can-Redefine-Classes>
                    <Can-Retransform-Classes>true</Can-Retransform-Classes>
                </manifestEntries>
            </archive>
        </configuration>
    </plugin>
</plugins>

Agentmaintatchmain:

 

public class AgentmainAttachMain {
    public static void main(String[] args) throws Exception {
        List<VirtualMachineDescriptor> list = VirtualMachine.list();
        for (VirtualMachineDescriptor descriptor : list) {
            if (descriptor.displayName().endsWith("AgentTargetSample")) {
                VirtualMachine virtualMachine = VirtualMachine.attach(descriptor.id());
                virtualMachine.loadAgent("I:\J-Projects\instrument-sample\premain-agent\target\premain-agent.jar", "arg1");
                virtualMachine.detach();
            }
        }
    }
}

AgentTargetSample:

 

public class AgentTargetSample {
    public void sayHello(String name) {
        System.out.println(String.format("%s say hello!", name));
    }
    public static void main(String[] args) throws Exception {
        AgentTargetSample sample = new AgentTargetSample();
        for (; ; ) {
            Thread.sleep(1000);
            sample.sayHello(Thread.currentThread().getName());
        }
    }
}

Then start AgentTargetSample first, and then agentmaintatchmain:

 

main say hello!
main say hello!
main say hello!
main say hello!
main say hello!
main say hello!
main say hello!
Agentmain process by ClassFileTransformer,target class = club/throwable/instrument/AgentTargetSample
main say hello!
main say hello!
main say hello!

PS: if no VirtualMachineDescriptor or VirtualMachine is found, just put ${java_hong} / lib / tools Copy jar to ${JAVA_HONE}/jre/lib directory.

V Limitations of Instrumentation

In most cases, we use the function of bytecode Instrumentation, or in general, the function of class redefine, but it has the following limitations:

  • Both premain and agentmain modify the bytecode after the Class file is loaded, that is to say, the parameter of Class type must be included, and a nonexistent Class cannot be redefined by bytecode file and user-defined Class name.
  • The bytecode modification of a class is called class transformation. In fact, class transformation ultimately returns to the class redefinition instrumentation #redefinitecclasses() method. This method has the following limitations:
    • The parent class of the new class and the old class must be the same.
    • The number of interfaces implemented by the new class and the old class should also be the same, and they are the same interfaces.
    • New and old class accessors must be consistent.
    • The number of fields and field names of new and old classes should be consistent.
    • New or deleted methods of new and old classes must be decorated with private static/final.
    • You can modify the method body.

In addition to the above methods, if you want to redefine a class, you can consider the isolation method based on class loader: create a new custom class loader to define a new class through a new bytecode, but there is also the limitation that you can only call the new class through reflection.

Vi Summary

This paper only briefly analyzes the principle and basic use of instrument. We can realize that instrument makes Java have stronger dynamic control and interpretation ability, so as to make the Java language more flexible and changeable. In jdk1 After 6, using Instrumentation, developers can build an application independent agent to monitor and assist the programs running on the JVM, and can remotely re convert the loaded classes in the specified JVM instance. From the perspective of developers, this implementation is like supporting AOP programming at the JVM level.

 

Keywords: Java

Added by coder4Ever on Tue, 08 Feb 2022 09:21:14 +0200