preface
One day when i was wandering around Zhihu, i encountered such a problem: why is i the final result of face code 8?
public static void main(String[] args) { int i = 1; i += i += ++i + 2.6 + i; }
Very simple two lines of code. If you encounter such a problem, how will you explain the problem clearly? Is the formula disassembled by Java operator sequence and then calculated step by step, or what other method?
After thinking for a while, I decided to see how these two lines of code work through bytecode instructions.
Copy the two lines of code into Test.java and execute the following instructions to convert the Java source code into bytecode:
javac Test.java javap -c Test.class
The bytecode output results are as follows:
If you don't know about bytecode before, you can search the data of bytecode instructions, or go to the book "understanding Java virtual machine" to find "Appendix b bytecode instruction list".
Next, translate the bytecode:
public static void main(java.lang.String[]); Code: 0: iconst_1 // Put 1 at the top of the operand stack 1: istore_1 // Take the i at the top of the operand stack out of the stack and store it in the slot in the local variable table 2: iload_1 // Take i from the slot and put it at the top of the operand stack. At this time, the stack content is 1 3: iload_1 // Take i from the slot and put it at the top of the operand stack again. At this time, the stack content is 1 one 4: i2d // Convert the int at the top i of the operand stack to double type. At this time, the stack content is 1.0 one 5: iinc // ++ i increases automatically. At this time, the value of i in the slot is 2. Remember, it is 2 8: iload_1 // Take i from the slot and put it at the top of the stack. At this time, the content of the stack is 2 one one 9: i2d // Converts the int type at the top of the stack to the double type 10: ldc2_w // Put 2.6 on the top of the stack, and the stack content is 2.6 two one one 13: dadd // Add the two double at the top of the stack, and put the result into the top of the stack. At this time, the content of the stack is four point six one one 14: iload_1 // Put the i in the slot at the top of the stack, and the stack content is two four point six one one 15: i2d // Convert the int type at the top of the stack to the double type, and the stack content two four point six one one 16: dadd // Add the two double at the top of the stack, and put the result into the top of the stack. At this time, the content of the stack is six point six one one 17: dadd // Add the two double at the top of the stack, and put the result into the top of the stack. At this time, the content of the stack is seven point six one 18: d2i // Convert the double at the top of the stack to int type, and 7.6 becomes 7. At this time, the content of the stack is 7 one 19: dup // Copy the stack top value and press the stack. At this time, the stack content is seven seven one 20: istore_1 // Will I= i + (++i + two point six + i) As a result, the value of I, that is, 7, is placed in the slot and out of the stack. At this time, the stack content is 7 one 21: iadd // Add the two int s at the top of the stack. At this time, the content of the stack is 8 22: istore_1 // i = i + (i + (++i + two point six + i)) the result, i.e. the value of I, i.e. 8, is put into the slot and out of the stack 23: return // Return 8
The bytecode annotation above is my answer. The operation steps are disassembled step by step.
Stack frame
What are the local variable tables and slot s mentioned above?
I have to raise the stack frame here. When we execute a method, the virtual machine will create a stack frame at the top of the virtual machine stack private to the thread to correspond to this method. Therefore, stack frame is the data structure during method call and execution, including local variable table, operand stack, dynamic connection, etc.
A method is called from the beginning to the completion of execution, which corresponds to the process of putting a stack frame into and out of the "virtual machine stack".
Local variable table
The local variable table is a space for storing method parameters and method local variables, which is composed of slots. When the code is compiled into a bytecode file, the size of the local variable table can be determined. Except that the 64 bit long and double types occupy two slots, other data types occupy one slot.
Operand stack
In the process of method execution, data is written and read into the operand stack through various bytecode instructions, that is, in and out of the stack. The operation of data is based on the operation stack. For example, iadd can add the two int types at the top of the stack.
Dynamic connection
Each stack frame will contain a symbolic reference to the corresponding method of the stack frame in the runtime constant pool. This reference is held to support the dynamic connection of the method call process. The process of resolving symbolic references into direct references at run time is called dynamic connection.
Method return address
The method exits under the following two conditions: when the method returns a bytecode instruction, determine whether a return value will be returned to the caller according to the method logic, and then exit the method normally; When an exception is encountered and try is not used to catch the exception, the code exits abnormally.
No matter how you exit, you must return to the position when calling the method. Some information returned by the method will be saved in the stack frame to restore the execution state of the upper layer method.
Extended application
Recently, there is a popular question on the Internet. Why does 100 = = 100 return true and 200 = = 200 return false? As we all know, = = compares the addresses of two objects. Why can the addresses of two objects be the same? Let's explore here:
The source code is as follows:
public static void main(String[] args) { Integer a = 100; Integer b = 100; Integer c = 200; Integer d = 200; System.out.println(a == b); System.out.println(c == d); }
Output results:
The bytecode is as follows:
public static void main(java.lang.String[]); Code: 0: bipush 100 2: invokestatic #2 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 5: astore_1 6: bipush 100 8: invokestatic #2 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 11: astore_2 12: sipush 200 15: invokestatic #2 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 18: astore_3 19: sipush 200 22: invokestatic #2 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 25: astore 4 27: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream; 30: aload_1 31: aload_2 32: if_acmpne 39 35: iconst_1 36: goto 40 39: iconst_0 40: invokevirtual #4 // Method java/io/PrintStream.println:(Z)V 43: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream; 46: aload_3 47: aload 4 49: if_acmpne 56 52: iconst_1 53: goto 57 56: iconst_0 57: invokevirtual #4 // Method java/io/PrintStream.println:(Z)V 60: return
From the bytecode, we can see that when assigning values to a, b, c and d, the Integer.valueOf() method is called through the "invokstatic" bytecode instruction.
However, the difference is that when assigning values to a and b, the bytecode instruction is bipush, which pushes the single byte integer constant value (- 128 - 127) into the top of the operand stack; When assigning values to c and d, the bytecode instruction is sipush, which pushes the constant value of int type into the top of the operand stack.
Why is it the same Integer type? One is 1 byte and the other is 4 bytes?
Let's explore the valueOf() method of Integer:
This method calls the overloaded valueOf(), and the code is as follows:
As shown above, this IntegerCache is a static internal class of Integer. It will judge the value of Integer initialized by you. When the value is between low and high, i.e. - 128 ~ 127, memory will not be reallocated in the heap to create an Integer object, and an Integer object will be returned directly from the cache array, so a == b.
The source code of IntegerCache is as follows:
It can be seen that the cache array is initialized through the for loop in the static static block.
epilogue
The article may not describe the stack frame in so much detail. It is mainly to let you roughly understand the basic functions of the stack frame and popularize the functions of bytecode. When we can't understand some code, it may be enlightened to understand it from another angle.