[JVM source code analysis] the template interpreter interprets and executes Java bytecode instructions

This article is compiled and published by jiumo (Ma Zhi), chief lecturer of HeapDump performance community

Part 22 - virtual machine bytecode operation instructions

The operation related bytecode instructions in the virtual machine specification are shown in the following table.


0x60	iadd	Add two int values at the top of the stack and press the result into the top of the stack
0x61	ladd	Add the two long values at the top of the stack and press the result into the top of the stack
0x62	fadd	Add two float type values at the top of the stack and press the result into the top of the stack
0x63	dadd	Add two double type values at the top of the stack and press the result into the top of the stack
0x64	isub	Subtract two int values at the top of the stack and push the result into the top of the stack
0x65	lsub	Subtract two long values at the top of the stack and press the result into the top of the stack
0x66	fsub	Subtract two float type values at the top of the stack and press the result into the top of the stack
0x67	dsub	Subtract two double type values at the top of the stack and press the result into the top of the stack
0x68	imul	Multiply the two int values at the top of the stack and press the result into the top of the stack
0x69	lmul	Multiply the two long values at the top of the stack and press the result into the top of the stack
0x6a	fmul	Multiply two float type values at the top of the stack and press the result into the top of the stack
0x6b	dmul	Multiply two double type values at the top of the stack and press the result into the top of the stack
0x6c	idiv	Divide the two int values at the top of the stack and push the result into the top of the stack
0x6d	ldiv	Divide two long type values at the top of the stack and push the result into the top of the stack
0x6e	fdiv	Divide two float type values at the top of the stack and press the result into the top of the stack
0x6f	ddiv	Divide two double type values at the top of the stack and push the result into the top of the stack
0x70	irem	Modulo the two int values at the top of the stack and press the result into the top of the stack
0x71	lrem	Modulo the two long values at the top of the stack and press the result into the top of the stack
0x72	frem	Modulo the two float type values at the top of the stack and press the results into the top of the stack
0x73	drem	Modulo the two double type values at the top of the stack and press the result into the top of the stack
0x74	ineg	Take the int value at the top of the stack as negative and press the result into the top of the stack
0x75	lneg	Take the long value at the top of the stack as negative and press the result into the top of the stack
0x76	fneg	Take the float type value at the top of the stack as negative and press the result into the top of the stack
0x77	dneg	Take the double value at the top of the stack as negative and press the result into the top of the stack
0x78	ishl	Shift the int value left by the specified number of bits and push the result to the top of the stack
0x79	lshl	Shift the long value left by the specified number of bits and push the result to the top of the stack
0x7a	ishr	Shift the int value right (sign) by the specified number of bits and push the result to the top of the stack
0x7b	lshr	Shift the long value right (sign) by the specified number of bits and push the result to the top of the stack
0x7c	iushr	Shift the int value to the right (unsigned) by the specified number of bits and push the result to the top of the stack
0x7d	lushr	Shift the long value to the right (unsigned) by the specified number of bits and push the result to the top of the stack
0x7e	iand	Make the two int values at the top of the stack "bitwise and" and press the result into the top of the stack
0x7f	land	Make the two long type values at the top of the stack "bitwise and" and press the results into the top of the stack
0x80	ior	Make the two int values at the top of the stack "bitwise or" and press the result into the top of the stack
0x81	lor	Make the two long type values at the top of the stack "bitwise or" and press the results into the top of the stack
0x82	ixor	Make the two int values at the top of the stack "bitwise XOR" and press the result into the top of the stack
0x83	lxor	Make the two long type values at the top of the stack "bitwise XOR" and press the results into the top of the stack
0x84	iinc	Increase the specified int variable by the specified value (i + +, i --, i+=2)
0x94	lcmp	Compare the size of two long type values at the top of the stack, and press the result (1, 0 or - 1) into the top of the stack
0x95	fcmpl	Compare the size of two float type values at the top of the stack, and press the result (1, 0 or - 1) into the top of the stack; When one of the values is NaN, press - 1 into the top of the stack
0x96	fcmpg	Compare the size of two float type values at the top of the stack, and press the result (1, 0 or - 1) into the top of the stack; When one of the values is NaN, press 1 into the top of the stack
0x97	dcmpl	Compare the size of two double type values at the top of the stack, and press the result (1, 0 or - 1) into the top of the stack; When one of the values is NaN, press - 1 into the top of the stack
0x98	dcmpg	Compare the size of two double type values at the top of the stack, and press the result (1, 0 or - 1) into the top of the stack; When one of the values is NaN, press 1 into the top of the stack

1. Basic add, subtract, multiply and divide instructions

1. iadd instruction

The iadd instruction adds two integers at the top of the stack, and then pushes the addition result into the top of the stack. The format of the instruction is as follows:

iadd  val1,val2

Val1 and val2 represent two int type integers. When the instruction is executed, val1 and val3 are taken out of the operand stack, the two values are added to obtain the int type data result, and the result is pressed into the operand stack.

The template of the iadd instruction is defined as follows:

def(Bytecodes::_iadd , ____|____|____|____, itos, itos, iop2 , add);

The generation function is TemplateTable::iop2(), which is implemented as follows:

void TemplateTable::iop2(Operation op) {
  switch (op) {
  case add  :                    __ pop_i(rdx); __ addl (rax, rdx); break;
  case sub  : __ movl(rdx, rax); __ pop_i(rax); __ subl (rax, rdx); break;
  case mul  :                    __ pop_i(rdx); __ imull(rax, rdx); break;
  case _and :                    __ pop_i(rdx); __ andl (rax, rdx); break;
  case _or  :                    __ pop_i(rdx); __ orl  (rax, rdx); break;
  case _xor :                    __ pop_i(rdx); __ xorl (rax, rdx); break;
  case shl  : __ movl(rcx, rax); __ pop_i(rax); __ shll (rax);      break;
  case shr  : __ movl(rcx, rax); __ pop_i(rax); __ sarl (rax);      break;
  case ushr : __ movl(rcx, rax); __ pop_i(rax); __ shrl (rax);      break;
  default   : ShouldNotReachHere();
  }
}

As you can see, this function is the generation function of many instructions, such as iadd, isub, imul, iand, ior, ixor, ishl, ishr and iushr.

The assembly code generated for the iadd instruction is as follows:

mov    (%rsp),%edx
add    $0x8,%rsp
add    %edx,%eax

Store the result of adding% eax cached at the top of the stack to% eax. 　

2. isub instruction

The assembly code generated by the isub instruction is as follows:

mov    %eax,%edx
mov    (%rsp),%eax
add    $0x8,%rsp
sub    %edx,%eax

The code implementation is relatively simple and will not be introduced here. 　

3. idiv instruction

idiv is a bytecode division instruction. The format of this instruction is as follows:

idiv val1,val2

Both val1 and val2 must be int type data. When the instruction is executed, val1 and val2 get out of the operand stack and divide the two values (val1 ÷ val2). The result is converted to the int type value result, and finally the result is pushed into the operand stack.

The template of the idiv instruction is defined as follows:

def(Bytecodes::_idiv , ____|____|____|____, itos, itos, idiv ,  _  );

The generated function called is TemplateTable::idiv(), and the generated assembly is as follows:

0x00007fffe1019707: mov    %eax,%ecx
0x00007fffe1019709: mov    (%rsp),%eax
0x00007fffe101970c: add    $0x8,%rsp

// Test whether the dividend is 0x80000000. If not, jump to normal_case
0x00007fffe1019710: cmp    $0x80000000,%eax
0x00007fffe1019716: jne    0x00007fffe1019727

// The divisor is 0x80000000, and if the divisor is - 1, jump to special_case
0x00007fffe101971c: xor    %edx,%edx
0x00007fffe101971e: cmp    $0xffffffff,%ecx
0x00007fffe1019721: je     0x00007fffe101972a

// -- normal_case --

// cltd extends the data symbols in the eax register to edx:eax, specifically
// The 32-bit integer of eax is extended to 64 bits, and the high 32 bits are filled with the sign bits of eax and saved to edx
0x00007fffe1019727: cltd   
0x00007fffe1019728: idiv   %ecx

// -- special_case --

The idiv function will use the specified register, as shown in the figure below.

The assembler checked the special division of 0x80000000 / -1. reference resources: Using disassembly debugging and complement to explain the inconsistency of 0x80000000 / -1 shaping output exception

2. Comparison instruction

The lcmp instruction compares the size of two long type values at the top of the stack and pushes the result (1, 0 or - 1) into the top of the stack. The format of the instruction is as follows:

lcmp val1,val2

Both val1 and val2 must be long data. When the instruction is executed, Val1 and val2 exit the operand stack and use an int value as the comparison result:

If If val1 is greater than val2, the result is 1;
If val1 equals val2 and the result is 0;
If val1 is less than val2, the result is - 1.

Finally, the comparison result is pushed into the operand stack.

The template of lcmp bytecode instruction is defined as follows:

def(Bytecodes::_lcmp , ____|____|____|____, ltos, itos, lcmp ,  _ );

The generation function is TemplateTable::lcmp(), The resulting assembly is as follows:

0x00007fffe101a6c8: mov     (%rsp),%rdx
0x00007fffe101a6cc: add     $0x10,%rsp

// The cmp instruction is described as follows:
// When the first operand < the second operand, ZF=0
// When the first operand = the second operand, ZF=1
// When the first operand > the second operand, ZF=0
0x00007fffe101a6d0: cmp     %rax,%rdx
0x00007fffe101a6d3: mov     $0xffffffff,%eax // Move - 1 to% eax

// If the first operand is less than the second operand, jump to done
0x00007fffe101a6d8: jl      0x00007fffe101a6e0

// After the cmp instruction is executed, the result of comparison can be obtained by executing the setne instruction
// Set the target operand to 0 or 1 according to the status flags (CF,SF,OF,ZF, and PF) in eflags
0x00007fffe101a6da: setne   %al
0x00007fffe101a6dd: movzbl  %al,%eax

//  -- done --

The logic of the above assembly code is very simple and will not be introduced here.

The logic of other bytecode instructions is also relatively simple, and those interested can study it by themselves.

Part 23 - type conversion of virtual machine bytecode instructions

The bytecode instructions related to type conversion defined in the Java virtual machine specification are shown in the following table.


0x85	i2l	Cast an int value at the top of the stack to a long value and push the result into the top of the stack
0x86	i2f	Cast the int value at the top of the stack to a float value and push the result into the top of the stack
0x87	i2d	Cast an int value at the top of the stack to a double value and push the result into the top of the stack
0x88	l2i	Cast the long value at the top of the stack to an int value and push the result into the top of the stack
0x89	l2f	Cast the long value at the top of the stack into a float value and push the result into the top of the stack
0x8a	l2d	Cast the long value at the top of the stack into a double value and push the result into the top of the stack
0x8b	f2i	Cast the float type value at the top of the stack to an int type value and push the result into the top of the stack
0x8c	f2l	Force the float type value at the top of the stack to a long type value and push the result into the top of the stack
0x8d	f2d	Cast the float type value at the top of the stack into a double type value and push the result into the top of the stack
0x8e	d2i	Cast the double value at the top of the stack to an int value and push the result into the top of the stack
0x8f	d2l	Cast the double value at the top of the stack into a long value and push the result into the top of the stack
0x90	d2f	Cast the double value at the top of the stack into a float value and push the result into the top of the stack
0x91	i2b	Cast the int value at the top of the stack to a byte value and push the result into the top of the stack
0x92	i2c	Cast the int value at the top of the stack to a char value and push the result into the top of the stack
0x93	i2s	Cast the int value at the top of the stack to a short value and push the result into the top of the stack

The template definition of bytecode instruction in the above table is as follows:

def(Bytecodes::_i2l   , ____|____|____|____, itos, ltos, convert ,  _           );
def(Bytecodes::_i2f   , ____|____|____|____, itos, ftos, convert ,  _           );
def(Bytecodes::_i2d   , ____|____|____|____, itos, dtos, convert ,  _           );
def(Bytecodes::_l2i   , ____|____|____|____, ltos, itos, convert ,  _           );
def(Bytecodes::_l2f   , ____|____|____|____, ltos, ftos, convert ,  _           );
def(Bytecodes::_l2d   , ____|____|____|____, ltos, dtos, convert ,  _           );
def(Bytecodes::_f2i   , ____|____|____|____, ftos, itos, convert ,  _           );
def(Bytecodes::_f2l   , ____|____|____|____, ftos, ltos, convert ,  _           );
def(Bytecodes::_f2d   , ____|____|____|____, ftos, dtos, convert ,  _           );
def(Bytecodes::_d2i   , ____|____|____|____, dtos, itos, convert ,  _           );
def(Bytecodes::_d2l   , ____|____|____|____, dtos, ltos, convert ,  _           );
def(Bytecodes::_d2f   , ____|____|____|____, dtos, ftos, convert ,  _           );
def(Bytecodes::_i2b   , ____|____|____|____, itos, itos, convert ,  _           );
def(Bytecodes::_i2c   , ____|____|____|____, itos, itos, convert ,  _           );
def(Bytecodes::_i2s   , ____|____|____|____, itos, itos, convert ,  _           );

The generation function of relevant bytecode conversion instructions is TemplateTable::convert(). The implementation of this function is as follows:

void TemplateTable::convert() {
  static const int64_t is_nan = 0x8000000000000000L;

  // Conversion
  switch (bytecode()) {
  case Bytecodes::_i2l:
    __ movslq(rax, rax);
    break;
  case Bytecodes::_i2f:
    __ cvtsi2ssl(xmm0, rax);
    break;
  case Bytecodes::_i2d:
    __ cvtsi2sdl(xmm0, rax);
    break;
  case Bytecodes::_i2b:
    __ movsbl(rax, rax);
    break;
  case Bytecodes::_i2c:
    __ movzwl(rax, rax);
    break;
  case Bytecodes::_i2s:
    __ movswl(rax, rax);
    break;
  case Bytecodes::_l2i:
    __ movl(rax, rax);
    break;
  case Bytecodes::_l2f:
    __ cvtsi2ssq(xmm0, rax);
    break;
  case Bytecodes::_l2d:
    __ cvtsi2sdq(xmm0, rax);
    break;
  case Bytecodes::_f2i:
  {
    Label L;
    __ cvttss2sil(rax, xmm0);
    __ cmpl(rax, 0x80000000); // NaN or overflow/underflow?
    __ jcc(Assembler::notEqual, L);
    __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::f2i), 1);
    __ bind(L);
  }
    break;
  case Bytecodes::_f2l:
  {
    Label L;
    __ cvttss2siq(rax, xmm0);
    // NaN or overflow/underflow?
    __ cmp64(rax, ExternalAddress((address) &is_nan));
    __ jcc(Assembler::notEqual, L);
    __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::f2l), 1);
    __ bind(L);
  }
    break;
  case Bytecodes::_f2d:
    __ cvtss2sd(xmm0, xmm0);
    break;
  case Bytecodes::_d2i:
  {
    Label L;
    __ cvttsd2sil(rax, xmm0);
    __ cmpl(rax, 0x80000000); // NaN or overflow/underflow?
    __ jcc(Assembler::notEqual, L);
    __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::d2i), 1);
    __ bind(L);
  }
    break;
  case Bytecodes::_d2l:
  {
    Label L;
    __ cvttsd2siq(rax, xmm0);
    // NaN or overflow/underflow?
    __ cmp64(rax, ExternalAddress((address) &is_nan));
    __ jcc(Assembler::notEqual, L);
    __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::d2l), 1);
    __ bind(L);
  }
    break;
  case Bytecodes::_d2f:
    __ cvtsd2ss(xmm0, xmm0);
    break;
  default:
    ShouldNotReachHere();
  }
}

As_ The i2l instruction converts an int value at the top of the stack into a long value and pushes the result into the top of the stack. The corresponding assembly code is as follows:

movslq %eax,%rax  // Extends a double word into a quadword

The conversion of floating point numbers from float or long to int or long is relatively complex. Let's take a look at the f2i instruction of converting float to int.

// Converts a scalar single precision number to a scalar integer occupying a doubleword
0x00007fffe1019189: vcvttss2si %xmm0,%eax
// Compare with 0x80000000. If it is not equal, jump to L
0x00007fffe101918d: cmp    $0x80000000,%eax
0x00007fffe1019193: jne    0x00007fffe10191bc

// If the stack top pointer has been aligned by 16 bytes, you can directly call the SharedRuntime::f2i() function. Otherwise
// Align the stack top instruction with 16 bytes before calling

0x00007fffe1019199: test   $0xf,%esp
0x00007fffe101919f: je     0x00007fffe10191b7
0x00007fffe10191a5: sub    $0x8,%rsp
// Call the SharedRuntime::f2i() function
0x00007fffe10191a9: callq  0x00007ffff6a0f946
0x00007fffe10191ae: add    $0x8,%rsp
0x00007fffe10191b2: jmpq   0x00007fffe10191bc
// Call the SharedRuntime::f2i() function
0x00007fffe10191b7: callq  0x00007ffff6a0f946 

---- L ----

The generated assembly instruction vcvttss2si means to convert a scalar single precision number into a scalar integer occupying double words. The origin of the name is interpreted as follows:

cvt: convert, conversion;

t: truncation;

ss: scalar single, scalar single precision number;

2: to；

si: scalar integer, scalar integer.

The implementation of the called SharedRuntime::f2i() function is as follows:

JRT_LEAF(jint, SharedRuntime::f2i(jfloat  x))
  if (g_isnan(x))  // If it is a non numeric value, it returns 0 directly
    return 0;
  if (x >= (jfloat) max_jint)
    return max_jint;
  if (x <= (jfloat) min_jint)
    return min_jint;
  return (jint) x;
JRT_END

When calling C + + functions, you need a parameter x, which follows the calling convention of System V AMD64 ABI on GNU / Linux. Registers RDI, RSI, RDX, RCX, R8 and R9 are parameters for integers and memory addresses, and xmm0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6 and XMM7 are floating-point parameters. Therefore, xmm0 will be used as the first parameter. This parameter happens to be the register at the top of the stack used to cache floating-point numbers, so no operation is required by default. 　

The return value is stored in% rax due to tos_out is itos, and the% rax register is used for stack top cache, so no additional operation is required.

Part 24 - getstatic of virtual machine object operation instructions

The bytecode instructions related to object operations defined in the Java virtual machine specification are shown in the following table.


0xb2	getstatic	Gets the static field of the specified class and pushes its value to the top of the stack
0xb3	putstatic	Assigns a value to the static field of the specified class
0xb4	getfield	Gets the instance field of the specified class and pushes its value to the top of the stack
0xb5	putfield	Assigns a value to the instance field of the specified class
0xbb	new	Create an object and push its reference value to the top of the stack
0xbc	newarray	Create an array of specified primitive types (such as int, float, char, etc.) and push its reference value to the top of the stack
0xbd	anewarray	Create an array of reference type (such as class, interface or array) and push its reference value to the top of the stack
0xbe	arraylength	Get the length value of the array and push it into the top of the stack
0xc0	checkcast	Verify the type conversion. If the verification fails, ClassCastException will be thrown
0xc1	instanceof	Check whether the object is an instance of the specified class. If yes, press 1 to the top of the stack, otherwise press 0 to the top of the stack
0xc5	multianewarray	Create a multidimensional array of the specified type and dimension (when executing this instruction, the operation stack must contain the length value of each dimension), and push its reference value into the top of the stack

The bytecode instruction template is defined as follows:

def(Bytecodes::_getstatic           , ubcp|____|clvm|____, vtos, vtos, getstatic           , f1_byte      );
def(Bytecodes::_putstatic           , ubcp|____|clvm|____, vtos, vtos, putstatic           , f2_byte      );
def(Bytecodes::_getfield            , ubcp|____|clvm|____, vtos, vtos, getfield            , f1_byte      );
def(Bytecodes::_putfield            , ubcp|____|clvm|____, vtos, vtos, putfield            , f2_byte      );

def(Bytecodes::_new                 , ubcp|____|clvm|____, vtos, atos, _new                ,  _           );
def(Bytecodes::_newarray            , ubcp|____|clvm|____, itos, atos, newarray            ,  _           );
def(Bytecodes::_anewarray           , ubcp|____|clvm|____, itos, atos, anewarray           ,  _           );
def(Bytecodes::_multianewarray      , ubcp|____|clvm|____, vtos, atos, multianewarray      ,  _           );

def(Bytecodes::_arraylength         , ____|____|____|____, atos, itos, arraylength         ,  _           );

def(Bytecodes::_checkcast           , ubcp|____|clvm|____, atos, atos, checkcast           ,  _           );

def(Bytecodes::_instanceof          , ubcp|____|clvm|____, atos, itos, instanceof          ,  _           );

The generation function of new bytecode instruction is templatetable:_ New(), which was introduced in detail in Chapter 9 class object creation of in-depth analysis of Java virtual machine: source code analysis and example explanation (basic volume), and will not be introduced here.

The getstatic bytecode instruction obtains the static field of the specified class and pushes its value to the top of the stack. The format is as follows:

getstatic indexbyte1 indexbyte2

The unsigned numbers indexbyte1 and indexbyte2 are constructed as (indexbyte1 < < 8) |indexbyte2. This value indicates the runtime constant pool index value of the current class, and the runtime constant pool item pointed to is the symbolic reference of a field.

The generating function of getstatic bytecode instruction is TemplateTable::getstatic(), and there is a similar getfield instruction. These generating functions are as follows:

void TemplateTable::getfield(int byte_no) {
  getfield_or_static(byte_no, false); // Byte of getfield_ The no value is 1
}

void TemplateTable::getstatic(int byte_no) {
  getfield_or_static(byte_no, true); // Byte of getstatic_ The value of no is 1
}

Getfield will eventually be called_ or_ The static () function generates machine instruction fragments. The assembly code corresponding to the machine instruction fragment generated by this function is as follows:

// Gets the index of the ConstantPoolCacheEntry in the ConstantPoolCache
0x00007fffe101fd10: movzwl 0x1(%r13),%edx
// Get the first address of ConstantPoolCache from the stack
0x00007fffe101fd15: mov    -0x28(%rbp),%rcx
// Shift 2 bits to the left because the ConstantPoolCacheEntry index is stored in% edx,
// The left shift of 2 bits is because the memory occupation of ConstantPoolCacheEntry is 4 words
0x00007fffe101fd19: shl    $0x2,%edx
// Calculate% rcx+%rdx*8+0x10 and obtain the values in constantpoolcacheentry [_indexes, _f1, _f2, _flags]_ indices
// Because the size of ConstantPoolCache is 0x16 bytes,% rcx+0x10 locates to the start of the first ConstantPoolCacheEntry
// %rdx*8 calculates the byte offset relative to the first ConstantPoolCacheEntry
0x00007fffe101fd1c: mov    0x10(%rcx,%rdx,8),%ebx
// _ After moving indexes to the right by 16 bits, get bytecode and set bytecode in [get bytecode,set bytecode,original constant pool index] are obtained
0x00007fffe101fd20: shr    $0x10,%ebx
// Gets the value of the set bytecode field
0x00007fffe101fd23: and    $0xff,%ebx
// 0xb2 is the Opcode of the getstatic instruction. Compare the values. If they are equal, it indicates that they have been connected. Jump to resolved
0x00007fffe101fd29: cmp    $0xb2,%ebx
0x00007fffe101fd2f: je     0x00007fffe101fdce


// Store Opcode of getstatic bytecode in% ebx
0x00007fffe101fd35: mov    $0xb2,%ebx

// Omit by calling MacroAssembler::call_VM() function to execute interpreterruntime:: resolve_ get_ Assembly code for the put() function
// ...

Call macroassembler:: call_ The VM () function generates the following code to execute interpreter Runtime:: resolve_ get_ Put() function. MacroAssembler::call_ The assembly of vm() function has been described in detail before. It will not be introduced here. The assembly code is directly given as follows:

0x00007fffe101fd3a: callq  0x00007fffe101fd44
0x00007fffe101fd3f: jmpq   0x00007fffe101fdc2

0x00007fffe101fd44: mov    %rbx,%rsi
0x00007fffe101fd47: lea    0x8(%rsp),%rax
0x00007fffe101fd4c: mov    %r13,-0x38(%rbp)
0x00007fffe101fd50: mov    %r15,%rdi
0x00007fffe101fd53: mov    %rbp,0x200(%r15)
0x00007fffe101fd5a: mov    %rax,0x1f0(%r15)
0x00007fffe101fd61: test   $0xf,%esp
0x00007fffe101fd67: je     0x00007fffe101fd7f
0x00007fffe101fd6d: sub    $0x8,%rsp
0x00007fffe101fd71: callq  0x00007ffff66b567c
0x00007fffe101fd76: add    $0x8,%rsp
0x00007fffe101fd7a: jmpq   0x00007fffe101fd84
0x00007fffe101fd7f: callq  0x00007ffff66b567c
0x00007fffe101fd84: movabs $0x0,%r10
0x00007fffe101fd8e: mov    %r10,0x1f0(%r15)
0x00007fffe101fd95: movabs $0x0,%r10
0x00007fffe101fd9f: mov    %r10,0x200(%r15)
0x00007fffe101fda6: cmpq   $0x0,0x8(%r15)
0x00007fffe101fdae: je     0x00007fffe101fdb9
0x00007fffe101fdb4: jmpq   0x00007fffe1000420
0x00007fffe101fdb9: mov    -0x38(%rbp),%r13
0x00007fffe101fdbd: mov    -0x30(%rbp),%r14
0x00007fffe101fdc1: retq

The above code is very simple. It calls the interpreter Runtime:: resolve written by the C + + function_ get_ Put() function, which will fill in the ConstantPoolCacheEntry information in the constant pool cache. The meaning of ConstantPoolCache, ConstantPoolCacheEntry, and the fields in ConstantPoolCacheEntry have been described in detail in deep analysis of Java virtual machine: source code analysis and detailed explanation of instances (basic volume), and will not be introduced here.

InterpreterRuntime::resolve_ get_ There are many implementations of put() function. Let's first look at some implementations, as follows:

IRT_ENTRY(void, InterpreterRuntime::resolve_get_put(JavaThread* thread, Bytecodes::Code bytecode))
  // resolve field
  fieldDescriptor      info;
  constantPoolHandle   pool(thread, method(thread)->constants());
  bool  is_put    = (bytecode == Bytecodes::_putfield  || bytecode == Bytecodes::_putstatic);
  bool  is_static = (bytecode == Bytecodes::_getstatic || bytecode == Bytecodes::_putstatic);

  {
    JvmtiHideSingleStepping jhss(thread);
    int x = get_index_u2_cpcache(thread, bytecode); // Get the constant pool cache index according to the bcp in the thread stack
    LinkResolver::resolve_field_access(info, pool, x ,bytecode, CHECK); // Collect information from info
  } 

  // check if link resolution caused cpCache to be updated
  if (already_resolved(thread)){
      return;
  }

   ...
}

Call get_ index_ u2_ The cpcache () function obtains bcp from the corresponding stack frame in the current method, and then obtains the operand of bytecode instruction through bcp, that is, constant pool index. After obtaining the constant pool index, it calls LinkResolver:: resolve_. field_ The access () function may connect classes and fields, and then store the queried field related information in the fieldDescriptor. resolve_ field_ The access() function is implemented as follows:

void LinkResolver::resolve_field_access(
 fieldDescriptor&     result,
 constantPoolHandle   pool,
 int                  index, // Constant pool index
 Bytecodes::Code      byte,
 TRAPS
) { 
  Symbol* field = pool->name_ref_at(index);
  Symbol* sig   = pool->signature_ref_at(index);

  // resolve specified klass connects to a specific class
  KlassHandle resolved_klass;
  resolve_klass(resolved_klass, pool, index, CHECK);

  KlassHandle  current_klass(THREAD, pool->pool_holder());
  resolve_field(result, resolved_klass, field, sig, current_klass, byte, true, true, CHECK);
}

The index entry at index found in the pool is CONSTANT_NameAndType_info in the following format:

CONSTANT_NameAndType_info {
   u1 tag;
   u2 name_index;       // Occupied 16 bits
   u2 descriptor_index; // Occupied 16 bits
}

A constant in the constant pool_ NameAndType_ Info data item can be regarded as constant_ An instance of the nameandtype Type. As can be seen from the Name of this data item, it describes two types of information. The first is the Name and the second is the Type. The Name here refers to the Name of the method or the Name of the field, while Type is a Type in a broad sense. It actually describes the descriptor of the field or the descriptor of the method. That is, if the Name part is a field Name, the Type part is the descriptor of the corresponding field; If the Name part describes the Name of a method, the Type part is the descriptor of the corresponding method. That is, a CONSTANT_NameAndType_info represents a method or a field.

Call resolve_klass() connects the class and calls resolve_field() connects the field. In resolve_ The field() function has the following implementation:

InstanceKlass* tmp = InstanceKlass::cast(resolved_klass());
KlassHandle    sel_klass(THREAD, tmp->find_field(field, sig, &fd));

The most important thing is to call the find of InstanceKlass_ The field() function finds the field and stores the found relevant information in fd of fieldDescriptor type. The storage and specific layout of fields in InstanceKlass have been described in detail in in-depth analysis of Java virtual machine: source code analysis and example details (basic volume), which will not be introduced here.

The fieldDescriptor class and important attributes are defined as follows:

class fieldDescriptor VALUE_OBJ_CLASS_SPEC {
 private:
  AccessFlags          _access_flags;
  int                  _index; // the field index
  constantPoolHandle   _cp;
  ...
}

Among them_ access_flags can be used to indicate whether the field is decorated with volatile, final and other keywords_ index indicates that the field is stored in the tuple of the corresponding array in InstanceKlass_ cp represents the constant pool of the class that defines the current field.

By calling resolve_klass() and resolve_ After the field() function, you can get the information, and then return to the interpreter Runtime:: resolve_ get_ The put() function continues to view the implementation logic:

  TosState state  = as_TosState(info.field_type());

  Bytecodes::Code put_code = (Bytecodes::Code)0;


  InstanceKlass* klass = InstanceKlass::cast(info.field_holder());
  bool uninitialized_static = (  (bytecode == Bytecodes::_getstatic || bytecode == Bytecodes::_putstatic) &&
                                 !klass->is_initialized()    );
  Bytecodes::Code get_code = (Bytecodes::Code)0;

  if (!uninitialized_static) {
    get_code = ((is_static) ? Bytecodes::_getstatic : Bytecodes::_getfield);
    // 1. Is a putfield or putstatic instruction
    // 2. Is a getstatic or getfield instruction and does not get the value of the final variable
    if (is_put || !info.access_flags().is_final()) {
      put_code = ((is_static) ? Bytecodes::_putstatic : Bytecodes::_putfield);
    }
  }

  ConstantPoolCacheEntry* cpce = cache_entry(thread);
  cpce->set_field(
    get_code,            // Set yes_ b1 in indices. When it is getstatic or getfield, Opcode is stored in it
    put_code,            // Set yes_ When b2 in indices is setstatic or setfield, Opcode is stored in it, so get_code and put_code if you want to connect, its value is not 0
    info.field_holder(), // Set yes_ The f1 field represents the owner of the field
    info.index(),                      // field_index, set flags
    info.offset(),                     // field_offset, set yes_ Field, Offset (in words) of field from start of instanceOop / Klass*
    state,                             // field_type, set flags
    info.access_flags().is_final(),    // Set flags
    info.access_flags().is_volatile(), // Set flags
    pool->pool_holder()
  );

You can get various information of the field through the information in info, and then fill in the ConstantPoolEntry information, so that you don't need to connect the field next time, or you don't need to find the field information from InstanceKlass. You can directly find all the desired information from ConstantPoolCacheEntry. 　

The above figure has been described in detail in the book "in depth analysis of Java virtual machine: source code analysis and detailed explanation of examples (basic volume)". By interpreting the interpretation and execution process of getstatic bytecode, we can clearly know the role of constant pool buffer items. For getstatic, it will be judged at the beginning_ The upper 8 bits in indices store the opcode of getstatic. If not, it means there is no connection, so you need to call interpreter Runtime:: resolve_ get_ The put() function performs the join operation.

When the connection is completed or has been completed, the following assembly code will continue to be executed:

// Store the index of ConstantPoolCacheEntry% edx
0x00007fffe101fdc2: movzwl 0x1(%r13),%edx
// Store the first address of ConstantPoolCache in% rcx
0x00007fffe101fdc7: mov    -0x28(%rbp),%rcx
// Get the index corresponding to the corresponding ConstantPoolCacheEntry
0x00007fffe101fdcb: shl    $0x2,%edx

// --resolved --

// Get the in [_indices,_f1,_f2,_flags]_ Since the ConstantPoolCache occupies 16 bytes, the_ indices
// And_ Each occupies 8 bytes, so_ The offset of f2 is 32 bytes, that is 0x32
// _ The byte offset of the field in the java.lang.Class instance is saved in f2. Through this offset, the field can be obtained and stored in
// Value of java.lang.Class instance
0x00007fffe101fdce: mov    0x20(%rcx,%rdx,8),%rbx
// Get the in [_indices,_f1,_f2,_flags]_ flags 
0x00007fffe101fdd3: mov 0x28(%rcx,%rdx,8),%eax
// Get the in [_indices,_f1,_f2,_flags]_ f1,_f1 saves the field owner,
// That is, the java.lang.Class object
0x00007fffe101fdd7: mov 0x18(%rcx,%rdx,8),%rcx

// From_ Get in f1_ java_ The value of the mirror property
0x00007fffe101fddc: mov    0x70(%rcx),%rcx
// Will_ flags moves 28 bits to the right, leaving TosState
0x00007fffe101fde0: shr    $0x1c,%eax
0x00007fffe101fde3: and    $0xf,%eax
// If it is not equal, it means that the value of TosState is not 0, then jump to notByte
0x00007fffe101fde6: jne    0x00007fffe101fdf6

// btos
// The number of btos is 0. When the code is executed here, the cache at the top of the stack may require btos
// %Stored in rcx is_ java_ What is stored in mirror,% rbx_ Since static variables are stored in_ java_ In the mirror, so you need to get
// The corresponding first address and push it into the stack
0x00007fffe101fdec: movsbl (%rcx,%rbx,1),%eax
0x00007fffe101fdf0: push   %rax
// Jump to Done
0x00007fffe101fdf1: jmpq   0x00007fffe101ff0c
// -- notByte --
// %TosState is stored in eax. If it is not atos, jump to notObj
0x00007fffe101fdf6: cmp    $0x7,%eax
0x00007fffe101fdf9: jne    0x00007fffe101fe90

// atos
// %Stored in rcx is_ java_ What is stored in mirror,% rbx_ f2，
// So get the first address of the static variable and push it into the stack
0x00007fffe101fdff: mov    (%rcx,%rbx,1),%eax
0x00007fffe101fe02: push   %r10
0x00007fffe101fe04: cmp    0x163a8d45(%rip),%r12   # 0x00007ffff73c8b50 
0x00007fffe101fe0b: je     0x00007fffe101fe88
0x00007fffe101fe11: mov    %rsp,-0x28(%rsp)
0x00007fffe101fe16: sub    $0x80,%rsp
0x00007fffe101fe1d: mov    %rax,0x78(%rsp)
0x00007fffe101fe22: mov    %rcx,0x70(%rsp)
0x00007fffe101fe27: mov    %rdx,0x68(%rsp)
0x00007fffe101fe2c: mov    %rbx,0x60(%rsp)
0x00007fffe101fe31: mov    %rbp,0x50(%rsp)
0x00007fffe101fe36: mov    %rsi,0x48(%rsp)
0x00007fffe101fe3b: mov    %rdi,0x40(%rsp)
0x00007fffe101fe40: mov    %r8,0x38(%rsp)
0x00007fffe101fe45: mov    %r9,0x30(%rsp)
0x00007fffe101fe4a: mov    %r10,0x28(%rsp)
0x00007fffe101fe4f: mov    %r11,0x20(%rsp)
0x00007fffe101fe54: mov    %r12,0x18(%rsp)
0x00007fffe101fe59: mov    %r13,0x10(%rsp)
0x00007fffe101fe5e: mov    %r14,0x8(%rsp)
0x00007fffe101fe63: mov    %r15,(%rsp)
0x00007fffe101fe67: movabs $0x7ffff6d4d828,%rdi
0x00007fffe101fe71: movabs $0x7fffe101fe11,%rsi
0x00007fffe101fe7b: mov    %rsp,%rdx
0x00007fffe101fe7e: and    $0xfffffffffffffff0,%rsp
0x00007fffe101fe82: callq  0x00007ffff6872e3a
0x00007fffe101fe87: hlt 
0x00007fffe101fe88: pop    %r10
0x00007fffe101fe8a: push   %rax
0x00007fffe101fe8b: jmpq   0x00007fffe101ff0c

// -- notObj --
0x00007fffe101fe90: cmp    $0x3,%eax
// If it is not itos, jump to notInt
0x00007fffe101fe93: jne    0x00007fffe101fea2

// itos
0x00007fffe101fe99: mov    (%rcx,%rbx,1),%eax
0x00007fffe101fe9c: push   %rax
// Jump to Done
0x00007fffe101fe9d: jmpq   0x00007fffe101ff0c
// -- notInt --
// If not ctos, jump to notChar
0x00007fffe101fea2: cmp    $0x1,%eax
0x00007fffe101fea5: jne    0x00007fffe101feb5

// ctos
0x00007fffe101feab: movzwl (%rcx,%rbx,1),%eax
0x00007fffe101feaf: push   %rax
// Jump to Done
0x00007fffe101feb0: jmpq   0x00007fffe101ff0c
// -- notChar --
// If not stos, jump to notShort
0x00007fffe101feb5: cmp    $0x2,%eax
0x00007fffe101feb8: jne    0x00007fffe101fec8

// stos
0x00007fffe101febe: movswl (%rcx,%rbx,1),%eax
0x00007fffe101fec2: push   %rax
// Jump to done
0x00007fffe101fec3: jmpq   0x00007fffe101ff0c
// -- notShort --
// If not ltos, jump to notLong
0x00007fffe101fec8: cmp    $0x4,%eax
0x00007fffe101fecb: jne    0x00007fffe101fee2

// ltos
0x00007fffe101fed1: mov    (%rcx,%rbx,1),%rax
0x00007fffe101fed5: sub    $0x10,%rsp
0x00007fffe101fed9: mov    %rax,(%rsp)
// Jump to Done
0x00007fffe101fedd: jmpq   0x00007fffe101ff0c
// -- notLong --
// If not ftos, jump to notFloat
0x00007fffe101fee2: cmp    $0x5,%eax
0x00007fffe101fee5: jne    0x00007fffe101fefe

// ftos
0x00007fffe101feeb: vmovss (%rcx,%rbx,1),%xmm0
0x00007fffe101fef0: sub    $0x8,%rsp
0x00007fffe101fef4: vmovss %xmm0,(%rsp)
// Jump to Done
0x00007fffe101fef9: jmpq   0x00007fffe101ff0c
// -- notFloat --
0x00007fffe101fefe: vmovsd (%rcx,%rbx,1),%xmm0
0x00007fffe101ff03: sub    $0x10,%rsp
0x00007fffe101ff07: vmovsd %xmm0,(%rsp)　　
　　
// -- Done --

Although there are many assembly codes above, the logic of completion is very simple, that is, the logic of stack pressing is completed through the information stored in ConstantPoolCacheEntry (the so-called bytecode connection completion means that the information of the corresponding constant pool cache item has been improved). Since the value of the static field is stored in the java.lang.Class instance, you need to obtain the corresponding value, and then press the value into the expression stack according to the state required by the cache at the top of the stack.

Part 25 - getfield of virtual machine object operation instruction

The getfield instruction means to get the instance field of the specified class and push its value to the top of the stack. The format is as follows:

getstatic indexbyte1 indexbyte2

The generating functions of getfield bytecode instructions are TemplateTable::getfield(). These generating functions are as follows:

void TemplateTable::getfield(int byte_no) {
  getfield_or_static(byte_no, false); // Byte of getfield_ The no value is 1
}

0x00007fffe10202d0: movzwl 0x1(%r13),%edx
0x00007fffe10202d5: mov    -0x28(%rbp),%rcx
0x00007fffe10202d9: shl    $0x2,%edx
0x00007fffe10202dc: mov    0x10(%rcx,%rdx,8),%ebx
0x00007fffe10202e0: shr    $0x10,%ebx
0x00007fffe10202e3: and    $0xff,%ebx
// 0xb4 is the Opcode of the getfield instruction. If it is equal, it indicates that it has been connected and directly jumps to resolved
0x00007fffe10202e9: cmp    $0xb4,%ebx
0x00007fffe10202ef: je     0x00007fffe102038e

0x00007fffe10202f5: mov    $0xb4,%ebx
// Omit by calling MacroAssembler::call_VM() function 
// InterpreterRuntime::resolve_ get_ Assembly code for the put() function 
// ...

0x00007fffe10202fa: callq  0x00007fffe1020304
0x00007fffe10202ff: jmpq   0x00007fffe1020382

0x00007fffe1020304: mov    %rbx,%rsi
0x00007fffe1020307: lea    0x8(%rsp),%rax
0x00007fffe102030c: mov    %r13,-0x38(%rbp)
0x00007fffe1020310: mov    %r15,%rdi
0x00007fffe1020313: mov    %rbp,0x200(%r15)
0x00007fffe102031a: mov    %rax,0x1f0(%r15)
0x00007fffe1020321: test   $0xf,%esp
0x00007fffe1020327: je     0x00007fffe102033f
0x00007fffe102032d: sub    $0x8,%rsp
0x00007fffe1020331: callq  0x00007ffff66b567c
0x00007fffe1020336: add    $0x8,%rsp
0x00007fffe102033a: jmpq   0x00007fffe1020344
0x00007fffe102033f: callq  0x00007ffff66b567c
0x00007fffe1020344: movabs $0x0,%r10
0x00007fffe102034e: mov    %r10,0x1f0(%r15)
0x00007fffe1020355: movabs $0x0,%r10
0x00007fffe102035f: mov    %r10,0x200(%r15)
0x00007fffe1020366: cmpq   $0x0,0x8(%r15)
0x00007fffe102036e: je     0x00007fffe1020379
0x00007fffe1020374: jmpq   0x00007fffe1000420
0x00007fffe1020379: mov    -0x38(%rbp),%r13
0x00007fffe102037d: mov    -0x30(%rbp),%r14
0x00007fffe1020381: retq

0x00007fffe1020382: movzwl 0x1(%r13),%edx
0x00007fffe1020387: mov    -0x28(%rbp),%rcx
0x00007fffe102038b: shl    $0x2,%edx

---- resolved ---- 

// Get the in [_indices,_f1,_f2,_flags]_ Since the ConstantPoolCache occupies 16 bytes, the_ indices
// And_ Each occupies 8 bytes, so_ The offset of f2 is 32 bytes, that is 0x32
// _ The byte offset of the field in the oop instance is saved in f2. This offset can be used to obtain the byte offset of the field stored in
// Value in oop
0x00007fffe102038e: mov    0x20(%rcx,%rdx,8),%rbx

// Get the in [_indices,_f1,_f2,_flags]_ flags 
0x00007fffe1020393: mov    0x28(%rcx,%rdx,8),%eax

// Pop the objectref object from the stack to% rcx
0x00007fffe1020397: pop    %rcx

// Evoke OS NULL exception if reg = NULL by
// accessing M[reg] w/o changing any (non-CC) registers
// NOTE: cmpl is plenty (enough) here to provoke a segv
0x00007fffe1020398: cmp    (%rcx),%rax

// Will_ flags moves 28 bits to the right, leaving TosState
0x00007fffe102039b: shr    $0x1c,%eax
0x00007fffe102039e: and    $0xf,%eax
// If it is not equal, it means that the value of TosState is not 0, then jump to notByte
0x00007fffe10203a1: jne    0x00007fffe10203ba

// btos

// The number of btos is 0. When the code is executed here, the cache at the top of the stack may require btos
// %Objectref is stored in rcx and objectref is stored in rbx_ Get the value corresponding to the field and store it in% rax
0x00007fffe10203a7: movsbl (%rcx,%rbx,1),%eax
0x00007fffe10203ab: push   %rax

// Rewrite the bytecode instruction to bytecodes::_ fast_ Opcode of bgetfield is stored in% ecx
0x00007fffe10203ac: mov    $0xcc,%ecx
// Bytecodes::_ fast_ Opcode of bgetfield is updated to opcode of bytecode instruction
0x00007fffe10203b1: mov    %cl,0x0(%r13)
// Jump to -- Done----
0x00007fffe10203b5: jmpq   0x00007fffe102050f
---- notByte ----
0x00007fffe10203ba: cmp    $0x7,%eax
0x00007fffe10203bd: jne    0x00007fffe102045d  // Jump to notObj


// atos

// Call macroassembler:: load_ heap_ The OOP () function generates the following code
0x00007fffe10203c3: mov    (%rcx,%rbx,1),%eax
// ... omit some code
// End macroassembler:: load_ heap_ Call of oop() function
0x00007fffe102044e: push   %rax
// Rewrite bytecode instruction as Bytecodes::_fast_agetfield
0x00007fffe102044f: mov    $0xcb,%ecx
0x00007fffe1020454: mov    %cl,0x0(%r13)
0x00007fffe1020458: jmpq   0x00007fffe102050f
// -- notObj --
0x00007fffe102045d: cmp    $0x3,%eax
0x00007fffe1020460: jne    0x00007fffe1020478 // Jump to notInt

// itos

0x00007fffe1020466: mov    (%rcx,%rbx,1),%eax
0x00007fffe1020469: push   %rax
// Rewrite bytecode instruction o Bytecodes::_fast_igetfield
0x00007fffe102046a: mov    $0xd0,%ecx
0x00007fffe102046f: mov    %cl,0x0(%r13)
0x00007fffe1020473: jmpq   0x00007fffe102050f
// --- notInt ----
0x00007fffe1020478: cmp    $0x1,%eax
0x00007fffe102047b: jne    0x00007fffe1020494 // Jump to notChar


// ctos

0x00007fffe1020481: movzwl (%rcx,%rbx,1),%eax
0x00007fffe1020485: push   %rax
// Rewrite bytecode instruction as Bytecodes::_fast_cgetfield
0x00007fffe1020486: mov    $0xcd,%ecx
0x00007fffe102048b: mov    %cl,0x0(%r13)
0x00007fffe102048f: jmpq   0x00007fffe102050f
// ---- notChar ----
0x00007fffe1020494: cmp    $0x2,%eax
0x00007fffe1020497: jne    0x00007fffe10204b0 // Jump to notShort

// stos

0x00007fffe102049d: movswl (%rcx,%rbx,1),%eax
0x00007fffe10204a1: push   %rax
// Rewrite bytecode instruction as Bytecodes::_fast_sgetfield
0x00007fffe10204a2: mov    $0xd2,%ecx
0x00007fffe10204a7: mov    %cl,0x0(%r13)
0x00007fffe10204ab: jmpq   0x00007fffe102050f
// ---- notShort ----
0x00007fffe10204b0: cmp    $0x4,%eax
0x00007fffe10204b3: jne    0x00007fffe10204d3 // Jump to notLong

// ltos

0x00007fffe10204b9: mov    (%rcx,%rbx,1),%rax
0x00007fffe10204bd: sub    $0x10,%rsp
0x00007fffe10204c1: mov    %rax,(%rsp)
// Rewrite bytecode instruction as Bytecodes::_fast_lgetfield,
0x00007fffe10204c5: mov    $0xd1,%ecx
0x00007fffe10204ca: mov    %cl,0x0(%r13)
0x00007fffe10204ce: jmpq   0x00007fffe102050f
// ---- notLong ----
0x00007fffe10204d3: cmp    $0x5,%eax
0x00007fffe10204d6: jne    0x00007fffe10204f8 // Jump to notFloat


// ftos
0x00007fffe10204dc: vmovss (%rcx,%rbx,1),%xmm0
0x00007fffe10204e1: sub    $0x8,%rsp
0x00007fffe10204e5: vmovss %xmm0,(%rsp)
// Rewrite bytecode instruction as Bytecodes::_fast_fgetfield
0x00007fffe10204ea: mov    $0xcf,%ecx
0x00007fffe10204ef: mov    %cl,0x0(%r13)
0x00007fffe10204f3: jmpq   0x00007fffe102050f
// ---- notFloat ----
0x00007fffe10204f8: vmovsd (%rcx,%rbx,1),%xmm0
0x00007fffe10204fd: sub    $0x10,%rsp
0x00007fffe1020501: vmovsd %xmm0,(%rsp)
0x00007fffe1020506: mov    $0xce,%ecx
0x00007fffe102050b: mov    %cl,0x0(%r13)

// -- Done --

We need to introduce some custom instructions inside the virtual machine. The templates of these custom instructions are as follows:

// JVM bytecodes
def(Bytecodes::_fast_agetfield      , ubcp|____|____|____, atos, atos, fast_accessfield    ,  atos        );
def(Bytecodes::_fast_bgetfield      , ubcp|____|____|____, atos, itos, fast_accessfield    ,  itos        );
def(Bytecodes::_fast_cgetfield      , ubcp|____|____|____, atos, itos, fast_accessfield    ,  itos        );
def(Bytecodes::_fast_dgetfield      , ubcp|____|____|____, atos, dtos, fast_accessfield    ,  dtos        );
def(Bytecodes::_fast_fgetfield      , ubcp|____|____|____, atos, ftos, fast_accessfield    ,  ftos        );
def(Bytecodes::_fast_igetfield      , ubcp|____|____|____, atos, itos, fast_accessfield    ,  itos        );
def(Bytecodes::_fast_lgetfield      , ubcp|____|____|____, atos, ltos, fast_accessfield    ,  ltos        );
def(Bytecodes::_fast_sgetfield      , ubcp|____|____|____, atos, itos, fast_accessfield    ,  itos        );

With_ fast_ For example, the bytecode instruction defined inside the agetfield generates TemplateTable::fast_accessfield() function, assembly code is as follows:

0x00007fffe101e4e1: movzwl 0x1(%r13),%ebx
0x00007fffe101e4e6: mov    -0x28(%rbp),%rcx
0x00007fffe101e4ea: shl    $0x2,%ebx
// Calculate% rcx+%rdx*8+0x20 and obtain the values in constantpoolcacheentry [_indexes, _f1, _f2, _flags]_ f2
// Because the size of ConstantPoolCache is 0x16 bytes,% rcx+0x20 locates to the start of the first ConstantPoolCacheEntry
// %rdx*8 calculates the byte offset relative to the first ConstantPoolCacheEntry
0x00007fffe101e4ed: mov    0x20(%rcx,%rbx,8),%rbx

// Check null exception
0x00007fffe101e4f2: cmp    (%rax),%rax
// %objectref is stored in rax, that is, to get the value of the field from this instance, offset it by% rbx
// The offset value can be obtained and loaded into% eax
0x00007fffe101e4f5: mov    (%rax,%rbx,1),%eax

Other bytecode instructions are similar and will not be introduced here. It can be seen from here that we do not need to execute the assembly instructions corresponding to getfield, but only execute_ These instructions are much simpler than getfield instructions and greatly improve the speed of interpretation and execution. 　　

Part 26 - putstatic of virtual machine object operation instructions

The assembly code execution logic of getstatic and getfield instructions has been introduced before. This article introduces the execution logic of putstatic instructions. putfield will not introduce it. You can study it yourself. I believe you have this strength.

The putstatic instruction assigns a value to the static field of the specified class. The format of bytecode instruction is as follows:

putstatic indexbyte1 indexbyte2

The unsigned numbers indexbyte1 and indexbyte2 are constructed as (indexbyte1 < < 8) |indexbyte2. The runtime constant pool item pointed to by the index should be a symbolic reference to a field.

The template of the instruction is defined as follows:

def(Bytecodes::_putstatic           , ubcp|____|clvm|____, vtos, vtos, putstatic           , f2_byte      );

The generating function is putstatic(), and the implementation of the function is as follows:

void TemplateTable::putstatic(int byte_no) {
  putfield_or_static(byte_no, false);
}

Call templatetable:: putfield_ or_ The assembly code corresponding to the machine instruction generated by the static() function is as follows:

0x00007fffe101ff90: movzwl 0x1(%r13),%edx
0x00007fffe101ff95: mov    -0x28(%rbp),%rcx
0x00007fffe101ff99: shl    $0x2,%edx
0x00007fffe101ff9c: mov    0x10(%rcx,%rdx,8),%ebx
0x00007fffe101ffa0: shr    $0x18,%ebx
0x00007fffe101ffa3: and    $0xff,%ebx
// Whether the putstatic instruction has been connected. If so, jump to resolved
0x00007fffe101ffa9: cmp    $0xb3,%ebx
0x00007fffe101ffaf: je     0x00007fffe102004e

Call templatetable:: Resolve_ cache_ and_ The index() function generates the following assembly code:

// After execution, the description field is not connected yet
0x00007fffe101ffb5: mov    $0xb3,%ebx

// Call macroassembler:: call_ The VM () function generates the following code,
// Used to execute InterpreterRuntime::resolve_get_put() function
0x00007fffe101ffba: callq  0x00007fffe101ffc4
0x00007fffe101ffbf: jmpq   0x00007fffe1020042
0x00007fffe101ffc4: mov    %rbx,%rsi
0x00007fffe101ffc7: lea    0x8(%rsp),%rax
0x00007fffe101ffcc: mov    %r13,-0x38(%rbp)
0x00007fffe101ffd0: mov    %r15,%rdi
0x00007fffe101ffd3: mov    %rbp,0x200(%r15)
0x00007fffe101ffda: mov    %rax,0x1f0(%r15)
0x00007fffe101ffe1: test   $0xf,%esp
0x00007fffe101ffe7: je     0x00007fffe101ffff
0x00007fffe101ffed: sub    $0x8,%rsp
0x00007fffe101fff1: callq  0x00007ffff66b567c
0x00007fffe101fff6: add    $0x8,%rsp
0x00007fffe101fffa: jmpq   0x00007fffe1020004
0x00007fffe101ffff: callq  0x00007ffff66b567c
0x00007fffe1020004: movabs $0x0,%r10
0x00007fffe102000e: mov    %r10,0x1f0(%r15)
0x00007fffe1020015: movabs $0x0,%r10
0x00007fffe102001f: mov    %r10,0x200(%r15)
0x00007fffe1020026: cmpq   $0x0,0x8(%r15)
0x00007fffe102002e: je     0x00007fffe1020039
0x00007fffe1020034: jmpq   0x00007fffe1000420
0x00007fffe1020039: mov    -0x38(%rbp),%r13
0x00007fffe102003d: mov    -0x30(%rbp),%r14
0x00007fffe1020041: retq   


0x00007fffe1020042: movzwl 0x1(%r13),%edx
0x00007fffe1020047: mov    -0x28(%rbp),%rcx
0x00007fffe102004b: shl    $0x2,%edx

The assembly code generated next is as follows:

// ---- resolved ----

// When the following code is executed, it indicates that the field has been connected

0x00007fffe102004e: mov    0x20(%rcx,%rdx,8),%rbx
0x00007fffe1020053: mov    0x28(%rcx,%rdx,8),%eax
0x00007fffe1020057: mov    0x18(%rcx,%rdx,8),%rcx
0x00007fffe102005c: mov    0x70(%rcx),%rcx
0x00007fffe1020060: mov    %eax,%edx
// Will_ flags moves 21 bits to the right to judge whether there is volatile keyword
0x00007fffe1020062: shr    $0x15,%edx
0x00007fffe1020065: and    $0x1,%edx
// Will_ flags moves 28 bits to the right, leaving TosState
0x00007fffe1020068: shr    $0x1c,%eax

// If not btos, jump to notByte
0x00007fffe102006b: and    $0xf,%eax
0x00007fffe102006e: jne    0x00007fffe1020083

// btos

// Store the value at the top of the stack in% eax, and this value will be written to the corresponding field
0x00007fffe1020074: mov    (%rsp),%eax
0x00007fffe1020077: add    $0x8,%rsp
// %rcx is_ java_mirror,% rbx is_ f2, represents the offset of the domain in the class
0x00007fffe102007b: mov    %al,(%rcx,%rbx,1)
0x00007fffe102007e: jmpq   0x00007fffe10201be  // Jump to Done
// -- notByte --
// If not atos, jump to notObj
0x00007fffe1020083: cmp    $0x7,%eax
0x00007fffe1020086: jne    0x00007fffe1020130

// atos
// Pop the value at the top of the stack into% rax, which will be used to update the value of the corresponding field
0x00007fffe102008c: pop    %rax
// ...
// Update the value to the corresponding field
0x00007fffe1020115: mov    %eax,(%rcx,%rbx,1)
// Where 0x9 is CardTableModRefBS::card_shift, shr indicates logical shift to the right, because%rcx points to
// The first address of the java.lang.Class instance. Move it to the right and%rcx calculate the index of the card table
0x00007fffe1020118: shr    $0x9,%rcx
// The address constant $0x7fffe07ff000 represents the base address of the card table
0x00007fffe102011c: movabs $0x7fffe07ff000,%r10 
// Mark the corresponding card table entry as dirty, where the constant 0x0 indicates a dirty card
0x00007fffe1020126: movb $0x0,(%r10,%rcx,1) 
0x00007fffe102012b: jmpq 
0x00007fffe10201be // Jump to Done
// ---- notObj ----
// If it is not itos, jump to notInt
0x00007fffe1020130: cmp    $0x3,%eax
0x00007fffe1020133: jne    0x00007fffe1020148

// itos
0x00007fffe1020139: mov    (%rsp),%eax
// If not ctos, jump to notChar
0x00007fffe102013c: add    $0x8,%rsp
0x00007fffe1020140: mov    %eax,(%rcx,%rbx,1)
0x00007fffe1020143: jmpq   0x00007fffe10201be   // Jump to Done
0x00007fffe1020148: cmp    $0x1,%eax
0x00007fffe102014b: jne    0x00007fffe1020161

// ctos
0x00007fffe1020151: mov    (%rsp),%eax
0x00007fffe1020154: add    $0x8,%rsp
0x00007fffe1020158: mov    %ax,(%rcx,%rbx,1)
0x00007fffe102015c: jmpq   0x00007fffe10201be  // Jump to Done
0x00007fffe1020161: cmp    $0x2,%eax
0x00007fffe1020164: jne    0x00007fffe102017a

// stos
0x00007fffe102016a: mov    (%rsp),%eax
0x00007fffe102016d: add    $0x8,%rsp
0x00007fffe1020171: mov    %ax,(%rcx,%rbx,1)
0x00007fffe1020175: jmpq   0x00007fffe10201be  // Jump to Done
0x00007fffe102017a: cmp    $0x4,%eax
0x00007fffe102017d: jne    0x00007fffe1020194

// ltos
0x00007fffe1020183: mov    (%rsp),%rax
0x00007fffe1020187: add    $0x10,%rsp
0x00007fffe102018b: mov    %rax,(%rcx,%rbx,1)
0x00007fffe102018f: jmpq   0x00007fffe10201be  // Jump to Done
0x00007fffe1020194: cmp    $0x5,%eax
0x00007fffe1020197: jne    0x00007fffe10201b0

// ftos
0x00007fffe102019d: vmovss (%rsp),%xmm0
0x00007fffe10201a2: add    $0x8,%rsp
0x00007fffe10201a6: vmovss %xmm0,(%rcx,%rbx,1)
0x00007fffe10201ab: jmpq   0x00007fffe10201be   // Jump to Done

// dtos
0x00007fffe10201b0: vmovsd (%rsp),%xmm0
0x00007fffe10201b5: add    $0x10,%rsp
0x00007fffe10201b9: vmovsd %xmm0,(%rcx,%rbx,1)


// ---- Done ----

0x00007fffe10201be: test   %edx,%edx
0x00007fffe10201c0: je     0x00007fffe10201cb
0x00007fffe10201c6: lock addl $0x0,(%rsp)

// ---- notVolatile ----

In the above code, the two most noteworthy points are as follows:

(1) When updating the reference field, mark the corresponding card table item as dirty through the barrier, so that the dirty card can be scanned in the GC process to mark the active object without omission;

(2) When the field is decorated with volatile keyword, you need to fill in the prefix of lock instruction. This prefix has not been introduced before when introducing x86-64 machine instructions. Here is an excerpt from others' introduction to this instruction:

The Intel manual describes the lock prefix as follows:

Ensure the atomicity of the modified instruction execution;
It is forbidden to reorder the instruction with the previous and subsequent read-write instructions;
After the instruction is executed, all data in the write buffer is flushed to memory (so that other modifications before the instruction are visible to all processors).

All x86 CPUs have the ability to lock a specific memory address. When the specific memory address is locked, it can prevent other system buses from reading or modifying the memory address. This ability is through The lock instruction prefix is implemented by adding the following assembly instructions. When used When the lock instruction is prefixed, it will cause the CPU to declare a lock# signal, so as to ensure the mutually exclusive use of this memory address in a multiprocessor system or a multi-threaded competitive environment. When the command is executed, the locking action will disappear.

Part 27 - operand stack management instructions for virtual machine bytecode instructions

The bytecode instructions related to operand stack management are shown in the following table.


0x57	pop	Pop up stack top value (value cannot be of type long or double)
0x58	pop2	Pop up one (long or double) or two values at the top of the stack (other)
0x59	dup	Copy the value at the top of the stack and push the copied value into the top of the stack
0x5a	dup_x1	Copy the value at the top of the stack and press the two copied values into the top of the stack
0x5b	dup_x2	Copy the top value and push three (or two) copied values into the top of the stack
0x5c	dup2	Copy one (long or double type) or two (other) values at the top of the stack and push the copied value into the top of the stack
0x5d	dup2_x1	dup_x1 Double version of instruction
0x5e	dup2_x2	dup_x2 Double version of instruction
0x5f	swap	Swap the two values at the top of the stack (values cannot be of type long or double)

The template corresponding to bytecode instruction is defined as follows:

def(Bytecodes::_pop         , ____|____|____|____, vtos, vtos, pop         ,  _           );
def(Bytecodes::_pop2        , ____|____|____|____, vtos, vtos, pop2        ,  _           );
def(Bytecodes::_dup         , ____|____|____|____, vtos, vtos, dup         ,  _           );
def(Bytecodes::_dup_x1      , ____|____|____|____, vtos, vtos, dup_x1      ,  _           );
def(Bytecodes::_dup_x2      , ____|____|____|____, vtos, vtos, dup_x2      ,  _           );
def(Bytecodes::_dup2        , ____|____|____|____, vtos, vtos, dup2        ,  _           );
def(Bytecodes::_dup2_x1     , ____|____|____|____, vtos, vtos, dup2_x1     ,  _           );
def(Bytecodes::_dup2_x2     , ____|____|____|____, vtos, vtos, dup2_x2     ,  _           );
def(Bytecodes::_swap        , ____|____|____|____, vtos, vtos, swap        ,  _           );

pop instruction pops up the value at the top of the stack. The corresponding assembly code is as follows:

add    $0x8,%rsp

The pop2 instruction pops up the top value of the stack. The corresponding assembly code is as follows:

add    $0x10,%rsp

The dup instruction copies the stack top value and pushes the copied value into the stack top. The corresponding assembly code is as follows:

mov    (%rsp),%rax
push   %rax

The swap instruction interchanges the two values at the top of the stack (the value cannot be of type long or double). The corresponding assembly code is as follows:

mov    0x8(%rsp),%rcx
mov    (%rsp),%rax
mov    %rcx,(%rsp)
mov    %rax,0x8(%rsp)

The execution logic of the instruction is relatively simple and will not be introduced here.

Part 28 - control transfer instructions for virtual machine bytecode instructions

The bytecode instructions related to control transfer are shown in the table below.


0x99	ifeq	Jump when the int value at the top of the stack is equal to 0
0x9a	ifne	Jump when the int value at the top of the stack is not equal to 0
0x9b	iflt	Jump when the int value at the top of the stack is less than 0
0x9c	ifge	Jump when the int value at the top of the stack is greater than or equal to 0
0x9d	ifgt	Jump when the int value at the top of the stack is greater than 0
0x9e	ifle	Jump when the int value at the top of the stack is less than or equal to 0
0x9f	if_icmpeq	Compare the size of two int values at the top of the stack. Jump when the result is equal to 0
0xa0	if_icmpne	Compare the size of two int values at the top of the stack. Jump when the result is not equal to 0
0xa1	if_icmplt	Compare the size of two int values at the top of the stack. Jump when the result is less than 0
0xa2	if_icmpge	Compare the size of two int values at the top of the stack. Jump when the result is greater than or equal to 0
0xa3	if_icmpgt	Compare the size of two int values at the top of the stack. Jump when the result is greater than 0
0xa4	if_icmple	Compare the size of two int values at the top of the stack. Jump when the result is less than or equal to 0
0xa5	if_acmpeq	Compare the two reference values at the top of the stack, and jump when the results are equal
0xa6	if_acmpne	Compare the two reference values at the top of the stack, and jump when the results are not equal
0xa7	goto	Unconditional jump
0xa8	jsr	Jump to the specified 16 bit offset position and push the address of the next instruction of jsr into the top of the stack
0xa9	ret	Return to the instruction location of the index of the local variable instruction (generally used in combination with JSR or jsr_w)
0xaa	tableswitch	For switch conditional jump, case value is continuous (variable length instruction)
0xab	lookupswitch	For switch conditional jump, case value is discontinuous (variable length instruction)
0xac	ireturn	Returns int from the current method
0xad	lreturn	Returns long from the current method
0xae	freturn	Returns float from the current method
0xaf	dreturn	Returns double from the current method
0xb0	areturn	Returns an object reference from the current method
0xb1	return	Returns void from the current method
0xc6	ifnull	Jump when null
0xc7	ifnonnull	Jump when not null
0xc8	goto_w	Unconditional jump (wide index)
0xc9	jsr_w	Jump to the specified 32-bit offset position and push the jsr_w next instruction address to the top of the stack

The template is defined as follows:

def(Bytecodes::_ifeq                , ubcp|____|clvm|____, itos, vtos, if_0cmp             , equal        );
def(Bytecodes::_ifne                , ubcp|____|clvm|____, itos, vtos, if_0cmp             , not_equal    );
def(Bytecodes::_iflt                , ubcp|____|clvm|____, itos, vtos, if_0cmp             , less         );
def(Bytecodes::_ifge                , ubcp|____|clvm|____, itos, vtos, if_0cmp             , greater_equal);
def(Bytecodes::_ifgt                , ubcp|____|clvm|____, itos, vtos, if_0cmp             , greater      );
def(Bytecodes::_ifle                , ubcp|____|clvm|____, itos, vtos, if_0cmp             , less_equal   );
def(Bytecodes::_if_icmpeq           , ubcp|____|clvm|____, itos, vtos, if_icmp             , equal        );
def(Bytecodes::_if_icmpne           , ubcp|____|clvm|____, itos, vtos, if_icmp             , not_equal    );
def(Bytecodes::_if_icmplt           , ubcp|____|clvm|____, itos, vtos, if_icmp             , less         );
def(Bytecodes::_if_icmpge           , ubcp|____|clvm|____, itos, vtos, if_icmp             , greater_equal);
def(Bytecodes::_if_icmpgt           , ubcp|____|clvm|____, itos, vtos, if_icmp             , greater      );
def(Bytecodes::_if_icmple           , ubcp|____|clvm|____, itos, vtos, if_icmp             , less_equal   );
def(Bytecodes::_if_acmpeq           , ubcp|____|clvm|____, atos, vtos, if_acmp             , equal        );
def(Bytecodes::_if_acmpne           , ubcp|____|clvm|____, atos, vtos, if_acmp             , not_equal    );
def(Bytecodes::_goto                , ubcp|disp|clvm|____, vtos, vtos, _goto               ,  _           );
def(Bytecodes::_jsr                 , ubcp|disp|____|____, vtos, vtos, jsr                 ,  _           ); // result is not an oop, so do not transition to atos
def(Bytecodes::_ret                 , ubcp|disp|____|____, vtos, vtos, ret                 ,  _           );
def(Bytecodes::_tableswitch         , ubcp|disp|____|____, itos, vtos, tableswitch         ,  _           );
def(Bytecodes::_lookupswitch        , ubcp|disp|____|____, itos, itos, lookupswitch        ,  _           );
def(Bytecodes::_ireturn             , ____|disp|clvm|____, itos, itos, _return             , itos         );
def(Bytecodes::_lreturn             , ____|disp|clvm|____, ltos, ltos, _return             , ltos         );
def(Bytecodes::_freturn             , ____|disp|clvm|____, ftos, ftos, _return             , ftos         );
def(Bytecodes::_dreturn             , ____|disp|clvm|____, dtos, dtos, _return             , dtos         );
def(Bytecodes::_areturn             , ____|disp|clvm|____, atos, atos, _return             , atos         );
def(Bytecodes::_return              , ____|disp|clvm|____, vtos, vtos, _return             , vtos         );

def(Bytecodes::_ifnull              , ubcp|____|clvm|____, atos, vtos, if_nullcmp          , equal        );
def(Bytecodes::_ifnonnull           , ubcp|____|clvm|____, atos, vtos, if_nullcmp          , not_equal    );
def(Bytecodes::_goto_w              , ubcp|____|clvm|____, vtos, vtos, goto_w              ,  _           );
def(Bytecodes::_jsr_w               , ubcp|____|____|____, vtos, vtos, jsr_w               ,  _           );

The assembly implementation of several typical instructions is introduced below.

1. goto instruction

The generation function of goto bytecode instruction is templatetable:: # u goto(). The generated assembly code is as follows: (add the command - Xint when generating code) - 20: - profileinterpreter, which eliminates the generation of unnecessary instructions)

// Method * saved in% rcx
0x00007fffe1019df0: mov    -0x18(%rbp),%rcx
// Store the index(2 bytes) after goto in% edx
0x00007fffe1019df4: movswl 0x1(%r13),%edx
0x00007fffe1019df9: bswap  %edx
// Arithmetic shift right instruction
0x00007fffe1019dfb: sar    $0x10,%edx
// A double word symbol is extended and sent to a four word address
0x00007fffe1019dfe: movslq %edx,%rdx
// Add the current bytecode address with the offset saved by rdx to calculate the jump target address
0x00007fffe1019e01: add    %rdx,%r13
// %r13 has become the target jump address. Here is the first bytecode of the jump address loaded into rbx
0x00007fffe1019e04: movzbl 0x0(%r13),%ebx

// continue with the bytecode @ target
// eax: return bci for jsr's, unused otherwise
// ebx: target bytecode
// r13: target bcp
// Start executing the bytecode at the jump address, where the constant address is
// The first address of the TemplateInterpreter::_active_table whose stack top cache status is vtos
0x00007fffe1019e09: movabs $0x7ffff73ba4a0,%r10
0x00007fffe1019e13: jmpq   *(%r10,%rbx,8)

In fact, the goto instruction actually generates more assembly code than the above code, because the goto instruction is a branch instruction, in which some performance statistics will be made to assist in compilation optimization, and if the goto is in a loop, it may also involve the technology of stack replacement. Therefore, we will introduce other functions of the goto instruction in detail when we introduce the corresponding technical points later Some assembly logic.

2. ifeq, ifne and other instructions

At present, the generation function of ifeq, ifne and other instructions is TemplateTable::if_0cmp(). Ifeq bytecode instructions represent the comparison between the top value of the stack and the zero value. If and only if the value of int type at the top of the stack is 0, the comparison result is true. The corresponding assembly code is as follows:

0x00007fffe10196c7: test   %eax,%eax
// When the stack top cache% eax is not 0, skip to not_taken directly
0x00007fffe10196c9: jne    0x00007fffe10196f6

// Assembly code generated by calling TemplateTable::branch(false,false) function

// Copy the Method * saved in the current stack frame to rcx
0x00007fffe10196cf: mov    -0x18(%rbp),%rcx
// Read the 2-byte data starting from the 1-byte backward offset of the current bytecode position into edx
0x00007fffe10196d3: movswl 0x1(%r13),%edx
// Reverse the byte order of values in% edx
0x00007fffe10196d8: bswap  %edx
// Shift the value in edx by 16 bits to the right. The above two steps are to calculate the offset of the jump branch
0x00007fffe10196da: sar    $0x10,%edx
// Expand the data in edx from 2 bytes to 4 bytes
0x00007fffe10196dd: movslq %edx,%rdx
// Add the current bytecode address with the offset saved by rdx to calculate the jump target address
0x00007fffe10196e0: add    %rdx,%r13
// r13 has become the target jump address. Here is the first bytecode of the jump address loaded into ebx
0x00007fffe10196e3: movzbl 0x0(%r13),%ebx

// Start executing the bytecode at the jump address, where the constant address is
// TemplateInterpreter::_ active_ The first address of table and the top of stack cache state is vtos
0x00007fffe10196e8: movabs $0x7ffff73ba4a0,%r10
0x00007fffe10196f2: jmpq   *(%r10,%rbx,8)

// -- not_taken --

Similar instruction implementation logic is also highly similar. If you are interested, you can study it yourself.

3. lookupswitch, tableswitch and other instructions

The lookupswitch instruction finds the paired branch in the jump table according to the key value and jumps. The specific format is shown in the following figure.

This is a variable length instruction and requires all operands to be aligned with 4 bytes, so there may be 0 to 3 bytes immediately after the lookupswitch instruction as blank filling, and the following default, npairs, etc. are represented by 4 bytes. The address calculated from the current method (the first bytecode instruction), that is, a series of 32-bit signed integer values followed by blank filling, It includes the default jump address default, the number of matching coordinates npairs, and the matching coordinates of npairs group. The value of npairs should be greater than or equal to 0. Each set of matching coordinates contains an integer value match and a signed 32-bit offset. All the above 32-bit signed values are calculated in the following way:

(byte1<<24)|(byte2<<24)|(byte3<<24)|byte4

The tableswitch instruction finds the paired branch in the jump table according to the key value and jumps. The specific format is shown in the following figure.

This is a variable length instruction and requires all operands to be aligned with 4 bytes, so there may be 0 to 3 bytes immediately after the lookupswitch instruction as blank filling, and the following default, lowbyte, highbyte, etc. are represented by 4 bytes, which is the address calculated from the current method (the first bytecode instruction), That is, a series of 32-bit signed integer values are filled immediately after the blank, including the default jump address default, the high value high and the low value low, followed by high-low+1 signed 32-bit offset. All the above 32-bit signed values are calculated in the following way:

(byte1<<24)|(byte2<<24)|(byte3<<24)|byte4

The generating function is TemplateTable::tableswitch(). The generated assembly is as follows:

// align r13, aligned by 4 bytes
0x00007fffe1019fa7: lea    0x4(%r13),%rbx
0x00007fffe1019fab: and    $0xfffffffffffffffc,%rbx
// load lo & hi
0x00007fffe1019faf: mov    0x4(%rbx),%ecx
0x00007fffe1019fb2: mov    0x8(%rbx),%edx
0x00007fffe1019fb5: bswap  %ecx
0x00007fffe1019fb7: bswap  %edx

// check against lo & hi
// %ecx stores lowbyte
0x00007fffe1019fb9: cmp    %ecx,%eax
// If it is lower than the low value, jump to default_case
0x00007fffe1019fbb: jl     0x00007fffe1019feb 
// %High byte is stored in edx
0x00007fffe1019fc1: cmp    %edx,%eax
// If it is higher than the high value, jump to default_case
0x00007fffe1019fc3: jg     0x00007fffe1019feb

// lookup dispatch offset
0x00007fffe1019fc9: sub    %ecx,%eax
// %rbx stores the aligned bytecode instruction address, and rax stores the stack top cache value
0x00007fffe1019fcb: mov    0xc(%rbx,%rax,4),%edx
// -- continue_execution --
// continue execution
0x00007fffe1019fcf: bswap  %edx
0x00007fffe1019fd1: movslq %edx,%rdx
0x00007fffe1019fd4: movzbl 0x0(%r13,%rdx,1),%ebx
0x00007fffe1019fda: add    %rdx,%r13

0x00007fffe1019fdd: movabs $0x7ffff73ba4a0,%r10
0x00007fffe1019fe7: jmpq   *(%r10,%rbx,8)

// -- default_case --
// handle default
0x00007fffe1019feb: mov (%rbx),%edx 
// Jump to continue_execution
0x00007fffe1019fed: jmp 0x00007fffe1019fcf

Keywords: Java jvm

Added by deveed on Thu, 25 Nov 2021 06:12:23 +0200

Programming VIP