gdb view inheritance chain memory layout demo1

gdb view inheritance chain memory layout demo1

Experiment 1

Let me introduce two new classes. A base class Person and a derived class Student that inherits from Person. Note that both use the same signature as their method who().

  • With dynamic assignment of virtual function polymorphism, the vtable is queried and the appropriate method is called.
  • If there is no virtual function, the method matching the pointer type of the object is called. Student *m will call Student:: who(), and Person *p will call Person::who(). Let's verify it with assembly code.
#include <stdio.h>

class Person {
public:
    Person() {}
    void who() {
        printf("I am a human!!\n");
    }

};

class Student: public Person{
public:
    Student() {}
    void who() {
        printf("I am a student!!\n");
    }
};

int main() {
    Student *m = new Student();
    m->who();

    Person *p = m;
    p->who();
}

Compile g++ 1. After cpp-g, GDB. / A.out View the results as follows

(gdb) disas main
Dump of assembler code for function main():
   0x00000000000006ca <+0>:	push   %rbp
   0x00000000000006cb <+1>:	mov    %rsp,%rbp
   0x00000000000006ce <+4>:	push   %rbx
   0x00000000000006cf <+5>:	sub    $0x18,%rsp
   0x00000000000006d3 <+9>:	mov    $0x1,%edi
   0x00000000000006d8 <+14>:	callq  0x590 <_Znwm@plt>
   0x00000000000006dd <+19>:	mov    %rax,%rbx
   0x00000000000006e0 <+22>:	mov    %rbx,%rdi
   0x00000000000006e3 <+25>:	callq  0x740 <Student::Student()>
   0x00000000000006e8 <+30>:	mov    %rbx,-0x20(%rbp)
   0x00000000000006ec <+34>:	mov    -0x20(%rbp),%rax
   0x00000000000006f0 <+38>:	mov    %rax,%rdi
   0x00000000000006f3 <+41>:	callq  0x75c <Student::who()>
   0x00000000000006f8 <+46>:	mov    -0x20(%rbp),%rax
   0x00000000000006fc <+50>:	mov    %rax,-0x18(%rbp)
   0x0000000000000700 <+54>:	mov    -0x18(%rbp),%rax
   0x0000000000000704 <+58>:	mov    %rax,%rdi
   0x0000000000000707 <+61>:	callq  0x724 <Person::who()>
   0x000000000000070c <+66>:	mov    $0x0,%eax
   0x0000000000000711 <+71>:	add    $0x18,%rsp
   0x0000000000000715 <+75>:	pop    %rbx
   0x0000000000000716 <+76>:	pop    %rbp
   0x0000000000000717 <+77>:	retq   
End of assembler dump.

We see that the who() method has been embedded in the context of the Student class at compile time, from 0x00000006f3 <+41>: callq 0x75c <Student::who()>, which indicates that runtime decision-making is impossible. The type of pointer is known at compile time, and the compiler chooses the correct who() method call, as in the case of 0x0000000000707 <+61>: callq 0x724 <Person::who()>which is also a static compile-time instruction.

Experiment 2

But we can also call methods of base classes by defining their namespaces, such as m->Person::who(). The test code is as follows.

#include <stdio.h>

class Person {
public:
    Person() {}
    void who() {
        printf("I am a human!!\n");
    }

};

class Student: public Person{
public:
    Student() {}
    void who() {
        printf("I am a student!!\n");
    }
};
int main() {
    Student *m = new Student();
    m->Person::who();
}

Disassembly results

(gdb) disas main
Dump of assembler code for function main():
   0x00000000000006ca <+0>:	push   %rbp
   0x00000000000006cb <+1>:	mov    %rsp,%rbp
   0x00000000000006ce <+4>:	push   %rbx
   0x00000000000006cf <+5>:	sub    $0x18,%rsp
   0x00000000000006d3 <+9>:	mov    $0x1,%edi
   0x00000000000006d8 <+14>:	callq  0x590 <_Znwm@plt>
   0x00000000000006dd <+19>:	mov    %rax,%rbx
   0x00000000000006e0 <+22>:	mov    %rbx,%rdi
   0x00000000000006e3 <+25>:	callq  0x72c <Student::Student()>
   0x00000000000006e8 <+30>:	mov    %rbx,-0x18(%rbp)
   0x00000000000006ec <+34>:	mov    -0x18(%rbp),%rax
   0x00000000000006f0 <+38>:	mov    %rax,%rdi
   0x00000000000006f3 <+41>:	callq  0x710 <Person::who()>
   0x00000000000006f8 <+46>:	mov    $0x0,%eax
   0x00000000000006fd <+51>:	add    $0x18,%rsp
   0x0000000000000701 <+55>:	pop    %rbx
   0x0000000000000702 <+56>:	pop    %rbp
   0x0000000000000703 <+57>:	retq   
End of assembler dump.

From 0x00000000006f3 <+41>: callq 0x710 <Person::who()>discover who() method of the base class that is actually called.

Experiment 3

To reduce the complexity of the problem, I'll use some minor changes in this code to remove some of the who methods:

#include <stdio.h>
class Person {
public:
    int age = 6;
    Person() { }
};

class Student : public Person {
public:
    int idNo = 1000;
    Student() { }
};

int main()
{
    Student* m = new Student();
    Person* p = new Person();
    delete m;
    delete p;
}

We use the print command in gdb to output the address of the relevant variable in the code, as shown below.

(gdb) b main
Breakpoint 1 at 0x6f3: file 1.cpp, line 16.
(gdb) r
Starting program: /home/m/t/a.out 

Breakpoint 1, main () at 1.cpp:16
16	    Student* m = new Student();
(gdb) n
17	    Person* p = new Person();
(gdb) n
18	    delete m;
(gdb) p &m
$1 = (Student **) 0x7fffffffdfc0
(gdb) p &p
$2 = (Person **) 0x7fffffffdfc8
(gdb) p *m
$3 = {<Person> = {age = 6}, idNo = 1000}
(gdb) p *p
$4 = {age = 6}
(gdb) p m
$5 = (Student *) 0x555555767e70
(gdb) p p
$6 = (Person *) 0x555555767e90

The following layouts are visible

  • Person* p is stored at 0x7fffffdfc8;
  • Student *m is stored at 0x7fffffdfc0;
  • Person data with age=6 at 0x5555767e90
  • Student data with age=6 and idNo=1000 at 0x5555767e70

The memory location that starts with a low number (in this case, 0x5555767e70) is allocated to the Student object on the heap. It is pointed to by the pointer variable m (address 0x7fffffffdfc0) on the main stack frame. The area on the heap where such a Person object is located is 0x5555767e90. As we all know, stacks grow up and stacks grow down. The way objects are organized in stacks and stack frames largely depends on the way the compiler and operating system are managed in memory.

Some values can be fully optimized without requiring stacking and are directly replaced by registers. However, there are no compiled optimization options used in this example, so the stack is still organized using x86 conventions.

From the memory layout above, we can get some inspiration:

  • Base class pointer p and derived pointer m are the first byte of the class, respectively. The base class is followed by the derived class, which is located at a higher address. This simple example helps us find a virtual pointer.
  • The memory allocation of a derived class is generally larger than that of a parent class, because the derived class inherits (copies) copies of the data members modified by public and protected from the base class. In this example, the Student object gets a copy of the Person age object, and when you try to use sizeof(*m), the result is 8, while Person's memory allocation size is 4.

Construction order in inheritance chain

This is actually the multiple inheritance RAAII convention discussed in the previous article, and we have a better understanding of this process from a disassembly Perspective

Take a look at disassembly again

(gdb) disas main
Dump of assembler code for function main():
   0x00005555555546ea <+0>:	push   %rbp
   0x00005555555546eb <+1>:	mov    %rsp,%rbp
   0x00005555555546ee <+4>:	push   %rbx
   0x00005555555546ef <+5>:	sub    $0x18,%rsp
   0x00005555555546f3 <+9>:	mov    $0x8,%edi
   0x00005555555546f8 <+14>:	callq  0x5555555545b0 <_Znwm@plt>
   0x00005555555546fd <+19>:	mov    %rax,%rbx
   0x0000555555554700 <+22>:	mov    %rbx,%rdi
   0x0000555555554703 <+25>:	callq  0x55555555476a <Student::Student()>
   0x0000555555554708 <+30>:	mov    %rbx,-0x20(%rbp)
   0x000055555555470c <+34>:	mov    $0x4,%edi
   0x0000555555554711 <+39>:	callq  0x5555555545b0 <_Znwm@plt>
   0x0000555555554716 <+44>:	mov    %rax,%rbx
   0x0000555555554719 <+47>:	mov    %rbx,%rdi
   0x000055555555471c <+50>:	callq  0x555555554754 <Person::Person()>
   0x0000555555554721 <+55>:	mov    %rbx,-0x18(%rbp)
=> 0x0000555555554725 <+59>:	mov    -0x20(%rbp),%rax
   0x0000555555554729 <+63>:	mov    $0x8,%esi
   0x000055555555472e <+68>:	mov    %rax,%rdi
   0x0000555555554731 <+71>:	callq  0x5555555545c0 <_ZdlPvm@plt>
   0x0000555555554736 <+76>:	mov    -0x18(%rbp),%rax
   0x000055555555473a <+80>:	mov    $0x4,%esi
   0x000055555555473f <+85>:	mov    %rax,%rdi
   0x0000555555554742 <+88>:	callq  0x5555555545c0 <_ZdlPvm@plt>
   0x0000555555554747 <+93>:	mov    $0x0,%eax
   0x000055555555474c <+98>:	add    $0x18,%rsp
   0x0000555555554750 <+102>:	pop    %rbx
   0x0000555555554751 <+103>:	pop    %rbp
   0x0000555555554752 <+104>:	retq   
End of assembler dump.

among

0x0000555555554703 <+25>:	callq  0x55555555476a <Student::Student()>

Is the student constructor, and the address 0x555555476a after callq is the address of the function.

 0x000055555555471c <+50>:	callq  0x555555554754 <Person::Person()>

Is the person constructor, the same as 0x5555554754 is the address of the function.

Disassembly of Derived Classes

(gdb) disas 0x55555555476a
Dump of assembler code for function Student::Student():
   0x000055555555476a <+0>:	push   %rbp
   0x000055555555476b <+1>:	mov    %rsp,%rbp
   0x000055555555476e <+4>:	sub    $0x10,%rsp
   0x0000555555554772 <+8>:	mov    %rdi,-0x8(%rbp)
   0x0000555555554776 <+12>:	mov    -0x8(%rbp),%rax
   0x000055555555477a <+16>:	mov    %rax,%rdi
   0x000055555555477d <+19>:	callq  0x555555554754 <Person::Person()>
   0x0000555555554782 <+24>:	mov    -0x8(%rbp),%rax
   0x0000555555554786 <+28>:	movl   $0x3e8,0x4(%rax)
   0x000055555555478d <+35>:	nop
   0x000055555555478e <+36>:	leaveq 
   0x000055555555478f <+37>:	retq   
End of assembler dump.

Disassembly of parent classes

(gdb) disas 0x555555554754 
Dump of assembler code for function Person::Person():
   0x0000555555554754 <+0>:	push   %rbp
   0x0000555555554755 <+1>:	mov    %rsp,%rbp
   0x0000555555554758 <+4>:	mov    %rdi,-0x8(%rbp)
   0x000055555555475c <+8>:	mov    -0x8(%rbp),%rax
   0x0000555555554760 <+12>:	movl   $0x6,(%rax)
   0x0000555555554766 <+18>:	nop
   0x0000555555554767 <+19>:	pop    %rbp
   0x0000555555554768 <+20>:	retq   
End of assembler dump.

Initialization of derived classes

  • In the context of the current derived class constructor, the constructors of each parent class in the inheritance chain are initialized sequentially in the order in which they are listed in the inheritance list. This example follows the steps below.
    • (1) After executing the callq 0x555555476a directive from the main function, some stacking operations (stack memory allocation and state preservation of the constructor) are performed in the context of the code snippet where Student::Student() is located.
    • (2) Perform callq 0x5555554754, that is, initialize the code snippet address where the Person class instruction set is located, and assemble the instruction movl movl $0x6, (%rax) x6,(%rax) before the class construction returns. This does these things: The base class saves the data member-variable age=6 as a return value to the location to which the cached memory address in the rax register points (the location allocated by the stack memory of the Student constructor in the previous step, that is, -0x8(%rbp)) so that the constructor of the derived class Student object reads as its data member.
  • Returns the constructor of the derived class itself that executes the remaining set of instructions.

The process of garbage collection is contrary to the order of parent classes defined in the inheritance list. This is the process of destructor call.

  • First, the call function implicitly executes the subclass destructor at the end.
  • The deconstructor of the parent class in the subclass inheritance list is then executed in reverse order.
    From the assembly code, the current constructor has cached some initialized local variables to the location indicated by the cached memory address in the available registers, usually the rax registers, in the assembly context of each constructor before executing the retq instruction return.

Summary

The constructor order of the inheritance chain is important for virtual tables because it defines what functions are visible at what stage.

Added by blindeddie on Fri, 21 Jan 2022 12:50:57 +0200