In the last article, we analyzed the reasons for the emergence of online coredump, including the coredump analysis tool gdb. Readers have been asking whether they can write an article on gdb debugging these days. Today, with this article, we share some debugging experience in our work, hoping to help you.
Write in front
In my work experience, I developed on windows a few years ago and used Visual Studio for debugging. It is a sharp tool. All kinds of breakpoints can be set with the click of the mouse; Since about 12 years ago, it has switched to Linux development, so the debugging is based on GDB. Originally, this article also wanted to write about debugging under windows. However, it has been useless for many years. In addition, the work is too busy, so this article only writes about GDB debugging under Linux. I'm really sorry for Windows developers 😃.
This article covers a comprehensive and summarizes the gdb debugging experience in recent years (all Pediatrics) 😁), Some debugging skills often used, hoping to be helpful to relevant personnel engaged in Linux development
background
As a C/C + + developer, ensuring the normal operation of the program is the most basic and main purpose. In order to ensure the normal operation of the program, debugging is the most basic means. Being familiar with these debugging methods can facilitate us to locate the program problem faster and improve the development efficiency.
In the development process, if the running results of the program do not meet the expectations, the first time is to open GDB for debugging, set breakpoints in the corresponding places, and then analyze the reasons; When there is a problem with the online service, check whether the process is in or not. If not, check whether the coredump file is generated. If so, use GDB to debug the coredump file. Otherwise, analyze the kernel log through dmesg to find the cause.
concept
GDB is a powerful program debugging tool based on command line and released by GNU open source organization under UNIX/LINUX operating system.
GDB supports debugging methods such as breakpoints, single step execution, printing variables, observing variables, viewing registers, viewing stacks, etc. In Linux environment software development, GDB is the main debugging tool for debugging C and C + + programs (also supports go and other languages).
Common commands
breakpoint
Breakpoint is a function we often use in debugging. After we set the breakpoint at the specified location, the program will pause when it runs to that location. At this time, we can perform more operations on the program, such as viewing variable content, stack, etc., to help us debug the program.
Commands to set breakpoints are divided into the following categories:
- breakpoint
- watchpoint
- catchpoint
breakpoint
Breakpoints can be generated according to line numbers, functions and conditions. The following are related commands and corresponding function descriptions:
command | effect |
---|---|
break [file]:function | Set a breakpoint at the function entry of the file |
break [file]:line | Set a breakpoint on the line of the file file |
info breakpoints | View breakpoint list |
break [+-]offset | Set a breakpoint where the current position offset is [+ -] offset |
break *addr | Set breakpoint at address addr |
break ... if expr | Set conditional breakpoints only when conditions are met |
ignore n count | Next, ignore count times for breakpoint number n |
clear | delete all breakpoints |
clear function | Delete all breakpoints within function |
delete n | Deletes the specified number of breakpoints |
enable n | Enables breakpoints of the specified number |
disable n | Disables the specified number of breakpoints |
save breakpoints file | Save breakpoint information to the specified file |
source file | Import the breakpoint information saved in the file |
break | Set breakpoint at next instruction |
clear [file:]line | Delete the breakpoint on line |
watchpoint
Watchpoint is a special type of breakpoint, similar to normal breakpoint. It is a command that requires GDB to pause program execution. The difference is that the watchpoint does not reside in a line of source code, but instructs GDB to pause the execution whenever the value of an expression changes.
watchpoint is divided into hardware implementation and software implementation. The former needs the support of hardware system; The principle of the latter is to check whether the value of the variable changes after each step. When GDB creates a data breakpoint, it will give priority to trying the hardware mode. If it fails, it will try the software implementation again.
command | effect |
---|---|
watch variable | Set variable data breakpoints |
watch var1 + var2 | Set expression data breakpoints |
rwatch variable | Set the read breakpoint. Only hardware implementation is supported |
awatch variable | Set the read / write breakpoint. Only hardware implementation is supported |
info watchpoints | View a list of data breakpoints |
set can-use-hw-watchpoints 0 | Mandatory software based implementation |
When using data breakpoints, you should pay attention to:
- When the monitoring variable is a local variable, once the local variable fails, the data breakpoint will also fail
- If the pointer variable p is monitored, watch *p monitors the change of the memory data referred to by P, and watch p monitors whether the pointer itself has changed
The most common data breakpoint application scenario: locate when the internal members of the structure on the heap are modified. Since pointers are generally local variables, there are generally two methods to solve breakpoint failure.
command | effect |
---|---|
print &variable | View the memory address of the variable |
watch *(type *)address | Setting breakpoints indirectly through memory addresses |
watch -l variable | Specify the location parameter |
watch variable thread 1 | Only the thread numbered 1 breaks when it modifies the var value of the variable |
catchpoint
Literally, it is to capture breakpoints, which mainly monitor the generation of signals. For example, throw in c + + or breakpoint behavior occurs when loading the library.
command | meaning |
---|---|
catch fork | Interrupt when program calls fork |
tcatch fork | The set breakpoint is triggered only once and then deleted automatically |
catch syscall ptrace | Set breakpoints for ptrace system calls |
Add the breakpoint number after the command to define the operation to be performed after the breakpoint is triggered. It may be used in some advanced automatic debugging scenarios.
command line
command | effect |
---|---|
run arglist | Run the program with arglist as the parameter list |
set args arglist | Specify startup command line parameters |
set args | Specify an empty parameter list |
show args | Print command line list |
Program stack
command | effect |
---|---|
backtrace [n] | Print stack frame |
frame [n] | Select the nth stack frame. If it does not exist, the current stack frame will be printed |
up n | Select the stack frame with current stack frame number + n |
down n | Select the stack frame with the current stack frame number - n |
info frame [addr] | Describes the currently selected stack frame |
info args | Parameter list of current stack frame |
info locals | Local variable of current stack frame |
Multi process, multi thread
Multi process
GDB only tracks the parent process by default when debugging multi process programs (including fork calls). You can use command settings to track only the parent process or child process, or debug the parent process and child process at the same time.
command | effect |
---|---|
info inferiors | View process list |
attach pid | Binding process id |
inferior num | Switch to the specified process for debugging |
print $_exitcode | Displays the return value when the program exits |
set follow-fork-mode child | Track child processes |
set follow-fork-mode parent | Trace parent process |
set detach-on-fork on | Only one of these processes is tracked when fork is called |
set detach-on-fork off | fork calls track both parent and child processes |
In debugging multi process programs, by default, in addition to the current debugging process, other processes are suspended. Therefore, if you need to debug the current process, other processes can also be executed normally, then you can set up set schedule-multiple on.
Multithreading
Multithreaded development is very common in daily development work, so it is necessary to master multithreaded debugging skills.
By default, when debugging multiple threads, all threads will be suspended once the program is interrupted. If you continue to execute the current thread at this time, other threads will also execute at the same time.
command | effect |
---|---|
info threads | View thread list |
print $_thread | Displays the number of threads currently being debugged |
set scheduler-locking on | While debugging one thread, other threads pause execution |
set scheduler-locking off | When debugging one thread, other threads execute synchronously |
set scheduler-locking step | When debugging a thread only with step, other threads will not execute, and other commands, such as next, will still execute |
If you only care about the current thread, it is recommended to temporarily set {scheduler locking} to} on to avoid other threads running at the same time, resulting in hitting other breakpoints and distracting attention.
Printout
Usually, during debugging, we need to check the value of a variable to analyze whether it meets the expectations. At this time, we need to print out the variable value.
command | effect |
---|---|
whatis variable | View the type of variable |
ptype variable | View detailed type information of variables |
info variables var | View the file that defines this variable. Local variables are not supported |
Print string
Use the x/s command to print an ASCII string. If it is a wide character string, you need to first look at the length of the wide character {print sizeof(str).
If the length is 2, print with x/hs; If the length is 4, print with x/ws.
command | effect |
---|---|
x/s str | Print string |
set print elements 0 | Print unlimited string length and / or unlimited array length |
call printf("%s\n",xxx) | At this time, the printed string will not contain redundant escape characters |
printf "%s\n",xxx | ditto |
Print array
command | effect |
---|---|
print *array@10 | Prints the values of 10 consecutive elements from the beginning of the array |
print array[60]@10 | Print the 10 elements of the array subscript starting from 60, i.e. the 60th to 69th elements |
set print array-indexes on | When printing array elements, the subscripts of the array are also printed |
Print pointer
command | effect |
---|---|
print ptr | View the type and address of the pointer |
print *(struct xxx *)ptr | View the contents of the structure pointed to |
Prints the value of the specified memory address
Use the x command to print the memory value in the format of x/nfu addr, and print the memory value of n length units starting from addr in the format of f.
- n: Number of output units
- f: Output format: for example, x indicates hexadecimal output, o indicates octal output, and the default is x
- u: The length of a unit, b represents 1 byte, h represents 2 bytes (half word), w represents 4 bytes, and g represents 8 bytes (giant word)
command | effect |
---|---|
x/8xb array | Print the values of the first 8 byte s of the array in hexadecimal |
x/8xw array | Print the first 16 word values of array array in hexadecimal |
Print local variables
command | effect |
---|---|
info locals | Prints the value of the local variable of the current function |
backtrace full | Print the local variable value of each function of the current stack frame. The command can be abbreviated as bt |
bt full n | Display n stack frames and their local variables from inside to outside |
bt full -n | Display n stack frames and their local variables from outside to inside |
Print structure
command | effect |
---|---|
set print pretty on | Each row displays only one member of the structure |
set print null-stop | Do not display '\ 000' |
Function jump
command | effect |
---|---|
set step-mode on | Without skipping functions without debugging information, you can display and debug assembly code |
finish | After executing the current function and printing the return value, the interrupt is triggered |
return 0 | Instead of executing the following instructions, you can return directly. You can specify the return value |
call printf("%s\n", str) | Call printf function to print string (call or print function can be used) |
print func() | Call func function (call function or print function can be used) |
set var variable=xxx | Set the value of variable to xxx |
set {type}address = xxx | Assign a value to a variable whose storage address is address and type is type |
info frame | Displays information about the function stack (stack frame address, instruction register value, etc.) |
other
Graphical
tui is the abbreviation of terminal user interface. You can enter or exit the graphical interface by specifying the - tui parameter at startup or by using ctrl+x+a during debugging.
command | meaning |
---|---|
layout src | Display source code window |
layout asm | Show assembly window |
layout split | Display source code + assembly window |
layout regs | Display register + source code or assembly window |
winheight src +5 | Source window height increased by 5 lines |
winheight asm -5 | Reduce the assembly window height by 5 lines |
winheight cmd +5 | Increase console window height by 5 lines |
winheight regs -5 | Reduce the height of the register window by 5 lines |
assembly
command | meaning |
---|---|
disassemble function | View the assembly code of the function |
disassemble /mr function | Compare function source code and assembly code at the same time |
Debug and save core files
command | meaning |
---|---|
file exec_file *# * | Loading symbol table information of executable file |
core core_file | Load core dump file |
gcore core_file | Generate a core dump file to record the status of the current process |
Start mode
gdb debugging can be started in the following ways:
- gdb filename: debug executable
- gdb attach pid: debug the running process by "binding" the process ID
- gdb filename -c coredump_file: debug executable
In the following sections, the above debugging methods will be explained respectively, so that everyone can better master debugging skills from the perspective of examples.
debugging
Executable file
Single thread
First, let's look at a piece of code:
#include<stdio.h> void print(int xx, int *xxptr) { printf("In print():\n"); printf(" xx is %d and is stored at %p.\n", xx, &xx); printf(" ptr points to %p which holds %d.\n", xxptr, *xxptr); } int main(void) { int x = 10; int *ptr = &x; printf("In main():\n"); printf(" x is %d and is stored at %p.\n", x, &x); printf(" ptr points to %p which holds %d.\n", ptr, *ptr); print(x, ptr); return 0; }
This code is relatively simple. Let's start debugging:
gdb ./test_main GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /root/test_main...done. (gdb) r Starting program: /root/./test_main In main(): x is 10 and is stored at 0x7fffffffe424. ptr points to 0x7fffffffe424 which holds 10. In print(): xx is 10 and is stored at 0x7fffffffe40c. xxptr points to 0x7fffffffe424 which holds 10. [Inferior 1 (process 31518) exited normally] Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.x86_64
In the above commands, we start debugging through gdb test command, and then execute the program by executing r (the abbreviation of run command) until exiting. In other words, the above command is a complete process of running executable programs using gdb (only r command is used). Next, we will take this as an example to introduce several common commands.
breakpoint
(gdb) b 15 Breakpoint 1 at 0x400601: file test_main.cc, line 15. (gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x0000000000400601 in main() at test_main.cc:15 (gdb) r Starting program: /root/./test_main In main(): x is 10 and is stored at 0x7fffffffe424. ptr points to 0x7fffffffe424 which holds 10. Breakpoint 1, main () at test_main.cc:15 15 print(xx, xxptr); Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.x86_64 (gdb)
backtrace
(gdb) backtrace #0 main () at test_main.cc:15 (gdb)
The backtrace command lists all frames in the current stack. In the above example, there is only one frame on the stack, numbered 0, belonging to the main function.
(gdb) step print (xx=10, xxptr=0x7fffffffe424) at test_main.cc:4 4 printf("In print():\n"); (gdb)
Next, we execute the step command, that is, enter the function. Next, we continue to view the stack frame information through the backtrace command.
(gdb) backtrace #0 print (xx=10, xxptr=0x7fffffffe424) at test_main.cc:4 #1 0x0000000000400612 in main () at test_main.cc:15 (gdb)
From the above output results, we can see that there are two stack frames. Frame 1 belongs to the main function and frame 0 belongs to the print function.
Each stack frame lists the parameters of the function. From the above, we can see that the main function has no parameters, while the print function has parameters and displays the value of its parameters.
One thing we may be confused about is that the stack frame number of the main function is 0 when executing the backtrace for the first time, while the stack frame of the main function is 1 and the stack frame of the print function is 0 when executing the backtrace for the second time_ With the downward growth of stack_ The rules are the same. We just need to remember_ The minimum frame number is the last function.
frame
Stack frames are used to store information such as variable values of functions. By default, GDB is always located in the context of the stack frame corresponding to the currently executing function.
In the previous example, GDB is in the context of frame 0 because it is currently executing in the print() function. You can obtain the frame of the currently executing context through the frame command.
(gdb) frame #0 print (xx=10, xxptr=0x7fffffffe424) at test_main.cc:4 4 printf("In print():\n"); (gdb)
Next, we try to print the value of the current stack frame with the print command, as follows:
(gdb) print xx $1 = 10 (gdb) print xxptr $2 = (int *) 0x7fffffffe424 (gdb)
What if we want to see the contents of other stack frames? For example, what about the information of x and ptr in the main function? If you print these two values directly, you will get the following:
(gdb) print x No symbol "x" in current context. (gdb) print xxptr No symbol "ptr" in current context. (gdb)
Here, we can_ frame num_ To switch stack frames, as follows:
(gdb) frame 1 #1 0x0000000000400612 in main () at test_main.cc:15 15 print(x, ptr); (gdb) print x $3 = 10 (gdb) print ptr $4 = (int *) 0x7fffffffe424 (gdb)
Multithreading
To facilitate the demonstration, we create a simple example with the following code:
#include <chrono> #include <iostream> #include <string> #include <thread> #include <vector> int fun_int(int n) { std::this_thread::sleep_for(std::chrono::seconds(10)); std::cout << "in fun_int n = " << n << std::endl; return 0; } int fun_string(const std::string &s) { std::this_thread::sleep_for(std::chrono::seconds(10)); std::cout << "in fun_string s = " << s << std::endl; return 0; } int main() { std::vector<int> v; v.emplace_back(1); v.emplace_back(2); v.emplace_back(3); std::cout << v.size() << std::endl; std::thread t1(fun_int, 1); std::thread t2(fun_string, "test"); std::cout << "after thread create" << std::endl; t1.join(); t2.join(); return 0; }
The above code is relatively simple:
- Function fun_ The function of int is to sleep for 10s and then print its parameters
- Function fun_ The string function is to sleep for 10s, and then print its parameters
- In the main function, create two threads to execute the above two functions respectively
The following is a complete debugging process:
(gdb) b 27 Breakpoint 1 at 0x4013d5: file test.cc, line 27. (gdb) b test.cc:32 Breakpoint 2 at 0x40142d: file test.cc, line 32. (gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x00000000004013d5 in main() at test.cc:27 2 breakpoint keep y 0x000000000040142d in main() at test.cc:32 (gdb) r Starting program: /root/test [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Breakpoint 1, main () at test.cc:27 (gdb) c Continuing. 3 [New Thread 0x7ffff6fd2700 (LWP 44996)] in fun_int n = 1 [New Thread 0x7ffff67d1700 (LWP 44997)] Breakpoint 2, main () at test.cc:32 32 std::cout << "after thread create" << std::endl; (gdb) info threads Id Target Id Frame 3 Thread 0x7ffff67d1700 (LWP 44997) "test" 0x00007ffff7051fc3 in new_heap () from /lib64/libc.so.6 2 Thread 0x7ffff6fd2700 (LWP 44996) "test" 0x00007ffff7097e2d in nanosleep () from /lib64/libc.so.6 * 1 Thread 0x7ffff7fe7740 (LWP 44987) "test" main () at test.cc:32 (gdb) thread 2 [Switching to thread 2 (Thread 0x7ffff6fd2700 (LWP 44996))] #0 0x00007ffff7097e2d in nanosleep () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff7097e2d in nanosleep () from /lib64/libc.so.6 #1 0x00007ffff7097cc4 in sleep () from /lib64/libc.so.6 #2 0x00007ffff796ceb9 in std::this_thread::__sleep_for(std::chrono::duration<long, std::ratio<1l, 1l> >, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) () from /lib64/libstdc++.so.6 #3 0x00000000004018cc in std::this_thread::sleep_for<long, std::ratio<1l, 1l> > (__rtime=...) at /usr/include/c++/4.8.2/thread:281 #4 0x0000000000401307 in fun_int (n=1) at test.cc:9 #5 0x0000000000404696 in std::_Bind_simple<int (*(int))(int)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=0x609080) at /usr/include/c++/4.8.2/functional:1732 #6 0x000000000040443d in std::_Bind_simple<int (*(int))(int)>::operator()() (this=0x609080) at /usr/include/c++/4.8.2/functional:1720 #7 0x000000000040436e in std::thread::_Impl<std::_Bind_simple<int (*(int))(int)> >::_M_run() (this=0x609068) at /usr/include/c++/4.8.2/thread:115 #8 0x00007ffff796d070 in ?? () from /lib64/libstdc++.so.6 #9 0x00007ffff7bc6dd5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007ffff70d0ead in clone () from /lib64/libc.so.6 (gdb) c Continuing. after thread create in fun_int n = 1 [Thread 0x7ffff6fd2700 (LWP 45234) exited] in fun_string s = test [Thread 0x7ffff67d1700 (LWP 45235) exited] [Inferior 1 (process 45230) exited normally] (gdb) q
During the above commissioning:
-
b 27 add a breakpoint on line 27
-
b test.cc:32 add a breakpoint on line 32 (the effect is the same as b 32)
-
info b outputs all breakpoint information
-
The r program starts running and pauses at the first breakpoint
-
c executes the c command, pauses at the second breakpoint, and creates two threads t1 and t2 between the first breakpoint and the second breakpoint
-
info threads outputs all thread information. From the output, we can see that there are three threads in total, namely main thread, t1 and t2
-
Thread 2 switches to thread 2
-
bt outputs the stack information of thread 2
-
c until the end of the procedure
-
q exit gdb
Multi process
As above, we still use an example to simulate multi process debugging. The code is as follows:
#include <stdio.h> #include <unistd.h> int main() { pid_t pid = fork(); if (pid == -1) { perror("fork error\n"); return -1; } if(pid == 0) { // Subprocess int num = 1; while(num == 1){ sleep(10); } printf("this is child,pid = %d\n", getpid()); } else { // Parent process printf("this is parent,pid = %d\n", getpid()); wait(NULL); // Wait for the child process to exit } return 0; }
In the above code, there are two processes, one is the parent process (that is, the main process), and the other is the child process created by the fork() function.
By default, in multi process programs, GDB only debugs the main process, that is, no matter how many times the program calls the fork() function and how many child processes are created, GDB only debugs the parent process by default. In order to support multi process debugging, GDB version 7.0 supports separate debugging (debugging parent process or child process) and simultaneous debugging of multiple processes.
So, how do we debug subprocesses? We can debug sub processes in the following ways.
attach
First, both parent and child processes can start gdb for debugging through the attach command. As we all know, the operating system assigns a unique ID number to each running program, that is, the process ID. If we know the process ID, we can debug it with the attach command.
In the above code, the subprocess created by the fork() function first enters the while loop sleep and then calls the printf function after the while loop. The purpose of this is as follows:
- Help attach capture the process id to debug
- When debugging with gdb, the real code (that is, the print function) is not executed, so you can debug the child process from scratch
You may have doubts. The above code and entering the while loop will not execute the printf function below anyway. In fact, this is the strength of gdb. You can modify the value of num through the gdb command so that it can jump out of the while loop
Compile and generate the executable file test with the following command_ process
g++ -g test_process.cc -o test_process
Now let's try to start debugging.
gdb -q ./test_process Reading symbols from /root/test_process...done. (gdb)
It should be noted here that the - q option is added to remove other unnecessary output. q is the abbreviation of quit.
(gdb) r Starting program: /root/./test_process Detaching after fork from child process 37482. this is parent,pid = 37478 [Inferior 1 (process 37478) exited normally] Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 libstdc++-4.8.5-36.el7.x86_64 (gdb) attach 37482 //Symbol class output, omitted here (gdb) n Single stepping until exit from function __nanosleep_nocancel, which has no line number information. 0x00007ffff72b3cc4 in sleep () from /lib64/libc.so.6 (gdb) Single stepping until exit from function sleep, which has no line number information. main () at test_process.cc:8 8 while(num==10){ (gdb)
In the above command, we execute n (abbreviation of next) to re judge the judgment body of the while loop.
(gdb) set num = 1 (gdb) n 12 printf("this is child,pid = %d\n",getpid()); (gdb) c Continuing. this is child,pid = 37482 [Inferior 1 (process 37482) exited normally] (gdb)
In order to exit the while loop, we use the set command to set the value of num to 1, so that the condition will expire, exit the while loop, and then execute the following printf() function; At last, we execute the C (short for continue) command to support the program exit.
If the program is running normally and deadlock occurs, you can obtain the process ID through ps, bind it according to gdb attach pid, and then view the stack information
Specify process
By default, when GDB debugs multi process programs, only the parent process is debugged. GDB provides two commands to specify whether to debug parent or child processes through follow fork mode and detach on fork.
follow-fork-mode
This command can be used as follows:
(gdb) set follow-fork-mode mode
mode has the following two options:
- Parent: parent process, default option of mode
- Child: child process. Its purpose is to tell gdb to debug the child process instead of the parent process after the target application calls fork, because in Linux system, a successful fork() system call will return twice, once in the parent process and once in the child process
(gdb) show follow-fork-mode Debugger response to a program call of fork or vfork is "parent". (gdb) set follow-fork-mode child (gdb) r Starting program: /root/./test_process [New process 37830] this is parent,pid = 37826 ^C Program received signal SIGINT, Interrupt. [Switching to process 37830] 0x00007ffff72b3e10 in __nanosleep_nocancel () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 libstdc++-4.8.5-36.el7.x86_64 (gdb) n Single stepping until exit from function __nanosleep_nocancel, which has no line number information. 0x00007ffff72b3cc4 in sleep () from /lib64/libc.so.6 (gdb) n Single stepping until exit from function sleep, which has no line number information. main () at test_process.cc:8 8 while(num==10){ (gdb) show follow-fork-mode Debugger response to a program call of fork or vfork is "child". (gdb)
In the above command, we did the following:
- Show follow fork mode: use this command to view the current mode. You can see from the output that it is in the parent process mode
- Set follow fork mode child: Specifies the debug subprocess mode
- r: Run the program, run the program directly, enter the sub process, and then execute the while loop
- ctrl + c: with this command, GDB can receive the SIGINT command and suspend the execution of the while loop
- n(next): continue to execute, and then enter the condition judgment of the while loop
- Show follow fork mode: execute the command again. You can see from the output that it is currently in child mode
detach-on-fork
If you specify whether to debug the child process or the parent process at the beginning, the following fork mode command can fully meet the requirements; But what if you want to switch debugging back and forth between the parent process and the child process according to the actual situation during debugging?
GDB provides another command:
(gdb) set detach-on-fork mode
mode has the following two values:
on: the default value indicates that only one process can be debugged, which can be a child process or a parent process
off: every process in the program will be recorded, so we can debug all processes
If you choose to turn off the detach on fork mode (the mode is off), GDB will retain control over all forked processes, that is, you can debug all forked processes. Use the info forks command to list all fork processes that can be debugged by GDB, and use the fork command to switch from one fork process to another.
- info forks: print the list of all forked processes under DGB control. The list includes fork id, process id, and the location of the current process
- Fork fork ID: the fork ID parameter is the internal fork number assigned by GDB, which can be obtained through the above command info forks
coredump
When we develop or use a program, what we fear most is that the program crashes inexplicably. In order to analyze the cause of the crash, the memory content of the operating system (including the stack and other information when the program crashes) will be dumped when the program crashes (by default, this file is called core.pid, where PID is the process id). This dump operation is called coredump (core dump). Then we can debug this file with the debugger, To restore the scene when the program crashed.
Before we analyze how to debug the coredump file with gdb, we need to generate a coredump. For simplicity, we use the following example to generate it:
#include <stdio.h> void print(int *v, int size) { for (int i = 0; i < size; ++i) { printf("elem[%d] = %d\n", i, v[i]); } } int main() { int v[] = {0, 1, 2, 3, 4}; print(v, 1000); return 0; }
Compile and run the program:
g++ -g test_core.cc -o test_core ./test_core
The output is as follows:
elem[775] = 1702113070 elem[776] = 1667200115 elem[777] = 6648431 elem[778] = 0 elem[779] = 0 Segment error(spit out the pips)
As expected, the program generates exceptions, but does not generate coredump files. This is because coredump generation is turned off by default, so you need to set the corresponding options to turn on coredump generation.
For coredump generated by multithreaded programs, sometimes its stack information can not completely analyze the cause, which makes us have to have other ways.
There was an online failure in 18 years. Everything was normal in the test environment, but when it was online, it would coredump. After debugging coredump according to gdb, it could only be located in libcurl, but the reason could not be located. It took about two days. It was found that coredump was only available when it timed out. Because the configuration of the test environment was poor, the timeout setting was 20ms, while the online timeout setting was 5ms, After knowing the cause of coredump, the method of step-by-step positioning and scope reduction is adopted to gradually narrow the scope of the code. Finally, it is located that it is caused by a bug in libcurl. Therefore, many times, the problem on the positioning line needs to take appropriate methods to locate the problem in combination with the actual situation.
to configure
The configuration coredump is generated, including temporary configuration (the configuration fails after exiting the terminal) and permanent configuration.
temporary
Through ulimit -a, you can determine whether coredump generation is currently configured:
ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0
From the above output, we can see that the number after the core file size is 0, that is, the core dump file is not generated. We can set it through the following command
ulimit -c size
Where size is the size of the coredump that is allowed to be generated. This is generally set as large as possible to prevent incomplete coredump information. The author generally sets it as unlimited.
ulimit -c unlimited
It should be noted that the default generation path of the temporarily configured coredump option is the path when the command is executed. You can modify the path by modifying the configuration.
permanent
The above settings only enable the core dump function. By default, the core file generated by the kernel during coredump is placed in the same directory as the program, and the file name is fixed as core. Obviously, if multiple programs generate core files, or the same program crashes multiple times, the same core file will be overwritten repeatedly.
By modifying the parameters of the kernel, you can specify the file name of the coredump file generated by the kernel. The following commands can be used to realize the permanent configuration, storage path and generation of coredump name.
mkdir -p /www/coredump/ chmod 777 /www/coredump/ /etc/profile ulimit -c unlimited /etc/security/limits.conf * soft core unlimited echo "/www/coredump/core-%e-%p-%h-%t" > /proc/sys/kernel/core_pattern
debugging
Now, we re execute the following command to generate the coredump file as expected:
./test_coredump elem[955] = 1702113070 elem[956] = 1667200115 elem[957] = 6648431 elem[958] = 0 elem[959] = 0 Segment error(spit out the pips)
Then use the following command for coredump debugging:
gdb ./test_core -c /www/coredump/core_test_core_1640765384_38924 -q
The output is as follows:
#0 0x0000000000400569 in print (v=0x7fff3293c100, size=1000) at test_core.cc:5 5 printf("elem[%d] = %d\n", i, v[i]); Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.x86_64 libgcc-4.8.5-36.el7.x86_64 libstdc++-4.8.5-36.el7.x86_64 (gdb)
It can be seen that the program core is on line 5. At this time, we can view the stack backtracking information through the where command.
Enter the where command in gdb to get the stack call information. This is the most basic and useful command when debugging coredump. The output of the where command contains the function name and related parameter values in the program.
Through the where command, we can find that the program core is on line 5, so we can basically locate the reason according to the analysis of the source code.
It should be noted that when multithreading is running, the core is not necessarily in the current thread, which requires us to have a certain understanding of the code to ensure which code is safe, then switch the thread through thread num, and then view the stack information through bt or where command, so as to locate the cause of coredump.
principle
In the previous sections, we talked about the commands of GDB and the role of these commands in debugging, and demonstrated them with examples. As a C/C++ coder, we should know its nature and why. Therefore, with the help of this section, we will talk about the principle of GDB debugging.
gdb takes over the execution of a process through the system call ptrace. Ptrace system call provides a way for the parent process to observe and control the execution of other processes, check and change its core image and registers. It is mainly used to realize breakpoint debugging and system call tracking.
ptrace system call is defined as follows:
#include <sys/ptrace.h> long ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data)
- pid_t pid: indicates the process to be tracked by ptrace
- void *addr: indicates the memory address to be monitored
- enum __ ptrace_ Request: determines the function of system call. There are several main options:
- PTRACE_ Trace: indicates that this process will be tracked by the parent process, and any signal (except {SIGKILL) will pause the child process, and then the parent process blocking} wait() will be awakened. The call to {exec() inside the child process will send a} sigrap} signal, which allows the parent process to fully control the child process before the new program starts running
- PTRACE_ATTACH: attach to a specified process to make it a child process tracked by the current process, and the behavior of the child process is equivalent to that it has performed a ptrace_ Trace operation. However, it should be noted that although the current process becomes the parent process of the tracked process, the pid of the child process using getppid() will still be the pid of its original parent process
- PTRACE_CONT: resume the child process that was stopped before. The specified signal can be delivered to the child process at the same time
Debugging principle
Run and debug new processes
Run and debug the new process as follows:
- Run gdb exe
- Enter the run command and gdb performs the following operations:
- Create a new process through the fork() system call
- Execute the ptrace (ptrace_trace, 0, 0, 0) operation in the newly created child process
- In the subprocess, the specified executable file is loaded through the execv() system call
attach the running process
You can debug a running process through gdb attach pid. gdb will perform ptrace(PTRACE_ATTACH, pid, 0, 0) operation on the specified process.
It should be noted that when we attach a process id, the following error may be reported:
Attaching to process 28849 ptrace: Operation not permitted.
This is because you do not have permission to operate. You can operate under the user or root who starts the process.
Breakpoint principle
Implementation principle
When we set a breakpoint through b or break, we insert a breakpoint instruction at the specified position. When the debugged program runs to the breakpoint, sigrap signal is generated. The signal is captured by gdb and the breakpoint hit judgment is performed.
Setting principle
To set a breakpoint in a program is to first save the original instruction in this location, and then write int 3 in this location. When int 3 is executed, a soft interrupt occurs and the kernel sends a sigrap signal to the child process. Of course, this signal is forwarded to the parent process. Then replace int 3 with the saved instruction and wait for the operation to resume.
Hit judgment
gdb stores all breakpoint locations in a linked list. Hit determination compares the current stop position of the debugged program with the breakpoint position in the linked list to view the signal generated by the breakpoint.
Conditional judgment
After the instruction is resumed at the breakpoint, a condition judgment is added. If the expression is true, a breakpoint is triggered. Because it needs to be judged once, whether to trigger the conditional breakpoint after adding the conditional breakpoint will affect the performance. On x86 platform, some hardware supports hardware breakpoints. Instead of inserting int 3 at the conditional breakpoint, insert another instruction. When the program reaches this address, it does not send an int 3 signal, but makes a comparison. The contents of a specific register and an address, and then decide whether to send int 3. Therefore, when your breakpoint location is frequently "passed" by the program, try to use hardware breakpoints, which will help improve performance.
Single step principle
This ptrace function is supported by ptrace(PTRACE_SINGLESTEP, pid,...) Call to implement a single step.
printf("attaching to PID %d\n", pid); if (ptrace(PTRACE_ATTACH, pid, 0, 0) != 0) { perror("attach failed"); } int waitStat = 0; int waitRes = waitpid(pid, &waitStat, WUNTRACED); if (waitRes != pid || !WIFSTOPPED(waitStat)) { printf("unexpected waitpid result!\n"); exit(1); } int64_t numSteps = 0; while (true) { auto res = ptrace(PTRACE_SINGLESTEP, pid, 0, 0); }
The above code first receives a pid, then carries on the attach, finally calls ptrace to carry on the single step debugging.
other
With the help of this article, briefly introduce some other commands or tools used in the author's work.
pstack
This command displays the stack trace for each process. The pstack command must be run by the owner or root of the corresponding process. Pstack can be used to determine where a process is suspended. The only option allowed for this command is the PID of the process to check.
This command is very useful for troubleshooting process problems. For example, if we find that a service is always in work state (such as suspended state, like an endless loop), we can easily locate the problem by using this command; pstack can be executed several times over a period of time. If it is found that the code stack always stops at the same location, that location needs to be focused on, which is likely to be the problem;
Taking the multithreaded code as an example, if its process ID is 4507 (local to the author), then
The output results of pstack 4507 are as follows:
Thread 3 (Thread 0x7f07aaa69700 (LWP 45708)): #0 0x00007f07aab2ee2d in nanosleep () from /lib64/libc.so.6 #1 0x00007f07aab2ecc4 in sleep () from /lib64/libc.so.6 #2 0x00007f07ab403eb9 in std::this_thread::__sleep_for(std::chrono::duration<long, std::ratio<1l, 1l> >, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) () from /lib64/libstdc++.so.6 #3 0x00000000004018cc in void std::this_thread::sleep_for<long, std::ratio<1l, 1l> >(std::chrono::duration<long, std::ratio<1l, 1l> > const&) () #4 0x00000000004012de in fun_int(int) () #5 0x0000000000404696 in int std::_Bind_simple<int (*(int))(int)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) () #6 0x000000000040443d in std::_Bind_simple<int (*(int))(int)>::operator()() () #7 0x000000000040436e in std::thread::_Impl<std::_Bind_simple<int (*(int))(int)> >::_M_run() () #8 0x00007f07ab404070 in ?? () from /lib64/libstdc++.so.6 #9 0x00007f07ab65ddd5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007f07aab67ead in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f07aa268700 (LWP 45709)): #0 0x00007f07aab2ee2d in nanosleep () from /lib64/libc.so.6 #1 0x00007f07aab2ecc4 in sleep () from /lib64/libc.so.6 #2 0x00007f07ab403eb9 in std::this_thread::__sleep_for(std::chrono::duration<long, std::ratio<1l, 1l> >, std::chrono::duration<long, std::ratio<1l, 1000000000l> >) () from /lib64/libstdc++.so.6 #3 0x00000000004018cc in void std::this_thread::sleep_for<long, std::ratio<1l, 1l> >(std::chrono::duration<long, std::ratio<1l, 1l> > const&) () #4 0x0000000000401340 in fun_string(std::string const&) () #5 0x000000000040459f in int std::_Bind_simple<int (*(char const*))(std::string const&)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) () #6 0x000000000040441f in std::_Bind_simple<int (*(char const*))(std::string const&)>::operator()() () #7 0x0000000000404350 in std::thread::_Impl<std::_Bind_simple<int (*(char const*))(std::string const&)> >::_M_run() () #8 0x00007f07ab404070 in ?? () from /lib64/libstdc++.so.6 #9 0x00007f07ab65ddd5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007f07aab67ead in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f07aba80740 (LWP 45707)): #0 0x00007f07ab65ef47 in pthread_join () from /lib64/libpthread.so.0 #1 0x00007f07ab403e37 in std::thread::join() () from /lib64/libstdc++.so.6 #2 0x0000000000401455 in main ()
In the above output results, the detailed information inside the process is output to the terminal to facilitate the analysis of the problem.
ldd
During the compilation process, we usually prompt that the compilation fails. Through the output of error information, it is found that the function definition cannot be found, or the compilation succeeds, but the runtime fails (often because it depends on the abnormal version of lib Library). At this time, we can analyze which libraries the executable depends on and the path of these libraries through ldd.
It is used to view the shared library required by the program. It is often used to solve some problems that the program cannot run due to the lack of a library file.
Still view the executable test_ The dependent Library of thread, and the output is as follows:
ldd -r ./test_thread linux-vdso.so.1 => (0x00007ffde43bc000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f8c5e310000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f8c5e009000) libm.so.6 => /lib64/libm.so.6 (0x00007f8c5dd07000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f8c5daf1000) libc.so.6 => /lib64/libc.so.6 (0x00007f8c5d724000) /lib64/ld-linux-x86-64.so.2 (0x00007f8c5e52c000)
In the above output:
- Column 1: what libraries does the program need to rely on
- The second column: the library provided by the system corresponding to the library required by the program
- Column 3: start address of Library loading
Sometimes, when we view the dependent library through ldd, we will prompt that the library cannot be found, as follows:
ldd -r test_process linux-vdso.so.1 => (0x00007ffc71b80000) libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fe4badd5000) libm.so.6 => /lib64/libm.so.6 (0x00007fe4baad3000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fe4ba8bd000) libc.so.6 => /lib64/libc.so.6 (0x00007fe4ba4f0000) /lib64/ld-linux-x86-64.so.2 (0x00007fe4bb0dc000) liba.so => not found
For example, the last hint above, Liba So can't find it. At this time, we need to know Liba The path of so, such as / path / to / Liba So, there are two ways:
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/
In this way, you can find the corresponding lib library through ldd, but this disadvantage is temporary. That is, after exiting the terminal and executing ldd, you will still be prompted that the library cannot be found, so there is another way, that is, modify / etc / LD so. Conf, add the required path after the file, i.e
include ld.so.conf.d/*.conf /path/to/
The following order shall then be passed to take effect permanently
/sbin/ldconfig
c++filter
Because c + + supports overloading, the name mangling mechanism of the compiler is introduced to rename functions.
We use the strings command to view test_ Function information in thread (only relevant information such as fun is output)
strings test_thread | grep fun_ in fun_int n = in fun_string s = _GLOBAL__sub_I__Z7fun_inti _Z10fun_stringRKSs
Can see_ Z10fun_ If you want to know the function definition of stringrkss, you can use the C + + filter command, as follows:
c++filt _Z10fun_stringRKSs fun_string(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
Through the above output, we can restore the function name generated by the compiler to the function name in our code, namely fun_string.
epilogue
GDB is an essential debugging tool for Linux development. The use scenario depends on specific requirements or specific problems encountered. In our daily development work, skillfully using GDB to assist can make the development process get twice the result with half the effort.
Starting from some simple commands, this paper gives examples to debug executable programs (single thread, multi thread and multi process scenarios), coredump files and other scenarios, so that we can more intuitively understand the use of GDB. GDB is very powerful. The author uses some very basic functions in his work. If you want to deeply understand GDB, you need to read it on the official website.
This article took about three weeks from conception to completion. The writing process is painful (it needs to sort out materials, build various scenes, and restore various scenes), and it is full of harvest at the same time. Through this paper, the understanding of the underlying principle of GDB is further deepened.
Author: high performance architecture exploration
This article starts with the official account [high performance architecture].
Personal technology blog: High performance architecture exploration
Turn https://www.cnblogs.com/gaoxingnjiagoutansuo/p/15820753.html