Android debuggerd source code analysis

Introduction to debuggerd

Android The system comes with a practical program to diagnose abnormal exit daemon Debuggerd. This process can detect program crash and output process status information when crash to file and serial port for developers to analyze and debug. Debuggerd's data is stored in the / data/tombstone / directory and can hold 10 files. When more than 10 files are saved, it overwrites the earliest produced files. In the serial port, we use DEBUG tag directly to output logcat information. Linux kernel has its own signaling mechanism. When an application crashes, the system kernel usually sends a signal to the process in question to notify it of any abnormalities. These processes can capture these signals and process them accordingly. Usually for the processing of abnormal program signals, it is exit. Android implements a more practical function in this mechanism: intercept these signals, dump process information for debugging.

Operation principle of debuggerd

Debuggerd creates a socket named "Android:debuggerd", which serves as a server to wait for the connection of other client-side processes. It receives the TID and action information sent by client-side processes, which will be run by the process specified by tid. According to the action dump specified by action, it can be used as debuggerd in the file or console. There are several main processes on the client side:

1. Exceptional C/C++ program

This program is installed by linker of bionic. When the program generates abnormal signal, it enters the signal processing function and is established with debuggerd.

2. debuggerd program

Debuggerd can be started in the console with the command debuggerd-b [<tid>] and then connected to debuggerd daemon. In this way, debuggerd can dump the process information specified by TID without interrupting the process execution.

3.       dumpstate

The command dumpstate is run in the console and the necessary parameters are specified. The command calls dump_backtrace_to_file to interact with debuggerd.

Use of debuggerd

After the C/C++ program which generates abnormal signals is connected with debuggerd, debuggerd saves the process information dump into tombstone_XX file to / data/tombstone/folder. The stack information of the exception process can be analyzed by viewing tombstone_XX.

Start in the console with the command debuggerd-b [<tid>]. If the - b parameter is added, the process information specified by TID will dump to the console, otherwise dump to the tombstone file. The command callstack/dumpstate is run in the console, and the process information is written to the files specified by the two commands.

Application exception handling process

Application portals are part of the bionic implementation and are valid for all android applications. After the application entry address _start, _linker_init calls debugger_init() function to register exception signal handler to intercept several singal: SIGILL,SIGABRT, SIGBUS, SIGFPE,SIGSEGV and SIGPIPE:

linker/arch/arm/begin.S

  1. start:  
  2.         mov     r0, sp  
  3.         mov     r1, #0  
  4.         bl      __linker_init  

bionic\linker\ Linker.cpp 

  1. extern "C" Elf32_Addr __linker_init(void* raw_args) {  
  2.   ...  
  3.   Elf32_Addr start_address = __linker_init_post_relocation(args, linker_addr);  
  4.   set_soinfo_pool_protection(PROT_READ);  
  5.   // Return the address that the calling assembly stub should jump to.  
  6.   return start_address;  
  7. }  

 

  1. static Elf32_Addr __linker_init_post_relocation(KernelArgumentBlock& args, Elf32_Addr linker_base) {  
  2.      ...  
  3.     debuggerd_init();  
  4.      ...  
  5. }  

bionic\linker\Debugger.c 

  1. void debugger_init()  
  2. {  
  3.     struct sigaction act;  
  4.     memset(&act, 0, sizeof(act));  
  5.     act.sa_sigaction = debugger_signal_handler;  
  6.     act.sa_flags = SA_RESTART | SA_SIGINFO;  
  7.     sigemptyset(&act.sa_mask);  
  8.     sigaction(SIGILL, &act, NULL);  
  9.     sigaction(SIGABRT, &act, NULL);  
  10.     sigaction(SIGBUS, &act, NULL);  
  11.     sigaction(SIGFPE, &act, NULL);  
  12.     sigaction(SIGSEGV, &act, NULL);  
  13.     sigaction(SIGSTKFLT, &act, NULL);  
  14.     sigaction(SIGPIPE, &act, NULL);  
  15. }  

The linker in the bionic library sets Handler(debugger_signal_handler) for the following seven signals:

  1. SIGILL (Illegal Instruction Exception)
  2. SIGABRT(abort exit exception)
  3. SIGBUS (hardware access exception)
  4. SIGFPE (Floating Point Abnormal)
  5. SIGSEGV (memory access exception)
  6. SIGSTKFLT (Coprocessor Stack Exception)
  7. SIGPIPE (Pipeline Abnormality)

The meaning of act.sa_flags = SA_RESTART | SA_SIGINFO in debugger_init:

1)      SA_RESTART

If this parameter is specified, it means that if the signal interrupts a system call of the process, the system automatically starts the system call. If this parameter is not specified, the interrupted system call fails to return with the error code EINTR. This flag bit is only used to handle slow system calls (system calls that may be blocked). For example, calling write system call to write a device is blocked, when the process captures a signal and enters the corresponding signal processing function to return, the system call may return ENINTR error. When this parameter is specified, the system call is restarted, and the completion of the write operation can be guaranteed by using the RETRY_ON_EINTR macro.

2)      SA_SIGINFO

If this parameter is specified, the parameter (siginfo_t structure) that represents the signal can be transferred to the signal processing function.

When the C/C++ program linked to the bionic library crashes, the kernel sends the corresponding signal, and when the process receives the abnormal signal, it transfers to the debugger_signal_handler function for processing.

  1. void debugger_signal_handler(int n, siginfo_t* info, void* unused)  
  2. {  
  3.     char msgbuf[128];  
  4.     unsigned tid;  
  5.     int s;  
  6.   
  7.     logSignalSummary(n, info);  
  8.    
  9.     tid = gettid();  
  10.     //"android:debuggerd"  
  11.     s = socket_abstract_client(DEBUGGER_SOCKET_NAME, SOCK_STREAM);  
  12.    
  13.     if (s >= 0) {  
  14.         /* debugger knows our pid from the credentials on the 
  15.          * local socket but we need to tell it our tid.  It 
  16.          * is paranoid and will verify that we are giving a tid 
  17.          * that's actually in our process 
  18.          */  
  19.         int  ret;  
  20.         debugger_msg_t msg;  
  21.         msg.action = DEBUGGER_ACTION_CRASH;  
  22.         msg.tid = tid;  
  23.         RETRY_ON_EINTR(ret, write(s, &msg, sizeof(msg)));  
  24.         if (ret == sizeof(msg)) {  
  25.             /* if the write failed, there is no point to read on 
  26.              * the file descriptor. */  
  27.             RETRY_ON_EINTR(ret, read(s, &tid, 1));  
  28.             int savedErrno = errno;  
  29.             notify_gdb_of_libraries();  
  30.             errno = savedErrno;  
  31.         }  
  32.    
  33.         if (ret < 0) {  
  34.             /* read or write failed -- broken connection? */  
  35.             format_buffer(msgbuf, sizeof(msgbuf),  
  36.                 "Failed while talking to debuggerd: %s", strerror(errno));  
  37.             __libc_android_log_write(ANDROID_LOG_FATAL, "libc", msgbuf);  
  38.         }  
  39.    
  40.         close(s);  
  41.     } else {  
  42.         /* socket failed; maybe process ran out of fds */  
  43.         format_buffer(msgbuf, sizeof(msgbuf),  
  44.             "Unable to open connection to debuggerd: %s", strerror(errno));  
  45.         __libc_android_log_write(ANDROID_LOG_FATAL, "libc", msgbuf);  
  46.     }  
  47.    
  48.     /* remove our net so we fault for real when we return */  
  49.     signal(n, SIG_DFL);  
  50.    
  51.     /* 
  52.      * These signals are not re-thrown when we resume.  This means that 
  53.      * crashing due to (say) SIGPIPE doesn't work the way you'd expect it 
  54.      * to.  We work around this by throwing them manually.  We don't want 
  55.      * to do this for *all* signals because it'll screw up the address for 
  56.      * faults like SIGSEGV. 
  57.      */  
  58.     switch (n) {  
  59.         case SIGABRT:  
  60.         case SIGFPE:  
  61.         case SIGPIPE:  
  62.         case SIGSTKFLT:  
  63.             (void) tgkill(getpid(), gettid(), n);  
  64.             break;  
  65.         default:    // SIGILL, SIGBUS, SIGSEGV  
  66.             break;  
  67.     }  
  68. }  

debugger_signal_handler function processing flow:

1) Call logSignalSummary to write the signal information to the file;

  1. static void logSignalSummary(int signum, const siginfo_t* info)  
  2. {  
  3.     char buffer[128];  
  4.     char threadname[MAX_TASK_NAME_LEN + 1]; // one more for termination  
  5.     char* signame;  
  6.     switch (signum) {  
  7.         case SIGILL:    signame = "SIGILL";     break;  
  8.         case SIGABRT:   signame = "SIGABRT";    break;  
  9.         case SIGBUS:    signame = "SIGBUS";     break;  
  10.         case SIGFPE:    signame = "SIGFPE";     break;  
  11.         case SIGSEGV:   signame = "SIGSEGV";    break;  
  12.         case SIGSTKFLT: signame = "SIGSTKFLT";  break;  
  13.         case SIGPIPE:   signame = "SIGPIPE";    break;  
  14.         default:        signame = "???";        break;  
  15.     }  
  16.   
  17.     if (prctl(PR_GET_NAME, (unsigned long)threadname, 0, 0, 0) != 0) {  
  18.         strcpy(threadname, "<name unknown>");  
  19.     } else {  
  20.         // short names are null terminated by prctl, but the manpage  
  21.         // implies that 16 byte names are not.  
  22.         threadname[MAX_TASK_NAME_LEN] = 0;  
  23.     }  
  24.     format_buffer(buffer, sizeof(buffer),  
  25.         "Fatal signal %d (%s) at 0x%08x (code=%d), thread %d (%s)",  
  26.         signum, signame, info->si_addr, info->si_code, gettid(), threadname);  
  27.   
  28.     __libc_android_log_write(ANDROID_LOG_FATAL, "libc", buffer);  
  29. }  

Get the name of the exception signal and the thread name, and format the string. Call the function _libc_android_log_write to write in "/dev/log/main".

2) Call socket_abstract_client function to establish socket connection with debuggerd.

  1. s = socket_abstract_client(DEBUGGER_SOCKET_NAME, SOCK_STREAM);  

3) If the connection is successfully established, the debugger_msg_t structure is set up and sent to debuggerd.

  1. msg.action = DEBUGGER_ACTION_CRASH;//Tell debuggerd what to do  
  2. msg.tid = tid;//Thread number  
  3. RETRY_ON_EINTR(ret, write(s, &msg, sizeof(msg)));  

4) Waiting for debuggerd's reply, blocking the call below, and then executing the following process after receiving the reply;

  1. RETRY_ON_EINTR(ret, read(s, &tid, 1));  

5) To reset the signal processing function to SIG_DFL, that is, to take the default action;

  1. signal(n, SIG_DFL);  

6) Re-send the signal. When the process returns from the current signal processing function, it will process the signal and perform the default signal processing action, i.e. interrupt the process.

Source code analysis of debuggerd

1. Start in deamon mode in init process, in init.rc

  1. service debuggerd /system/bin/debuggerd   
  2. class main  

If started in this way, after entering the main function, the do_server function will be invoked as a service to provide dump process information for other processes on the server side.

2. To run the system/bin/debuggerd executable directly, you need to specify parameters.

  1. debuggerd -b [<tid>] //The parameter - b represents the output backtrace in the console  

Starting in this way, after entering the main function, the do_explicit_dump function is called to communicate with debuggerd daemon, and the information of the specified process is dump to the file or console.

Service startup mode

  1. int main(int argc, char** argv) {  
  2.     if (argc == 1) {  
  3.         return do_server();  
  4.     }  
  5. }  

When the number of parameters passed by the debuggerd process is 1, the debuggerd started at this time will be used as a background service process, which receives the application abnormal exit message and generates tombstone.

  1. static int do_server() {  
  2.     int s;  
  3.     struct sigaction act;  
  4.     int logsocket = -1;  
  5.    
  6.     /* 
  7.      * debuggerd crashes can't be reported to debuggerd.  Reset all of the 
  8.      * crash handlers. 
  9.      */  
  10.     signal(SIGILL, SIG_DFL);  
  11.     signal(SIGABRT, SIG_DFL);  
  12.     signal(SIGBUS, SIG_DFL);  
  13.     signal(SIGFPE, SIG_DFL);  
  14.     signal(SIGSEGV, SIG_DFL);  
  15.     signal(SIGPIPE, SIG_IGN);  
  16.     signal(SIGSTKFLT, SIG_DFL);  
  17.    
  18.     logsocket = socket_local_client("logd",  
  19.             ANDROID_SOCKET_NAMESPACE_ABSTRACT, SOCK_DGRAM);  
  20.     if(logsocket < 0) {  
  21.         logsocket = -1;  
  22.     } else {  
  23.         fcntl(logsocket, F_SETFD, FD_CLOEXEC);  
  24.     }  
  25.    
  26.     act.sa_handler = SIG_DFL;  
  27.     sigemptyset(&act.sa_mask);  
  28.     sigaddset(&act.sa_mask,SIGCHLD);  
  29.     act.sa_flags = SA_NOCLDWAIT;  
  30.     sigaction(SIGCHLD, &act, 0);  
  31.    
  32.     s = socket_local_server(DEBUGGER_SOCKET_NAME,  
  33.             ANDROID_SOCKET_NAMESPACE_ABSTRACT, SOCK_STREAM);  
  34.     if(s < 0) return 1;  
  35.     fcntl(s, F_SETFD, FD_CLOEXEC);  
  36.    
  37.     LOG("debuggerd: " __DATE__ " " __TIME__ "\n");  
  38.    
  39.     //check corefile limit.  
  40.     (void)check_corefile_limit();  
  41.    
  42.     for(;;) {  
  43.         struct sockaddr addr;  
  44.         socklen_t alen;  
  45.         int fd;  
  46.         alen = sizeof(addr);  
  47.         XLOG("waiting for connection\n");  
  48.         fd = accept(s, &addr, &alen);  
  49.         if(fd < 0) {  
  50.             XLOG("accept failed: %s\n", strerror(errno));  
  51.             continue;  
  52.         }  
  53.    
  54.         fcntl(fd, F_SETFD, FD_CLOEXEC);  
  55.    
  56.         handle_request(fd);  
  57.     }  
  58.     return 0;  
  59. }  

1. Ignore debuggerd's own crash processing;

2. Establish server side of socket communication;

3. Enter an infinite loop, wait for and receive client process connection requests, and process requests through handle_request() function;

handle_request

  1. static void handle_request(int fd) {  
  2.     XLOG("handle_request(%d)\n", fd);  
  3.    
  4.     debugger_request_t request;  
  5.     int status = read_request(fd, &request);  
  6.     if (!status) {  
  7.         XLOG("BOOM: pid=%d uid=%d gid=%d tid=%d\n",  
  8.             request.pid, request.uid, request.gid, request.tid);  
  9.    
  10.         /* At this point, the thread that made the request is blocked in 
  11.          * a read() call.  If the thread has crashed, then this gives us 
  12.          * time to PTRACE_ATTACH to it before it has a chance to really fault. 
  13.          * 
  14.          * The PTRACE_ATTACH sends a SIGSTOP to the target process, but it 
  15.          * won't necessarily have stopped by the time ptrace() returns.  (We 
  16.          * currently assume it does.)  We write to the file descriptor to 
  17.          * ensure that it can run as soon as we call PTRACE_CONT below. 
  18.          * See details in bionic/libc/linker/debugger.c, in function 
  19.          * debugger_signal_handler(). 
  20.          */  
  21.         if (ptrace(PTRACE_ATTACH, request.tid, 0, 0)) {  
  22.             LOG("ptrace attach failed: %s\n", strerror(errno));  
  23.         } else {  
  24.             bool detach_failed = false;  
  25.             bool attach_gdb = should_attach_gdb(&request);  
  26.             if (TEMP_FAILURE_RETRY(write(fd, "\0", 1)) != 1) {  
  27.                 LOG("failed responding to client: %s\n", strerror(errno));  
  28.             } else {  
  29.                 char* tombstone_path = NULL;  
  30.    
  31.                 if (request.action == DEBUGGER_ACTION_CRASH) {  
  32.                     close(fd);  
  33.                     fd = -1;  
  34.                 }  
  35.    
  36.                 int total_sleep_time_usec = 0;  
  37.                 for (;;) {  
  38.                     int signal = wait_for_signal(request.tid, &total_sleep_time_usec);  
  39.                     if (signal < 0) {  
  40.                         break;  
  41.                     }  
  42.    
  43.                     switch (signal) {  
  44.                     case SIGSTOP:  
  45.                         if (request.action == DEBUGGER_ACTION_DUMP_TOMBSTONE) {  
  46.                             XLOG("stopped -- dumping to tombstone\n");  
  47.                             tombstone_path = engrave_tombstone(request.pid, request.tid,  
  48.                                     signal, truetrue, &detach_failed,  
  49.                                     &total_sleep_time_usec);  
  50.                         } else if (request.action == DEBUGGER_ACTION_DUMP_BACKTRACE) {  
  51.                             XLOG("stopped -- dumping to fd\n");  
  52.                             dump_backtrace(fd, request.pid, request.tid, &detach_failed,  
  53.                                     &total_sleep_time_usec);  
  54.                         } else {  
  55.                             XLOG("stopped -- continuing\n");  
  56.                             status = ptrace(PTRACE_CONT, request.tid, 0, 0);  
  57.                             if (status) {  
  58.                                 LOG("ptrace continue failed: %s\n", strerror(errno));  
  59.                             }  
  60.                             continue/* loop again */  
  61.                         }  
  62.                         break;  
  63.    
  64.                     case SIGILL:  
  65.                     case SIGABRT:  
  66.                     case SIGBUS:  
  67.                     case SIGFPE:  
  68.                     case SIGSEGV:  
  69.                     case SIGSTKFLT: {  
  70.                         XLOG("stopped -- fatal signal\n");  
  71.                         /* 
  72.                          * Send a SIGSTOP to the process to make all of 
  73.                          * the non-signaled threads stop moving.  Without 
  74.                          * this we get a lot of "ptrace detach failed: 
  75.                          * No such process". 
  76.                          */  
  77.                         kill(request.pid, SIGSTOP);  
  78.                         /* don't dump sibling threads when attaching to GDB because it 
  79.                          * makes the process less reliable, apparently... */  
  80.                         tombstone_path = engrave_tombstone(request.pid, request.tid,  
  81.                                 signal, !attach_gdb, false, &detach_failed,  
  82.                                 &total_sleep_time_usec);  
  83.                         break;  
  84.                     }  
  85.    
  86.                     case SIGPIPE:  
  87.                         LOG("socket-client process stopped due to SIGPIPE! \n");  
  88.                         break;  
  89.    
  90.                     default:  
  91.                         XLOG("stopped -- unexpected signal\n");  
  92.                         LOG("process stopped due to unexpected signal %d\n", signal);  
  93.                         break;  
  94.                     }  
  95.                     break;  
  96.                 }  
  97.    
  98.                 if (request.action == DEBUGGER_ACTION_DUMP_TOMBSTONE) {  
  99.                     if (tombstone_path) {  
  100.                         write(fd, tombstone_path, strlen(tombstone_path));  
  101.                     }  
  102.                     close(fd);  
  103.                     fd = -1;  
  104.                 }  
  105.                 free(tombstone_path);  
  106.             }  
  107.    
  108.             XLOG("detaching\n");  
  109.             if (attach_gdb) {  
  110.                 /* stop the process so we can debug */  
  111.                 kill(request.pid, SIGSTOP);  
  112.    
  113.                 /* detach so we can attach gdbserver */  
  114.                 if (ptrace(PTRACE_DETACH, request.tid, 0, 0)) {  
  115.                     LOG("ptrace detach from %d failed: %s\n", request.tid, strerror(errno));  
  116.                     detach_failed = true;  
  117.                 }  
  118.    
  119.                 /* 
  120.                  * if debug.db.uid is set, its value indicates if we should wait 
  121.                  * for user action for the crashing process. 
  122.                  * in this case, we log a message and turn the debug LED on 
  123.                  * waiting for a gdb connection (for instance) 
  124.                  */  
  125.                 wait_for_user_action(request.pid);  
  126.             } else {  
  127.                 /* just detach */  
  128.                 if (ptrace(PTRACE_DETACH, request.tid, 0, 0)) {  
  129.                     LOG("ptrace detach from %d failed: %s\n", request.tid, strerror(errno));  
  130.                     detach_failed = true;  
  131.                 }  
  132.             }  
  133.    
  134.             /* resume stopped process (so it can crash in peace). */  
  135.             kill(request.pid, SIGCONT);  
  136.    
  137.             /* If we didn't successfully detach, we're still the parent, and the 
  138.              * actual parent won't receive a death notification via wait(2).  At this point 
  139.              * there's not much we can do about that. */  
  140.             if (detach_failed) {  
  141.                 LOG("debuggerd committing suicide to free the zombie!\n");  
  142.                 kill(getpid(), SIGKILL);  
  143.             }  
  144.         }  
  145.    
  146.     }  
  147.     if (fd >= 0) {  
  148.         close(fd);  
  149.     }  
  150. }  

1) Call the read_request function to read the data sent by the client-side process:

  1. static int read_request(int fd, debugger_request_t* out_request) {  
  2.     struct ucred cr;  
  3.     int len = sizeof(cr);  
  4.     int status = getsockopt(fd, SOL_SOCKET, SO_PEERCRED, &cr, &len);  
  5.     if (status != 0) {  
  6.         LOG("cannot get credentials\n");  
  7.         return -1;  
  8.     }  
  9.   
  10.     XLOG("reading tid\n");  
  11.     fcntl(fd, F_SETFL, O_NONBLOCK);  
  12.   
  13.     struct pollfd pollfds[1];  
  14.     pollfds[0].fd = fd;  
  15.     pollfds[0].events = POLLIN;  
  16.     pollfds[0].revents = 0;  
  17.     status = TEMP_FAILURE_RETRY(poll(pollfds, 1, 3000));  
  18.     if (status != 1) {  
  19.         LOG("timed out reading tid\n");  
  20.         return -1;  
  21.     }  
  22.   
  23.     debugger_msg_t msg;  
  24.     status = TEMP_FAILURE_RETRY(read(fd, &msg, sizeof(msg)));  
  25.     if (status < 0) {  
  26.         LOG("read failure? %s\n", strerror(errno));  
  27.         return -1;  
  28.     }  
  29.     if (status != sizeof(msg)) {  
  30.         LOG("invalid crash request of size %d\n", status);  
  31.         return -1;  
  32.     }  
  33.   
  34.     out_request->action = msg.action;  
  35.     out_request->tid = msg.tid;  
  36.     out_request->pid = cr.pid;  
  37.     out_request->uid = cr.uid;  
  38.     out_request->gid = cr.gid;  
  39.   
  40.     if (msg.action == DEBUGGER_ACTION_CRASH) {  
  41.         /* Ensure that the tid reported by the crashing process is valid. */  
  42.         char buf[64];  
  43.         struct stat s;  
  44.         snprintf(buf, sizeof buf, "/proc/%d/task/%d", out_request->pid, out_request->tid);  
  45.         if(stat(buf, &s)) {  
  46.             LOG("tid %d does not exist in pid %d. ignoring debug request\n",  
  47.                     out_request->tid, out_request->pid);  
  48.             return -1;  
  49.         }  
  50.     } else if (cr.uid == 0  
  51.             || (cr.uid == AID_SYSTEM && msg.action == DEBUGGER_ACTION_DUMP_BACKTRACE)) {  
  52.         /* Only root or system can ask us to attach to any process and dump it explicitly. 
  53.          * However, system is only allowed to collect backtraces but cannot dump tombstones. */  
  54.         status = get_process_info(out_request->tid, &out_request->pid,  
  55.                 &out_request->uid, &out_request->gid);  
  56.         if (status < 0) {  
  57.             LOG("tid %d does not exist. ignoring explicit dump request\n",  
  58.                     out_request->tid);  
  59.             return -1;  
  60.         }  
  61.     } else {  
  62.         /* No one else is not allowed to dump arbitrary processes. */  
  63.         return -1;  
  64.     }  
  65.     return 0;  
  66. }  

Read pid uid gid of client process from socket

  1. getsockopt(fd, SOL_SOCKET, SO_PEERCRED, &cr, &len);  

Polling socket handle

  1. struct pollfd pollfds[1];  
  2. pollfds[0].fd = fd;  
  3. pollfds[0].events = POLLIN;  
  4. pollfds[0].revents = 0;  
  5. status = TEMP_FAILURE_RETRY(poll(pollfds, 1, 3000));  

Read debugger_msg_t structure from socket

  1. debugger_msg_t msg;  
  2. status = TEMP_FAILURE_RETRY(read(fd, &msg, sizeof(msg)));  
  3. if (status < 0) {  
  4.     LOG("read failure? %s\n", strerror(errno));  
  5.     return -1;  
  6. }  
  7. if (status != sizeof(msg)) {  
  8.     LOG("invalid crash request of size %d\n", status);  
  9.     return -1;  
  10. }  
  11. out_request->action = msg.action;  
  12. out_request->tid = msg.tid;  
  13. out_request->pid = cr.pid;  
  14. out_request->uid = cr.uid;  
  15. out_request->gid = cr.gid;  

If the action set in debugger_msg_t is DEBUGGER_ACTION_CRASH, which indicates that it is a request from crash C/C++ process, then judge whether the incoming tid e is valid.

  1. if (msg.action == DEBUGGER_ACTION_CRASH) {  
  2.     /* Ensure that the tid reported by the crashing process is valid. */  
  3.     char buf[64];  
  4.     struct stat s;  
  5.     snprintf(buf, sizeof buf, "/proc/%d/task/%d", out_request->pid, out_request->tid);  
  6.     if(stat(buf, &s)) {  
  7.         LOG("tid %d does not exist in pid %d. ignoring debug request\n",  
  8.                 out_request->tid, out_request->pid);  
  9.         return -1;  
  10.     }  
  11. }  

If the action set in debugger_msg_t is DEBUGGER_ACTION_DUMP_BACKTRACE which indicates that it is a request sent by other means (debuggerd), the request must be root or system permissions, and then the effectiveness of the Ted can be judged.

 

2) After returning from read_request, the ptrace function is called to attach to the process specified by tid, at which time debuggerd becomes the parent of the attache d process, and then the ptrace function sends SIGSTOP signals to the child process to stop the child process. At this point, the parent process has the opportunity to check the value of the child process core image and register.

  1. ptrace(PTRACE_ATTACH, request.tid, 0, 0)  

3) Call the following statement to reply the message to the client terminal process so that the process on the client terminal can return from the read call.

  1. TEMP_FAILURE_RETRY(write(fd, "\0", 1)  

4) Wait for the child process to stop in the for loop.

  1. int signal = wait_for_signal(request.tid, &total_sleep_time_usec);  

5) Subprocesses perform different processing according to different signals and action s received

  1. switch (signal) {  
  2.     case SIGSTOP:  
  3.         if (request.action == DEBUGGER_ACTION_DUMP_TOMBSTONE) {  
  4.             XLOG("stopped -- dumping to tombstone\n");  
  5.             tombstone_path = engrave_tombstone(request.pid, request.tid,  
  6.                     signal, truetrue, &detach_failed,  
  7.                     &total_sleep_time_usec);  
  8.         } else if (request.action == DEBUGGER_ACTION_DUMP_BACKTRACE) {  
  9.             XLOG("stopped -- dumping to fd\n");  
  10.             dump_backtrace(fd, request.pid, request.tid, &detach_failed,  
  11.                     &total_sleep_time_usec);  
  12.         } else {  
  13.             XLOG("stopped -- continuing\n");  
  14.             status = ptrace(PTRACE_CONT, request.tid, 0, 0);  
  15.             if (status) {  
  16.                 LOG("ptrace continue failed: %s\n", strerror(errno));  
  17.             }  
  18.             continue/* loop again */  
  19.         }  
  20.         break;  
  21.     case SIGILL:  
  22.     case SIGABRT:  
  23.     case SIGBUS:  
  24.     case SIGFPE:  
  25.     case SIGSEGV:  
  26.     case SIGSTKFLT: {  
  27.         XLOG("stopped -- fatal signal\n");  
  28.         kill(request.pid, SIGSTOP);  
  29.         tombstone_path = engrave_tombstone(request.pid, request.tid,  
  30.                 signal, !attach_gdb, false, &detach_failed,  
  31.                 &total_sleep_time_usec);  
  32.         break;  
  33.     }  
  34.     case SIGPIPE:  
  35.         LOG("socket-client process stopped due to SIGPIPE! \n");  
  36.         break;  
  37.     default:  
  38.         XLOG("stopped -- unexpected signal\n");  
  39.         LOG("process stopped due to unexpected signal %d\n", signal);  
  40.         break;  
  41. }  

Subprocess receives SIGSTOP to indicate that crash does not occur in the process and writes process information to tombstone file according to action.

The sub-process receives seven abnormal signals indicating that the process crash and calls engrave_tombstone to write dump information directly to tombstone.

6) Call ptrace(PTRACE_DETACH, request.tid, 0, 0) to unlock the tracing of child processes;

  1. if (attach_gdb) {  
  2.     kill(request.pid, SIGSTOP);  
  3.     if (ptrace(PTRACE_DETACH, request.tid, 0, 0)) {  
  4.         LOG("ptrace detach from %d failed: %s\n", request.tid, strerror(errno));  
  5.         detach_failed = true;  
  6.     }  
  7.     wait_for_user_action(request.pid);  
  8. else {  
  9.     if (ptrace(PTRACE_DETACH, request.tid, 0, 0)) {  
  10.         LOG("ptrace detach from %d failed: %s\n", request.tid, strerror(errno));  
  11.         detach_failed = true;  
  12.     }  
  13. }  

If the following instructions are run: adb shell setprop debug.db.uid 10000; then attach_gdb is true when crash occurs for all processes with uid < 10000, the crash process will be stopped, and ptrace(PTRACE_DETACH, request.tid, 0, 0) will be called to unlock the crash process and wait for the GDB connection.

adb forward tcp:5039 tcp:5039

adb shell gdbserver :5039 --attach pid &

By pressing the HOME or VOLUME DOWN button, the user can make the process continue and crash naturally.

When attach_gdb is false, tracing of child processes is only released.

7) Call kill(request.pid, SIGCONT) to restore the stopped child process and let it terminate naturally;

engrave_tombstone

  1. char* engrave_tombstone(pid_t pid, pid_t tid, int signal,  
  2.         bool dump_sibling_threads, bool quiet, bool* detach_failed,  
  3.         int* total_sleep_time_usec) {  
  4.     mkdir(TOMBSTONE_DIR, 0755);  
  5.     chown(TOMBSTONE_DIR, AID_SYSTEM, AID_SYSTEM);  
  6.   
  7.     //dump maps & check corefile limit .  
  8.     dump_creash_maps(pid);  //creat maps file  
  9.   
  10.     int fd;  
  11.     char* path = find_and_open_tombstone(&fd);  
  12.     if (!path) {  
  13.         *detach_failed = false;  
  14.         return NULL;  
  15.     }  
  16.   
  17.     log_t log;  
  18.     log.tfd = fd;  
  19.     log.quiet = quiet;  
  20.     *detach_failed = dump_crash(&log, pid, tid, signal, dump_sibling_threads,  
  21.             total_sleep_time_usec);  
  22.   
  23.     close(fd);  
  24.     return path;  
  25. }  

For crash C/C++ process, dump process information is mainly through this function.

1. Create "/ data/tombstones" folder and modify permissions

2. Call function find_and_open_tombstone, tombstone_XX file up to 10, more than the earliest coverage

3. Call dump_crash to dump all information to tombstone file:

☞ dump_build_info(log);

☞ dump_thread_info(log, pid, tid, true);

☞ dump_fault_addr(log, tid, signal);

Dump_thread (context, log, tid, true, total_sleep_time_usec) Context information for dump processes

☞ dump_logs(log, pid, true);

☞ dump_sibling_thread_report(context, log, pid, tid, total_sleep_time_usec);

dump_backtrace

  1. void dump_backtrace(int fd, pid_t pid, pid_t tid, bool* detach_failed,  
  2.         int* total_sleep_time_usec) {  
  3.     log_t log;  
  4.     log.tfd = fd;  
  5.     log.quiet = true;  
  6.   
  7.     ptrace_context_t* context = load_ptrace_context(tid);  
  8.     dump_process_header(&log, pid);  
  9.     dump_thread(&log, tid, context, true, detach_failed, total_sleep_time_usec);  
  10.   
  11.     char task_path[64];  
  12.     snprintf(task_path, sizeof(task_path), "/proc/%d/task", pid);  
  13.     DIR* d = opendir(task_path);  
  14.     if (d) {  
  15.         struct dirent debuf;  
  16.         struct dirent *de;  
  17.         while (!readdir_r(d, &debuf, &de) && de) {  
  18.             if (!strcmp(de->d_name, ".") || !strcmp(de->d_name, "..")) {  
  19.                 continue;  
  20.             }  
  21.   
  22.             char* end;  
  23.             pid_t new_tid = strtoul(de->d_name, &end, 10);  
  24.             if (*end || new_tid == tid) {  
  25.                 continue;  
  26.             }  
  27.   
  28.             dump_thread(&log, new_tid, context, false, detach_failed, total_sleep_time_usec);  
  29.         }  
  30.         closedir(d);  
  31.     }  
  32.   
  33.     dump_process_footer(&log, pid);  
  34.     free_ptrace_context(context);  
  35. }  

☞  dump_process_header(&log, pid);

☞  dump_thread(&log, tid, context, true, detach_failed, total_sleep_time_usec);

☞ dump_process_footer(&log, pid);       

Debugging Tools

  1. int main(int argc, char** argv) {  
  2.     bool dump_backtrace = false;  
  3.     bool have_tid = false;  
  4.     pid_t tid = 0;  
  5.     for (int i = 1; i < argc; i++) {  
  6.         if (!strcmp(argv[i], "-b")) {  
  7.             dump_backtrace = true;  
  8.         } else if (!have_tid) {  
  9.             tid = atoi(argv[i]);  
  10.             have_tid = true;  
  11.         } else {  
  12.             usage();  
  13.             return 1;  
  14.         }  
  15.     }  
  16.     if (!have_tid) {  
  17.         usage();  
  18.         return 1;  
  19.     }  
  20.     return do_explicit_dump(tid, dump_backtrace);  
  21. }  

Through do_explicit_dump function dump, the stack information of the specified process can be obtained.

  1. static int do_explicit_dump(pid_t tid, bool dump_backtrace) {  
  2.     fprintf(stdout, "Sending request to dump task %d.\n", tid);  
  3.    
  4.     if (dump_backtrace) {  
  5.         fflush(stdout);  
  6.         if (dump_backtrace_to_file(tid, fileno(stdout)) < 0) {  
  7.             fputs("Error dumping backtrace.\n", stderr);  
  8.             return 1;  
  9.         }  
  10.     } else {  
  11.         char tombstone_path[PATH_MAX];  
  12.         if (dump_tombstone(tid, tombstone_path, sizeof(tombstone_path)) < 0) {  
  13.             fputs("Error dumping tombstone.\n", stderr);  
  14.             return 1;  
  15.         }  
  16.         fprintf(stderr, "Tombstone written to: %s\n", tombstone_path);  
  17.     }  
  18.     return 0;  
  19. }  

☞  dump_backtrace_to_file(tid, fileno(stdout))

☞  dump_tombstone(tid, tombstone_path, sizeof(tombstone_path))

Keywords: socket Linker Android shell

Added by kfresh on Sun, 30 Jun 2019 04:57:55 +0300