Saturday, April 11, 2015

Getting the stack trace programmatically in C

In some situations where applications run for a long period of time on remote systems, if the application crashes, there is little information about the crash: no core dump or valuable log available. In such cases it is useful to get and log the stack trace at the moment of the crash, thus having some more information in order to solve the problem.
Getting some information about the stack trace is possible using C, as we will see below.

A link to a GitHub repo containing the full example code is at the bottom of the page, just in case you need to quicly copy-paste the code.

First thing we need to do in order to be notified when the program crashes is to register a handler for the SIGSEGV signal. We will use the sigaction interface over the traditional signal call as we need to use the sigcontext structure as parameter:
//register for sigsegv
struct sigaction sa;
sa.sa_handler = (__sighandler_t)sig_hup;
sa.sa_flags = SA_RESTART;
sigaction(SIGSEGV, &sa, NULL);
The sig_hup function will have the following signature:
void sig_hup(int sig,  struct sigcontext ctx);
The sigcontext data is used to get the instruction pointer information from the signal hander in order to get the stack trace.

Once SIGSEGV is handled we need to call the backtrace function to get the array of addresses for the function calls currently active in the program:
static const int traceSize = 16;
void *trace[traceSize];
trace_size = backtrace(trace, traceSize);
Then we need to override sigaction with the signal's context instruction pointer (to overwrite the caller's address - not to get the first frame in the stack to be the function which prints the stack trace):
trace[1] = (void *)ctx->rip;
The instruction pointer register is different based on the architecture: eip for x86, rip for x64, arm_ip for arm, etc.

After we have the backtrace we need to call backtrace_symbols in order to translate the addresses into an array of strings which describe them symbolically:
char **messages = (char **)NULL;
messages = backtrace_symbols(trace, trace_size);
At this point our call stack will look like this:
[bt] Execution path:
[bt] #1 [(nil)]
[bt] #2 /lib64/ [0x3d06e34950]
[bt] #3 ./testc() [0x400bcd]
[bt] #4 ./testc(main+0x23) [0x400bf5]
[bt] #5 /lib64/ [0x3d06e1ffe0]
[bt] #6 ./testc() [0x4009f9]
It gives some information, but no file and line number. We can get more information by using the addr2line utility present on Linux systems:
addr2line <address> -e <executable name>
Note: the executable must be compiled without optimizations (-O0) and with debug symbos (-g).

The call to addr2line may be done programmatically like so:
static const int buflen = 1024;        
char syscom[buflen];
snprintf(syscom, buflen,"addr2line %p -e %s", trace[i], exePath);
FILE *f = popen(syscom, "r");
if (f != NULL)
    char buffer[buflen];
    memset(buffer, 0, buflen*sizeof(char));
    while(fgets(buffer, sizeof(buffer), f) != NULL)
        printf("%s", buffer);
At the end we'll get a nice stack trace printed which would look like this:
[bt] Execution path:
[bt] #1 [(nil)]
[bt] #2 /lib64/ [0x3d06e34950]??:0
[bt] #3 ./testc() [0x400cad]/home/alex/workarea/c-stacktrace/testc/testc.c:33
[bt] #4 ./testc(main+0x23) [0x400cd5]/home/alex/workarea/c-stacktrace/testc/testc.c:43
[bt] #5 /lib64/ [0x3d06e1ffe0]??:0
[bt] #6 ./testc() [0x400ad9]??:?

With this output in the logs, debugging program crashes can be easier.

The full example and library-like packaging can be found at the following link:

No comments:

Post a Comment