Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Sensors/MEMS

How to catch elusive bugs without using debugger

Posted: 16 Sep 2014 ?? ?Print Version ?Bookmark and Share

Keywords:software bugs? hardware debugger? Debugging? wireless sensor networks? GCC compiler?

Most software bugs in deeply embedded systems can be debugged through hardware debugger, however there are times when a problem maybe so elusive that it appears only following prolonged code execution and/or complex interactions with other nodes in the system. Depending on the system design, it may not always be practical to attach a hardware debugger, and as a result alternative debugging methods must be employed.

This article will discuss such problems and provide a software technique that captures call stack in real time and uses the stack dump from the embedded system at the point of failure. This article will also discuss a Python-based tool to match content of the stack to the disassembler output to recover full-function call stack and eventually find a point of failure.

For this article we will focus on debugging wireless sensor networks, but the same techniques can be applied to any system with a large number of distributed devices, or even a single device, that cannot be run under the debugger.

One of the major problems with debugging networks of devices is that behaviour of the individual devices depends on the behaviour of the surrounding nodes and the amount of traffic being exchanged. This makes it impossible to debug such systems on a low scale.

The method presented here is based on the analysis of the stack contents of the failed node and matching the contents to the disassembly of the application binary in order to recover the function call stack. A watchdog timer is used to detect an infinite loop condition and to trigger saving of the stack contents.

While the idea of this method is generic, different MCUs and compilers use slightly different ways of handling stacks, so tools shown here will have to be modified for a specific target system. For this article we will focus on Atmel's AVR MCUs, in particular the ATmega128RFA1 and ATmega256RFR2, and a GCC compiler.

To effectively use the method proposed here, we need to understand how compilers use stacks to store return address from the called functions. Fundamentally there are two types of data that have to be temporarily stored in the stack C local variables of the called function and return address to the calling function. Some compilers have two separate stacks, which makes it easier to recover the call stack since it is stored in one sequential memory location.

GCC compiler uses the same stack to store both return addresses and local variables. This makes it harder to parse the stack without deep analysis of the function operation. Fortunately it is possible to recover most of the useful information using a brute-force method. Instead of trying to recover return addresses from known locations, we will go over every possible combination of bytes of the appropriate length on the stack and check if it can be a return address. This method will occasionally give false positives by interpreting data on the stack as a valid return address, but usually they are easy to spot in the full output.

To identify all possible function calls stored on the stack, we need to find all call instructions in the disassembly listing. The disassembly listing can be obtained from the ELF file using avr-objdump utility. The example output will have the format shown below.

Here, the first line shows the address and the name of the function. The following lines show detailed information about instructions comprising the function. The first column contains the address of the instruction (in bytes), the second column contains opcode of the instruction and the last column shows instruction mnemonic with optional comment. In this output we should be looking for all versions of the call instruction. For each call instruction, we need to make a note of the instruction address and the size of the opcode. Alternatively we can note the address of the next instruction.

A full set of call instructions for the AVR core is listed in the table below.

1???2???3?Next Page?Last Page

Article Comments - How to catch elusive bugs without us...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top