Stacktool is no longer maintained; please read about its
successor here.
Using Static Analysis to Bound Stack Depth
Shortcuts:
Stack overflow is what happens when an execution stack smashes memory
that is being used for other purposes. This is rare for programs
running in a process under a general-purpose operating system since
these OSs use the MMU to provide transparent dynamic expansion of the
stack. However, stack overflow is a common source of crashes in
embedded systems that have relatively little RAM and often lack an
MMU. Stack overflow is difficult to diagnose because the worst-case
stack size is typically encountered only rarely, for example when
several interrupts happen to be signaled at almost the same
time.
The typical ad hoc method for choosing stack sizes is to add a fudge
factor to the largest stack depth seen during testing. This is
potentially wasteful, since the fudge factor may be too large, and
also unsafe, since there is no guarantee that it is large
enough.
It is possible to statically and accurately bound
the size of the execution stack for embedded programs that were
compiled from moderately large (~30 Kloc) C programs. The technique
that makes this possible is context-sensitive abstract interpretation
of machine code. Accurate bounds on stack depth cannot be computed
from source code for a number of reasons:
-
The compiler has considerable freedom when performing the translation.
-
Source code for libraries and the RTOS is often not available.
-
Inlined assembly language is common in embedded software, and a sound
analysis must account for its effects.
Furthermore, whole-program analysis across compilation units is
inconvenient -- it's much easier to let the linker do its job and then
analyze a program that has no unresolved references.
For simple embedded systems, analyzing stack size is almost trivial.
For example, I spent a few hours writing a Perl program less than 350
lines long (available for download below) that
can analyze embedded programs for AVR microcontrollers that are
structured as an infinite main loop plus interrupts. For example, it
does fine on the onboard control program flybywire.elf from the Autopilot project.
On the other hand, analyzing more sophisticated systems is a
research-grade problem because it can be difficult to reconstruct an
accurate preemption graph for interrupts, and because systems often
contain features that can complicate or defeat static analysis:
-
indirect calls and jumps, for example from virtual function
tables, exceptions, device driver interfaces, event loops,
callbacks, and context switches
-
loads into the stack pointer, for example to allocate or free
a large function-scope data structure, to perform a context switch,
or to implement alloca()
-
recursion, fortunately not that common in embedded software
-
reentrant interrupts, which are unfortunately all too common
-
self-modifying code, which can at least be detected in embedded
systems that have a Harvard architecture
This research has resulted in a paper that appeared
in EMSOFT 2003 and a (more readable) article called Say
No to Stack Overflow that appeared in the October 2004 issue of
Embedded Systems Programming. It was the cover story:
This research has also resulted in the stacktool, which implements our
analysis for programs that run on Atmel AVR microcontrollers.
The three different approaches
to bounding stack depth that we looked at are:
-
Adding up the individual stack requirements of main() and all
interrupts, a simple approach that tends to overestimate stack depth.
-
A whole-program analysis of stack depth taking interrupt masks into
account, where these masks were computed using a context insensitive
analysis. This approach does better, but procedural aliasing hurts
its accuracy.
-
A whole-program analysis where interrupts masks were computed using a
context sensitive analysis. For the set of programs that we looked at
this analysis could successfully identify a concrete value for the
interrupt mask at almost all points in each test program. This means
that very little extra precision could be gained by moving to a more
powerful analysis (such as one that models the store in addition to
the registers).
Here's a graph (png, jpg) showing the differences in accuracy
between the three approaches when applied to a bunch of programs
written on TinyOS, an executive for wireless sensor network nodes.
Also, we used the stacktool to drive a whole-program function inliner.
This can significantly improve the stack memory utilization of
embedded programs when compared to simpler inlining policies such as:
-
no inlining,
-
maximal inlining,
-
a standard whole-program inlining heuristic designed to produce a
reasonable balance between performance and code size.
The EMSOFT paper contains more details about the inlining experiments.
Our current work has three main thrusts:
-
The "stacktool" is due for a name change as it has become a fairly
general-purpose analyzer for object codes. For example, our current
internal version (as of July 2004) supports pretty good worst-case
execution time analysis. Pointer analysis is in the works as well.
-
We are increasing the precision of our bitwise analysis, and
supporting more abstract domains, using operations generated by Hoist
-
We are increasing the size of the set of programs whose stack depth
can be successfully analyzed. Current rough spots are indirect
control flow, recursion, fancy manipulation of the stack pointer, and
fancy manipulation of the interrupt mask.
An snapshot of the stacktool is available here:
Please send us bug reports,
feature requests, experience reports, etc.
This is supported research software! It is known to work fine on
Linux, FreeBSD, and Cygwin. It properly analyzes most of the TinyOS
example kernels although it still fails on a few of them (but never
silently, as far as we know). We know how to solve the problems that
lead to analysis failure and are actively working on fixing
them!
Two versions of the stlite perl script are available:
Analyzing binary programs is hard, and stacktool sometimes needs some
help.
Question:
What do I do if I get an error like this?
oops: stack top 10fd 10fd
10db 10dd
at 7122
FATAL ERROR: at line 589 of file src/dataflow.cpp
Aborted
Answer: First, this uninformative error report has been improved
in the CVS version, which will appear on the web presently.
The problem here is that at address 0x7122 in your
program, stacktool cannot reliably determine the value stored in the
stack pointer. You will need to dump the asm for your program
(i.e., by typing avr-objdump -d foo.elf) and look at the address
where the problem occurred.
If the code at this position is in a gcc intrinsic function
such as __divmodhi4_exit, then you will need to build your
avr-gcc after applying this patch.
The patch simply cleans up a bit of excessive ugliness in the intrinsic
code. If the problem was not in an intrinsic, or if the patch does
not solve the problem, then drop John Regehr a line, including your
TinyOS elf file and app.c as attachments.
To build your own avr-gcc, first fetch the AVR gcc, binutils, and libc from here.
Then run commands like these:
tar zxvf avr-binutils-2.15tinyos.tgz
cd binutils-2.16.1
./configure --target=avr
make
make install
tar zxvf avr-gcc-3.4.3.tgz
cd avr-gcc-3.4.3
patch -p1 < patch-for-avr-gcc-3.3tinyos
./configure --disable-nls --enable-languages="c" --target=avr
make
make install
tar zxvf avr-libc-1.2.3.tar.gz
cd avr-libc-1.2.3
./doconf
./domake
./domake install
It is important that if you use the
"prefix" option to configure, that
all three programs are installed with the same prefix.
Question:
What do I do if I get an error like this?
probable missing stack macro-insn
Answer: The short answer is that this is probably easy to fix, please
mail your TinyOS elf file
to John Regehr. The long answer is that the AVR's non-atomic stack pointer
accesses create major problems for stacktool, which falls back on
pattern-matching
the compiler output. This works because avr-gcc has only a few idioms
for stack pointer manipulation.
Question:
What do I do if I get an error like this?
OOPS -- found icall at xxx with inconclusive arguments
Answer:
The most likely reason is that gcc turned a switch statement into a jump
table, which stacktool cannot currently resolve. Solving this problem from
first principles is hard and a pattern-matching solution seems too ugly,
so jump tables are unlikely to be supported in the near future. The workaround
is to give the -mno-tablejump option to avr-gcc.
Question:
What do I do if I get an error that is not mentioned on this page?
Answer:
Please mail two things to John Regehr: the elf version of your TinyOS app,
and the app.c file that nesC places into the build directory. I'll take
a look and see if it's feasible to support your application.
Question:
Will stacktool ever support msp430 or ARM?
Answer:
This would be great! It would be a considerable
amount of work. I don't have time.
John Regehr, October 2004