Many of the folks that come to this blog by way of search engines do so because they are having problems with stack overflow. I’ve already given my take on the likely causes of a stack overflow. Today I’d like to offer some hints on a related topic – how to set about computing the stack size for your application. This is an extremely difficult problem, which can be approached in one of three ways – experimentally, analytically or randomly. The latter is by far the most common technique, which consists essentially of choosing a number and seeing whether it works! In an effort to reduce the use of the random approach, I’ll try and summarize the other two methods.
In the experimental method, a typically very large stack size is selected. The stack is then filled with an arbitrary bit pattern such as 0xFC. The code is then executed for a ‘suitable’ amount of time, and then the stack is examined to see how far it has grown. Most people will typically take this number, add a safety margin and call it the required stack size. The main advantage of this approach is that it’s easy to do (indeed many good debuggers have this feature built in to them). It also has the advantage of being ‘experimental data’. However, there are two big problems with this approach, which will catch the unwary.
The biggest single problem with the experimental approach is the implicit assumption that the experiment that is run is representative of the worst case conditions. What do I mean by the worst case conditions? Well, the maximum stack size occurs in an embedded system when an interrupt that uses the most stack size occurs at a point in the code that the foreground application is also using the maximum stack size. On the assumption that most interrupts are asynchronous to the foreground application, the problem should be clear. How exactly do you know after your testing whether or not the interrupt that uses the most stack size did indeed trigger at the worst (best?) possible moment? Thus even if your testing had 100% code coverage, it still isn’t possible to know for sure whether you have covered all possible scenarios. If, as is the normal case, you don’t even begin to approach full code coverage, it should be clear to you that testing tends to reveal the typical ‘worst-case’ condition, rather than the genuine worst case condition.
The second major issue with testing is that it tends to be done when the code is close to being completed, rather than when it is completed. The problem is that small changes in the source code can have a huge impact on the required stack size. For example, let’s say that during testing it is discovered that an interrupt service routine is taking so long to complete that another interrupt is being occasionally missed. A ‘quick fix’ is to simply enable interrupts in the long interrupt handler, so that the other interrupts can do their thing. This one line change can lead to a dramatic increase in stack usage. (If you aren’t cognizant of the stack usage of interrupt handlers, you should read this article I wrote).
In the analytical approach, the idea is to examine the source code and from the analysis work out the maximum stack usage of the foreground application, and then to add to this the worst case interrupt handler usage. This is obviously a daunting task for anything but the simplest of applications. You will not be surprised to hear that computer programs have been written to perform this analysis. Indeed good quality linkers will now do this for you as a matter of course. Furthermore, my favorite third party tool, PC-Lint from Gimpel, will also now do this starting with version 9. However be warned that it takes a lot of work to set up PC-Lint to perform the analysis.
Although analysis can theoretically give an accurate answer, it does have several problems.
It’s almost impossible for an analytical approach to compute the stack usage of a program that uses recursion. Indeed it’s because of the unbounded effect on stack size that recursion is a really bad idea in embedded systems. Indeed MISRA bans it, and I personally banned it about twenty years ago.
Indirect Function Calls
Pointers to functions are something that I use extensively and heartily recommend (for a discussion see this article I wrote). Although they don’t have a deleterious effect on stack size, they do make it quite difficult for analysis programs to track what is going on. Indeed PC-Lint cannot handle pointers to functions when it comes to computing stack usage. Thus if you use an analytical approach and you use pointers to functions, then make absolutely sure that the analysis program can track all the indirect calls.
Code optimization can play havoc with the stack usage. Some optimizations reduce stack usage (by e.g. placing function parameters in registers), while others can increase stack usage. I should note that it’s only third party tools that should be bamboozled by the optimizer. The linker that makes up part of the compilation package should be aware of everything that the compiler has done.
Even if you have a linker that will compute stack usage, interpreting the output of the linker is always a daunting task. For example, the linker from IAR will compute your stack usage. However, it isn’t nice enough to simply say: You need 279 bytes of stack space. Instead you have to study the linker output carefully to glean the requisite information.
A Practical Approach
It’s clear from the above that it isn’t easy to determine the stack size for an application. So how exactly does one set about this in practice? Well here’s what I do.
- Locate the stack at the beginning / end of memory (depending upon how the stack grows) and place all variables at the other end of the memory. This essentially means that you are implicitly allowing the maximum amount of memory possible for the stack. Note that many good compilers / linkers will do this automatically for you.
- As a starting point, I allocate 10% of the available memory for stack use. If I know I will be using functions that are huge users of the stack (such as printf, scanf and their brethren), then I’ll typically set it to 20% of available memory.
- I set up the debug environment from day 1 to monitor and report stack usage. This way as I progress through the development process I get a very good feel for the application’s stack consumption. This also helps in spotting changes to the code that have big impacts on the required stack size.
- Once I have ‘all the code written’, I start to make use of the information in the linker report. The more tight I am on memory, the closer I examine the linker output. In particular, what I often find is that there is one and only one function call chain that leads to a stack usage that is much greater than all the other call chains. In which case, I look to see if I can restructure that call chain so as to bring the maximum stage usage more in line with the typical stack usage.
If you stumbled upon this blog courtesy of a search engine, then I hope you found the above useful. I invite you to check out some of my other posts, which you may find useful. If you are a regular reader, then as always, thanks for stopping by.