Back in July 2008 I promised that the next blog post would be on why you should use speed optimization instead of size optimization. Well four other posts somehow got in the way – for which I apologize. Anyway, onto the post!
In “Efficient C Tips #2” I made the case for always using full optimization on your released code. Back when I was a lad, the conventional wisdom when it came to optimization was to use the following algorithm:
1. Use size optimization by default
2. For those few pieces of code that get executed the most, use speed optimization.
This algorithm was based on the common observation that most code is executed infrequently and so in the grand scheme of things its execution time is irrelevant. Furthermore since memory is constrained and expensive, this code that is rarely executed should consume as little resource (i.e. memory) as possible. On first blush, this approach seems reasonable. However IMHO it was flawed back then and is definitely flawed now. Here is why:
1. In an embedded system, you typically are not sharing memory with other applications (unlike on a general purpose computer). Thus there are no prizes for using less than the available memory. Of course, if by using size optimization you can fit the application into a smaller memory device then use size optimization and use the smaller and cheaper part. However in my experience this rarely happens. Instead typically you have a system that comes with say 32K, 64K or 128K of Flash. If your application consumes 50K with speed optimization and 40K with size optimization, then you’ll still be using the 64K part and so size optimization has bought you nothing. Conversely, speed optimization will also cost you nothing – but your code will presumably run faster, and consume less power.
2. In an interesting quirk of optimization technology, it turns out that in some cases speed optimization can result in a smaller image than size optimization! It is almost never the case that the converse is true. See however this article that I wrote which discusses one possible exception. Thus even if you are memory constrained, try speed optimization.
3. Size optimization comes with a potentially very big downside. After a compiler has done all the usual optimizations (constant folding, strength reduction etc), a compiler that is set up to do size optimization will usually perform “common sub-expression elimination”. What this consists of is looking at the object code and identifying small blocks of assembly language that are used repeatedly throughout the application. These “common sub-expressions” are converted into sub routines. This process can be repeated ad nauseum such that one subroutine calls another which calls another and so on. As a result an innocuous looking piece of C code can be translated into a call tree that nests many levels deep – and there is the rub. Although this technique can dramatically reduce code size it comes at the price of increasing the call stack depth. Thus code that runs fine in debug mode may well suffer from a call stack overflow when you turn on size optimization. Speed optimization will not do this to you!
4. As I mentioned in “Efficient C Tips #2” one downside of optimization is that it can rearrange instruction sequences such that the special access requirements often needed by watchdogs, EEPROM etc are violated. In my experience, this only happens when one uses size optimization – and never with speed optimization. Note that I don’t advocate relying on this; it is however a bonus if you have forgotten to follow the advice I give in “Efficient C Tips #2” for these cases.
The bottom line – speed optimization is superior to size optimization. Now I just have to get the compiler vendors to select speed optimization by default!