embedded software boot camp

Efficient C Tips #2 – Using the optimizer

Saturday, July 5th, 2008 by Nigel Jones

In my first post on “Efficient C” I talked about how to use the optimal integer data type to achieve the best possible performance. In this post, I’ll talk about using the code optimization settings in your compiler to achieve further performance gains.

I assume that if you are reading this, then you are aware that compilers have optimization settings or switches. Invoking these settings usually has a dramatic effect on the size and speed of the compiled image. Typical results that I have observed over the years is a 40% reduction in code size and a halving of execution time for fully optimized versus non-optimized code. Despite these amazing numbers, I’d say about half of the code that I see (and I see a lot) is released to the field without full optimization turned on. When I ask developers about this, I typically get one of the following explanations:

1. I forgot to turn the optimizer on.
2. The code works fine as is, so why bother optimizing it?
3. When I turned the optimizer on, the code stopped working.

The first answer is symptomatic of a developer that is just careless. I can guarantee that the released code will have a lot of problems!

The second answer on the face of it has some merit. It’s the classic “if it aint broke don’t fix it” argument. However, notwithstanding that it means that your code will take longer to execute and thus almost certainly consume more energy (see my previous post on “Embedded Systems and the Environment”), it also means that there are potential problems lurking in your code. I address this issue below.

The third answer is of course the most interesting. You have a “perfectly good” piece of code that is functioning just fine, yet when you turn the optimizer on, the code stops working. Whenever this happens, the developer blames the “stupid compiler” and moves on. Well, after having this happen to me a fair number of times over my career, I’d say that the chances that the compiler is to blame are less than 1 in 10. The real culprit is normally the developer’s poor understanding of the rules of the programming language and how compilers work.

Typically when a compiler is set up to do no optimization, it generates object code for each line of source code in the order in which the code is encountered and then simply stitches the result together (for the compiler aficionados out there I know it’s more involved than this – but it serves my point). As a result, code is executed in the order in which you write it, constants are tested to see if they have changed, variables are stored to memory and then immediately loaded back into registers, invariant code is repeatedly executed within loops, all the registers in the CPU are stacked in an ISR and so on.

Now, when the optimizer is turned on, the optimizer rearranges code execution order, looks for constant expressions, redundant stores, common sub-expressions, unused registers and so on and eliminates everything that it perceives to be unnecessary. And therein dear reader lies the source of most of the problems. What the compiler perceives as unnecessary, the coder thinks is essential – and indeed is relying upon the “unnecessary” code to be executed.

So what’s to be done about this? Firstly, you have to understand what the key word volatile means and does. Even if you think you understand volatile, go and read this article I wrote a number of years back for Embedded Systems Programming magazine. I’d say that well over half of the optimization problems out there relate to failure to use volatile correctly.

The second problematic area concerns specialized protective hardware such as watchdogs. In an effort to make inadvertent modification of certain registers less likely, the CPU manufacturers insist upon a certain set of instructions being executed in order within a certain time. An optimizer can often break these specialized sequences. In which case, the best bet is to put the specialized sequences into their own function and then use the appropriate #pragma directive to disable optimization of that function.

Now what to do if you are absolutely sure that you are using volatile appropriately and correctly and that specialized coding sequences have been protected as suggested, yet your code still does not work when the optimizer is turned on? The next thing to look for are software timing sequences, either explicit or implicit. The explicit timing sequences are things such as software delay loops, and are easy to spot. The implicit ones are a bit tougher and typically arise when you are doing something like bit-banging a peripheral, where the instruction cycle time implicitly acts as a setup or hold time for the hardware being addressed.

OK, what if you’ve checked for software timing and things still don’t work? In my experience you are now in to what I’ll call the “Suspect Code / Suspect Compiler (SCSC)” environment. With an SCSC problem, the chances are you’ve written some very complex, convoluted code. With this type of code, two things can happen:

1. You are working in a grey area of the language (i.e. an area where the behavior is not well specified by the standard). Your best defense against this is to use Lint from Gimpel. Lint will find all your questionable coding constructs. Once you have fixed them, you’ll probably find your optimization problems have gone away.
2. The optimizer is genuinely getting confused. Although this is regrettable, the real blame may lie with you for writing knarly code. The bottom line in my experience is that optimizers work best on simple code. Of course, if you have written simple code and the optimizer is getting it wrong, then do everyone a favor and report it to the compiler vendor.

In my next post I’ll take on the size / speed dichotomy and make the case for using speed rather than size as the “usual” optimization method.

Next Tip

Previous Tip

Home

Tags:

9 Responses to “Efficient C Tips #2 – Using the optimizer”

  1. kewarken says:

    You have to be careful with volatile though since there are quite a few compiler bugs with generating the code:Volatiles Are Miscompiled, and What to Do about It

  2. Nigel Jones says:

    Yes there are. However IMHO the best way to tackle this is to do the following:1. Write simple code.2. If the compiler still screws it up – complain loudly to the vendor. The rest of us will thank you!

  3. ashleigh says:

    Write simple code… YES!I try and write pascal in every language I ever use… plain, simple, easy to read, and easy to complile.The day you start writing terriblly complex clever code (with ? operators inside the conditional part of an IF statement by way of example), lots of pre and post increments, and so on, the compilers life becomes much mroe difficult.All that said, I have actually found optimiser defects in a commercial (and very expensive) compiler on 2 occasions. The vendor gave me a years free maintenance and support for helping to find those.Look at the generated code… Another thing I've found that makes a huge difference is if-then-else statements:x = 1;if (condition) { x = 2; }tends to generate faster tighter code (on some machines) than:if (condition) { x = 2; }else { x = 1; }

  4. Anonymous says:

    I find compilers typically make errors because of something other than complicated statements (after all, they can't compile much of anything without a stack). They either have some sort of overflow, or they simply do something unbelievably stupid. Not to mention names, but I'm using a compiler that apparently uses #define like functionality to implement const types. So if you mean "simple" code just doesn't use some part of the language (like const), then all will be well. Otherwise, you just have to learn what the compiler can't do.Mike Layton

  5. Robert Chi says:

    The hyperlink of “Next Tip” cannot bring me to the next one but the previous one. Would you please fix it? By the way, I really enjoy your posts.

  6. […] take this to mean that the code probably doesn’t work when optimization is turned on. I have written about this in the past and consider this to be a major indicator of potential code quality […]

  7. Aba Sumayya says:

    Speed is determine when we have lower execution cycles right? But after I turn optimization to speed, the execution cycles for 1 specific function is more compare to before. Im using Green Hills compiler for power pc. Any comment?

    • Nigel Jones says:

      Yes. Speed optimization means that the compiler uses various algorithms that usually result in faster execution. However it isn’t guaranteed! For example, let’s say you have a switch statement. For most code most of the time the default case is expected to never or rarely be executed, and thus the speed optimizer may proceed on this assumption. However if your code is structured such that the default case is the one most usually taken, then suddenly speed optimization has made things worse. This concept is closely allied with whether optimization is designed to to improve the average execution time, or the worst case execution time. As a final note, my experience with Greenhills is that their compiler is not exactly the best out there, particularly given its price. If you have the opportunity or inclination, try it with another compiler.

Leave a Reply to kewarken

You must be logged in to post a comment.