Archive for the ‘Coding Standards’ Category

Setting a bad example – part 1

Wednesday, August 4th, 2010 Nigel Jones

About a month ago I bought an ARM evaluation board from my favorite compiler vendor, IAR  Systems. Like most people I bought the board in part because it comes with a lot of example code showing how to configure the peripherals on the NXP LPC1768. Upon receiving the board, I  hooked it up and started browsing the example code that came with it. To my dismay I discovered a plethora of what I consider to be shoddy coding practices. Now I am well aware that one person’s shoddy practices is another person’s standard operating procedure, and so I intend to steer clear of areas that are just a matter of taste. Notwithstanding this, over the next few posts I will describe some of the things I found in the code – and more importantly why I think they are shoddy.

Before I describe the first example, I will comment on why I’m doing this. Three years ago, almost to the day, I wrote about the lousy code that hardware manufacturers supply as part of their application notes. When I wrote the post I had in the back of my mind the idea that hardware manufacturers aren’t really interested in code and so what should one expect? However when it comes to compiler vendors, what possible excuse do they have? To put it another way, where should engineers go to look for great examples of how to write embedded systems code if not the example code provided by compiler vendors? Do they not have a responsibility to publish code that uses best practices and sets a great example? Furthermore, even if they contracted the code out to some third party to write then I still think it is incumbent upon them to demand (and check) that the code is indeed the sort of code that they want to put their name on?

With all that being said, I will start with an example that is not particularly egregious!

Right associativity of the = operator

One of the lesser appreciated aspects of the C programming language is that the assignment (=) operator associates right to left. As a result the following is perfectly legal code.

uint16_t a,b,c;

a = b = c = 6;

This code has the effect of setting all three variables to the value of 6. With that as a preamble, I found that the author of the evaluation board code loves this construct – even when it is inappropriate. For example consider this excerpt.

/*************************************************************************
 * Function Name: GpioInit
 * Parameters: void
 * Return: void
 *
 * Description: Reset all GPIO pins to default: primary function
 *
 *************************************************************************/
void GpioInit(void)
{
 // Set to inputs
 FIO0DIR = \
 FIO1DIR = \
 FIO2DIR = \
 FIO3DIR = \
 FIO4DIR = 0;

 // clear mask registers
 FIO0MASK =\
 FIO1MASK =\
 FIO2MASK =\
 FIO3MASK =\
 FIO4MASK = 0;

 // Reset all GPIO pins to default primary function
 PINSEL0 =\
 PINSEL1 =\
 PINSEL2 =\
 PINSEL3 =\
 PINSEL4 =\
 PINSEL7 =\
 PINSEL8 =\
 PINSEL9 =\
 PINSEL10 = 0;
}

So why don’t I like this code? Well I have several complaints.  The main problem is that in my opinion one should only make use of the right associative property of the assignment operator if it is essential that multiple variables have the same starting value. That is, if I have a piece of code that can only possibly work if two variables start with the same value, then I will make use of this construct. For example:

uint16_t foo, bar;
foo = bar = 21;     /* foo and bar must be equal to each other and start at 21 */

If instead it is merely coincidental that two variables start with the same value, then I will put their assignment on separate lines.

uint16_t foo, bar;
foo = 14;
bar = 14;

In the IAR code there is no requirement that the various registers be locked together with a starting value.

My second complaint is that because there is no requirement that the various registers be locked together this code is now a pain to maintain. For example, consider the case where it became necessary to change the value of FIO2DIR to 0x12345678; One would now have to do this:

// Set to inputs
 FIO0DIR = \
 FIO1DIR = 0;
 FIO2DIR = 0x12345678;
 FIO3DIR = \
 FIO4DIR = 0;

In other words I would have to edit an unrelated line (FIO1DIR) – which is a recipe ripe for disaster. To drive home this point, consider the case where rather than having to edit FIO1DIR I wanted to change just FIO4DIR. I would say that the likelihood of someone doing this is very high:

 // Set to inputs
 FIO0DIR = \
 FIO1DIR = \
 FIO2DIR = \
 FIO3DIR = \
 FIO4DIR = 0x12345678;

In other words they might not notice the line continuation operator and simply change the value of the register they are interested in. The code would compile; the results would be disastrous.

While I am at it I will opine on the use of the line continuation operator.  While this exists for good reason, it almost never leads to easy to read, easy to maintain code. In short I never use it unless I absolutely have no choice. To use it in a casual manner such as this is not good.

As a final point, I am sure that many people that purchase the evaluation board are quite inexperienced. The GpioInit function excerpted above makes unnecessary use of two advanced constructs (right associativity and the line continuation character). Furthermore the use of these constructs doesn’t make the code better – it makes it worse! I am quite sure that a novice C programmer would look at this code and be baffled as to what is going on – possibly triggering a call to IAR technical support.

Evaluating embedded code

Sunday, June 20th, 2010 Nigel Jones

One of the interesting aspects of being an embedded systems consultant is that I get to look at a lot of code written by others. This can come about in a number of ways, but most commonly occurs when someone wants changes made to an existing code base and the original author(s) of the code are no longer available. When faced with a situation such as this, it is essential that I quickly get a sense for how maintainable the code is – and thus how difficult changes will be. As a result I have developed a few techniques to help me assess code maintainability which I thought I’d share with you.

SourceMonitor

After installing the code on my system, the first thing I do is run SourceMonitor over the code. SourceMonitor is a free utility that computes various metrics. The metrics and their values for a typical code base of mine are shown below.

Number of Files: 476

Lines of code: 139,013

Statements: 61,144

% branches: 6.3%

% comments: 41.7%

Functions: 2,509

Average statements / function: 11.8

Max Complexity: 158

Max depth: 9+

Average depth: 0.54

Average complexity: 2.38

Probably the only thing that needs explanation is ‘complexity’. The author of SourceMonitor is not computing the McCabe complexity index, but rather is computing complexity based upon Steve McConnel’s methodology. The details of the implementation aren’t particularly important to me, as I’m more interested in comparative values.

While SourceMonitor helps give me the big picture, it is nowhere near enough – and this is where it gets interesting.

Optimization Level

The next thing I look at are the optimization levels being used. These can be very revealing. For example if a high level of optimization is being used for the debug build then it might be indicative that a non-optimized build either will not fit into the available memory, or possibly that the code doesn’t run fast enough unless optimization is turned on. Either is indicative of a system that is probably going to be tough to maintain. Conversely if the release build doesn’t use full optimization then I take this to mean that the code probably doesn’t work when optimization is turned on. I have written about this in the past and consider this to be a major indicator of potential code quality problems.

C-V Qualifiers

Having looked at the optimization levels, I then perform a grep on the code base looking for the number of instances of ‘volatile‘ and ‘const‘. If the number of instances of volatile is zero (and it often is) and the optimization level is turned way down, then it’s almost certain that the author of the code didn’t understand volatile and that the code is riddled with potential problems. Whenever this happens, I get a sinking feeling because if the author didn’t understand volatile, then there is no chance that he had any appreciation for race conditions, priority inversion, non-atomic operations etc. In short, the author was a PC programmer.

The ‘const‘ count is less revelatory. If the author makes use of const then this is normally an indicator that they know their way around the compiler and understand the value of defensive programming. In short I take the use of const to be very encouraging. However, I can say that I have known some excellent embedded systems programmers who rarely used const, and thus its absence doesn’t fill me with the same despair as the absence of volatile.

Incidentally in my code base described above, there are 53 incidences of the use of ‘volatile’ (note that I have excluded compiler vendor supplied header files which define all the various hardware registers as volatile). There are also 771 incidences of the the use of const.

Static qualifiers

Regular readers of this blog will know I am a big fan of the ‘static‘ qualifier. Static not only makes for safer and more maintainable code, it also makes for faster code. In fact, IMHO the case for static is so overwhelming that I find its absence or infrequent use a strong indicator that the author of the code was an amateur. In my example code base, static appears 1484 times.

Case statements

Regular readers of this blog also know that I am not a big fan of the case statement. While it has its place, too often I see it used as a substitute for thought. Indeed I have observed a strong inverse correlation between programmer skill and frequency of use of the case statement. As a result, I will usually run a grep to see what the case statement frequency is. In my example code, a case statement occurs 683 times, or once every 90 statements.

Compilation

All of the above ‘tests’ can be performed without compiling the code. In some cases I own the target compiler (or can download an evaluation copy), in which case I will of course attempt to compile the code. When I do this I’m looking for several things:

  1. An absence of compiler warnings / errors. Alan Bowens has written concisely and eloquently on this topic. The bottom line – compilation warnings in the release build are a major issue for me. Note that I’m more forgiving of compiler warnings in the debug build, since by its nature debug often ignores things such as inline commands, which can generate warnings on some compilers.
  2. The compilation speed. Massive files containing very large functions compile very slowly. They are also a bear to maintain.
  3. The final image size. This is relevant both in absolute terms (8K versus 128K versus 2M) and also in comparison to the available memory. Small images using a small percentage of the available memory are much easier to maintain than large images that nearly fill the available memory.

Lint

The final test that I perform only rarely is to Lint the code base. I do this rarely because quite frankly it takes a long time to configure PC-Lint. Thus only if I have already created a PC-Lint configuration file for the target compiler do I perform this step. Previously un-linted code will always generate thousands of warnings. However, what I’m looking for are the really serious warnings – uninitialized variables, indexing beyond the end of an array, possible null pointer dereferences etc. If any of these are present then I know the code base is in bad shape.

I can typically run the above tests on a code base in an hour or so. At the end of it I usually have a great idea of the overall code quality and how difficult it will be to modify. I would be very interested to hear from readers that are willing to perform the same tests on their code base and to publish the results. (Incidentally, I’m not trying to claim that my metrics are necessarily good – they are intended merely as a reference / discussion point).

Considerate coding

Monday, May 3rd, 2010 Nigel Jones

One of my major recreational pursuits is bike riding. I live in a rural area with some great terrain, and more to the point a very low traffic density. Naturally on a 5 or 6 hour ride one does encounter some traffic and I’m always struck by the different degrees of consideration afforded to cyclists by motorists. Some are extremely solicitous and will wait so that they can pass you slowly and with a wide separation; others are complete jerks and will pass you as close and as fast as possible, often sounding their horn as they go by. Then there are the bulk of the drivers who will attempt to give you as much room as possible commensurate with slowing them down as little as possible. I was pondering this view of human nature yesterday while out on a ride, when it occurred to me that I see a similar range of consideration when it comes to embedded software. To see what I mean, read on …

I’ve mentioned several times in this blog that the main purpose of source code is not to generate a binary image but rather to educate those that come after you (including possibly a future version of yourself). You may or may not subscribe to this belief. However once you realize that source code often has a life of decades, and that the same source code may end up in dozens of products, then perhaps you may start to change your mind. So with this said, I think I can make a number of observations.

  1. It may be obvious to you, the author of the code, what the intended compilation platform is – after all it’s the one you are using. Alas it is not obvious to someone else who has been handed the source code and told to use it. ( I ran into this problem six months ago in which I had a vary large ARM project – but with no indication of which ARM compiler it was intended for).
  2. It may also be obvious to you what hardware platform the code is intended for – again it’s the one you are working on.
  3. It may also be obvious to you that the way to build the various targets is to change to directory X and invoke command Y with parameter Z – after all you do it ten times a day.
  4. It may also be obvious to you that the 27 warnings produced during the final build are benign – as after all you have checked them out.

However none of the above is clear to someone 5 years from now!

Clearly the above is just a partial list of what I call implicit information about a project. That is information that is essential to being able to use the code base, but which is often omitted from the documentation by the author. It’s my contention that the degree to which you explicitly provide this implicit information governs whether you are a jerk, a typical coder, or a considerate coder. Most of us (myself included) are typical coders (and I know this because I’ve seen a lot of code). If you want to make the move up to being a considerate coder, then here’s a few things I suggest you do.

  1. Place all the implicit information in main.c. Why is this you ask? Well if I was to dump three hundred source files on you, which one would you look at first? (An acceptable alternative is to state in main.c that useful information may be found in file X. Be aware however that non obvious source files sometimes get stripped out of source code archives).
  2. Include in main as a minimum information about the compiler (including its version), the intended hardware target, and how to build the code.
  3. Think for a minute or two about all the other information you are implicitly using in writing the source code and building it – and take the time to include it in main.c. Typically this includes additional tools, scripts etc.
  4. For an excellent discourse on why leaving warnings in your code is downright inconsiderate, see this posting from Alan Bowens.

If you do the above, then you are well on the way to becoming a ‘considerate coder’. Will doing this get you a pay increase, or at least a pat on the back from the boss – probably not. However just like the person who slows down and passes cyclists with a wide berth, you can go home at night knowing you aren’t a jerk. That has to be worth something.

Efficient C Tip #12 – Be wary of switch statements

Saturday, April 10th, 2010 Nigel Jones

This is the twelfth in a series of tips on writing efficient C for embedded systems. Like the previous topic, I suspect that this will be a bit controversial. As the title suggests, if you are interested in writing efficient C, you need to be wary of switch statements. Before I explain why, a little background will be useful. I did all of my early embedded systems programming in assembly language. This wasn’t out of some sense of machismo, it was simply a reflection of the fact that there were no high level languages available (with the possible exception of PL/M). Naturally as well as programming embedded systems I also did computer programming, initially in Pascal and BASIC, and later in C. One of the major differences I found in using the HLL was the wonderful switch / case statement. I found it to be a beautiful tool – with a few lines of source code I could do all sorts of powerful things that were simply very difficult or tedious to do in assembly language. Fast forward a number of years and C compilers began to become available for small embedded systems and so I naturally started using them, together with of course the attendant switch statement. All was well in paradise until the day I used a switch statement in an interrupt service routine and found to my horror that the ISR was taking about ten times longer to execute than I thought was reasonable.

This precipitated an investigation into how exactly switch statements are implemented by the compiler. When I did this, I discovered a number of things that should give one pause.

Heuristic Algorithms

The first thing I discovered is that compilers typically have a number of ways of implementing a switch statement. They seem to be loosely divided into the following trichotomy:

  1. An if-else-if-else-if chain. In this implementation, the switch statement is treated as syntactic sugar for an if-else-if chain.
  2. Some form of jump or control tables, or as they are sometimes called a computed goto. This is a favorite technique of assembly language programmers and the compiler writers can use it to great effect.
  3. A hybrid of 1 & 2.

Where it gets interesting is how the compiler decides which approach to use. If the case values are contiguous (e.g. zero through ten), then it’s likely the compiler will use some form of jump table. Conversely if the case values are completely disjointed (e.g. zero, six, twenty, four hundred and a thousand) then an if-else implementation is likely. However what does the compiler do when, for example, you have a bifurcated set of ranges such as zero-ten and ninety – one hundred? Well the answer is, that each compiler seems to have some form of heuristic algorithm for determining what is the ‘best’ way of implementing a given set of cases. Although some compilers allow you to force a particular implementation, for the most part you are at the mercy of the compiler.

Comparative Execution Speeds

If you think about it, it should become apparent that a jump table approach is likely to give a highly consistent time of execution through the decision tree, whereas the if-else -if chain has a highly variable time of execution depending upon the particular value of the switched variable.  Notwithstanding this, the jump table approach has a certain amount of execution overhead associated with it. This means that although its  mean execution time (which is normally the same as its worst and best execution time) may be dramatically better than the mean execution time of the if-else-if chain, the if-else-if chain’s best execution time may be considerably better. So what you say! Well in some cases, a particular value is far more likely to occur than the other values, thus it would be very nice if this value was tested first. However, as you will now see, this isn’t guaranteed…

Order of Execution

For many years I wrote switch statements under the assumption that the case values would be evaluated from top to bottom. That is, if the compiler chose to implement the switch statement as an if-else-if chain, then it would first test the first case, then the second case and so on down to the default case at the bottom of my source code. Well it turns out that my assumption was completely wrong. The compiler is under no such obligation, and indeed will often evaluate the values bottom to top. Furthermore, the compiler will often evaluate the default value first. For example consider a defaulted switch statement with contiguous case values in the range zero to ten. If the index variable is an unsigned int, then there are at least 65525 possible values handled by the default case, and so it makes sense to eliminate them first. Now if you know that the index variable can only possibly take on the values zero to ten, then you can of course eliminate the default statement – and then get excoriated by the coding standards / MISRA folks.

Maintenance

This is the area where I really get worried. Consider the case where you have a switch statement in an ISR. The code is working with no problems until one day it is necessary to make a change to the switch statement – by for example adding an additional case value. This simple change can cause the compiler to completely change the implementation of the switch statement. As a result, you may find that:

  1. The worst case execution time has jumped dramatically.
  2. The mean execution time has jumped dramatically.
  3. The stack space required by the ISR has jumped dramatically.

Any of these three possibilities can cause your program to fail catastrophically. Now of course one could argue ‘that’s why you test all changes’. However, in my opinion it’s far better to be proactive and to avoid putting yourself in this situation in the first place.

I’d also be remiss in not noting the dreaded missing break statement maintenance problem. However as a religious user of Lint, I’m not normally too concerned about this.

Switch statement alternatives

If performance and stability is your goal then I strongly recommend that you implement your code, the way you want it executed. This means either explicitly use an if-else-if chain or use function pointers. If function pointers scare you, then you might want to read this article I wrote on the subject.

Recommendations

Based on my experience, I have a number of things that I do when it comes to switch statements. If you find my analysis compelling, you may want to adopt them:

  1. Switch statements should be the last language construct you reach for – and not the first.
  2. Learn how to use function pointers. Once you do you’ll find a lot of the reasons for using switch statements go away.
  3. Try to keep all the case values contiguous.
  4. If you can’t keep the case values contiguous, go to the other extreme and make them disparate – that way you are less likely to have the compiler change the algorithm on you.
  5. If your compiler supports it, consider using pragmas to lock in a particular implementation.
  6. Be very wary of using switch statements in interrupt service routines or any other performance critical code.
  7. Use Lint to guard against missing break statements.

Comments welcome!

Next Tip

Previous Tip

Home

Hardware vs. firmware naming conventions

Sunday, March 28th, 2010 Nigel Jones

Today’s post is motivated in part by Gary Stringham. Gary is the newest member of EmbeddedGurus and he consults and blogs on what he calls the bridge between hardware and firmware. Since I work on both hardware and firmware, I’m looking forward to what Gary has to say in the coming months. Anyway, I’d recently read his posting on Early hardware / firmware collaboration when I found myself looking at a fairly complex schematic. The microprocessor had a lot of IO pins, most of which were being used. When I looked at the code to gain insight on how some of the IO was being used I found that the hardware engineer and firmware engineer had adopted completely different naming conventions. For example, what appeared on the schematic as “Relay 6” appeared in the code as “ALARM_RELAY_2”. As a result the only way I could reconcile the schematic and the code was to look at a signal’s port pin assignment on the schematic and then search the code to see what name was associated with that port pin. After I’d done this a few times, I realized I needed a more systematic approach and ended up going through all the port pin assignments in the code and using them to hand mark up the schematic. Clearly this was not only a colossal time waster, it also had the potential for introducing stupid bugs.

So how had this come about? Well if you have ever designed hardware, you will know that naming nets is essentially optional. In other words one can create a perfectly correct schematic without naming any of the nets. Instead all you have to do is ensure that their connectivity is correct. (This is loosely analogous in firmware to referring to variables via their absolute addresses instead of assigning a name to the variable and using it. However, the consequences for the hardware design are nowhere near as dire). Furthermore, if the engineer does decide to name a net, then in most schematic packages I’ve seen, one is free to use virtually any combination of characters. For example “~LED A” would be a perfectly valid net name – but is most definitely not a valid C variable name. If one throws in the usual issue of numbering things from zero or one (should the first of four LED’s be named LED0 or LED1?), together with hardware engineer’s frequent (and understandable) desire to indicate if a signal is active low or active high by using some form of naming convention, then one has the recipe for a real mess.

So what’s to be done? Well here are my suggestions:

  1. The hardware team should have a rigorously enforced naming standards convention (in much the same way that most companies have a coding standards manual).
  2. All nets that are used by firmware must be named on the schematic.
  3. The net names must adhere to the C standard for naming variables.
  4. The firmware must use the identical name to that appearing on the schematic.

Clearly this can be facilitated by having very early meetings between the hardware and firmware teams, such that when the first version of the schematic is released, there is complete agreement on the net names. If you read Gary’s blog post you’ll see that this is his point too – albeit in a slightly different field.
Home