embedded software boot camp

A volatile tempest

September 27th, 2010 by Nigel Jones

Regular readers will know that I often comment on the use of volatile in embedded systems. As a result I am occasionally contacted about my opinion on whether a compiler is generating correct code – particularly when hardware is being accessed. Well I was contacted last week by Ratish Punoose who had a classic problem in that his code compiled okay on GCC but not on IAR. He had contacted IAR, who in turn basically said the compiler is correct – and here is the explanation. Ratish turned to me and John Regehr for our opinions. Well John and I came to similar opinions – namely:

  1. Ratish’s code was a bit weird, but not dramatically so.
  2. The explanation from IAR made no sense.
  3. It did indeed appear to be a compiler bug.

Ratish then posted his issue to the Msp430 forum on Yahoo. You can read his post and the responses here.

I’m sure many of you are at this point thinking that IAR is in for another round of bashing from me. Well you’d be wrong. One of the first responders to Ratish’s post was Paul Curtis of Rowley compilers. Paul gives an admirable explanation as to why Ratish’s code is wrong (and by extension so am I). Now I’m sure that IAR and Rowley are fierce competitors, and so Paul is also to be commended for leaping to the defense of IAR.

Furthermore, later in the thread Anders Lindgren of IAR chimes in and adds his detailed and compelling explanation.

Having read the posts from Paul and Anders I think they are right and I’m wrong. So thanks Gentlemen for:

  1. Setting me straight
  2. Proving that in the wonderful world of volatile accesses, there is always something more to learn.

I think there are several other lessons to be learned from this episode. However I think I’ll save them for another post.

#include “includes.h”

September 9th, 2010 by Nigel Jones

I am sure that the title of this blog posting is familiar to most of the readers of this blog, in that you have opened up a C source file and found a single #include statement that references a file that is typically called ‘includes.h’. On opening ‘includes.h’ one invariably finds an enormous list of other header files. Furthermore, as you go through other source files in the project, you will find that they all use ‘includes.h’. I suspect that by this point, the readers of this blog are divided into camps, namely:

  1. Either: So what, I do it all the time because it makes my life a lot easier.
  2. Or: I want to scream whenever I see this done.

I’m one of the screamers – and this is my rationale.

Back in the dark ages when one had to compile on computers with extremely limited resources, the compilation time of a module was a major issue. One of the things that significantly affected compile time was the number and the size of the header files that a module opened. As a result, most of us took steps to ensure that we only included the header files that were needed. However, as processor speeds increased and compilers started using pre-compiled header files, this became less of an issue, such that today I seriously doubt if you’d notice much difference in compilation times regardless of the number of header files that are included. I don’t know but I suspect that this was the enabler that caused people to start using ‘includes.h’.

So if compilation time is no longer an issue, what’s the big deal? After all we have all had the hassle of compiling a file only to be told that we are missing a prototype or a data type. At which point we have to hunt down the requisite header file, include it and recompile. If you do this half a dozen times in a new module, then it takes you say 15 minutes before everything is OK – and who has 15 minutes to waste on such irritating details? Well, in my opinion it’s time well spent. Here’s my case:

Coupling Indication

The number of header files a module needs to use is a crude but effective indicator of coupling. A module that needs to include almost no header files is clearly a module that is extremely self contained. That is it isn’t relying upon the outside world. Modules like this are typically easier to maintain and also more immune from changes made elsewhere in the system. In short I like modules that don’t have to include a lot of header files. Indeed when I have finished writing a module, I take a look at its include list. If the list is long then it really makes me wonder whether I should be breaking the module apart in some way so as to reduce the degree of coupling between it and the outside world.

Maintenance – understanding coupling

This is related to the first point. If I need to do some maintenance on a module, then a quick look at the include list can tell me how this module interacts with the rest of the code. This can be extremely useful if one is trying to understand how a program is put together.

Maintenance – understanding functionality

If I look at the include list and I see ‘math.h’, then I know that the module is using transcendental functions, which in turn implies complex floating point operations, which in turn implies potentially long execution times. In a similar manner, if it includes the header for the hardware interrupt handler, then I know I’m dealing with something related to the chip. I can get all this sort of information in a two second scan of the include list.

Documentation

If you use an automated documentation tool such as Doxygen, then only including the header files that are needed by a module ensures that Doxygen generates a meaningful documentation set for you, rather than including hyperlinks to useless files.

Not getting what you want

I have left what is probably the biggest problem to last. By including an enormous number of header files you lay yourself wide open to problems like this:

Header1.h

#define FALSE 0
#define TRUE  !FALSE

Header17.h

#ifndef FALSE
#define FALSE 0UL
#define TRUE  1UL
#endif

Header26.h

#pragma(IGNORE_REDEFINITION_OF_MACROS)
#define FALSE NULL
#define TRUE !FALSE
#pragma(ERROR_ON_REDEFINITION_OF_MACROS)

Trust me when I tell you I have seen this done! In other words the more files you include, the more likely it is that the macro that you are blithely using does not in fact have the value you think it does. Time to debug problems such as these – a lot longer than 15 minutes!

Remedial Action

On the off chance that I have convinced an ‘includes.h’ fan of the error of their ways, it would be remiss of me to not tell you how to quickly find out just the header files needed by a module.

  1. Paste the include list of includes.h into the module.
  2. Delete the entry for includes.h
  3. Compile the code to make sure you haven’t broken anything.
  4. Lint the file. Lint will tell you all the header files that aren’t being used.
  5. Delete the unnecessary include statements.
  6. Repeat from step 3 until Lint is happy.

Of course the chances are that if you use ‘includes.h’ you aren’t using Lint. If you do start using Lint then it will do a lot more for you than just telling you about unnecessary includes.

Classic race conditions and thoughts on testing

August 30th, 2010 by Nigel Jones

At a rather fundamental level this blog is about how I do embedded systems. Implicit in a lot of the articles is the concept that I believe what I’m doing is ‘right’, or at least ‘better’. Well today I thought I’d write about something I got wrong (at least on the first pass).

This is the scenario. I’m currently working on an NXP LPC17xx ARM Cortex design. Like all modern processors, the LPC17xx has a number of sophisticated timers with all sorts of operating modes. Well it so happens that I am using  four (out of a possible six) interrupt sources for one particular timer. The hardware architecture of the processor routes all of these interrupts to one vector and thus one interrupt handler. Here’s what I wrote:

void TMR3_IRQHandler(void)
{
 if (T3IR_bit.MR0INT)
 {                                            
  /* Do stuff */
 }

 if (T3IR_bit.CR0INT)
 {        
  /* Do stuff */        
 }

 if (T3IR_bit.MR1INT)
 {                                            
  /* Do stuff */
 }

 if (T3IR_bit.CR1INT)
 {                                            
  /* Do stuff */
 }

 T3IR = 0x3F;            /* Acknowledge all interrupts */

 ...
}

Thus in the ISR I tested each of my interrupt sources, took the appropriate actions in the sections marked ‘Do stuff’, acknowledged the interrupts, did a bit of clean up and I was done.  The ‘Do Stuff’ sections were quite complicated and so this was where I spent my time. Anyway having finished coding the ISR, I took a short break and came back to re-examine the code. As I was re-reading the code, I realized that I had made a classic mistake. In case you haven’t spotted it, the problem is in the line where I acknowledge all interrupts. Consider the following sequence of events:

  1. Interrupt source CR1INT is asserted and the CPU vectors to this ISR.
  2. I test the various interrupt flags and discover that CR1INT is set and do the requisite work.
  3. While I’m doing the requisite work, interrupt source MR1INT becomes active.
  4. I clear all interrupt sources (including MR1INT) and terminate the ISR
  5. As a result I have missed an interrupt.

The way this should have been coded is to acknowledge each interrupt bit individually. I.e. like this:

void TMR3_IRQHandler(void)
{
 if (T3IR_bit.MR0INT)
 {                                            
  /* Do stuff */
  T3IR_bit.MR0INT = 1;                    /* Clear the interrupt */
 }

 if (T3IR_bit.CR0INT)
 {        
  /* Do stuff */    
  T3IR_bit.CR0INT = 1;                    /* Clear the interrupt */
 }

 if (T3IR_bit.MR1INT)
 {                                            
  /* Do stuff */
  T3IR_bit.MR1INT = 1;                    /* Clear the interrupt */
 }

 if (T3IR_bit.CR1INT)
 {                                            
  /* Do stuff */
  T3IR_bit.CR1INT = 1;                    /* Clear the interrupt */        
 }

 ...
}

So how did this mistake come about? I think there were two culprits:

Mistake 1

The first mistake I made was in using another timer ISR as a template. The code I copied had just a single interrupt source, and thus acknowledging all of the sources was reasonable.

Mistake 2

I was too concerned with the ‘real work’ of the ISR. I should have written the ISR outline first and only then worried about the real work.

Notwithstanding the above, I did do one thing correctly – and that was to finish the code, walk away, and then come back to re-examine it. At no time did I reach for the debugger to test my code – which was just as well because quite frankly the chances of this bug being caught by testing are vanishingly small. Indeed just about the only way a bug like this would get caught is via code inspection – which is why I’m such a firm believer in code inspection as a debugging tool.

Anyway if you found this informative, you may find this account of another mistake I made equally enlightening.

A foreign perspective on variable names

August 18th, 2010 by Nigel Jones

This blog is read by people from all over the world. I make this point not to brag, but rather to demonstrate that designing embedded systems is a truly global effort. Remarkably, despite this, it appears that a huge amount of embedded code is commented in English and / or uses English nomenclature for variable and function names. This is of course wonderful for those of us that are native English speakers. However I’ve often thought that designing embedded systems is hard enough without having the additional burden of working in a foreign language.

Anyway, I mention this as preamble because last week I found myself in a rather unusual situation for me. Namely I was handed a fairly sophisticated driver which was written by a native German speaker. Now one of the things I have always liked about the Germans is that they don’t kowtow to the altar of the English language – and so I found myself looking at code that was commented entirely in German and that used almost exclusively German for function and variable names. I was thus faced with trying to understand it – which with my limited knowledge of German was not at all easy.

Anyway, as I went through the code I found myself entering variable names into an online German-English dictionary – with very limited success. Now while part of the problem was undoubtedly the technical nature of the words, I don’t have the slightest doubt that the real problem was that the author was using abbreviations / slang / jargon as well as concatenating words (e.g. (in English) bufferindex) such that the online dictionaries were flummoxed. The net result was that I had a much harder time interpreting the code than would have been the case if I had understood the variable names. Needless to say this got me thinking. How many times has a non-native English speaker looked at some of my code and entered variable names into a dictionary only to be told that there is no such word? If you subscribe to the belief that you write code for other people to read then it follows that one should take the spoken language barrier into consideration. If one does, then certain ‘rules’ become apparent:

  1. Don’t abbreviate unless you have to. While BufrWrtLmt may be understandable to native English speakers, it must be really hard to comprehend for others.
  2. In concatenating words, make the word boundary clear either via underscore or via camel-case. Thus buffer_index or bufferIndex.
  3. Pay attention to your spelling. A simple spelling mistake such as writing temprature when you meant temperature can completely stymie someone using a dictionary. While I don’t know of a tool that spell checks variable names, there are several tools available for spell checking comments.

As a passing observation, not only will these changes make your code easier to comprehend for non-native English speakers, it will also do wonders for those of us that purport to speak English as our native tongue!

Setting a bad example – final thoughts

August 15th, 2010 by Nigel Jones

While I am sure that I could extend the setting a bad example  series of  articles I think it’s time to move on to other topics. Before I do so I’d like to give some final thoughts. The series has generated a lot of excellent comments. While the majority have been in response to a particular coding construct, a number of readers have expressed their frustration at how pervasive this problem is with vendor supplied code. My experience agrees with this assessment. In other words while this series has taken IAR to task, I don’t have the slightest doubt that if I had bought an ARM evaluation board from Keil, ImageCraft etc that I would have found similar things to complain about. In other words my experience is the norm and not the exception. Now I don’t think I’m going too far out on a limb by observing that  the code supplied with evaluation boards is very influential in that:

  1. It is likely to find itself incorporated into hundreds, if not thousands of products.
  2. Will be used both verbatim and as a template for future code by huge numbers of inexperienced engineers.

Thus the bottom line is that the code supplied with evaluation boards needs to be of the highest quality and to incorporate as many best practices as possible. While it would be great if this blog was influential enough to cause the vendors to change their ways, I suspect that little will really happen until people start complaining. Those of you that work for large organizations which buy a commensurate number of licenses are in the best position to make the change happen by loudly complaining to your sales representative every time you find some lousy code.

As always, thanks for reading.