Archive for August, 2010

Classic race conditions and thoughts on testing

Monday, August 30th, 2010 Nigel Jones

At a rather fundamental level this blog is about how I do embedded systems. Implicit in a lot of the articles is the concept that I believe what I’m doing is ‘right’, or at least ‘better’. Well today I thought I’d write about something I got wrong (at least on the first pass).

This is the scenario. I’m currently working on an NXP LPC17xx ARM Cortex design. Like all modern processors, the LPC17xx has a number of sophisticated timers with all sorts of operating modes. Well it so happens that I am using  four (out of a possible six) interrupt sources for one particular timer. The hardware architecture of the processor routes all of these interrupts to one vector and thus one interrupt handler. Here’s what I wrote:

void TMR3_IRQHandler(void)
{
 if (T3IR_bit.MR0INT)
 {                                            
  /* Do stuff */
 }

 if (T3IR_bit.CR0INT)
 {        
  /* Do stuff */        
 }

 if (T3IR_bit.MR1INT)
 {                                            
  /* Do stuff */
 }

 if (T3IR_bit.CR1INT)
 {                                            
  /* Do stuff */
 }

 T3IR = 0x3F;            /* Acknowledge all interrupts */

 ...
}

Thus in the ISR I tested each of my interrupt sources, took the appropriate actions in the sections marked ‘Do stuff’, acknowledged the interrupts, did a bit of clean up and I was done.  The ‘Do Stuff’ sections were quite complicated and so this was where I spent my time. Anyway having finished coding the ISR, I took a short break and came back to re-examine the code. As I was re-reading the code, I realized that I had made a classic mistake. In case you haven’t spotted it, the problem is in the line where I acknowledge all interrupts. Consider the following sequence of events:

  1. Interrupt source CR1INT is asserted and the CPU vectors to this ISR.
  2. I test the various interrupt flags and discover that CR1INT is set and do the requisite work.
  3. While I’m doing the requisite work, interrupt source MR1INT becomes active.
  4. I clear all interrupt sources (including MR1INT) and terminate the ISR
  5. As a result I have missed an interrupt.

The way this should have been coded is to acknowledge each interrupt bit individually. I.e. like this:

void TMR3_IRQHandler(void)
{
 if (T3IR_bit.MR0INT)
 {                                            
  /* Do stuff */
  T3IR_bit.MR0INT = 1;                    /* Clear the interrupt */
 }

 if (T3IR_bit.CR0INT)
 {        
  /* Do stuff */    
  T3IR_bit.CR0INT = 1;                    /* Clear the interrupt */
 }

 if (T3IR_bit.MR1INT)
 {                                            
  /* Do stuff */
  T3IR_bit.MR1INT = 1;                    /* Clear the interrupt */
 }

 if (T3IR_bit.CR1INT)
 {                                            
  /* Do stuff */
  T3IR_bit.CR1INT = 1;                    /* Clear the interrupt */        
 }

 ...
}

So how did this mistake come about? I think there were two culprits:

Mistake 1

The first mistake I made was in using another timer ISR as a template. The code I copied had just a single interrupt source, and thus acknowledging all of the sources was reasonable.

Mistake 2

I was too concerned with the ‘real work’ of the ISR. I should have written the ISR outline first and only then worried about the real work.

Notwithstanding the above, I did do one thing correctly – and that was to finish the code, walk away, and then come back to re-examine it. At no time did I reach for the debugger to test my code – which was just as well because quite frankly the chances of this bug being caught by testing are vanishingly small. Indeed just about the only way a bug like this would get caught is via code inspection – which is why I’m such a firm believer in code inspection as a debugging tool.

Anyway if you found this informative, you may find this account of another mistake I made equally enlightening.

A foreign perspective on variable names

Wednesday, August 18th, 2010 Nigel Jones

This blog is read by people from all over the world. I make this point not to brag, but rather to demonstrate that designing embedded systems is a truly global effort. Remarkably, despite this, it appears that a huge amount of embedded code is commented in English and / or uses English nomenclature for variable and function names. This is of course wonderful for those of us that are native English speakers. However I’ve often thought that designing embedded systems is hard enough without having the additional burden of working in a foreign language.

Anyway, I mention this as preamble because last week I found myself in a rather unusual situation for me. Namely I was handed a fairly sophisticated driver which was written by a native German speaker. Now one of the things I have always liked about the Germans is that they don’t kowtow to the altar of the English language – and so I found myself looking at code that was commented entirely in German and that used almost exclusively German for function and variable names. I was thus faced with trying to understand it – which with my limited knowledge of German was not at all easy.

Anyway, as I went through the code I found myself entering variable names into an online German-English dictionary – with very limited success. Now while part of the problem was undoubtedly the technical nature of the words, I don’t have the slightest doubt that the real problem was that the author was using abbreviations / slang / jargon as well as concatenating words (e.g. (in English) bufferindex) such that the online dictionaries were flummoxed. The net result was that I had a much harder time interpreting the code than would have been the case if I had understood the variable names. Needless to say this got me thinking. How many times has a non-native English speaker looked at some of my code and entered variable names into a dictionary only to be told that there is no such word? If you subscribe to the belief that you write code for other people to read then it follows that one should take the spoken language barrier into consideration. If one does, then certain ‘rules’ become apparent:

  1. Don’t abbreviate unless you have to. While BufrWrtLmt may be understandable to native English speakers, it must be really hard to comprehend for others.
  2. In concatenating words, make the word boundary clear either via underscore or via camel-case. Thus buffer_index or bufferIndex.
  3. Pay attention to your spelling. A simple spelling mistake such as writing temprature when you meant temperature can completely stymie someone using a dictionary. While I don’t know of a tool that spell checks variable names, there are several tools available for spell checking comments.

As a passing observation, not only will these changes make your code easier to comprehend for non-native English speakers, it will also do wonders for those of us that purport to speak English as our native tongue!

Setting a bad example – final thoughts

Sunday, August 15th, 2010 Nigel Jones

While I am sure that I could extend the setting a bad example  series of  articles I think it’s time to move on to other topics. Before I do so I’d like to give some final thoughts. The series has generated a lot of excellent comments. While the majority have been in response to a particular coding construct, a number of readers have expressed their frustration at how pervasive this problem is with vendor supplied code. My experience agrees with this assessment. In other words while this series has taken IAR to task, I don’t have the slightest doubt that if I had bought an ARM evaluation board from Keil, ImageCraft etc that I would have found similar things to complain about. In other words my experience is the norm and not the exception. Now I don’t think I’m going too far out on a limb by observing that  the code supplied with evaluation boards is very influential in that:

  1. It is likely to find itself incorporated into hundreds, if not thousands of products.
  2. Will be used both verbatim and as a template for future code by huge numbers of inexperienced engineers.

Thus the bottom line is that the code supplied with evaluation boards needs to be of the highest quality and to incorporate as many best practices as possible. While it would be great if this blog was influential enough to cause the vendors to change their ways, I suspect that little will really happen until people start complaining. Those of you that work for large organizations which buy a commensurate number of licenses are in the best position to make the change happen by loudly complaining to your sales representative every time you find some lousy code.

As always, thanks for reading.

Setting a bad example – part 5

Wednesday, August 11th, 2010 Nigel Jones

This is the fifth part in a series of  articles in which I’m ‘highlighting’ what I consider to be lousy code provided by IAR in their NXP LPC1764 development kit. This code excerpt is taken from usb_t9.c in the AudioDevice project.

/*************************************************************************
 * Function Name: UsbCoreReq
 * Parameters:  UsbCoreReqType_t Type
 *
 * Return: Int32U
 *
 * Description: Return device states
 *
 *************************************************************************/
Int32U UsbCoreReq (UsbCoreReqType_t Type)
{
 switch(Type)
 {
 case UsbCoreReqDevState:
   return(UsbDevCtrl.State.DS);
 case UsbCoreReqDevSusState:
   return(UsbDevCtrl.State.Suspend);
 case UsbCoreReqConfiquration:
   return(UsbDevCtrl.Configuration);
 case UsbCoreReqInterface:
   return(UsbDevCtrl.Interfaces);
 case UsbCoreReqDevOwnAddress:
   return(UsbDevCtrl.DevAdd);
 case UsbCoreReqWakeUpEnableStatus:
   return(UsbDevCtrl.Feature.RemoteWakeupEnable);
 case UsbCoreReqSelfPoweredStatus:
   return(USB_SELF_POWERED);
#if USB_HIGH_SPEED > 0
 case UsbCoreReqSpeed:
   return(UsbDevCtrl.HighSpeed);
#endif // USB_HIGH_SPEED > 0
 }
 return((Int32U) -1);
}

I suspect that on first blush many readers will not be too offended by this function. However it contains several things that are pet peeves of mine. I’ve listed them in the order that they occur in the function

  1. As with most of the code, I found the function header to be almost useless. Having read it multiple times and also having looked at the code, I’m still not really clear what the purpose of this function is other than to provide access to some of the parameters of the structure UsbDevCtrl.
  2. This function does nothing more than switch on a parameter and return a value. As I outlined in this post, IMHO one should use multiple simple functions rather than one large function such as this. The resulting code is smaller, faster and the purpose of each function is much clearer. It also avoids the problem that is discussed in point 7 below.
  3. The author has used the variable name ‘Type’. While I know this is legal and it may even be somewhat meaningful in this case, I find the idea of using ‘Type’ as a variable name just plain confusing.
  4. The function contains no-less than nine return statements. Most coding standards require that a function have a single exit point. I have to admit that if you’d asked me ten years ago whether I thought this was a good idea I’d have probably equivocated. However in recent years I have really made an effort to ensure that all my functions have a single exit point – and I have found it has improved my code a lot. Of course there are the odd cases where it really is unavoidable. However in this case, maintaining a single exit point would have been trivial.
  5. The argument to the return statement is in parentheses. This seems to be a popular construct, which I simply do not understand. What benefit do the parentheses confer? To those that ask what harm are they doing, then I would suggest that they make the code harder to read.
  6. The switch statement has no default clause. Yes, one will fall through to the last return statement – but it’s a sloppy way of handling it.
  7. The pièce de résistance is of course the cast of -1 to an unsigned integer. The function is declared as returning an unsigned integer – but in one case it returns a signed integer that is cast to unsigned. I’m sure that you will not be surprised to hear that none of the functions that call UsbCoreReq() check to see if “-1” has been returned.

I don’t think this example is quite as horrific as some of the others I have highlighted – but it’s still pretty bad. As always your comments make this interesting!

Setting a bad example – part 4

Monday, August 9th, 2010 Nigel Jones

This is the fourth part in a series of  articles in which I’m ‘highlighting’ what I consider to be lousy code provided by IAR in their NXP LPC1764 development kit. This example is taken from the VirtualCom project but can be found in almost all of the projects.

This code excerpt is taken from a file ‘Terminal_18_24x12.c”.

/*************************************************************************
 *
 *    Used with ICCARM and AARM.
 *
 *    (c) Copyright IAR Systems 2006
 *
 *    File name   : Terminal_18_24x12.c
 *    Description : Font Terminal 18 (24x12)
 *
 *    History :
 *    1. Date        : December 2, 2006
 *       Author      : Stanimir Bonev
 *       Description : Create
 *
 *    $Revision: 28532 $
 **************************************************************************/
#include "drv_glcd.h"

static const unsigned char TextStream [] = {
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,

... One thousand five hundred and thirty+ more lines of data without a comment

0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
};

FontType_t Terminal_18_24_12 =
{
 12,
 24,
 0,
 255,
 (pInt8U)TextStream,
 "Terminal 18 (24x12)"
};

Other than the header shown above, there is not a single comment in a 1566 line file. To help complete the picture, here is the excerpt from the header file drv_glcd.h which defines the data type FontType_t.

typedef struct _FontType_t {
 Int32U H_Size;
 Int32U V_Size;
 Int32U CharacterOffset;
 Int32U CharactersNuber;
 pInt8U pFontStream;
 pInt8U pFontDesc;
} FontType_t, *pFontType_t;

I have not omitted the comments – there just aren’t any.

Now it’s reasonably clear that this file is used to define a font. To illustrate my complaint with this file, I will pose what I consider to be a quite reasonable situation.

The sales department comes to engineering and says “90% of our sales for this product are in Germany. We have received complaints from Germany that the digit seven is rendered incorrectly – can you please correct it?” (For those of you that don’t know, in Germany a ‘7’ is usually rendered with a  horizontal bar through the body. I would demonstrate it – but I can’t within the limitations of my editing environment).

Now the question is, how big a job should this be – and how big a job will it actually be based upon how the font has been implemented? I will let you the reader decide the answer to the first question. To the second question however, I will suggest an answer of days. The reason is that before I can possibly make the change I have to get the following questions answered:

  1. How is TextStream[] indexed? It’s likely by ASCII number – but who knows for sure?
  2. Just how many bytes are there per character?
  3. How do the bytes map onto the character – by row or by column and do they start at the bottom, the top, the left or the right?
  4. What about italics and bold? Are they included in this font definition or not? If not, is there a separate file that defines them, or is it done algorithmically?
  5. Is there a German style ‘7’ already in the font map that I could just use?

All of these questions could have and should have been answered in the font file. If it was me, I would have done it via a combination of coding changes and comments. For example I might have done something like this:

static const unsigned char TextStream [N_TYPES][N_CHARS][FONT_SIZE_IN_BYTES] =
{
 /* Start of standard fonts (c.f. bold and italic) */
 {
  /*  Character 0. This is the nul character and is non-printing.
  Appears so as to make indexing into TextStream[][] easy */
  {
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
    ...

Would this have been a lot of work up-front? Absolutely. However compared to what the poor sod that has to maintain the font has to go through, it would have been time well spent. Unfortunately the industry seems to measure productivity based on how quickly code is written – and not on how easy it is to maintain. I for one would like to see it changed!