Archive for the ‘General C issues’ Category

A volatile tempest

Monday, September 27th, 2010 Nigel Jones

Regular readers will know that I often comment on the use of volatile in embedded systems. As a result I am occasionally contacted about my opinion on whether a compiler is generating correct code – particularly when hardware is being accessed. Well I was contacted last week by Ratish Punoose who had a classic problem in that his code compiled okay on GCC but not on IAR. He had contacted IAR, who in turn basically said the compiler is correct – and here is the explanation. Ratish turned to me and John Regehr for our opinions. Well John and I came to similar opinions – namely:

  1. Ratish’s code was a bit weird, but not dramatically so.
  2. The explanation from IAR made no sense.
  3. It did indeed appear to be a compiler bug.

Ratish then posted his issue to the Msp430 forum on Yahoo. You can read his post and the responses here.

I’m sure many of you are at this point thinking that IAR is in for another round of bashing from me. Well you’d be wrong. One of the first responders to Ratish’s post was Paul Curtis of Rowley compilers. Paul gives an admirable explanation as to why Ratish’s code is wrong (and by extension so am I). Now I’m sure that IAR and Rowley are fierce competitors, and so Paul is also to be commended for leaping to the defense of IAR.

Furthermore, later in the thread Anders Lindgren of IAR chimes in and adds his detailed and compelling explanation.

Having read the posts from Paul and Anders I think they are right and I’m wrong. So thanks Gentlemen for:

  1. Setting me straight
  2. Proving that in the wonderful world of volatile accesses, there is always something more to learn.

I think there are several other lessons to be learned from this episode. However I think I’ll save them for another post.

#include “includes.h”

Thursday, September 9th, 2010 Nigel Jones

I am sure that the title of this blog posting is familiar to most of the readers of this blog, in that you have opened up a C source file and found a single #include statement that references a file that is typically called ‘includes.h’. On opening ‘includes.h’ one invariably finds an enormous list of other header files. Furthermore, as you go through other source files in the project, you will find that they all use ‘includes.h’. I suspect that by this point, the readers of this blog are divided into camps, namely:

  1. Either: So what, I do it all the time because it makes my life a lot easier.
  2. Or: I want to scream whenever I see this done.

I’m one of the screamers – and this is my rationale.

Back in the dark ages when one had to compile on computers with extremely limited resources, the compilation time of a module was a major issue. One of the things that significantly affected compile time was the number and the size of the header files that a module opened. As a result, most of us took steps to ensure that we only included the header files that were needed. However, as processor speeds increased and compilers started using pre-compiled header files, this became less of an issue, such that today I seriously doubt if you’d notice much difference in compilation times regardless of the number of header files that are included. I don’t know but I suspect that this was the enabler that caused people to start using ‘includes.h’.

So if compilation time is no longer an issue, what’s the big deal? After all we have all had the hassle of compiling a file only to be told that we are missing a prototype or a data type. At which point we have to hunt down the requisite header file, include it and recompile. If you do this half a dozen times in a new module, then it takes you say 15 minutes before everything is OK – and who has 15 minutes to waste on such irritating details? Well, in my opinion it’s time well spent. Here’s my case:

Coupling Indication

The number of header files a module needs to use is a crude but effective indicator of coupling. A module that needs to include almost no header files is clearly a module that is extremely self contained. That is it isn’t relying upon the outside world. Modules like this are typically easier to maintain and also more immune from changes made elsewhere in the system. In short I like modules that don’t have to include a lot of header files. Indeed when I have finished writing a module, I take a look at its include list. If the list is long then it really makes me wonder whether I should be breaking the module apart in some way so as to reduce the degree of coupling between it and the outside world.

Maintenance – understanding coupling

This is related to the first point. If I need to do some maintenance on a module, then a quick look at the include list can tell me how this module interacts with the rest of the code. This can be extremely useful if one is trying to understand how a program is put together.

Maintenance – understanding functionality

If I look at the include list and I see ‘math.h’, then I know that the module is using transcendental functions, which in turn implies complex floating point operations, which in turn implies potentially long execution times. In a similar manner, if it includes the header for the hardware interrupt handler, then I know I’m dealing with something related to the chip. I can get all this sort of information in a two second scan of the include list.

Documentation

If you use an automated documentation tool such as Doxygen, then only including the header files that are needed by a module ensures that Doxygen generates a meaningful documentation set for you, rather than including hyperlinks to useless files.

Not getting what you want

I have left what is probably the biggest problem to last. By including an enormous number of header files you lay yourself wide open to problems like this:

Header1.h

#define FALSE 0
#define TRUE  !FALSE

Header17.h

#ifndef FALSE
#define FALSE 0UL
#define TRUE  1UL
#endif

Header26.h

#pragma(IGNORE_REDEFINITION_OF_MACROS)
#define FALSE NULL
#define TRUE !FALSE
#pragma(ERROR_ON_REDEFINITION_OF_MACROS)

Trust me when I tell you I have seen this done! In other words the more files you include, the more likely it is that the macro that you are blithely using does not in fact have the value you think it does. Time to debug problems such as these – a lot longer than 15 minutes!

Remedial Action

On the off chance that I have convinced an ‘includes.h’ fan of the error of their ways, it would be remiss of me to not tell you how to quickly find out just the header files needed by a module.

  1. Paste the include list of includes.h into the module.
  2. Delete the entry for includes.h
  3. Compile the code to make sure you haven’t broken anything.
  4. Lint the file. Lint will tell you all the header files that aren’t being used.
  5. Delete the unnecessary include statements.
  6. Repeat from step 3 until Lint is happy.

Of course the chances are that if you use ‘includes.h’ you aren’t using Lint. If you do start using Lint then it will do a lot more for you than just telling you about unnecessary includes.

A foreign perspective on variable names

Wednesday, August 18th, 2010 Nigel Jones

This blog is read by people from all over the world. I make this point not to brag, but rather to demonstrate that designing embedded systems is a truly global effort. Remarkably, despite this, it appears that a huge amount of embedded code is commented in English and / or uses English nomenclature for variable and function names. This is of course wonderful for those of us that are native English speakers. However I’ve often thought that designing embedded systems is hard enough without having the additional burden of working in a foreign language.

Anyway, I mention this as preamble because last week I found myself in a rather unusual situation for me. Namely I was handed a fairly sophisticated driver which was written by a native German speaker. Now one of the things I have always liked about the Germans is that they don’t kowtow to the altar of the English language – and so I found myself looking at code that was commented entirely in German and that used almost exclusively German for function and variable names. I was thus faced with trying to understand it – which with my limited knowledge of German was not at all easy.

Anyway, as I went through the code I found myself entering variable names into an online German-English dictionary – with very limited success. Now while part of the problem was undoubtedly the technical nature of the words, I don’t have the slightest doubt that the real problem was that the author was using abbreviations / slang / jargon as well as concatenating words (e.g. (in English) bufferindex) such that the online dictionaries were flummoxed. The net result was that I had a much harder time interpreting the code than would have been the case if I had understood the variable names. Needless to say this got me thinking. How many times has a non-native English speaker looked at some of my code and entered variable names into a dictionary only to be told that there is no such word? If you subscribe to the belief that you write code for other people to read then it follows that one should take the spoken language barrier into consideration. If one does, then certain ‘rules’ become apparent:

  1. Don’t abbreviate unless you have to. While BufrWrtLmt may be understandable to native English speakers, it must be really hard to comprehend for others.
  2. In concatenating words, make the word boundary clear either via underscore or via camel-case. Thus buffer_index or bufferIndex.
  3. Pay attention to your spelling. A simple spelling mistake such as writing temprature when you meant temperature can completely stymie someone using a dictionary. While I don’t know of a tool that spell checks variable names, there are several tools available for spell checking comments.

As a passing observation, not only will these changes make your code easier to comprehend for non-native English speakers, it will also do wonders for those of us that purport to speak English as our native tongue!

Setting a bad example – final thoughts

Sunday, August 15th, 2010 Nigel Jones

While I am sure that I could extend the setting a bad example  series of  articles I think it’s time to move on to other topics. Before I do so I’d like to give some final thoughts. The series has generated a lot of excellent comments. While the majority have been in response to a particular coding construct, a number of readers have expressed their frustration at how pervasive this problem is with vendor supplied code. My experience agrees with this assessment. In other words while this series has taken IAR to task, I don’t have the slightest doubt that if I had bought an ARM evaluation board from Keil, ImageCraft etc that I would have found similar things to complain about. In other words my experience is the norm and not the exception. Now I don’t think I’m going too far out on a limb by observing that  the code supplied with evaluation boards is very influential in that:

  1. It is likely to find itself incorporated into hundreds, if not thousands of products.
  2. Will be used both verbatim and as a template for future code by huge numbers of inexperienced engineers.

Thus the bottom line is that the code supplied with evaluation boards needs to be of the highest quality and to incorporate as many best practices as possible. While it would be great if this blog was influential enough to cause the vendors to change their ways, I suspect that little will really happen until people start complaining. Those of you that work for large organizations which buy a commensurate number of licenses are in the best position to make the change happen by loudly complaining to your sales representative every time you find some lousy code.

As always, thanks for reading.

Setting a bad example – part 5

Wednesday, August 11th, 2010 Nigel Jones

This is the fifth part in a series of  articles in which I’m ‘highlighting’ what I consider to be lousy code provided by IAR in their NXP LPC1764 development kit. This code excerpt is taken from usb_t9.c in the AudioDevice project.

/*************************************************************************
 * Function Name: UsbCoreReq
 * Parameters:  UsbCoreReqType_t Type
 *
 * Return: Int32U
 *
 * Description: Return device states
 *
 *************************************************************************/
Int32U UsbCoreReq (UsbCoreReqType_t Type)
{
 switch(Type)
 {
 case UsbCoreReqDevState:
   return(UsbDevCtrl.State.DS);
 case UsbCoreReqDevSusState:
   return(UsbDevCtrl.State.Suspend);
 case UsbCoreReqConfiquration:
   return(UsbDevCtrl.Configuration);
 case UsbCoreReqInterface:
   return(UsbDevCtrl.Interfaces);
 case UsbCoreReqDevOwnAddress:
   return(UsbDevCtrl.DevAdd);
 case UsbCoreReqWakeUpEnableStatus:
   return(UsbDevCtrl.Feature.RemoteWakeupEnable);
 case UsbCoreReqSelfPoweredStatus:
   return(USB_SELF_POWERED);
#if USB_HIGH_SPEED > 0
 case UsbCoreReqSpeed:
   return(UsbDevCtrl.HighSpeed);
#endif // USB_HIGH_SPEED > 0
 }
 return((Int32U) -1);
}

I suspect that on first blush many readers will not be too offended by this function. However it contains several things that are pet peeves of mine. I’ve listed them in the order that they occur in the function

  1. As with most of the code, I found the function header to be almost useless. Having read it multiple times and also having looked at the code, I’m still not really clear what the purpose of this function is other than to provide access to some of the parameters of the structure UsbDevCtrl.
  2. This function does nothing more than switch on a parameter and return a value. As I outlined in this post, IMHO one should use multiple simple functions rather than one large function such as this. The resulting code is smaller, faster and the purpose of each function is much clearer. It also avoids the problem that is discussed in point 7 below.
  3. The author has used the variable name ‘Type’. While I know this is legal and it may even be somewhat meaningful in this case, I find the idea of using ‘Type’ as a variable name just plain confusing.
  4. The function contains no-less than nine return statements. Most coding standards require that a function have a single exit point. I have to admit that if you’d asked me ten years ago whether I thought this was a good idea I’d have probably equivocated. However in recent years I have really made an effort to ensure that all my functions have a single exit point – and I have found it has improved my code a lot. Of course there are the odd cases where it really is unavoidable. However in this case, maintaining a single exit point would have been trivial.
  5. The argument to the return statement is in parentheses. This seems to be a popular construct, which I simply do not understand. What benefit do the parentheses confer? To those that ask what harm are they doing, then I would suggest that they make the code harder to read.
  6. The switch statement has no default clause. Yes, one will fall through to the last return statement – but it’s a sloppy way of handling it.
  7. The pièce de résistance is of course the cast of -1 to an unsigned integer. The function is declared as returning an unsigned integer – but in one case it returns a signed integer that is cast to unsigned. I’m sure that you will not be surprised to hear that none of the functions that call UsbCoreReq() check to see if “-1” has been returned.

I don’t think this example is quite as horrific as some of the others I have highlighted – but it’s still pretty bad. As always your comments make this interesting!