Archive for October, 2009

Embedded systems boot times

Monday, October 26th, 2009 Nigel Jones

Last week saw the release of Windows 7. Looking over the new features, the one that struck me the most was the effort that Microsoft had put into decreasing the boot time of the OS. If the reports are to be believed, then Windows 7 boots dramatically faster than its predecessors – to which I say about time! Almost contemporaneously with the Windows 7 announcement I took delivery of a beautiful new Tektronix Mixed Signal Oscilloscope. It’s a model MSO2024 with four analog channels, 16 digital channels, a huge color display, great user interface, tremendous connectivity etc. Despite all this, I’m disappointed with the product. The reason – it takes 75 seconds to boot. Now if I’m preparing a major debug session, then this 75 seconds isn’t terrible. However, most of the time when I turn a scope on, I’m interested in just getting a quick look at a signal – and then I’m done. For this usage mode, the MSO2024 fails miserably.

Now I’d like to think that this scope is an oddball in this respect – but it isn’t. I purchased a big fancy flat screen TV last year – and it takes about 5 seconds to boot from standby (i.e. powered, yet ‘off’) to being ‘on’. Maybe it’s my type A personality, but I find that time unacceptable (in part because I’m never sure if I’ve actually turned the thing on, or whether the remote control signal missed its mark).

Now without a doubt, these long boot times are a function of large processors, huge memories, complex RTOS’s etc. However, I also think they are equally a result of poor design by the engineers (or maybe poor specification by the marketing department).

Thus the bottom line – think about the boot time of your product. Your end user will appreciate you doing so.

Incidentally, if there is sufficient interest, I may publish some tips on how to minimize boot times in future blog postings.

Home

Whither white space?

Tuesday, October 20th, 2009 Nigel Jones

I was looking over some code I wrote a year ago in preparation for making some minor enhancements, when I noticed that in one place I had two blank lines between functions, instead of my coding standard mandated one line. I immediately and instinctively corrected it – as is my norm. However having done so, I paused to consider what I’d just done – and why.

On the one hand, this was a clear violation of my coding standard – and so it must be corrected. However, the violation wasn’t doing any harm, per se, and correcting it came at a cost – namely that someone (probably me) browsing the version control system at a later date will see that the file has been touched – and may choose to investigate what was changed – only to find out that it was a simple white space correction. (I appreciate that version control systems can be set up to ignore white space. I choose to not use that option).

Now I suspect that readers of this blog will be divided. Some will think I was quite right to eliminate the extra line, whereas others are thinking – doesn’t this guy have better things to do in life? Which brings me to my point!

Some people are completely anal retentive when it comes to white space. They are very careful on indentation, alignment of comments, use of blank lines and so on. I fall squarely into this category – as does Jean Labrosse of MicroOS II fame. Others could not care less about white space. They will arbitrarily have 6 blank lines between two functions,and then no lines between the next two functions. Their comments are usually aligned all over the place, and they rarely use space between e.g. the elements of a for loop statement. Finally, there’s the third (and largest group) who fall somewhere in between these two extremes.

Now I look at a lot of code, and  having done so, I think I can make a sweeping generalization, which I’ll call the “Nigel Jones white space principle”. Succinctly put, it states:

White space discipline is highly correlated with coding discipline.

That is, those who are careless about white space are often careless about a lot of other things. The converse seems to apply. As a result, when I look at code, literally the first thing I note is how well disciplined was the author in the use of white space. If the code is cleanly and consistently laid out, then I get a good first impression, and the chances are the code will be first rate.

Now I am unsure which is the cause and which is the effect here. In other words, does white space discipline lead to more disciplined code overall, or is it the other way around? Regardless, if your code looks like a mess, then I’d humbly suggest that you literally clean up your act – your career will thank you!

Home

Effective C Tip #7 – Use strongly typed function parameters

Monday, October 12th, 2009 Nigel Jones

This is the seventh in a series of tips on writing effective C. Today’s topic concerns function parameters, and more to the point, how you should choose them in order to make your code considerably more resilient to parameter passing errors.  What do I mean by parameter passing errors? Well consider a function that is intended to draw a rectangle on a display. The lousy way to design this function interface would be something like this:

void draw_rect(int x1, int y1, int x2, int y2, int color, int fill)
{
...
}

I must have seen a function like this many times. So what’s wrong with this you ask? Well in computer jargon the parameters are too weakly typed. To put it into plain English, it’s way too easy to pass a Y ordinate when you are supposed to pass an X ordinate, or indeed to pass a color when you are supposed to be passing an ordinate or a fill pattern. Although in this case (and indeed in most cases) these types of mistakes are clearly discernible at run time, I’m a firm believer in catching as many problems at compile time as possible. So how do I do this? Well there are various things one can do. The most powerful technique is to use considerably more meaningful data types. In this case, I’d do something like this:

typedef struct
{
 int x;
 int y;
} COORDINATE;

typedef enum
{
 Red, Black, Green, Purple .... Yellow
} COLOR;

typedef enum
{
 Solid, Dotted, Dashed .. Morse
} FILL_PATTERN;

void draw_rect(COORDINATE p1, COORDINATE p2, COLOR color, FILL_PATTERN fill)
{
...
}

Now clearly it’s highly likely that your compiler will complain if you attempt to pass a coordinate to a color and so on – and thus this is a definite improvement. However, nothing I’ve done here will prevent the X & Y ordinates being interchanged. Unfortunately, most of the time you are out of luck on this one – except in the case where you are dealing with certain sizes of display panels with resolutions such as 320 * 64, 320 * 128 and so on. In these cases, the X ordinate must be represented by a uint16_t whereas the Y ordinate may be represented by a uint8_t. In which case my COORDINATE data type becomes:

typedef struct
{
uint16_t x;
uint8_t y;
} COORDINATE;

This will at least cut down on the incidence of parameters being passed incorrectly.

Although you probably will not get much help from the compiler, you can also often get a degree of protection by declaring appropriate parameters as const. A good example of this is the standard C function memcpy(). If like me, you find yourself wondering if it’s memcpy(to, from) or memcpy(from, to), then an examination of the function prototype tells you all you need to know:

void *memcpy(void * s1, const void * s2, size_t n);

That is, the first parameter is simply declared as a void * pointer, whereas the second parameter is declared as void * pointer to const. In short the second parameter points to what we are reading from, and hence memcpy is indeed memcpy(to, from). Now I’m sure that many of you are thinking to yourself – so what, the real solution to this is to give meaningful names to the function prototype. For example:

void *memcpy(void *destination, const void *source, size_t nos_bytes);

Although I agree wholeheartedly with this sentiment, I’ll make two observations:

  1. You are assuming that the person reading your code is sufficiently fluent in the language (English in this case) that the names are meaningful to them.
  2. Your idea of a meaningful label may not be shared by others. I’ve noticed that this is particularly the case with software, as it seems that all too often the ability to write code and the ability to put a meaningful sentence together are inversely correlated.

The final technique that I employ concerns psychology!  Now one can argue that the failure to pass parameters correctly is due to laziness on behalf of the caller. At the end of the day, this is indeed the case. However, I suspect that in many cases, it’s not because the caller was lazy, but rather it’s because the caller thought they knew what the function parameter ordering is (or should be). A classic example of this of course concerns dates. Being from the UK (or more relevantly – Europe), I grew up thinking of dates as being day / month / year. Here in the USA, they of course use the month / day / year format. Thus when designing a function that needs to be passed the day, month and year, in what order should one declare the parameters? Well in my opinion it’s year, month, day. That is the function should look like this:

void foo(int16_t year, MONTH month, uint8_t day)

There are several things to note:

  1. By putting the year first, one causes both Europeans and Americans to think twice. This is where the psychology comes in!
  2. I’ve made the year signed – because it can indeed be negative, whereas the month and day cannot.
  3. I’ve made the month a MONTH data type, thus considerably increasing the likelihood that an attempt to pass a day when a month is required will be flagged by the compiler.
  4. I’ve made the day yet another data type (that maps well on to its expected range). Furthermore, attempts to pass most year values to this parameter will result in a compilation warning.

Thus I’ve used a combination of psychology and good coding practice to achieve a more robust function interface.
Thus the bottom line when it comes to designing function interfaces:

  1. Use strongly typed parameters.
  2. Use const where you can.
  3. Don’t assume that what is ‘natural’ to you is ‘natural’ to everyone.
  4. Do indeed use descriptive parameter names – but don’t assume that everyone will understand them.
  5. Apply some pop psychology if necessary.

I hope you find this useful.

Next Tip

Previous Tip

Home

Is MISRA compliance worthwhile?

Wednesday, October 7th, 2009 Nigel Jones

I had been planning on talking about MISRA C compliance in a month or two from now. However, in the comments section of my recent post about bitfields, Anand posed the following question:

However I am writing to ask your opinion about MISRA C compliance. This is the first time a client company has asked us for MISRA C compliance and I am not quite sure where to start. I started reading the guidelines from the web, however soon realised that its an enormous task and I would never get through all of them. So what I would like to know from you is how realistic is it to expect to comply with all the guidelines? Since my compiled code size is less than 2KB so should we even bother with MISRA C compliance?
Your valuable insights would be highly appreciated

I first gave my thoughts about MISRA C in an article published in 2002 in Embedded Systems Programming magazine (now Embedded Systems Design). Looking back over the article, I don’t see anything in it that I disagree with now. However, there are certainly some things that I’d add, having attempted to adhere to the MISRA guidelines in several large projects. After I have given my thoughts, I’ll try and address Anand’s questions.

I’ll start by noting that since I wrote the aforementioned article, MISRA released a second edition of their guidelines in 2004. The second edition was a major revision, and attempted to address many of the ambiguities in the 1998 version. As such, if someone is asking for MISRA compliance, they usually mean the 2004 rules; however it would behoove you to check!

Most of the MISRA rules can be checked via static analysis (i.e. by a compiler like tool) and indeed many more compilers now come with a MISRA checking option. Thus for those of you that are using such a compiler, conformance to most of the rules may be checked by simply using the correct compiler switch. For those of you that don’t have such a compiler, a great alternative is my favorite tool – PC-Lint from Gimpel – which brings me nicely to the main point I wish to make. MISRA  attempts to protect you from the darkest, nastiest corners of the C language – which is exactly what PC-Lint attempts to do as well. However, MISRA C attempts to do it by banning various constructs, whereas PC-Lint attempts to detect and inform you when you are using a construct in a potentially dangerous way. To put it another way, MISRA treats you like a child and PC-Lint treats you like an adult. As a result, I’ll take code that is ‘Lint free’ over code that is ‘MISRA compliant’ any day. I’d also add that making code Lint free is often a lot more challenging than making it MISRA compliant.

Now does this mean that I think the MISRA rules are not worthwhile? Absolutely not! Indeed the vast majority of the rules are pretty much good programming practice in a codified form. For example, rule 2.2. states: “Source code shall only use ISO9899:1990 ‘C’ style comments”. Now regardless of whether you agree with this rule or not, it’s my opinion that source code that contains C style comments and C++ style comments reflects a lack of discipline on the author’s behalf. Thus I like this rule – and I adhere to it.

Where I start to run into problems with MISRA are rules such as 20.6 “The macro offsetof in stddef.h shall not be used”. I wrote an article in 2004 for Embedded Systems Programming magazine entitled “Learn a new trick with the offsetof() macro”. The examples I give in the article are elegant and robust solutions to certain common classes of problems in embedded systems. Solving these problems without using the offsetof() macro is hard and /or tedious and / or dangerous. In short the medicine prescribed by MISRA is worse than the supposed disease.

Putting this all together, my feelings on MISRA are as follows.

  1. It’s intentions are excellent – and I wholeheartedly support them.
  2. Most of its rules are really good.
  3. There are times when I just have to say – sorry your attempts to make my code ‘safer’ are actually having the opposite effect and so as an experienced embedded systems engineer I’m choosing to ignore the rule in the interest of making a safer product. Note that I don’t do this on a whim!

Now clearly, when it comes to my third point, I can certainly be accused of hubris. However, at the end of the day my clients typically hire me for my experience / knowledge and not for my ability to follow a rule book.

As a final note, I must say that I think the MISRA committee has overall done a very fine job. Trying to come up with a set of rules for the vastly disparate embedded systems industry (I know they are only really aimed at the car industry) is essentially an impossible task.

So with that all of the above as a preamble, I think I can address Anand’s questions.

Where to Start?
Order a copy of the guidelines from MISRA. You can get an electronic copy for £10 (about $15). If you are using a C compiler that has a MISRA checking switch, then turn it on. If not buy a copy of PC-Lint. If you are not already using PC-Lint then you should be. It will be the best $389 you ever spent.

Then What?
Take a snapshot of your code base in your version control system before starting.
The first time you check your code for compliance, you will undoubtedly get an enormous number of errors. However, I think you will find that half a dozen rules are creating 90% of the problems. Thus you should be able to knock the error count down fairly quickly to something manageable. At that point you will be in to some of the tougher problems. My recommendation is not to blow off the MISRA errors, but rather to understand why MISRA thinks your constructs are unsafe. Only once you are convinced that your particular instance is indeed safe should you choose to ignore the violation.

Is it worth it for 2K of object code?
Yes – and no. With an executable image of 2K, the chances are most of your code is very tightly tied to the hardware. In my experience, the closer you get to the hardware, the harder it is to achieve MISRA compliance, simply because interaction with the hardware typically relies upon extensions to standard C – and these extensions violate MISRA rule 1.1. Thus you have no hope of making your code compliant, literally starting with the first rule. (The MISRA committee aren’t stupid, and so they have a formal method for allowing you to waive certain rules – and this is a clear example of why it’s necessary. However, it’s a tough way to start out). Does this mean that the exercise is pointless though? Probably not, as at the end of the day you’ll probably have cleaner, more portable, more easily maintained code. However, I seriously doubt that it will be compliant.

Finally, I’ll answer a question that you didn’t ask. What should I say to a client that asks for MISRA compliance? Well the first thing to determine is whether compliance is desired or required. If it’s the former, then what they are probably asking for is work that conforms to industry best practices. In which case what I normally do is explain to them that the code I deliver will be Lint free – and that this is a far higher hurdle to cross than MISRA compliance. If however MISRA compliance is a must, then you have no option other than to bite the bullet and get to work. It would probably make sense to retain a consultant for a day or two to help you get up to speed quickly.

Home

A taxonomy of bug types in embedded systems

Wednesday, October 7th, 2009 Nigel Jones

Over the next few months I’ll be touching upon the subjects of debugging and testing embedded systems. Although much has been written about these topics (often by companies looking to sell you something), I’ve always been struck by the fact that many of these discussions treat errors as if they were all cut from the same cloth. Clearly this is foolhardy, as it’s my experience that understanding what class of error you have is key to adopting an effective debugging and testing strategy. With that being said, my taxonomy of embedded systems errors appears below, arranged roughly in the order that one encounters them in an embedded project. I might also add that the difficulty in solving these problems also roughly follows the order I’ve listed, with syntax errors being trivial to identify and fix, while race conditions can be extremely difficult to identify (even if the fix is fairly easy).

Group 1 – Building a linked image
Syntax errors
Language errors
Build environment problems (make file dependencies, linker configurations)

Group 2 – Getting the board up and running
Hardware configuration errors (failure to setup peripherals correctly)
Run time environment errors (stack & heap allocation, memory models etc)
Software configuration errors (failure to use library code correctly)

Group 3 – Knocking off the obvious mistakes
Coding errors (initialization, pointer dereferencing, N + 1 issues etc)
Algorithmic errors.

Group 4 – Background / Foreground issues
Re-entrancy
Atomicity
Interrupt response times

Group 5 – Timing related
Resource allocation mistakes
Priority / scheduling issues
Deadlocks
Priority inversion
Race conditions

It’s my intention over the next few months to discuss how I set about solving these sorts of problems, so it’s important that I’ve got the groups right. Thus if anyone thinks this taxonomy is missing an important group, then perhaps you could let me know via the comments section or email.

Home