Archive for the ‘Coding Standards’ Category

What’s in your main() header?

Saturday, February 2nd, 2013 Nigel Jones

One of the consequences of being in the consulting business is that I get to look at a lot of code written by other people. Usually it is necessary for me to get up to speed on the code as quickly as possible, and so to this end, one of the first things I do is look for main.c, or if it doesn’t exist the file that contains main(). Here’s what I usually find:

/********************************************************************************
 main.c
 Possibly a one line description.
 Legal notice. Sometimes many lines long.
 *********************************************************************************/

That’s it. Now maybe it’s just me, but I find this a bit inadequate. Before I describe what I put in the header for main.c, I should first note my motivation. Anyone that has written code for many years realizes that the whole point of writing code is to allow it to be maintained. The person maintaining the code may be a future version of yourself, but often is some poor sod who gets thrown a bunch of code and told to get on with it. As a result, it is imperative that this future maintainer be told as much as possible about what it is they are maintaining. Now I realize that a lot of what I describe below could be described elsewhere. However, it’s my experience that Word documents and other non source code related documents tend to get lost over time (or perhaps more accurately not packaged with the source code when it is given to someone else), and so by putting this information in main.c, you pretty much guarantee that the maintainer will receive the information. With this as a background, here’s what I think should be in the header for main.c.

A product description

I typically write somewhere between 10 and 100 lines of text describing the product, what it does, how it does it, what makes it unique and the things about it that make it difficult. Note I’m not describing the code. I always find this a challenge because it forces me to really get to the core of the product. I can sometimes take many hours on this stage, as I try to refine and precis my description to include as much useful information as possible. Who has this sort of time you ask? Well if you think about it, if you can’t write a concise, yet detailed description of the product, surely you aren’t ready to start writing code? Thus if you go through this exercise and find yourself stymied, then you simply shouldn’t be sitting in front of a text editor.

Text Editor settings

Talking of text editors, the next thing you should have is an entry that describes how you have your text editor configured. I’m not interested in getting into a discussion about what your text editor settings should be – I’d just like to know what they are so that I can configure my text editor to match. This is a critical step as there is no bigger time waster than trying to understand code that looks like a disaster because you used tabs with an indentation of 2 and my editor is using an indentation of 4.

Development Environment

The text editor is of course part of the larger development environment. While it’s obvious to you what build environment you are using, it isn’t to someone else. Thus if you are using an IDE from Keil then say so. Conversely if you are from the IDE’s are evil camp and instead rely upon makefiles, well make that clear as well. Note the presence of a makefile in the source code directory does not IMHO constitute adequate documentation that this is how you intend the code to be built.

Compiler make and version

Almost every project I look at fails to make it clear what compiler make and version the code was written for. This always blows me away, because I’ve never yet seen an embedded system that doesn’t rely on compiler specific facets for it to successfully compile. Thus you should spell out exactly what compiler you used – even if you don’t think it really matters much.

Libraries

If you are using libraries, particularly ones from a third party, then you should really be spelling this out and of course specifying what version of the library you used. If there are special licensing restrictions on the use of the library, then this isn’t a bad place to mention it either.

Other tools

If you use other tools, particularly code generation tools, then it would be really nice to let the reader know that your code relies upon tool X, version Y. If you are using make rather than an IDE, it would also be nice to let us all know what version of make you used.

CPU configuration

Many CPUs are configurable via fuse bits of some type (PICs and AVRs are prime examples). These configuration bits usually have a dramatic impact on how the CPU behaves, and so it is critical that you document what fuse bit settings you are assuming. It’s possible to waste many hours debugging a system that in fact has no code problems per se, but rather is simply misconfigured at the fuse bit level.

How to build

Finally, it would be really nice if you told everyone how to actually make the executable. I’m constantly amazed at the number of projects I see where either the method of building is unclear, or worse, the ‘obvious method’ (e.g. typing make) results in a build failure because prior to e.g. invoking make, it is necessary to run some batch file etc.

While I think there’s a lot more project specific information that can go in the header, I think the above is a pretty decent start. I’d be interested in hearing about other information that you put in your main.c header.

 

 

What does 0x47u mean anyway?

Saturday, July 21st, 2012 Nigel Jones

In the last couple of years I have had a large number of folks end up on this blog as a result of search terms such as “what does 0X47u mean?” In an effort to make their visit more productive, I’ll explain and also offer some thoughts on the topic.

Back in the mists of time, it was considered perfectly acceptable to write code that looks like this:

unsigned int foo = 6;

Indeed I’m guessing that just about every C textbook out there has just such a construct somewhere in its first few chapters. So what’s wrong with this you ask? Well, according to the C90 semantics, constants by default are of type ‘signed int’. Thus the above line of code takes a signed int and assigns it to an unsigned int. Now not so many years ago, most people would have just shrugged and got on with the task of churning out code. However, the folks at MISRA looked askance at this practice (and correctly so IMHO), and promulgated rule 10.6:

“Rule 10.6 (required): A “U” suffix shall be applied to all constants of unsigned type.”

Now in the world of computing, unsigned types don’t seem to crop up much. However in the embedded arena, unsigned integers are extremely common. Indeed IMHO you should use them. For information on doing so, see here.

Thus what has happened as MISRA adoption has spread throughout the embedded world, is you are starting to see code that looks like this:

unsigned int foo = 6u;

So this brings me to the answer to the question posed in the title – what does 0x47u mean? It means that it is an unsigned hexadecimal constant of value 47 hex = 71 decimal. If the ‘u’ is omitted then it is a signed hexadecimal constant of value 47 hex.

Some observations

You actually have three ways that to satisfy rule 10.6. Here are examples of the three methods.

unsigned int foo = 6u;
unsigned int foo = 6U;
unsigned int foo = (unsigned int)6;

Let’s dispense with the third method first. I am not a fan of casting, mainly because casting makes code hard to read and can inadvertently cover up all sorts of coding mistakes. As a result, any methodology that results in increased casts in code is a bad idea. If that doesn’t convince you, then consider initializing an unsigned array using casts:

unsigned int bar[42] = {(unsigned int)89, (unsigned int)56, (unsigned int)12, ... };

The result is a lot of typing and a mess to read. Don’t do it!

What then of the first methods? Should you use a lower case ‘u’ or an upper case ‘U’. Well I have reluctantly come down in favor of using an upper case ‘U’. Aesthetically I think that the lower case ‘u’ works better, in that the lower case letter is less intrusive and keeps your eye on the digits (which after all is what’s really important). Here’s what I mean:

unsigned int bar[42] = {89u, 56u, 12u, ... };
unsigned int bar[42] = {89U, 56U, 12U, ... };

So why do I use upper case ‘U’? Well it’s because ‘U’ isn’t the only modifier that one can append to an integer constant. One can also append an ‘L’  or ‘l’ meaning that the constant is of type ‘long’. They can also be combined as in ‘UL’, ‘ul’, ‘LU’ or ‘lu’, to signify an unsigned long constant. The problem is that a lower case ‘l’ looks an awful lot like a ‘1’ in most editors. Thus if you write this:

long bar = 345l;

Is that 345L or 3451? To really see what I mean, try these examples in a standard text editor. Anyway as a result, I always use upper case ‘L’ to signify a long constant – and thus to be consistent I use an upper case ‘U’ for unsigned. I could of course use ‘uL’ – but that just looks weird to me.

Incidentally based upon the code I have looked at over the last decade or so, I’d say that I’m in the minority on this topic, and that more people use the lower case ‘u’. I’d be interested to know what the readers of this blog do – particularly if they have a reason for doing so rather than whim!

 

The Crap Code Conundrum

Friday, June 29th, 2012 Nigel Jones

Listed below are three statements. Based on my nearly thirty years in the embedded space I can confidently state that: One I have never heard stated. Another I have rarely heard stated, and the third I hear a lot. Here they are in order:

  1. I write crap code.
  2. You know so-and-so. (S)he writes really good code.
  3. This code is complete crap.

If your experience comports with mine, then it leads to what I have coined the ‘crap code conundrum’. In short, crap code is everywhere – but no one admits to or realizes they are writing it! So how can this be? Well I see several possibilities:

  1. In fact a lot of so called crap code is labeled as such because the author did things differently to the way the reader would have done it. I think it’s important to recognize this before summarily dismissing some code. Notwithstanding this, I all too often find myself saying ‘this code is complete crap’ – because it is!
  2. This is related to point 1, and essentially comes down to different people have different ideas about what constitutes good code. For example I think wrapping code in complex macros is an invitation for disaster. Others see it as a perfectly good way of simplifying things. (I’m right :-))
  3. The code started out being pretty good and has degenerated over time because the author hasn’t been allowed the time to perform the necessary refactoring. I think this does explain a lot of the bad code I see.
  4. The folks that write crap code are completely oblivious to the fact they are doing it. Indeed it’s only the self aware / self critical types that would even bother to ask themselves the question ‘is my code any good?’ Indeed, the first step to improving ones code is to ask oneself the question – how can I improve my code?

My gut feel is that point 4 is most likely the main cause. Now if you are so self-absorbed that you wouldn’t even dream to ask yourself the question ‘do I write crap code?’, then I seriously doubt whether you’d be reading this article. However if you have crossed this hurdle, then how can you determine if the code you are writing is any good? Well I took a stab at this a while back with this article . However some of the commenters pointed out that it’s quite easy to write code that has good metrics – yet is still complete crap. So clearly the code metrics approach is part of the story – but not the entire story.

So a couple of weeks ago I found myself in a bar in San Francisco having a beer with Michael Barr and a very smart  guy Steve Loudon. The topic of crap code came up and I posed the question ‘how can you tell code is crap?’ After all I think that crap code is a bit like pornography – you know it when you see it. After a spirited debate, the most pithy statement we could come up with is this:

If it’s hard to maintain, it’s crap.

Clearly there are all sorts of exceptions and qualifications, but at the end of the day I think this statement pretty much says it all. Thus if you are wondering if you write crap code, just ask yourself the question – how hard is this code to maintain? If you don’t like the answer, then it’s time to make a change.

 

 

 

Optimizing for the CPU / compiler

Sunday, June 3rd, 2012 Nigel Jones

It is well known that standard C language features map horribly on to the architecture of many processors. While the mapping is obvious and appalling for some processors (low end PICs, 8051 spring to mind), it’s still not necessarily great at the 32 bit end of the spectrum where processors without floating point units can be hit hard with C’s floating point promotion rules. While this is all obvious stuff, it’s essentially about what those CPUs are lacking. Where it gets really interesting in the embedded space is when you have a processor that has all sorts of specialized features that are great for embedded systems – but which simply do not map on to the C language view of the world. Some examples will illustrate my point.

Arithmetic vs. Logical shifting

The C language does of course have support for performing shift operations. However, these are strictly arithmetic shifts. That is when bits get shifted off the end of an integer type, they are simply lost. Logical shifting, sometimes known as rotation, is different in that bits simply get rotated back around (often through the carry bit but not always). Now while arithmetic shifting is great for, well arithmetic operations, there are plenty of occasions in which I find myself wanting to perform a rotation. Now can I write a rotation function in C – sure – but it’s a real pain in the tuches.

Saturated addition

If you have ever had to design and implement an integer digital filter, I am sure you found yourself yearning for an addition operator that will saturate rather than overflow. [In this form of arithmetic, if the integral type would overflow as the result of an operation, then the processor simply returns the minimum or maximum value as appropriate].  Processors that the designers think might be required to perform digital filtering will have this feature built directly into their instruction sets.  By contrast the C language has zero direct support for such operations, which must be coded using nasty checks and masks.

Nibble swapping

Swapping the upper and lower nibbles of a byte is a common operation in cryptography and related fields. As a result many processors include this ever so useful instruction in their instruction sets. While you can of course write C code to do it, it’s horrible looking and grossly inefficient when compared to the built in instruction.

Implications

If you look over the examples quoted I’m sure you noticed a theme:

  1. Yes I can write C code to achieve the desired functionality.
  2. The resultant C code is usually ugly and horribly inefficient when compared to the intrinsic function of the processor.

Now in many cases, C compilers simply don’t give you access to these intrinsic functions, other than resorting to the inline assembler. Unfortunately, using the inline assembler causes a lot of problems. For example:

  1. It will often force the compiler to not optimize the enclosing function.
  2. It’s really easy to screw it up.
  3. It’s banned by most coding standards.

As a result, the intrinsic features can’t be used anyway. However, there are embedded compilers out there that support intrinsic functions. For example here’s how to swap nibbles using IAR’s AVR compiler:

foo = __swap_nibbles(bar);

There are several things to note about this:

  1. Because it’s a compiler intrinsic function, there are no issues with optimization.
  2. Similarly because one works with standard variable names, there is no particular likelihood of getting this wrong.
  3. Because it looks like a function call, there isn’t normally a problem with coding standards.

This then leads to one of the essential quandaries of embedded systems. Is it better to write completely standard (and hence presumably portable) C code, or should one take every advantage of neat features that are offered by your CPU (and if it is any good), your compiler?

I made my peace with this decision many years ago and fall firmly into the camp of take advantage of every neat feature offered by the CPU / compiler – even if it is non-standard. My rationale for doing so is as follows:

  1. Porting code from one CPU to another happens rarely. Thus to burden the bulk of systems with this mythical possibility seems weird to me.
  2. End users do not care. When was the last time you heard someone extoll the use of standard code in the latest widget? Instead end users care about speed, power and battery life. All things that can come about by having the most efficient code possible.
  3. It seems downright rude not to use those features that the CPU designer built in to the CPU just because some purist says I should not.

Having said this, I do of course understand completely if you are in the business of selling software components (e.g. an AES library), where using intrinsic / specialized instructions could be a veritable pain. However for the rest of the industry I say use those intrinsic functions! As always, let the debate begin.

 

Configuring hardware – part 1.

Saturday, November 13th, 2010 Nigel Jones

One of the more challenging tasks in embedded systems programming is configuring the hardware peripherals in a microcontroller. This task is challenging because:

  1. Some peripherals are stunningly complex. If you have ever configured the ATM controller on a PowerQUICC processor then you know what I mean!
  2. The documentation is often poor. See for example just about any LCD controller’s data sheet.
  3. The person configuring the hardware (i.e. me in my case) has an incomplete understanding of how the peripheral works.
  4. One often has to write the code before the hardware is available for testing.
  5. Manufacturer supplied example code is stunningly bad

I think I could extend this list a little further – but you get the idea. Anyway, I have struggled with this problem for many years. Now while it is impossible to come up with a methodology that guarantees correct results, I have come up with a system that really seems to make this task easier. In the first part of this series I will address the most elemental task – and that is how to set the requisite bits in the register.

By way of example, consider this register definition.

This is a control register for an ADC found in the MSP430 series of microcontrollers. The task at hand is how to write the code to set the desired bits. Now in some ways this is trivial. However if you are serious about your work, then your concern isn’t just setting the correct bits – but doing so in such a manner that it is crystal clear to someone else (normally a future version of yourself) as to what you have done – and why. With this as a premise, let’s look at some of the ways you can tackle this problem.

Magic Number

Probably the most common methodology I see is the magic number approach. For example:

ADC12CTL0 = 0x362C;

This method is an abomination. It’s error prone, and very difficult to maintain. Having said that, there is one case in which this approach is useful – and that’s when one wants to shutdown a peripheral. In which case I may use the construct:

ADC12CTL0 = 0;   /* Return register to its reset condition */

Other than that, I really can’t see any justification for this approach.

Bit Fields

Even worse than the magic number approach is to attempt to impose a bit field structure on to the register. While on first glance this may be appealing – don’t do it! Now while I think bitfields have their place, I certainly don’t recommend them for mapping on to hardware registers. The reason is that in a nutshell the C standard essentially allows the compiler vendor carte blanche in how they implement them. For a passionate exposition on this topic, see this comment on the aforementioned post. Anyway, this approach is so bad I refuse to give an example of it!

Defined fields – method 1

This method is quite good. The idea is that one defines the various fields. The definitions below are taken from an IAR supplied header file:

#define ADC12SC             (0x001)   /* ADC12 Start Conversion */
#define ENC                 (0x002)   /* ADC12 Enable Conversion */
#define ADC12TOVIE          (0x004)   /* ADC12 Timer Overflow interrupt enable */
#define ADC12OVIE           (0x008)   /* ADC12 Overflow interrupt enable */
#define ADC12ON             (0x010)   /* ADC12 On/enable */
#define REFON               (0x020)   /* ADC12 Reference on */
#define REF2_5V             (0x040)   /* ADC12 Ref 0:1.5V / 1:2.5V */
#define MSC                 (0x080)   /* ADC12 Multiple Sample Conversion */
#define SHT00               (0x0100)  /* ADC12 Sample Hold 0 Select 0 */
#define SHT01               (0x0200)  /* ADC12 Sample Hold 0 Select 1 */
#define SHT02               (0x0400)  /* ADC12 Sample Hold 0 Select 2 */
#define SHT03               (0x0800)  /* ADC12 Sample Hold 0 Select 3 */
#define SHT10               (0x1000)  /* ADC12 Sample Hold 1 Select 0 */
#define SHT11               (0x2000)  /* ADC12 Sample Hold 1 Select 1 */
#define SHT12               (0x4000)  /* ADC12 Sample Hold 2 Select 2 */
#define SHT13               (0x8000)  /* ADC12 Sample Hold 3 Select 3 */

With these definitions, one can now write code that looks something like this:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REFON + MSC;

However, there is a fundamental problem with this approach. To see what I mean, examine the comment associated with the define REF2_5V. You will notice that in this case, setting the bit to zero selects a 1.5V reference. Thus in my example code, I have implicitly set the reference voltage to 1.5V. If one examines the code at a later date, then it’s unclear if I intended to select a 1.5V reference – or whether I just forgot to select any reference – and ended up with the 1.5V by default. One possible way around this is to add the following definition:

#define REF1_5V             (0x000)   /* ADC12 Ref = 1.5V */

One can then write:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC;

Clearly this is an improvement. However there is nothing stopping you writing:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC + REF2_5V;

Don’t laugh – I have seen this done. There is also another problem with the way the fields have been defined, and that concerns the fields which are more than 1 bit wide. For example the field SHT0x is used to define the number of clock cycles the sample and hold should be active. It’s a 4 bit field, and thus has 16 possible combinations. If I need 13 clocks of sample and hold, then I have to write code that looks like this:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC + SHT00 + SHT02 + SHT03;

It’s not exactly clear from the above that I desire 13 clock samples on the sample and hold. Now clearly one can overcome this problem by having additional defines – and that’s precisely what IAR does. For example:

#define SHT0_0               (0*0x100u)
#define SHT0_1               (1*0x100u)
#define SHT0_2               (2*0x100u)
...
#define SHT0_15             (15*0x100u)

Now you can write:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC + SHT0_13;

However, if you use this approach you will inevitably end up confusing SHT00 and SHT0_0 – with disastrous and very frustrating results.

Defining Fields – method 2

In this method, one defines the bit position of the fields. Thus our definitions now look like this:

#define ADC12SC             (0)   /* ADC12 Start Conversion */
#define ENC                 (1)   /* ADC12 Enable Conversion */
#define ADC12TOVIE          (2)   /* ADC12 Timer Overflow interrupt enable */
#define ADC12OVIE           (3)   /* ADC12 Overflow interrupt enable */
#define ADC12ON             (4)   /* ADC12 On/enable */
#define REFON               (5)   /* ADC12 Reference on */
#define REF2_5V             (6)   /* ADC12 Ref */
#define MSC                 (7)   /* ADC12 Multiple Sample Conversion */
#define SHT0                (8)   /* ADC12 Sample Hold 0 */
#define SHT1                (12)  /* ADC12 Sample Hold 1 */

Our example configuration now looks like this:

ADCT12CTL0 = (1 << ADC12TOVIE) + (1 << ADC12ON) + (1 << REFON) + (0 << REF2_5V) + (1 << MSC) + (13 << SHT0);

Note that zero is given to the REF2_5V argument and 13 to the SHT0 argument. This was my preferred approach for a long time. However it had certain practical weaknesses:

  1. It relies upon the manifest constants being correct / me using the correct manifest constant. You only need to spend a few hours tracking down a bug that ends up being an incorrect #define to know how frustrating this can be.
  2. It still doesn’t really address the issue of fields that aren’t set. That is, was it my intention to leave them at zero, or was it an oversight?
  3. There is often a mismatch between what the compiler vendor calls a field and what appears in the data sheet. For example, the data sheet shows that the SHT0 field is called SHT0x. However the compiler vendor may choose to simply call this SHT0, or SHT0X etc. Thus I end up fighting compilation errors because of trivial naming mismatches.
  4. When debugging, I often end up looking at a window that tells me that ADC12CTL0 bit 6 is set – and I’m stuck trying to determine what that means. (I recognize that some debuggers will symbolically label the bits – however it isn’t universal).

Eschewing definitions

We now come to my preferred methodology. What I wanted was a method that has the following properties:

  1. It requires me to explicitly set / clear every bit.
  2. It is not susceptible to errors in definition / use of #defines.
  3. It allows easy interaction with a debugger.

This is what I ended up with:

ADC12CTL0 =
 (0u << 0) |        /* Don't start conversion yet */
 (0u << 1) |        /* Don't enable conversion yet */
 (1u << 2) |        /* Enable conversion-time-overflow interrupt */
 (0u << 3) |        /* Disable ADC12MEMx overflow-interrupt */
 (1u << 4) |        /* Turn ADC on */
 (1u << 5) |        /* Turn reference on */
 (0u << 6) |        /* Reference = 1.5V */
 (1u << 7) |        /* Automatic sample and conversion */
 (13u <<  8) |      /* Sample and hold of 13 clocks for channels 0-7 */
 (0u << 12);        /* Sample and hold of don't care clocks for channels 8-15 */

There are multiple things to note here:

  1. I have done away with the various #defines. At the end of the day, the hardware requires that bit 5 be set to turn the reference on. The best way to ensure that bit 5 is set is to explicitly set it. Now this thinking tends to fly in the face of conventional wisdom. However, having adopted this approach I have found it to be less error prone – and a lot easier to debug / maintain.
  2. Every bit position is explicitly set or cleared. This forces me to consider every bit in turn and decide what it’s appropriate value should be.
  3. The layout is important. By looking down the columns, I can check that I haven’t missed any fields. Just as important, many debuggers present the bit fields of a register as a column just like this. Thus it’s trivial to map what you see in the debugger to what you have written.
  4. The value being shifted has a ‘u’ appended to it. This keeps the MISRA folks happy – and it’s a good habit to get into.
  5. The comments are an integral part of this approach

There are still a few problems with this approach. This is what I have discovered so far:

  1. It can be tedious with a 32 bit register.
  2. Lint will complain about shifting zero (as it considers it pointless). It will also complain about shifting anything zero places (as it also considers it pointless). In which case you have to suppress these complaints. The following macros do the trick:
#define LINT_SUPPRESS(n)  /*lint --e{n} */
LINT_SUPPRESS(835)        /**< Inform Lint not to worry about zero being given as an argument to << */
LINT_SUPPRESS(845)        /**< Inform Lint not to worry about the right side of the | operator being zero */

In the next part of this article I will describe how one can extend this technique to make configuring peripherals a lot less painful.