embedded software boot camp

What’s in your main() header?

February 2nd, 2013 by Nigel Jones

One of the consequences of being in the consulting business is that I get to look at a lot of code written by other people. Usually it is necessary for me to get up to speed on the code as quickly as possible, and so to this end, one of the first things I do is look for main.c, or if it doesn’t exist the file that contains main(). Here’s what I usually find:

/********************************************************************************
 main.c
 Possibly a one line description.
 Legal notice. Sometimes many lines long.
 *********************************************************************************/

That’s it. Now maybe it’s just me, but I find this a bit inadequate. Before I describe what I put in the header for main.c, I should first note my motivation. Anyone that has written code for many years realizes that the whole point of writing code is to allow it to be maintained. The person maintaining the code may be a future version of yourself, but often is some poor sod who gets thrown a bunch of code and told to get on with it. As a result, it is imperative that this future maintainer be told as much as possible about what it is they are maintaining. Now I realize that a lot of what I describe below could be described elsewhere. However, it’s my experience that Word documents and other non source code related documents tend to get lost over time (or perhaps more accurately not packaged with the source code when it is given to someone else), and so by putting this information in main.c, you pretty much guarantee that the maintainer will receive the information. With this as a background, here’s what I think should be in the header for main.c.

A product description

I typically write somewhere between 10 and 100 lines of text describing the product, what it does, how it does it, what makes it unique and the things about it that make it difficult. Note I’m not describing the code. I always find this a challenge because it forces me to really get to the core of the product. I can sometimes take many hours on this stage, as I try to refine and precis my description to include as much useful information as possible. Who has this sort of time you ask? Well if you think about it, if you can’t write a concise, yet detailed description of the product, surely you aren’t ready to start writing code? Thus if you go through this exercise and find yourself stymied, then you simply shouldn’t be sitting in front of a text editor.

Text Editor settings

Talking of text editors, the next thing you should have is an entry that describes how you have your text editor configured. I’m not interested in getting into a discussion about what your text editor settings should be – I’d just like to know what they are so that I can configure my text editor to match. This is a critical step as there is no bigger time waster than trying to understand code that looks like a disaster because you used tabs with an indentation of 2 and my editor is using an indentation of 4.

Development Environment

The text editor is of course part of the larger development environment. While it’s obvious to you what build environment you are using, it isn’t to someone else. Thus if you are using an IDE from Keil then say so. Conversely if you are from the IDE’s are evil camp and instead rely upon makefiles, well make that clear as well. Note the presence of a makefile in the source code directory does not IMHO constitute adequate documentation that this is how you intend the code to be built.

Compiler make and version

Almost every project I look at fails to make it clear what compiler make and version the code was written for. This always blows me away, because I’ve never yet seen an embedded system that doesn’t rely on compiler specific facets for it to successfully compile. Thus you should spell out exactly what compiler you used – even if you don’t think it really matters much.

Libraries

If you are using libraries, particularly ones from a third party, then you should really be spelling this out and of course specifying what version of the library you used. If there are special licensing restrictions on the use of the library, then this isn’t a bad place to mention it either.

Other tools

If you use other tools, particularly code generation tools, then it would be really nice to let the reader know that your code relies upon tool X, version Y. If you are using make rather than an IDE, it would also be nice to let us all know what version of make you used.

CPU configuration

Many CPUs are configurable via fuse bits of some type (PICs and AVRs are prime examples). These configuration bits usually have a dramatic impact on how the CPU behaves, and so it is critical that you document what fuse bit settings you are assuming. It’s possible to waste many hours debugging a system that in fact has no code problems per se, but rather is simply misconfigured at the fuse bit level.

How to build

Finally, it would be really nice if you told everyone how to actually make the executable. I’m constantly amazed at the number of projects I see where either the method of building is unclear, or worse, the ‘obvious method’ (e.g. typing make) results in a build failure because prior to e.g. invoking make, it is necessary to run some batch file etc.

While I think there’s a lot more project specific information that can go in the header, I think the above is a pretty decent start. I’d be interested in hearing about other information that you put in your main.c header.

 

 

Real world variables

January 16th, 2013 by Nigel Jones

Part of what makes embedded systems fun for me is that they normally interact with the physical world. The physical world contains real parameters which we measure using transducers, signal conditioning circuits and so on, such that ultimately we end up with a variable in our embedded code that purports to represent this real world parameter. For example, we might have this:

uint16_t pressure;     /* Pressure */

Don’t laugh. I have seen this a million times. What’s wrong with this you ask? Well, when dealing with real world variables, it is crucial that you as the author of the code make crystal clear at least four things about the real world variable:

  1. What the parameter actually is.
  2. The units of the variable.
  3. The resolution of the variable, otherwise known as the value of a LSB.
  4. The dynamic range of the variable.

Identifying the parameter

Many embedded systems measure a multitude of real world parameters, many of which may be of the same ‘type’ (for example, ‘pressure’). Thus while it is blindingly obvious to you when you write the code which pressure you are thinking of, it most certainly isn’t to someone coming into the code cold. Even if there is only one measured pressure in your system, the chances are there are various flavors of it. For example, a common architecture when measuring real world variables is to:

  1. Have the raw value. This is typically the value resulting from converting the latest ADC reading. This raw value is then:
  2. Median filtered so as to eliminate egregious outliers. The median value is then:
  3. Low pass filtered so as to remove Gaussian noise.

There are thus at least three representations of the pressure in the system I have described. My preference is that the variable name identify which instance you are dealing with. However, if for various reasons this makes for unwieldy names then at the very least make sure the comment that accompanies the variable declaration spells it out. For example:

uint16_t pressure_co2_median;    /* Median filtered CO2 pressure */

Units

Failure to specify the units drives me up the wall. For example there are literally dozens of units of pressure, there are at least four units of temperature and a huge number of  units for speed. While it is of course obvious to you the author that you are measuring pressure in lbs/square inch, you are likely to find that the rest of world that works on the SI system probably aren’t even aware that such an arcane unit exists. The bottom line: specify your units. My example now becomes:

uint16_t pressure_co2_median;    /* Median filtered CO2 pressure. Units: bar */

Related to this is the requirement that the units are consistent with the variable name. For example, most of us would criticize a declaration that looks like this:

uint16_t pressure_co2_median;    /* Median filtered CO2 pressure. Units: Celsius */

This is clearly wrong. However, what about this declaration:

uint16_t symbol_rate;    /* Symbol rate reported by demodulator. Units: bps */

What’s confusing about this declaration is that a symbol rate should have units of symbols per second. For the special case, where there is one bit per symbol, the symbol rate does happen to equal the bit rate, which is usually measured in bits per second or bps. Thus upon seeing a declaration such as this, I’m left in a quandary, as the following are all possibilities:

  1. I’m dealing with the special case where the symbol rate and bit rate are the same.
  2. The author was sloppy in his commenting and meant to write ‘sps’ rather than ‘bps’ and so the variable genuinely does reflect a symbol rate.
  3. The author was sloppy in his variable naming, such that the variable actually represents a bit rate with units of bps.

That’s a lot of confusion to sow for failure to ensure that the variable name and its units are consistent.

Resolution

Closely allied with units is the resolution or scaling. In a nutshell what does 1LSB of the variable represent? Again I find that this all too important parameter is deemed obvious by the author of the code. While it is common that 1 LSB = 1 unit, systems that need to maximize dynamic range will often use a different scaling. For example: 1 LSB = 0.0125 bar. The bottom line: if you don’t specify what 1LSB represents you are really doing future readers of your code a major disservice. It’s common to incorporate the resolution and units together, such that our example now becomes:

uint16_t pressure_co2_median;    /* Median filtered CO2 pressure. 1 LSB = 0.0125 bar */

Dynamic range

The last thing that should be reported is the expected dynamic range of the variable. This covers a multitude of issues:

  1. What is the legal range of values? For example, consider the popular LM35 series of temperature sensors. These devices are inherently designed to report temperatures above zero Celsius. As such negative temperatures are not expected when using this sensor. Other sensors will of course have other physical limits which it’s important to report.
  2. What is the sensor offset? For example, absolute pressure sensors will typically have a non zero output when exposed to atmospheric pressure.

Thus it is essential that you specify the expected range. If you do so, the resolution can be implicit. However I still like to make it explicit. For example:

uint16_t pressure_co2_median;    /* Median filtered CO2 pressure. 
                                    Range 0x0000 - 0x3FFF = 1.000 - 204.7875 bar. 
                                    1 LSB = 0.0125 bar. */

I urge you to take a look at your code and see if you are doing this for your real world variables. If you aren’t then consider making some changes. Future generations will thank you.

2012 Explained – Toyota

December 27th, 2012 by Nigel Jones

Regular readers of this blog will no doubt have noticed the paltry number of articles posted by me this year. While there have been a number of contributing factors, one of the more significant has been Toyota. I have been part of the team that has spent a large part of 2012 examining the engine control module hardware and firmware for Toyota cars and light trucks in light of reports on unintended acceleration. Yesterday, a tentative settlement was announced in the class action lawsuit. Other litigation concerning personal injury is still pending. As a result of the pending litigation, together with various protective orders (confidentiality agreements) I can’t comment or provide any details. For those of you that are interested, here are some key links.

Settlement announced: Approximately 5pm EST on December 26th 2012

Statement by the plaintiffs.

Statement by Toyota.

Information about the settlement.

The proposed settlement

Update: The federal judge has given provisional approval to the settlement.

For a primer on Toyota’s engine control module, you can read the lengthy report published by NASA. Readers of this blog will find appendix A the most illuminating.

Anyway, it is my hope that I will be able to blog considerably more frequently in 2013.

 

All variables are equal, but some are more equal than others

July 31st, 2012 by Nigel Jones

With all due apologies to George Orwell for the title, I thought I’d offer a little tidbit on the practice of the following construct:

uint8_t a,b,c,d;
a = b = c = d = 0;

This code declares four variables (a,b,c,d) and sets them all equal to 0. The question is, is this a good, bad or indifferent practice? Well, I think it is an excellent practice in one very limited case, but otherwise should be avoided. Consider these two examples:

#define MAX_SPD (42U)
void fna(void)
{
uint8_t spd1 = MAX_SPD;
uint8_t spd2 = MAX_SPD;
...
}
void fnb(void)
{
uint8_t spd1, spd2;
spd1 = spd2 = MAX_SPD;
...
}

What difference, if any, is there between these two functions? Well clearly they both declare two variables and assign them the value MAX_SPD. However, I would suggest that there is a very subtle difference. In fna() there are two variables that happen to have the same initialized value, whereas in fnb() there are two variables that are initialized to the same value, which happens to be MAX_SPD. So what you ask? Well consider someone maintaining this code. For fna() all they know is that the two variables happen to be initialized to the same value, and thus a change such as this is perhaps a reasonable thing to do:

void fna(void)
{
uint8_t spd1 = MAX_SPD;
uint8_t spd2 = MAX_SPD - 1;
...
}

Conversely, for fnb() there is a barrier to and a subtle hint against making spd1 different from spd2. Thus if the algorithm requires that spd1 and spd2 always be initialized to the same value then the construct in fnb() is better. Conversely, if it is essentially happenstance that spd1 and spd2 share the same initial value then fna() is better.

To put it another way, if you find yourself using this construct to save on typing or lines of code then the chances are you are doing the wrong thing. Conversely if you find yourself doing this to impart a subtle hint then the chances are you are doing the right thing.

A comment on comments. One can (and should) argue that in the case of fnb() one should have a comment to the effect that spd1 and spd2 must be initialized to the same value. While I agree wholeheartedly, I always try and use code constructs that minimize reliance on someone actually reading the comments.

As a final thought. I have seem coding standards that ban this practice. If your coding standard does ban it, perhaps its time to revisit it?

 

 

What does 0x47u mean anyway?

July 21st, 2012 by Nigel Jones

In the last couple of years I have had a large number of folks end up on this blog as a result of search terms such as “what does 0X47u mean?” In an effort to make their visit more productive, I’ll explain and also offer some thoughts on the topic.

Back in the mists of time, it was considered perfectly acceptable to write code that looks like this:

unsigned int foo = 6;

Indeed I’m guessing that just about every C textbook out there has just such a construct somewhere in its first few chapters. So what’s wrong with this you ask? Well, according to the C90 semantics, constants by default are of type ‘signed int’. Thus the above line of code takes a signed int and assigns it to an unsigned int. Now not so many years ago, most people would have just shrugged and got on with the task of churning out code. However, the folks at MISRA looked askance at this practice (and correctly so IMHO), and promulgated rule 10.6:

“Rule 10.6 (required): A “U” suffix shall be applied to all constants of unsigned type.”

Now in the world of computing, unsigned types don’t seem to crop up much. However in the embedded arena, unsigned integers are extremely common. Indeed IMHO you should use them. For information on doing so, see here.

Thus what has happened as MISRA adoption has spread throughout the embedded world, is you are starting to see code that looks like this:

unsigned int foo = 6u;

So this brings me to the answer to the question posed in the title – what does 0x47u mean? It means that it is an unsigned hexadecimal constant of value 47 hex = 71 decimal. If the ‘u’ is omitted then it is a signed hexadecimal constant of value 47 hex.

Some observations

You actually have three ways that to satisfy rule 10.6. Here are examples of the three methods.

unsigned int foo = 6u;
unsigned int foo = 6U;
unsigned int foo = (unsigned int)6;

Let’s dispense with the third method first. I am not a fan of casting, mainly because casting makes code hard to read and can inadvertently cover up all sorts of coding mistakes. As a result, any methodology that results in increased casts in code is a bad idea. If that doesn’t convince you, then consider initializing an unsigned array using casts:

unsigned int bar[42] = {(unsigned int)89, (unsigned int)56, (unsigned int)12, ... };

The result is a lot of typing and a mess to read. Don’t do it!

What then of the first methods? Should you use a lower case ‘u’ or an upper case ‘U’. Well I have reluctantly come down in favor of using an upper case ‘U’. Aesthetically I think that the lower case ‘u’ works better, in that the lower case letter is less intrusive and keeps your eye on the digits (which after all is what’s really important). Here’s what I mean:

unsigned int bar[42] = {89u, 56u, 12u, ... };
unsigned int bar[42] = {89U, 56U, 12U, ... };

So why do I use upper case ‘U’? Well it’s because ‘U’ isn’t the only modifier that one can append to an integer constant. One can also append an ‘L’  or ‘l’ meaning that the constant is of type ‘long’. They can also be combined as in ‘UL’, ‘ul’, ‘LU’ or ‘lu’, to signify an unsigned long constant. The problem is that a lower case ‘l’ looks an awful lot like a ‘1’ in most editors. Thus if you write this:

long bar = 345l;

Is that 345L or 3451? To really see what I mean, try these examples in a standard text editor. Anyway as a result, I always use upper case ‘L’ to signify a long constant – and thus to be consistent I use an upper case ‘U’ for unsigned. I could of course use ‘uL’ – but that just looks weird to me.

Incidentally based upon the code I have looked at over the last decade or so, I’d say that I’m in the minority on this topic, and that more people use the lower case ‘u’. I’d be interested to know what the readers of this blog do – particularly if they have a reason for doing so rather than whim!