embedded software boot camp

Hardware vs. firmware naming conventions

March 28th, 2010 by Nigel Jones

Today’s post is motivated in part by Gary Stringham. Gary is the newest member of EmbeddedGurus and he consults and blogs on what he calls the bridge between hardware and firmware. Since I work on both hardware and firmware, I’m looking forward to what Gary has to say in the coming months. Anyway, I’d recently read his posting on Early hardware / firmware collaboration when I found myself looking at a fairly complex schematic. The microprocessor had a lot of IO pins, most of which were being used. When I looked at the code to gain insight on how some of the IO was being used I found that the hardware engineer and firmware engineer had adopted completely different naming conventions. For example, what appeared on the schematic as “Relay 6” appeared in the code as “ALARM_RELAY_2”. As a result the only way I could reconcile the schematic and the code was to look at a signal’s port pin assignment on the schematic and then search the code to see what name was associated with that port pin. After I’d done this a few times, I realized I needed a more systematic approach and ended up going through all the port pin assignments in the code and using them to hand mark up the schematic. Clearly this was not only a colossal time waster, it also had the potential for introducing stupid bugs.

So how had this come about? Well if you have ever designed hardware, you will know that naming nets is essentially optional. In other words one can create a perfectly correct schematic without naming any of the nets. Instead all you have to do is ensure that their connectivity is correct. (This is loosely analogous in firmware to referring to variables via their absolute addresses instead of assigning a name to the variable and using it. However, the consequences for the hardware design are nowhere near as dire). Furthermore, if the engineer does decide to name a net, then in most schematic packages I’ve seen, one is free to use virtually any combination of characters. For example “~LED A” would be a perfectly valid net name – but is most definitely not a valid C variable name. If one throws in the usual issue of numbering things from zero or one (should the first of four LED’s be named LED0 or LED1?), together with hardware engineer’s frequent (and understandable) desire to indicate if a signal is active low or active high by using some form of naming convention, then one has the recipe for a real mess.

So what’s to be done? Well here are my suggestions:

  1. The hardware team should have a rigorously enforced naming standards convention (in much the same way that most companies have a coding standards manual).
  2. All nets that are used by firmware must be named on the schematic.
  3. The net names must adhere to the C standard for naming variables.
  4. The firmware must use the identical name to that appearing on the schematic.

Clearly this can be facilitated by having very early meetings between the hardware and firmware teams, such that when the first version of the schematic is released, there is complete agreement on the net names. If you read Gary’s blog post you’ll see that this is his point too – albeit in a slightly different field.
Home

Reading a register for its side effects in C and C++

March 15th, 2010 by Nigel Jones

Although today’s post is the first real post on the new EmbeddedGurus, it’s special for another reason. This post is being jointly written with John Regehr. John is an Associate Professor of Computer Science at the University of Utah and maintains an excellent blog, Embedded in Academia which I heartily recommend. This blog posting grew out of a lengthy email exchange which started with John alerting me to some blatant plagiarism of my work and then evolved (dissolved?) into what you find here. John is also posting this article on his blog.

Anyway, enough preamble, on to the topic at hand.

Once in awhile one finds oneself having to read a device register, but without needing nor caring what the value of the register is. A typical scenario is as follows. You have written some sort of asynchronous communications driver. The driver is set up to generate an interrupt upon receipt of a character. In the ISR, the code first of all examines a status register to see if the character has been received correctly (e.g. no framing, parity or overrun errors). If an error has occurred, what should the code do? Well, in just about every system we have worked on, it is necessary to read the register that contains the received character — even though the character is useless. If you don’t perform the read, then you will almost certainly get an overrun error on the next character. Thus you find yourself in the position of having to read a register even though its value is useless. The question then becomes, how does one do this in C? In the following examples, assume that SBUF is the register holding the data to be discarded and that SBUF is understood to be volatile. The exact semantics of the declaration of SBUF vary from compiler to compiler.

If you are programming in C and if your compiler correctly supports the volatile qualifier, then this simple code suffices:

void cload_reg1 (void)
{
   SBUF;
}

This certainly looks a little strange, but it is completely legal C and should generate the requisite read, and nothing more. For example, at the -Os optimization level, the MSP430 port of GCC gives this code:

cload_reg1:
    mov &SBUF, r15
    ret

Unfortunately, there are two practical problems with this C code. First, quite a few C compilers incorrectly translate this code, although the C standard gives it an unambiguous meaning. We tested the code on a variety of general-purpose and embedded compilers, and present the results below. These results are a little depressing.

The second problem is even scarier. The problem is that the C++ standard is not 100% clear about what the code above means. On one hand, the standard says this:

In general, the semantics of volatile are intended to be the same in C++ as they are in C.

A number of C++ compilers, including GCC and LLVM, generate the same code for cload_reg1() when compiling in C++ mode as they do in C mode. On the other hand, several high-quality C++ compilers, such as those from ARM, Intel, and IAR, turn the function cload_reg1() into object code that does nothing. We discussed this issue with people from the compiler groups at Intel and IAR, and both gave essentially the same response. Here we quote (with permission) from the Intel folks:

The operation that turns into a load instruction in the executable code is what the C++ standard calls the lvalue-to-rvalue conversion; it converts an lvalue (which identifies an object, which resides in memory and has an address) into an rvalue (or just value; something whose address can’t be taken and can be in a register). The C++ standard is very clear and explicit about where the lvalue-to-rvalue conversion happens. Basically, it happens for most operands of most operators – but of course not for the left operand of assignment, or the operand of unary ampersand, for example. The top-level expression of an expression statement, which is of course not the operand of any operator, is not a context where the lvalue-to-rvalue conversion happens.

In the C standard, the situation is somewhat different. The C standard has a list of the contexts where the lvalue-to-rvalue conversion doesn’t happen, and that list doesn’t include appearing as the expression in an expression-statement.

So we’re doing exactly what the various standards say to do. It’s not a matter of the C++ standard allowing the volatile reference to be optimized away; in C++, the standard requires that it not happen in the first place.

We think the last sentence sums it up beautifully. How many readers were aware that the semantics for the volatile qualifier are significantly different between C and C++? The additional implication is that as shown below, GCC, the Microsoft compiler, and Open64, when compiling C++ code, are in error.

We asked about this on the GCC mailing list and received only one response which was basically “Why should we change the semantics, since this will break working code?” This is a fair point. Frankly speaking, the semantics of volatile in C are a bit of mess and C++ makes the situation much worse by permitting reasonable people to interpret it in two totally different ways.

Experimental Results

To test C and C++ compilers, we compiled the following two functions to object code at a reasonably high level of optimization:

extern volatile unsigned char foo;
void cload_reg1 (void)
{
   foo;
}
void cload_reg2 (void)
{
   volatile unsigned char sink;
   sink = foo;
}

For embedded compilers that have built-in support for accessing hardware registers, we tested two additional functions where as above, SBUF is understood to be a hardware register defined by the semantics of the compiler under test:

void cload_reg3 (void)
{
   SBUF;
}

void cload_reg4 (void)
{
   volatile unsigned char sink;
   sink = SBUF;
}

The results were as follows.

GCC

We tested version 4.4.1, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os. The C compiler loads from foo in both cload_reg1() and cload_reg2() . No warnings are generated. The C++ compiler shows the same behavior as the C compiler.

Intel Compiler

We tested icc version 11.1, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os. The C compiler emits code loading from foo for both cload_reg1() and cload_reg2(), without giving any warnings. The C++ compiler emits a warning “expression has no effect” for cload_reg1() and this function does not load from foo. cload_reg2() does load from foo and gives no warnings.

Sun Compiler

We tested suncc version 5.10, hosted on x86 Linux and also targeting x86 Linux, using optimization level -O. The C compiler does not load from foo in cload_reg1(), nor does it emit any warning. It does load from foo in cload_reg2(). The C++ compiler has the same behavior as the C compiler.

x86-Open64

We tested opencc version 4.2.3, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os. The C compiler does not load from foo in cload_reg1(), nor does it emit any warning. It does load from foo in cload_reg2(). The C++ compiler has the same behavior as the C compiler.

LLVM / Clang

We tested subversion rev 98508, which is between versions 2.6 and 2.7, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os. The C compiler loads from foo in both cload_reg1() and cload_reg2() .
A warning about unused value is generated for cload_reg1(). The C++ compiler shows the same behavior as the C compiler.

CrossWorks for MSP430

We tested version 2.0.8.2009062500.4974, hosted on x86 Linux, using optimization level -O. This compiler supports only C. foo was not loaded in cload_reg1(), but it was loaded in cload_reg2().

IAR for AVR

We tested version 5.30.6.50191, hosted on Windows XP, using maximum speed optimization. The C compiler performed the load in all four cases. The C++ compiler did not perform the load for cload_reg1() or cload_reg3(),
but did for cload_reg2() and cload_reg4().

Keil 8051

We tested version 8.01, hosted on Windows XP, using optimization level 8, configured to favor speed. The Keil compiler failed to generate the required load in cload_reg1() (but did give at least give a warning), yet did perform the load in all other cases including cload_reg3() suggesting that for the Keil compiler, its IO register (SFR) semantics are treated differently to volatile variable semantics.

HI-TECH for PIC16

We tested version 9.70, hosted on Windows XP, using Global optimization level 9, configured to favor speed. This was very interesting in that the results were almost a mirror image to the Keil compiler. In this case the load was performed in all cases except cload_reg3(). Thus the HI-TECH semantics for IO registers and volatile variables also appears to be different – just the opposite to Keil! No warnings was generated by the Hi-TECH compiler when it failed to generate code.

Microchip Compiler for PIC18

We tested version 3.35, hosted on Windows XP, using full optimization level. This rounded out the group of embedded compilers quite nicely in that it didn’t perform the load in either cload_reg1() or cload_reg3() – but did in the rest. It also failed to warn about the statements having no effect. This was the worst performing of all the compilers we tested.

Summary

The level of non-conformance with the C compilers, together with the genuine uncertainty as to what the C++ compilers should do provides a real quandary. If you need the most efficient code possible, then you have no option other than to investigate what your compiler does. If you are looking for a generally reliable and portable solution, then the methodology in cload_reg2() is probably your best bet. However it would be just that: a bet. Naturally, we (and the other readers of this blog) would be very interested to hear what your compiler does. So if you have a few minutes, please run the sample code through your compiler and let us know the results.

Acknowledgments

We’d like to thank Hans Boehm at HP, Arch Robison at Intel, and the compiler groups at both Intel and IAR for their valuable feedback that helped us construct this post. Any mistakes are, of course, ours.
Home

Welcome to the new stack-overflow!

March 3rd, 2010 by Nigel Jones

Regular visitors will no doubt have noticed a rather dramatic change to the visual appearance of this blog. EmbeddedGurus has grown dramatically in the last year and so we’ve moved to a better platform (WordPress) to manage the growth. Although the switch over from Blogger has been relatively painless, it’s still necessary for me to manually check all my previous posts making sure they are right. I should have this done in the next few days at which point I will resume regular blogging.

If you’ve posted a comment to my blog in the last week or so it may not have made the transition – for which I apologize.
Home

So you want to be an independent contractor?

February 19th, 2010 by Nigel Jones

Today’s post is motivated by the events that happened yesterday in Austin, Texas. For my overseas visitors, a software engineer, Joe Stack, decided to fly his light aircraft into an office building that housed the regional offices of the IRS (the American tax office). He created tremendous damage and likely murdered at least one person, while killing himself. Notwithstanding that I wrote just a few weeks ago about the propensity for engineers to be involved in terrorist acts, what is relevant about this news item is that it appears that Joe Stack’s principal complaint concerned a portion of the US tax code (via Andrew Leonard) that applies almost uniquely to consultants / independent contractors in the software / firmware field.

So while this isn’t a tax advice blog, I thought I’d weigh in on the issue, since it’s something that applies to me, and indeed anyone thinking of becoming a consultant / independent contractor in the USA.

The main issue revolves around who is an employee and who is an independent contractor. From a tax perspective this is an important distinction, because companies can avoid a lot of overhead by classifying employees as independent contractors. For example, employers avoid paying the employer contribution to social security, which for a typical engineer in the USA was around US$7000 per person in 2009. Instead the independent contractor is responsible for this payment. Conversely, independent contractors get some benefits that employees do not. For example an independent contractor can normally deduct from his taxable income the cost of travel to and from a client’s office.

Now whether one prefers employee or independent status is of course a matter of income levels and personal preference. However, what is crucial is that one not fall some where in the middle – because if you do you stand the risk of being re-classified by the IRS – at which point the tax bills can start getting very large for everyone involved. This falling in the middle tends to occur when someone is classified as an independent by the company – but acts like an employee. That is they work the same hours, and do the same work at the same time in the same location as someone who is an employee. If this describes you, or it describes a job that you are considering, then I suggest you read on.

Note that in the following, I have assumed that you want to be an independent contractor. If you are classified this way and instead want to be an employee, then do the opposite of what is advised!

Time
An independent contractor must be free to set their own work hours. Although it is OK for an organization to say you can’t work, e.g. after 9 pm or before 6 am, it is not OK for them to specify your exact work hours. Furthermore, it is important that you exercise this right. For example if you are required to be on site 40 hours a week, then to preserve your independent contractor status it would be smart to work e.g. four 10 hour days, rather than the normal five 8 hour days. I also recommend that you strive to get the right to work from your home office for a certain percentage of the time. This helps establish your home office as a bona fide work place while simultaneously bolstering your independent status.

Tools
An independent contractor is normally expected to provide their own tools. Now clearly you are unlikely to own a $25,000 spectrum analyzer. However as an independent contractor it is certainly reasonable that you provide your own computer and other tools such as compilers, email clients etc. The problem with this is that it often clashes with a companies IT policy. When this happens I strongly suggest that you sit down with the various parties (HR, IT, your recruiting manager etc) to address the issue. There are various options available, but the bottom line is you need to protect your status – and the company (if it’s on the ball) will want to do the same.

Multiple Clients
The final way in which I handle this issue is by having multiple concurrent clients. This not only helps you meet the time requirement, but it also strongly reinforces the fact that you are free to work for whom you want, when you want – almost the definition of an independent contractor.

Well that’s my practical guide to not falling afoul of the IRS rules. Hopefully for those of you contemplating going into the consulting business you’ll have found it useful.

I’ll be returning to my more normal fare with my next post. As a heads up, embedded-gurus is undergoing a major face-lift over the next few weeks, which may impact not only my posting schedule, but also all the other bloggers here.
Home

Efficient C Tip #11 – Avoid passing parameters by using more small functions

February 6th, 2010 by Nigel Jones

This is the eleventh in a series of tips on writing efficient C for embedded systems. Today’s topic will, I suspect, be slightly controversial. This post is based upon two basic observations:

  1. Passing parameters to functions is costly.
  2. Conditional branch instructions can be very costly on CPUs that have instruction caches (even with branch prediction).

I don’t think that too many people will disagree with me on the above. Despite this I too often see a style of coding that incurs these costs unnecessarily. I think it’s best illustrated by a (real world) example. The issue is one that will be familiar to most of you.  An embedded system contains a number of discrete LEDs (say 3), and the requirement is to write some code to allow higher level code to either turn on, turn off, or toggle a particular LED. The way I often see this coded is as follows:

typedef enum
{
 LED1, LED2, LED3
} LED_NO;
typedef enum
{
 LED_OFF, LED_ON, LED_TOGGLE
} LED_ACTION;
void led(LED_NO led_no, LED_ACTION led_action)
{
 switch (led_no)
 {
  case LED1:
   switch (led_action)
   {
    case LED_OFF:
     PORTB_PORTB0 = 0;
     break;
    case LED_ON:
     PORTB_PORTB0 = 1;
     break;
    case LED_TOGGLE:
     PORTB_PORTB0 ^= 1;
     break;
    default:
     break;
   }
  break;
 case LED2:
  ...
}

So what’s wrong with this you ask? Well in a nutshell the parameters passed to the function are used strictly to control the order of execution. There is no code common to any pair or group of parameters. When faced with a situation such as this, I instead implement the code as a large number of very small functions. For example:

void led1_Off(void)
{
 PORTB_PORTB1 = 0;
}
void led1_On(void)
{
 PORTB_PORTB1 = 1;
}
void led1_Toggle(void)
{
 PORTB_PORTB1 ^= 1;
}
...

Let’s compare the two approaches.

Efficiency

This blog posting is supposedly about efficiency, so let’s start with the results. I coded these two approaches up together with a main() function that exercised all 9 possible combination’s. I then turned full speed optimization on and looked at the results for an AVR processor.

Single function approach: 78 bytes for main(), 94 bytes for the LED code. Execution time 208 cycles.

Multiple function approach: 42 bytes for main(), 54 bytes for the LED code. Execution time 96 cycles.

Clearly my approach is significantly more efficient.

Usability

By usability I’m referring to the case where someone else needs to use your code. They know they need to say toggle LED2 so they hunt around and find the file led.h. The question is, once they have opened up led.h, how quickly can they determine what they have to do in order to toggle LED2? In the single function case they are presented with just one function (which is a plus), but then they have to locate the enumerations and work out the parameters that need to be passed to the function (which is a minus). In the multiple function case, they have to search through a list of functions looking for the correct one. However once they have found it, it’s very clear what the function does.

For me, I think it is a toss up between the two approaches as to which is more usable.

Maintainability

In this case the multiple function approach is the big winner. To see how this is, consider what happens to the single function case when one adds an LED or adds an action. The single function case just explodes in size, whereas with the multi-function approach one simply adds more very simple functions.

Conclusions

If you buy my analysis then clearly the multi-function approach is superior in both efficiency and maintainability – two areas that are dear to my heart. Now granted this is a fairly extreme example. However in my experience if you look through a reasonable amount of code you will soon discover a function that essentially does one thing or another based upon a function parameter. When you locate such a function you might want to try breaking it into two functions in the manner described here – I think you’ll be pleased with the results.

Next Tip

Previous Tip

****
As the readership of this blog has grown I must say I have been really impressed with the many insightful comments that have been posted. I know I learn a lot from them, and so I suspect, do a lot of the other readers. Thus for those of you that have commented in the past – thank you. For those of you yet to post a comment, I encourage you to take the plunge!

Home