Archive for November, 2010

Configuring hardware – part 1.

Saturday, November 13th, 2010 Nigel Jones

One of the more challenging tasks in embedded systems programming is configuring the hardware peripherals in a microcontroller. This task is challenging because:

  1. Some peripherals are stunningly complex. If you have ever configured the ATM controller on a PowerQUICC processor then you know what I mean!
  2. The documentation is often poor. See for example just about any LCD controller’s data sheet.
  3. The person configuring the hardware (i.e. me in my case) has an incomplete understanding of how the peripheral works.
  4. One often has to write the code before the hardware is available for testing.
  5. Manufacturer supplied example code is stunningly bad

I think I could extend this list a little further – but you get the idea. Anyway, I have struggled with this problem for many years. Now while it is impossible to come up with a methodology that guarantees correct results, I have come up with a system that really seems to make this task easier. In the first part of this series I will address the most elemental task – and that is how to set the requisite bits in the register.

By way of example, consider this register definition.

This is a control register for an ADC found in the MSP430 series of microcontrollers. The task at hand is how to write the code to set the desired bits. Now in some ways this is trivial. However if you are serious about your work, then your concern isn’t just setting the correct bits – but doing so in such a manner that it is crystal clear to someone else (normally a future version of yourself) as to what you have done – and why. With this as a premise, let’s look at some of the ways you can tackle this problem.

Magic Number

Probably the most common methodology I see is the magic number approach. For example:

ADC12CTL0 = 0x362C;

This method is an abomination. It’s error prone, and very difficult to maintain. Having said that, there is one case in which this approach is useful – and that’s when one wants to shutdown a peripheral. In which case I may use the construct:

ADC12CTL0 = 0;   /* Return register to its reset condition */

Other than that, I really can’t see any justification for this approach.

Bit Fields

Even worse than the magic number approach is to attempt to impose a bit field structure on to the register. While on first glance this may be appealing – don’t do it! Now while I think bitfields have their place, I certainly don’t recommend them for mapping on to hardware registers. The reason is that in a nutshell the C standard essentially allows the compiler vendor carte blanche in how they implement them. For a passionate exposition on this topic, see this comment on the aforementioned post. Anyway, this approach is so bad I refuse to give an example of it!

Defined fields – method 1

This method is quite good. The idea is that one defines the various fields. The definitions below are taken from an IAR supplied header file:

#define ADC12SC             (0x001)   /* ADC12 Start Conversion */
#define ENC                 (0x002)   /* ADC12 Enable Conversion */
#define ADC12TOVIE          (0x004)   /* ADC12 Timer Overflow interrupt enable */
#define ADC12OVIE           (0x008)   /* ADC12 Overflow interrupt enable */
#define ADC12ON             (0x010)   /* ADC12 On/enable */
#define REFON               (0x020)   /* ADC12 Reference on */
#define REF2_5V             (0x040)   /* ADC12 Ref 0:1.5V / 1:2.5V */
#define MSC                 (0x080)   /* ADC12 Multiple Sample Conversion */
#define SHT00               (0x0100)  /* ADC12 Sample Hold 0 Select 0 */
#define SHT01               (0x0200)  /* ADC12 Sample Hold 0 Select 1 */
#define SHT02               (0x0400)  /* ADC12 Sample Hold 0 Select 2 */
#define SHT03               (0x0800)  /* ADC12 Sample Hold 0 Select 3 */
#define SHT10               (0x1000)  /* ADC12 Sample Hold 1 Select 0 */
#define SHT11               (0x2000)  /* ADC12 Sample Hold 1 Select 1 */
#define SHT12               (0x4000)  /* ADC12 Sample Hold 2 Select 2 */
#define SHT13               (0x8000)  /* ADC12 Sample Hold 3 Select 3 */

With these definitions, one can now write code that looks something like this:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REFON + MSC;

However, there is a fundamental problem with this approach. To see what I mean, examine the comment associated with the define REF2_5V. You will notice that in this case, setting the bit to zero selects a 1.5V reference. Thus in my example code, I have implicitly set the reference voltage to 1.5V. If one examines the code at a later date, then it’s unclear if I intended to select a 1.5V reference – or whether I just forgot to select any reference – and ended up with the 1.5V by default. One possible way around this is to add the following definition:

#define REF1_5V             (0x000)   /* ADC12 Ref = 1.5V */

One can then write:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC;

Clearly this is an improvement. However there is nothing stopping you writing:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC + REF2_5V;

Don’t laugh – I have seen this done. There is also another problem with the way the fields have been defined, and that concerns the fields which are more than 1 bit wide. For example the field SHT0x is used to define the number of clock cycles the sample and hold should be active. It’s a 4 bit field, and thus has 16 possible combinations. If I need 13 clocks of sample and hold, then I have to write code that looks like this:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC + SHT00 + SHT02 + SHT03;

It’s not exactly clear from the above that I desire 13 clock samples on the sample and hold. Now clearly one can overcome this problem by having additional defines – and that’s precisely what IAR does. For example:

#define SHT0_0               (0*0x100u)
#define SHT0_1               (1*0x100u)
#define SHT0_2               (2*0x100u)
...
#define SHT0_15             (15*0x100u)

Now you can write:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC + SHT0_13;

However, if you use this approach you will inevitably end up confusing SHT00 and SHT0_0 – with disastrous and very frustrating results.

Defining Fields – method 2

In this method, one defines the bit position of the fields. Thus our definitions now look like this:

#define ADC12SC             (0)   /* ADC12 Start Conversion */
#define ENC                 (1)   /* ADC12 Enable Conversion */
#define ADC12TOVIE          (2)   /* ADC12 Timer Overflow interrupt enable */
#define ADC12OVIE           (3)   /* ADC12 Overflow interrupt enable */
#define ADC12ON             (4)   /* ADC12 On/enable */
#define REFON               (5)   /* ADC12 Reference on */
#define REF2_5V             (6)   /* ADC12 Ref */
#define MSC                 (7)   /* ADC12 Multiple Sample Conversion */
#define SHT0                (8)   /* ADC12 Sample Hold 0 */
#define SHT1                (12)  /* ADC12 Sample Hold 1 */

Our example configuration now looks like this:

ADCT12CTL0 = (1 << ADC12TOVIE) + (1 << ADC12ON) + (1 << REFON) + (0 << REF2_5V) + (1 << MSC) + (13 << SHT0);

Note that zero is given to the REF2_5V argument and 13 to the SHT0 argument. This was my preferred approach for a long time. However it had certain practical weaknesses:

  1. It relies upon the manifest constants being correct / me using the correct manifest constant. You only need to spend a few hours tracking down a bug that ends up being an incorrect #define to know how frustrating this can be.
  2. It still doesn’t really address the issue of fields that aren’t set. That is, was it my intention to leave them at zero, or was it an oversight?
  3. There is often a mismatch between what the compiler vendor calls a field and what appears in the data sheet. For example, the data sheet shows that the SHT0 field is called SHT0x. However the compiler vendor may choose to simply call this SHT0, or SHT0X etc. Thus I end up fighting compilation errors because of trivial naming mismatches.
  4. When debugging, I often end up looking at a window that tells me that ADC12CTL0 bit 6 is set – and I’m stuck trying to determine what that means. (I recognize that some debuggers will symbolically label the bits – however it isn’t universal).

Eschewing definitions

We now come to my preferred methodology. What I wanted was a method that has the following properties:

  1. It requires me to explicitly set / clear every bit.
  2. It is not susceptible to errors in definition / use of #defines.
  3. It allows easy interaction with a debugger.

This is what I ended up with:

ADC12CTL0 =
 (0u << 0) |        /* Don't start conversion yet */
 (0u << 1) |        /* Don't enable conversion yet */
 (1u << 2) |        /* Enable conversion-time-overflow interrupt */
 (0u << 3) |        /* Disable ADC12MEMx overflow-interrupt */
 (1u << 4) |        /* Turn ADC on */
 (1u << 5) |        /* Turn reference on */
 (0u << 6) |        /* Reference = 1.5V */
 (1u << 7) |        /* Automatic sample and conversion */
 (13u <<  8) |      /* Sample and hold of 13 clocks for channels 0-7 */
 (0u << 12);        /* Sample and hold of don't care clocks for channels 8-15 */

There are multiple things to note here:

  1. I have done away with the various #defines. At the end of the day, the hardware requires that bit 5 be set to turn the reference on. The best way to ensure that bit 5 is set is to explicitly set it. Now this thinking tends to fly in the face of conventional wisdom. However, having adopted this approach I have found it to be less error prone – and a lot easier to debug / maintain.
  2. Every bit position is explicitly set or cleared. This forces me to consider every bit in turn and decide what it’s appropriate value should be.
  3. The layout is important. By looking down the columns, I can check that I haven’t missed any fields. Just as important, many debuggers present the bit fields of a register as a column just like this. Thus it’s trivial to map what you see in the debugger to what you have written.
  4. The value being shifted has a ‘u’ appended to it. This keeps the MISRA folks happy – and it’s a good habit to get into.
  5. The comments are an integral part of this approach

There are still a few problems with this approach. This is what I have discovered so far:

  1. It can be tedious with a 32 bit register.
  2. Lint will complain about shifting zero (as it considers it pointless). It will also complain about shifting anything zero places (as it also considers it pointless). In which case you have to suppress these complaints. The following macros do the trick:
#define LINT_SUPPRESS(n)  /*lint --e{n} */
LINT_SUPPRESS(835)        /**< Inform Lint not to worry about zero being given as an argument to << */
LINT_SUPPRESS(845)        /**< Inform Lint not to worry about the right side of the | operator being zero */

In the next part of this article I will describe how one can extend this technique to make configuring peripherals a lot less painful.

Subscribing to comments

Thursday, November 11th, 2010 Nigel Jones

I heard from Jeff Gros the other day asking if it’s possible to subscribe to all the comments posted on this blog. Given the quality of the comments that are posted here, I thought it was an excellent request. Anyway, the answer is yes.  Just follow this link.

Median Filter Performance Results

Tuesday, November 9th, 2010 Nigel Jones

In my earlier post on median filtering I made the claim that for filter sizes of 3, 5 or 7 that using a simple insertion sort is ‘better’ than using Phil Ekstrom’s technique.  It occurred to me that this claim was based upon my testing with 8 bit processors quite a few years ago, and that the results might not be valid for 32 bit processors with their superior pointer manipulation.  Accordingly I ran some bench marks comparing an insertion sort based approach with Ekstrom’s method.

The procedure was as follows:

  1. I generated an array of random integers on the interval 900 – 1000. The idea is that these would represent data from a typical 10 bit ADC found on many microcontrollers.
  2. I then put together a base line project which performed all the basic house keeping functions, but without performing any filtering. The idea was to try and get a feel for the non-algorithm specific overhead.
  3. I then put together a project which median filtered using an insertion sort, for sizes, 3, 5, 7, 9, 11, and 13. Note that I elected to take a copy of the data prior to sorting. See this comment thread for a discussion of whether this is necessary or not.
  4. I put together another project which median filtered using Ekstrom’s method.
  5. I compiled the above for an ARM Cortex M3 target using an IAR compiler with full speed optimization.

The results were a clear win for Ekstrom. His code size was 132 bytes versus 224. His code was 5%, 32%, 61%, 89%,113% and 146% faster than the insertion sort for filters sizes of 3, 5, 7, 9, 11 and 13 respectively. To be fair to the insertion sort technique, I have made no effort to optimize it. Notwithstanding this, I think I can say that for 32 bit targets, you may as well just use Ekstrom’s approach for all filter sizes.

I’ll endeavor to update this post with results for a 16 bit target (MSP430) in the next few days.

Well I finally got around to running the tests on an MSP430 target. In this case Ekstrom’s method produced a larger code size (186 bytes versus 160). Much to my surprise, Ekstrom’s method was dramatically superior to the insertion sort approach, with speeds of 69% faster for a filter size of 3, going up to a whopping 250% faster with a filter size of 13.  The bottom line: I think my original claim is bunk. Use Ekstrom’s method by default!