embedded software boot camp

Configuring hardware – part 1.

November 13th, 2010 by Nigel Jones

One of the more challenging tasks in embedded systems programming is configuring the hardware peripherals in a microcontroller. This task is challenging because:

  1. Some peripherals are stunningly complex. If you have ever configured the ATM controller on a PowerQUICC processor then you know what I mean!
  2. The documentation is often poor. See for example just about any LCD controller’s data sheet.
  3. The person configuring the hardware (i.e. me in my case) has an incomplete understanding of how the peripheral works.
  4. One often has to write the code before the hardware is available for testing.
  5. Manufacturer supplied example code is stunningly bad

I think I could extend this list a little further – but you get the idea. Anyway, I have struggled with this problem for many years. Now while it is impossible to come up with a methodology that guarantees correct results, I have come up with a system that really seems to make this task easier. In the first part of this series I will address the most elemental task – and that is how to set the requisite bits in the register.

By way of example, consider this register definition.

This is a control register for an ADC found in the MSP430 series of microcontrollers. The task at hand is how to write the code to set the desired bits. Now in some ways this is trivial. However if you are serious about your work, then your concern isn’t just setting the correct bits – but doing so in such a manner that it is crystal clear to someone else (normally a future version of yourself) as to what you have done – and why. With this as a premise, let’s look at some of the ways you can tackle this problem.

Magic Number

Probably the most common methodology I see is the magic number approach. For example:

ADC12CTL0 = 0x362C;

This method is an abomination. It’s error prone, and very difficult to maintain. Having said that, there is one case in which this approach is useful – and that’s when one wants to shutdown a peripheral. In which case I may use the construct:

ADC12CTL0 = 0;   /* Return register to its reset condition */

Other than that, I really can’t see any justification for this approach.

Bit Fields

Even worse than the magic number approach is to attempt to impose a bit field structure on to the register. While on first glance this may be appealing – don’t do it! Now while I think bitfields have their place, I certainly don’t recommend them for mapping on to hardware registers. The reason is that in a nutshell the C standard essentially allows the compiler vendor carte blanche in how they implement them. For a passionate exposition on this topic, see this comment on the aforementioned post. Anyway, this approach is so bad I refuse to give an example of it!

Defined fields – method 1

This method is quite good. The idea is that one defines the various fields. The definitions below are taken from an IAR supplied header file:

#define ADC12SC             (0x001)   /* ADC12 Start Conversion */
#define ENC                 (0x002)   /* ADC12 Enable Conversion */
#define ADC12TOVIE          (0x004)   /* ADC12 Timer Overflow interrupt enable */
#define ADC12OVIE           (0x008)   /* ADC12 Overflow interrupt enable */
#define ADC12ON             (0x010)   /* ADC12 On/enable */
#define REFON               (0x020)   /* ADC12 Reference on */
#define REF2_5V             (0x040)   /* ADC12 Ref 0:1.5V / 1:2.5V */
#define MSC                 (0x080)   /* ADC12 Multiple Sample Conversion */
#define SHT00               (0x0100)  /* ADC12 Sample Hold 0 Select 0 */
#define SHT01               (0x0200)  /* ADC12 Sample Hold 0 Select 1 */
#define SHT02               (0x0400)  /* ADC12 Sample Hold 0 Select 2 */
#define SHT03               (0x0800)  /* ADC12 Sample Hold 0 Select 3 */
#define SHT10               (0x1000)  /* ADC12 Sample Hold 1 Select 0 */
#define SHT11               (0x2000)  /* ADC12 Sample Hold 1 Select 1 */
#define SHT12               (0x4000)  /* ADC12 Sample Hold 2 Select 2 */
#define SHT13               (0x8000)  /* ADC12 Sample Hold 3 Select 3 */

With these definitions, one can now write code that looks something like this:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REFON + MSC;

However, there is a fundamental problem with this approach. To see what I mean, examine the comment associated with the define REF2_5V. You will notice that in this case, setting the bit to zero selects a 1.5V reference. Thus in my example code, I have implicitly set the reference voltage to 1.5V. If one examines the code at a later date, then it’s unclear if I intended to select a 1.5V reference – or whether I just forgot to select any reference – and ended up with the 1.5V by default. One possible way around this is to add the following definition:

#define REF1_5V             (0x000)   /* ADC12 Ref = 1.5V */

One can then write:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC;

Clearly this is an improvement. However there is nothing stopping you writing:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC + REF2_5V;

Don’t laugh – I have seen this done. There is also another problem with the way the fields have been defined, and that concerns the fields which are more than 1 bit wide. For example the field SHT0x is used to define the number of clock cycles the sample and hold should be active. It’s a 4 bit field, and thus has 16 possible combinations. If I need 13 clocks of sample and hold, then I have to write code that looks like this:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC + SHT00 + SHT02 + SHT03;

It’s not exactly clear from the above that I desire 13 clock samples on the sample and hold. Now clearly one can overcome this problem by having additional defines – and that’s precisely what IAR does. For example:

#define SHT0_0               (0*0x100u)
#define SHT0_1               (1*0x100u)
#define SHT0_2               (2*0x100u)
...
#define SHT0_15             (15*0x100u)

Now you can write:

ADCT12CTL0 = ADC12TOVIE + ADC12ON + REF1_5V + REFON + MSC + SHT0_13;

However, if you use this approach you will inevitably end up confusing SHT00 and SHT0_0 – with disastrous and very frustrating results.

Defining Fields – method 2

In this method, one defines the bit position of the fields. Thus our definitions now look like this:

#define ADC12SC             (0)   /* ADC12 Start Conversion */
#define ENC                 (1)   /* ADC12 Enable Conversion */
#define ADC12TOVIE          (2)   /* ADC12 Timer Overflow interrupt enable */
#define ADC12OVIE           (3)   /* ADC12 Overflow interrupt enable */
#define ADC12ON             (4)   /* ADC12 On/enable */
#define REFON               (5)   /* ADC12 Reference on */
#define REF2_5V             (6)   /* ADC12 Ref */
#define MSC                 (7)   /* ADC12 Multiple Sample Conversion */
#define SHT0                (8)   /* ADC12 Sample Hold 0 */
#define SHT1                (12)  /* ADC12 Sample Hold 1 */

Our example configuration now looks like this:

ADCT12CTL0 = (1 << ADC12TOVIE) + (1 << ADC12ON) + (1 << REFON) + (0 << REF2_5V) + (1 << MSC) + (13 << SHT0);

Note that zero is given to the REF2_5V argument and 13 to the SHT0 argument. This was my preferred approach for a long time. However it had certain practical weaknesses:

  1. It relies upon the manifest constants being correct / me using the correct manifest constant. You only need to spend a few hours tracking down a bug that ends up being an incorrect #define to know how frustrating this can be.
  2. It still doesn’t really address the issue of fields that aren’t set. That is, was it my intention to leave them at zero, or was it an oversight?
  3. There is often a mismatch between what the compiler vendor calls a field and what appears in the data sheet. For example, the data sheet shows that the SHT0 field is called SHT0x. However the compiler vendor may choose to simply call this SHT0, or SHT0X etc. Thus I end up fighting compilation errors because of trivial naming mismatches.
  4. When debugging, I often end up looking at a window that tells me that ADC12CTL0 bit 6 is set – and I’m stuck trying to determine what that means. (I recognize that some debuggers will symbolically label the bits – however it isn’t universal).

Eschewing definitions

We now come to my preferred methodology. What I wanted was a method that has the following properties:

  1. It requires me to explicitly set / clear every bit.
  2. It is not susceptible to errors in definition / use of #defines.
  3. It allows easy interaction with a debugger.

This is what I ended up with:

ADC12CTL0 =
 (0u << 0) |        /* Don't start conversion yet */
 (0u << 1) |        /* Don't enable conversion yet */
 (1u << 2) |        /* Enable conversion-time-overflow interrupt */
 (0u << 3) |        /* Disable ADC12MEMx overflow-interrupt */
 (1u << 4) |        /* Turn ADC on */
 (1u << 5) |        /* Turn reference on */
 (0u << 6) |        /* Reference = 1.5V */
 (1u << 7) |        /* Automatic sample and conversion */
 (13u <<  8) |      /* Sample and hold of 13 clocks for channels 0-7 */
 (0u << 12);        /* Sample and hold of don't care clocks for channels 8-15 */

There are multiple things to note here:

  1. I have done away with the various #defines. At the end of the day, the hardware requires that bit 5 be set to turn the reference on. The best way to ensure that bit 5 is set is to explicitly set it. Now this thinking tends to fly in the face of conventional wisdom. However, having adopted this approach I have found it to be less error prone – and a lot easier to debug / maintain.
  2. Every bit position is explicitly set or cleared. This forces me to consider every bit in turn and decide what it’s appropriate value should be.
  3. The layout is important. By looking down the columns, I can check that I haven’t missed any fields. Just as important, many debuggers present the bit fields of a register as a column just like this. Thus it’s trivial to map what you see in the debugger to what you have written.
  4. The value being shifted has a ‘u’ appended to it. This keeps the MISRA folks happy – and it’s a good habit to get into.
  5. The comments are an integral part of this approach

There are still a few problems with this approach. This is what I have discovered so far:

  1. It can be tedious with a 32 bit register.
  2. Lint will complain about shifting zero (as it considers it pointless). It will also complain about shifting anything zero places (as it also considers it pointless). In which case you have to suppress these complaints. The following macros do the trick:
#define LINT_SUPPRESS(n)  /*lint --e{n} */
LINT_SUPPRESS(835)        /**< Inform Lint not to worry about zero being given as an argument to << */
LINT_SUPPRESS(845)        /**< Inform Lint not to worry about the right side of the | operator being zero */

In the next part of this article I will describe how one can extend this technique to make configuring peripherals a lot less painful.

Subscribing to comments

November 11th, 2010 by Nigel Jones

I heard from Jeff Gros the other day asking if it’s possible to subscribe to all the comments posted on this blog. Given the quality of the comments that are posted here, I thought it was an excellent request. Anyway, the answer is yes.  Just follow this link.

Median Filter Performance Results

November 9th, 2010 by Nigel Jones

In my earlier post on median filtering I made the claim that for filter sizes of 3, 5 or 7 that using a simple insertion sort is ‘better’ than using Phil Ekstrom’s technique.  It occurred to me that this claim was based upon my testing with 8 bit processors quite a few years ago, and that the results might not be valid for 32 bit processors with their superior pointer manipulation.  Accordingly I ran some bench marks comparing an insertion sort based approach with Ekstrom’s method.

The procedure was as follows:

  1. I generated an array of random integers on the interval 900 – 1000. The idea is that these would represent data from a typical 10 bit ADC found on many microcontrollers.
  2. I then put together a base line project which performed all the basic house keeping functions, but without performing any filtering. The idea was to try and get a feel for the non-algorithm specific overhead.
  3. I then put together a project which median filtered using an insertion sort, for sizes, 3, 5, 7, 9, 11, and 13. Note that I elected to take a copy of the data prior to sorting. See this comment thread for a discussion of whether this is necessary or not.
  4. I put together another project which median filtered using Ekstrom’s method.
  5. I compiled the above for an ARM Cortex M3 target using an IAR compiler with full speed optimization.

The results were a clear win for Ekstrom. His code size was 132 bytes versus 224. His code was 5%, 32%, 61%, 89%,113% and 146% faster than the insertion sort for filters sizes of 3, 5, 7, 9, 11 and 13 respectively. To be fair to the insertion sort technique, I have made no effort to optimize it. Notwithstanding this, I think I can say that for 32 bit targets, you may as well just use Ekstrom’s approach for all filter sizes.

I’ll endeavor to update this post with results for a 16 bit target (MSP430) in the next few days.

Well I finally got around to running the tests on an MSP430 target. In this case Ekstrom’s method produced a larger code size (186 bytes versus 160). Much to my surprise, Ekstrom’s method was dramatically superior to the insertion sort approach, with speeds of 69% faster for a filter size of 3, going up to a whopping 250% faster with a filter size of 13.  The bottom line: I think my original claim is bunk. Use Ekstrom’s method by default!

DigiView Logic Analyzer

October 6th, 2010 by Nigel Jones

Today is one of those rare days on which I recommend a product. I only do this when I find a product that has genuinely made my life easier, and which by extension I think will also make your life easier. The product in question is a  DigiView logic analyzer. Now the fact that logic analyzers are useful tools should not be news to you. Indeed if you have been in this business long enough you will no doubt remember the bad old days of debugging code by decoding execution traces on a logic analyzer. That being said, I almost stopped using logic analyzers because they were big, expensive, difficult to set up and highly oriented towards bus-based systems. Given that I had my own consulting company with limited cash, limited space and a propensity to work on non-bus based systems (i.e. single chip microcontrollers), it’s hardly surprising that a logic analyzer wasn’t part of my toolbox.

This state of affairs persisted for a number of years until I obtained via a convoluted route a DigiView DV1-100. This is a USB powered, hand-sized box, with 18 channels at 100 MHz. It’s successor (The DV3100) sells for $499. The device sat on my shelf for a while until I decided to give it a spin one day. Since then I have found it to be an indispensable tool. Interestingly I find I use it the most when implementing the myriad of synchronous protocols that seem to exist on peripheral ICs today. While it is of course very useful for getting the interfaces working, I also find it extremely useful in fine tuning the interfaces. Via the use of the logic analyzer I can really examine set-up and hold times, clock frequencies, transmission latencies and so on. Doing so has allowed me to dramatically improve the performance of these interfaces in many cases. Indeed, I have had such success in this area that I now routinely hook the analyzer up, even when the interface works first time. If nothing else it gives me a nice warm fuzzy feeling that the interface is working the way it was designed – and not by luck.

Another area where I find it very useful is when I need to reverse engineer a product. I do this a lot as part of my expert witness work – and it is really quite remarkable how much you can learn from looking at a logic analyzer trace.

Anyway, the bottom line is this. $499 gets you an 18 channel 100 MHz personal logic analyzer that can handle most of the circuitry most of us see on a daily basis. If you value your time at all, then the DigiView will pay for itself the first time you use it. Go hassle your boss to get one.

Median filtering

October 2nd, 2010 by Nigel Jones

NOTE: I have heavily edited this blog post since it was originally published, based on some recent testing

If your engineering education was anything like mine then I’m sure that you learned all about different types of linear filters whose essential objective was  to pass signals within a certain frequency band and to reject as far as possible all others. These filters are of course indispensable for many types of ‘noise’. However in the real world of embedded systems it doesn’t take one too long to realize that these classical linear filters are useless against  burst noise. This kind of noise typically arises from a quasi-random event. For example a 2-way radio may be keyed next to your product or an ESD event may occur close to your signal. Whenever this happens your input signal may transiently go to a ridiculous value. For example I have often seen A2D readings that look something like this: 385, 389, 388, 388, 912, 388, 387. The 912 value is presumably anomalous and as such should be rejected. If you try and use a classical linear filter then you will almost certainly find that the 912 reading actually ends up having a significant impact on the output. The ‘obvious’ answer in this case is to use a median filter. Despite the supposed obviousness of this, it’s my experience that median filters are used remarkably infrequently in embedded systems. I don’t know why this is, but my guess is that it is a combination of a lack of knowledge of their existence, coupled with difficulty of implementation. Hopefully this post will go some way to rectifying both issues.

As its name suggests, a median filter is one which takes the middle of a group of readings. It’s normal for the group to have an odd number of members such that there is no ambiguity about the middle value.  Thus the general idea is that one buffers a certain number of readings and takes the middle reading.

Now Until recently I recognized three classes of median filter, based purely on the size of the filter. They were:

  • Filter size of 3 (i.e. the smallest possible).
  • Filter size of 5, 7 or 9 (the most common).
  • Filter size of 11 or more.

However, I now espouse a simple dichotomy

  • Filter size of 3
  • Filter size > 3

Filter size of 3

The filter size of three is of course the smallest possible filter. It’s possible to find the middle value simply via a few if statements. The code below is based on an algorithm described here. Clearly this is small and fast code.

uint16_t middle_of_3(uint16_t a, uint16_t b, uint16_t c)
{
 uint16_t middle;

 if ((a <= b) && (a <= c))
 {
   middle = (b <= c) ? b : c;
 }
 else if ((b <= a) && (b <= c))
 {
   middle = (a <= c) ? a : c;
 }
 else
 {
   middle = (a <= b) ? a : b;
 }
 return middle;
}

Filter size > 3

For filter sizes greater than 3 I suggest you turn to an algorithm described by Phil Ekstrom in the November 2000 edition of Embedded Systems Programming magazine. With the recent hatchet job on embedded.com I can’t find the original article. However there is a copy here. Ekstrom’s approach is to use a linked list. The approach works essentially by observing that once an array is sorted, the act of removing the oldest value and inserting the newest value doesn’t result in the array being significantly unsorted. As a result his approach works well – particularly for large filter sizes.

Be warned that there are some bugs in the originally published code (which Ekstrom corrected). However given the difficulty of finding anything on embedded.com nowadays I have opted to publish my implementation of his code. Be warned that the code below was originally written in Dynamic C and has been ported to standard C for this blog posting. It is believed to work. However it would behoove you to check it thoroughly before use!

#define STOPPER 0                                      /* Smaller than any datum */
#define    MEDIAN_FILTER_SIZE    (13)

uint16_t median_filter(uint16_t datum)
{
 struct pair
 {
   struct pair   *point;                              /* Pointers forming list linked in sorted order */
   uint16_t  value;                                   /* Values to sort */
 };
 static struct pair buffer[MEDIAN_FILTER_SIZE] = {0}; /* Buffer of nwidth pairs */
 static struct pair *datpoint = buffer;               /* Pointer into circular buffer of data */
 static struct pair small = {NULL, STOPPER};          /* Chain stopper */
 static struct pair big = {&small, 0};                /* Pointer to head (largest) of linked list.*/

 struct pair *successor;                              /* Pointer to successor of replaced data item */
 struct pair *scan;                                   /* Pointer used to scan down the sorted list */
 struct pair *scanold;                                /* Previous value of scan */
 struct pair *median;                                 /* Pointer to median */
 uint16_t i;

 if (datum == STOPPER)
 {
   datum = STOPPER + 1;                             /* No stoppers allowed. */
 }

 if ( (++datpoint - buffer) >= MEDIAN_FILTER_SIZE)
 {
   datpoint = buffer;                               /* Increment and wrap data in pointer.*/
 }

 datpoint->value = datum;                           /* Copy in new datum */
 successor = datpoint->point;                       /* Save pointer to old value's successor */
 median = &big;                                     /* Median initially to first in chain */
 scanold = NULL;                                    /* Scanold initially null. */
 scan = &big;                                       /* Points to pointer to first (largest) datum in chain */

 /* Handle chain-out of first item in chain as special case */
 if (scan->point == datpoint)
 {
   scan->point = successor;
 }
 scanold = scan;                                     /* Save this pointer and   */
 scan = scan->point ;                                /* step down chain */

 /* Loop through the chain, normal loop exit via break. */
 for (i = 0 ; i < MEDIAN_FILTER_SIZE; ++i)
 {
   /* Handle odd-numbered item in chain  */
   if (scan->point == datpoint)
   {
     scan->point = successor;                      /* Chain out the old datum.*/
   }

   if (scan->value < datum)                        /* If datum is larger than scanned value,*/
   {
     datpoint->point = scanold->point;             /* Chain it in here.  */
     scanold->point = datpoint;                    /* Mark it chained in. */
     datum = STOPPER;
   };

   /* Step median pointer down chain after doing odd-numbered element */
   median = median->point;                       /* Step median pointer.  */
   if (scan == &small)
   {
     break;                                      /* Break at end of chain  */
   }
   scanold = scan;                               /* Save this pointer and   */
   scan = scan->point;                           /* step down chain */

   /* Handle even-numbered item in chain.  */
   if (scan->point == datpoint)
   {
     scan->point = successor;
   }

   if (scan->value < datum)
   {
     datpoint->point = scanold->point;
     scanold->point = datpoint;
     datum = STOPPER;
   }

   if (scan == &small)
   {
     break;
   }

   scanold = scan;
   scan = scan->point;
 }
 return median->value;
}

To use this code, simply call the function every time you have a new input value. It will return the median of the last MEDIAN_FILTER_SIZE readings. This approach can consume a fair amount of RAM as one has to store both the values and the pointers. However if this isn’t a problem for you then it really is a nice algorithm that deserves to be in your tool box as it is dramatically faster than algorithms based upon sorting.

Median filtering based on sorting

In the original version of this article I espoused using a sorting based approach to median filtering when the filter size was 5, 7 or 9. I no longer subscribe to this belief. However for those of you that want to do it, here’s the basic outline:

 if (ADC_Buffer_Full)
 {
   uint_fast16_t adc_copy[MEDIAN_FILTER_SIZE];
   uint_fast16_t filtered_cnts;

   /* Copy the data */
   memcpy(adc_copy, ADC_Counts, sizeof(adc_copy));
   /* Sort it */
   shell_sort(adc_copy, MEDIAN_FILTER_SIZE);
   /* Take the middle value */
   filtered_cnts = adc_copy[(MEDIAN_FILTER_SIZE - 1U) / 2U];
   /* Convert to engineering units */
   ...
 }

Final Thoughts

Like most things in embedded systems, median filters have certain costs associated with them. Clearly median filters introduce a delay to a step change in value which can be problematic at times. In addition median filters can completely clobber frequency information in the signal. Of course if you are only interested in DC values then this is not a problem. With these caveats I strongly recommend that you consider incorporating median filters in your next embedded design.