embedded software boot camp

Configuring hardware – part 3

Wednesday, January 26th, 2011 by Nigel Jones

This is the final part in a series on configuring the hardware peripherals in a microcontroller. In the first part I talked about how to set / clear bits in a configuration register, and in the second part I talked about putting together the basic framework for the driver. When I finished part 2, we had got as far as configuring all the bits in the open function. It’s at this point that things get interesting. In my experience the majority of driver problems fall into three areas:

  1. Failing to place the peripheral into the correct mode.
  2. Getting the clocking wrong.
  3. Mishandling interrupts.

I think most people tend to focus on the first item. Personally I have learned that it’s usually better to tackle the above problems in the reverse order.

Mishandling interrupts

Almost all peripheral drivers need interrupt handlers, and these are often the source of many problems.  If you have followed my advice, then at this stage you should have a skeleton interrupt handler for every possible interrupt vector that the peripheral uses.  You should also have an open and close function. A smart thing to do at this stage is to download your code to your debug environment. I then place a break-point on every interrupt handler and then I call  the open function. If the open function merely configures the peripheral, yet does not enable it, then presumably no interrupts should occur. If they do, then you need to find out why and fix the problem.

At this point I now add just enough code to each interrupt handler such that it will clear the source of the interrupt and generate the requisite interrupt acknowledge. Sometimes this is done for you in hardware. In other cases you have to write a surprising amount of code to get the job done. I strongly recommend that you take your time over this stage as getting an interrupt acknowledge wrong can cause you endless problems.

The next stage is to write the enable function, download the code and open and enable the peripheral. This time you need to check that you do get the expected interrupts (e.g. a timer overflow interrupt) and that you acknowledge them correctly. Just as importantly you also need to check that you don’t get an unexpected interrupt (e.g. a timer match interrupt). On the assumption that all is well, then you can be reasonably confident that  there are no egregious errors in your setup of interrupts. At this point you will probably have to further flesh out the interrupt handlers in order to give the driver some limited functionality. Although I’m sure you’ll be tempted to get on with the problem at hand, I recommend that you don’t do this, but rather write code to help tackle the next problem – namely that of clocking verification.

Clocking

Most peripherals use a clock source internal to the microprocessor. Now modern processors have multiple clock domains, PLL based frequency multipliers, and of course multi-level pre-scalars. As a result it can be a real nightmare trying to get the correct frequency to a peripheral. Even worse it is remarkably easy to get the approximately correct frequency to a peripheral. This issue can be a real problem with asynchronous communications links where a 1% error in frequency may be OK with one host and fail with another. As a result I now make it a rule to always try and verify that I am indeed clocking a peripheral with the correct frequency. To do this, there is no substitute for breaking out the oscilloscope or logic analyzer and measuring something. For timers one can normally output the signal on a port pin (even if this is just for verification purposes). For communications links one can simply set up the port to constantly transmit a fixed pattern. For devices such as A2D converters I usually have to resort to toggling  a port pin at the start and end of conversion. Regardless of the peripheral, it’s nearly always worth taking the time to write some code to help you verify that the peripheral is indeed being clocked at the correct frequency.

When you are doing this, there are a couple of things to watch out for:

  1. If your processor has an EMI reduction mode, then consider turning it off while performing clocking measurements. The reason for this is that ‘EMI reduction’ is actually achieved by dithering (quasi randomly varying) the clock frequency. Clearly a randomly varying clock isn’t conducive to accurate frequency measurements.
  2. Make sure that your system is indeed being clocked by the correct source. I mention this because some debuggers can provide the clock to the target.

Finally, if you find that you have an occasional problem with a peripheral, then checking that the clocking is precise is always a good place to start.

Mode

At this stage you have done the following:

  1. Considered every bit in every register in your open function.
  2. Verified that you have interrupts set up correctly.
  3. Written the enable function and at least part of the interrupt handler(s).
  4. Verified that you have the correct frequency clocks going to the peripheral.

You should now complete writing the driver. This is where you write the bulk of the application specific code. Clearly this part is highly application specific. Notwithstanding this, I can offer one piece of advice. Probably the single biggest mistake that I have made over the years is to assume that because the driver ‘works’ that it must be correct. I will give you a simple example to demonstrate what I mean.

It’s well known that the popular SPI port found on many devices can operate in one of four modes (often imaginatively called Mode0, Mode1, Mode2 & Mode3). These modes differ based on the phase relationship of the clock and data lines and whether the data are valid on the rising or falling edge of the clock. Thus it’s necessary to study the data sheet of the SPI peripheral to find out its required mode. Let’s assume that after studying the data sheet you conclude that Mode2 operation is called for – and you implement the code and it works. If you then walk away from the code then I humbly suggest you are asking for it. The reason is that it’s possible that a peripheral will ‘work’ in Mode 2, even though it should be operated in Mode 3. The peripheral ‘works’ in Mode 2 even though you are right on the edge of violating the various required setup and hold times. A different temperature or a different chip lot and your code will fall over. It’s for this reason that I strongly recommend that you break out the logic analyzer and carefully compare the signals to what is specified in the data sheet. There is nothing quite like comparing waveforms to what is in the data sheet to give you a warm fuzzy feeling that the driver really is doing its job correctly.

Final Thoughts

Driver writing is hard. Engineers that can take on this task and write clean, fast and correct drivers in a timely manner are immensely valuable to organizations. Thus even if you cringe at the thought of having to write a device driver, you might want to put the effort into learning how to do it – your career will thank you!

7 Responses to “Configuring hardware – part 3”

  1. > I then place a break-point on every interrupt handler and then I call the open function. If the open function merely configures the peripheral, yet does not enable it, then presumably no interrupts should occur.

    That’s very good methodology and that same breakpoint can also help with other problems, like when you think you are acknowledging and you’re not. Virtual platforms and Instruction-set simulators can be a great help with that. I made a short video a while back about how I used a virtual platform to diagnose an interrupt bug (see: http://www.tentech.ca/index.php/2010/10/5-minute-interrupt-controller-bug-chase-and-fix-with-simics/ )

    The extra info virtual platforms provide (like warning that you are writing to a read-only register) can be invaluable when writing drivers. This is especially true with highly complex SoCs that have more than 50 internal and external sources, as well as multiple cores. In these cases, debugging a “simple driver bug” can be fraught with pitfalls because of the sheer number of things that can go wrong.

    • Nigel Jones says:

      Indeed when one is writing a driver for a complex peripheral then you need a lot more than I have outlined here. I fixed a minor error in your hyperlink (I got a 404 error initially). I found the video interesting. Thirty years ago one used to patch assembly language because it took so long to assemble and reprogram. Here we are in 2011 – and we are still doing it! Seriously, the SIMICS tool looks very powerful. While I enjoyed watching how you found and solved the problem, I wonder if you have given any thought as to how you could have prevented the problem in the first place?

  2. Like I mention in my post, it was cut’n’paste 🙂 I would say that cut’n’paste mistakes are one of my biggest bug categories. In that case, I had made macros to shorten the boring repetitive bits of the assembler interrupt handling code, like loading a 32-bit constant in scratch registers. When I needed one for the EOI (End-of-interrupt) flag writing, I copied bits of a macro used for IACK and forgot to change the register names. That’s how I got in trouble. When I reviewed my code, everything seemed alright, but I had missed it…

    • Nigel Jones says:

      I understood that it was cut and paste error. I was thinking along the lines of doing something such that the assembler would generate an error if an attempt was made to write to a read only register. Whether you can do this will depend upon the sophistication of your assembler. You can sort of do this in C by declaring a register to be const. Attempts to write to a const variable should be flagged by the compiler.

  3. Lundin says:

    > I then place a break-point on every interrupt handler and then I call the open function. If the open function merely configures the peripheral, yet does not enable it, then presumably no interrupts should occur.

    To take this even further, good “defensive programming” requires that one doesn’t only write an ISR for every single interrupt on the MCU, but also add code in every unused one, which disables the interrupt and then possibly indicates that an error occured (“unexpected SPI interrupt” etc).

    This is common practice in safety-critical systems, but there is no reason one shouldn’t do this no matter application. If you get unexpected interrupts caused by buggy peripheral setup code, EMI on pins, broken hardware or runaway code, you can disable those faulty interrupts and carry on, instead of having the whole MCU mysteriously crashing in the middle of runtime.

    > (often imaginatively called Mode0, Mode1, Mode2 & Mode3)

    Yikes! I actually haven’t heard that one before, though I’m sitting in the Big Endian camp so that might be why 🙂 On Freescale these correspond to two bits, “clock phase” (clock on edge or middle of flank) and “clock polarity” (active high/low).
    No matter, SPI is a very good example, as it often “seems to work”, even though one has managed to get a subtle clock skew going, which will manifest itself in mysterious intermittent runtime errors.

Leave a Reply to Nigel Jones

You must be logged in to post a comment.