Archive for May, 2010

Lowering power consumption tip #4 – transmitting serial data

Thursday, May 20th, 2010 Nigel Jones

This is the fourth in a series of tips on lowering power consumption in embedded systems. For this post I thought I’d delve into the common task of transmitting serial data. I compare polling and interrupting and show you how a hybrid approach can sometimes be optimal.

Almost every embedded system I have ever worked on has contained serial links. At its most abstract level, a serial link takes in parallel data and converts it to a serial stream. This serialization inherently takes longer than the write to the register that holds the data and thus to send multiple bytes back to back there is an inevitable delay. The process thus looks like this:

Store data to be transmitted
Wait for data to be sent out
Store data to be transmitted
Wait for data to be sent out
...

Store data to be transmitted
Wait for data to be sent out

From a power consumption perspective, the question is – how best to wait for the data to be sent out? Well, you have four basic approaches – open loop, polling, interrupting or a hybrid combination.  In assessing them from a power consumption perspective, what I look at is how many non-useful clock cycles I have to execute in order to transmit a byte of data.

Open Loop

I use the term open loop to describe a technique whereby you make use of the properties of a synchronous link to know (actually more accurately presume) that it is safe to send the next byte. This technique is only of use when the transmit frequency is very high in comparison to the CPU speed. For example, consider an SPI link between a CPU and a peripheral. In many cases, this link may be clocked at up to half the CPU clock frequency. In which case it takes a mere 16 CPU clocks to shift out an 8 bit datum. As a result one can simply delay 16 clock cycles between writing successive bytes. The code looks something like this:

SBUF = datum[0];
delay(16 - LOAD_TIME);
SBUF = datum[1];
delay(16 - LOAD_TIME);
...

LOAD_TIME is a constant that takes into account the number of cycles required to get the next datum from memory and write it to SBUF. Thus the number of non-useful clock cycles per byte is (16- LOAD-TIME).

Now most of you are probably thinking that I’m nuts for advocating this approach – and I’d tend to agree with you! It’s a technique I’ve only used a few times – and then only when I had to get the data out with the least possible latency and with the least amount of power consumed. I much prefer the next technique which can be almost as efficient – but a lot safer.

Polling

Polling differs from the open loop approach in that one polls a status register to determine when it is safe to write the next byte. This can be quite power efficient as long as, just for the previous example,  the transmit speed is very high in comparison to the CPU speed. Thus the SPI link given in the open loop example is also a good candidate for this approach. The code looks something like this:

SBUF = datum[0];
wait_for_sbuf_empty();
SBUF = datum[1];
wait_for_sbuf_empty();
...

The key to making this approach as efficient as possible is to code the wait function so that you read the status register on the first clock after you expect SBUF will become available.  In other words you still use a pre-calculated delay, but you throw in a check of the status register just to make sure before you load the next byte. By pre-fetching the next byte to be loaded and doing some other tweaking it’s often possible to get this approach almost as efficient as the open loop method. Notwithstanding these optimizations, the number of non-useful polling clock cycles will be greater than the number of CPU clocks required to transmit the data.

Interrupting

When the transmit frequency starts to slow down with respect to the CPU frequency, then the number of non-useful clock cycles quickly starts to rise if one uses a polling method. The classic example of this is of course asynchronous serial links running at standard baud rates.  In these cases, the transmit time is a large fraction of a millisecond and a polling approach consumes a huge number of CPU cycles (and hence power). The solution here is of course to turn to an interrupt driven approach. In this case the over-head of the ISR is ‘non-useful’ clock cycles.  As I showed in this article the overhead of even a simple looking ISR can be quite significant. Notwithstanding this, for asynchronous serial links, an interrupt based approach is nearly always the most efficient.

Hybrid

The final methodology is what I term the hybrid approach. It’s typically the most power efficient and is well suited to medium to fast serial links. The code for it looks like this:

SBUF = datum[0];
__sleep();
SBUF = datum[1];
__sleep();
...

__interrupt void sbuf_tx_isr(void)
{
 /* Empty */
}

In this approach, I enable the transmit interrupt, but have no code in the interrupt handler. After each write to SBUF I execute a sleep instruction, effectively stopping op code processing. Once SBUF has emptied, it generates an interrupt. The processor vectors to the empty ISR, returns immediately and then processes the next instruction which stores the next byte in SBUF. In this case the overhead is the number of clock cycles to enter and exit sleep mode, plus the number of cycles to vector to an ISR and return. Depending upon your processor architecture this can be anything from almost nothing to quite a lot. However it is always less than a full blown interrupt handler approach and is in my experience, often less than the polling or open loop methods.

Notwithstanding the above, this method has several weaknesses:

  1. It should be obvious that the only interrupt that can be enabled is the SBUF transmit interrupt. (Actually it’s more accurate to say that the only interrupt that can cause the processor to exit sleep mode is the SBUF transmit interrupt. The MSP430, for example, allows one to do this).
  2. While I don’t consider this a kludge, it’s certainly not crystal clear what is going on. Thus clear documentation is a must.

Summary

  1. If you feel the need for the utmost efficiency then go open loop. It’s a bit like drag-racing in that it’s fast, furious and undoubtedly gets you from A to B ASAP. Just don’t be surprised if you blow up in the process.
  2. If open-loop isn’t for you then polling may make sense provided you can crank up the transmit speed high enough. This makes for the simplest code – and that’s always a plus in my book.
  3. If you have an asynchronous link, then an interrupt based approach is the right answer 99% of the time.
  4. If you have a medium to high speed link, then the hybrid approach has much to commend it. Once you’ve seen it done a few times it becomes less weird looking.

Previous Tip

Considerate coding

Monday, May 3rd, 2010 Nigel Jones

One of my major recreational pursuits is bike riding. I live in a rural area with some great terrain, and more to the point a very low traffic density. Naturally on a 5 or 6 hour ride one does encounter some traffic and I’m always struck by the different degrees of consideration afforded to cyclists by motorists. Some are extremely solicitous and will wait so that they can pass you slowly and with a wide separation; others are complete jerks and will pass you as close and as fast as possible, often sounding their horn as they go by. Then there are the bulk of the drivers who will attempt to give you as much room as possible commensurate with slowing them down as little as possible. I was pondering this view of human nature yesterday while out on a ride, when it occurred to me that I see a similar range of consideration when it comes to embedded software. To see what I mean, read on …

I’ve mentioned several times in this blog that the main purpose of source code is not to generate a binary image but rather to educate those that come after you (including possibly a future version of yourself). You may or may not subscribe to this belief. However once you realize that source code often has a life of decades, and that the same source code may end up in dozens of products, then perhaps you may start to change your mind. So with this said, I think I can make a number of observations.

  1. It may be obvious to you, the author of the code, what the intended compilation platform is – after all it’s the one you are using. Alas it is not obvious to someone else who has been handed the source code and told to use it. ( I ran into this problem six months ago in which I had a vary large ARM project – but with no indication of which ARM compiler it was intended for).
  2. It may also be obvious to you what hardware platform the code is intended for – again it’s the one you are working on.
  3. It may also be obvious to you that the way to build the various targets is to change to directory X and invoke command Y with parameter Z – after all you do it ten times a day.
  4. It may also be obvious to you that the 27 warnings produced during the final build are benign – as after all you have checked them out.

However none of the above is clear to someone 5 years from now!

Clearly the above is just a partial list of what I call implicit information about a project. That is information that is essential to being able to use the code base, but which is often omitted from the documentation by the author. It’s my contention that the degree to which you explicitly provide this implicit information governs whether you are a jerk, a typical coder, or a considerate coder. Most of us (myself included) are typical coders (and I know this because I’ve seen a lot of code). If you want to make the move up to being a considerate coder, then here’s a few things I suggest you do.

  1. Place all the implicit information in main.c. Why is this you ask? Well if I was to dump three hundred source files on you, which one would you look at first? (An acceptable alternative is to state in main.c that useful information may be found in file X. Be aware however that non obvious source files sometimes get stripped out of source code archives).
  2. Include in main as a minimum information about the compiler (including its version), the intended hardware target, and how to build the code.
  3. Think for a minute or two about all the other information you are implicitly using in writing the source code and building it – and take the time to include it in main.c. Typically this includes additional tools, scripts etc.
  4. For an excellent discourse on why leaving warnings in your code is downright inconsiderate, see this posting from Alan Bowens.

If you do the above, then you are well on the way to becoming a ‘considerate coder’. Will doing this get you a pay increase, or at least a pat on the back from the boss – probably not. However just like the person who slows down and passes cyclists with a wide berth, you can go home at night knowing you aren’t a jerk. That has to be worth something.