embedded software boot camp

An unfortunate consequence of a 32-bit world

Wednesday, August 29th, 2007 by Nigel Jones

Back in the bad old days when I was a lad, one learned about microprocessors by programming 8 bit devices in assembly language. In fact I can still remember my first lab assignment – namely to multiply two 8 bit unsigned quantities together to get a 16 bit result (without the use of a hardware multiplier of course). One of the indelible lessons that comes from doing an exercise such as this, is that it can take many instructions to perform even the most innocuous of high level language statements.

I mention this, because today I was looking at some code written by a young engineer who was recommended to me. In examining some of his code, I noticed the following construct:

void some_function(void)
{
 ...
 ++ivar;
 ...
}

interrupt void isr_handler(void)
{
 ...
 --ivar;
 ...
}

Notwithstanding the fact that ivar should have been declared volatile, the most egregious mistake here was the assumption that the statement ++ivar is an atomic operation. Now if one is used to working on 32 bit machines, the concept of incrementing an integer being anything other than an atomic operation is of course ludicrous. However, in the 8 or 16 bit world where many of us labor in the embedded space, the idea of incrementing an integer being an atomic operation is equally ridiculous. The trouble is with bugs like this is that they are difficult to spot, and will only rear their head after months or even years of operation.

So, is this a case of an incompetent individual? Although nominally yes, I suspect that the real problem is that he was raised on a diet of big CPUs. Perhaps the universities could do these engineers a favor, and throw away the ARM based evaluation boards and replace them with an 8051 based system.

Home

14 Responses to “An unfortunate consequence of a 32-bit world”

  1. Anonymous says:

    ARM is a load-store architecture, so any read-modify-write must necessarily involve multiple instructions.The point is that ++ivar is actually rarely atomic, even on 32-bit machines.

  2. Anonymous says:

    Would you please give some additional detail on why this example (in its current condition) is not a good idea, and how it could be modified for safe usage. Thank you.

  3. Nigel Jones says:

    To make this safe (actually a better term is correct), it is necessary to disable the interrupt around the increment. E.g.void some_function(void){…__DISABLE_INTERRUPTS();++ivar;__ENABLE_INTERRUPTS();…} This of course assumes that interrupts are enabled in some_function(). If it’s possible that they are not, then you should do this:void some_function(void){uint16_t isr_state;…isr_state = __GET_ISR_STATE();__DISABLE_INTERRUPTS();++ivar;__SET_ISR_STATE(isr_state);…}If you think it’s messy, then you’re right. Unfortunately, there is no way around this problem.

  4. Gordon J Milne says:

    Hmm, perhaps it would be better to throw the 8051 out the door and start taking advantage of those very inexpensive, very powerful and downright friendly ARM processors.Working on embedded systems does not have to mean torturing yourself.

    • Mark Lakata says:

      8051’s aren’t that bad. They are simpler than ARM cores and if that is all you need, then it is easier. I did 8-bit for a while, then 32-bit for a while, and now I’m doing 8-bit again for a project that requires it and I’ve discovered, hey, it is actually easier… then I realized I only have 8K of RAM and got sad again.

  5. riveywood says:

    Just reading through your excellent blog, so apologies for commenting on a year old post.I’ve seen problems with precisely this code construct on an ARM and commented that the writer was obviously more familar with 8/16 bit architectures – you’ve come to the opposite conclusion from the same facts :-)++ivar certainly would not be guaranteed atomic on an ARM. A 32 bit value in a register could be incremented with ADD, Rx, Rx, #1, but if it was a variable in memory (eg a local spilled to the stack), it would need an assembly language load/add sequence (with a possible store at the end). Likewise if it was a char or short int, there might be extra instructions to check for overflow.Some 16 bit processors have direct INC and DEC operations that operate directly on variables in memory doing two bus accesses and on those processors, the ++ would be an atomic operation.

  6. Nigel Jones says:

    I love the fact that you came to a different conclusion based upon the same facts! Your point is well taken in that a 32 bit architecture doesn’t guarantee that an increment is an atomic operation. I guess my point was that on a 32 bit machine there is the possibility of the increment being atomic, whereas on an 8 bit machine it patently cannot be. It’s rather interesting that as you point out, some 16 bit machines did permit making an increment operation an atomic operation. The fact that ARM (and presumably other 32 bit devices) do not is surprising – and disappointing.Perhaps the real issue here is not the 32 bit world, but rather the general lack of understanding of what an i++ operation can translate into. The fact that someone posted a comment asking why the operation was dangerous is indicative of the general lack of understanding.

  7. Juergen says:

    The discussion confounds the two accesses in a read-update-write operation and the need to use more than one memory access if the value is bigger than the word size of the machine.

    If you only write the variable in one place and read it elsewhere, then increment can be considered atomic on a 32-bit machine, because the new value is read or written atomically.

    An increment of an 16-bit or bigger variable on an 8-bit machine will need more than one read or write operation to access the complete value, so it is not atomic. You need to disable interrupts both for reading and writing such values to ensure consistency.

    • Mark Lakata says:

      If you write a variable in one place and read it elsewhere and it is register sized (or smaller), then it should be atomic in most cases and systems. But ++ and — are not reads or writes, they are read/modify/write. So you can’t use ++ or — atomically unless the architecture guarantees it.

      Luckily, the world of multicore processors hasn’t gotten into the world of conventional microcontrollers, otherwise there is another whole world of hurt with memory coherency, memory fences, out of order execution …. sigh. Hopefully someone invents a better language than C to handle this soon!

  8. e says:

    Juergen, thank you for pointing out that disabling interrupts during the variable update only half-solves the problem. Disabling interrupts during the read works to solve the other half of the problem, but in some cases is not desirable. Since it increases interrupt latency, it is often something to be avoided.

    One technique that avoids disabling the interrupt around the read is to read the volatile variable repeatedly until it has not changed over two successive reads. As long as the variable update happens infrequently enough (less than once every handful of memory cycles) the read loop is guaranteed to terminate in a few iterations, and typically will not iterate at all. When interrupt latency is critical, this technique is invaluable.

    • unknown says:

      This is an interesting technique. Has anyone else used this method? Does anyone have an examples? Thanks.

      • Peter Lund says:

        Yes. They are called seqlocks in the Linux kernel.

        I rediscovered the same technique as a teenager back in the eighties when writing code that needed access to the BIOS tick counter on a PC, which was a 32-bit word on a 16-bit machine (pre-386 real-mode). Of course, I could have used the LES reg, [addr] instruction to do the read atomically but that was a bit unwieldy from Turbo Pascal. I also used it on 8051 machines later.

      • Mark Lakata says:

        You need to do something like that when reading large volatile integers that may wrap around while you are reading it. For example, an 32-bit RTC tick counter being read on a 16-bit architecture. You can’t stop the counter or disable interrupts, since it is hardware driven.

        read low
        read high
        read low again – if it changed
        read high again

        rtc = (high , low)

        This assumes that the high word won’t change twice within a few cycles.

  9. R. D. Poor says:

    Handling concurrency properly is so difficult that only 12 out of 10 programmers get it right.

Leave a Reply to Mark Lakata

You must be logged in to post a comment.