embedded software boot camp

A volatile tempest

Monday, September 27th, 2010 by Nigel Jones

Regular readers will know that I often comment on the use of volatile in embedded systems. As a result I am occasionally contacted about my opinion on whether a compiler is generating correct code – particularly when hardware is being accessed. Well I was contacted last week by Ratish Punoose who had a classic problem in that his code compiled okay on GCC but not on IAR. He had contacted IAR, who in turn basically said the compiler is correct – and here is the explanation. Ratish turned to me and John Regehr for our opinions. Well John and I came to similar opinions – namely:

  1. Ratish’s code was a bit weird, but not dramatically so.
  2. The explanation from IAR made no sense.
  3. It did indeed appear to be a compiler bug.

Ratish then posted his issue to the Msp430 forum on Yahoo. You can read his post and the responses here.

I’m sure many of you are at this point thinking that IAR is in for another round of bashing from me. Well you’d be wrong. One of the first responders to Ratish’s post was Paul Curtis of Rowley compilers. Paul gives an admirable explanation as to why Ratish’s code is wrong (and by extension so am I). Now I’m sure that IAR and Rowley are fierce competitors, and so Paul is also to be commended for leaping to the defense of IAR.

Furthermore, later in the thread Anders Lindgren of IAR chimes in and adds his detailed and compelling explanation.

Having read the posts from Paul and Anders I think they are right and I’m wrong. So thanks Gentlemen for:

  1. Setting me straight
  2. Proving that in the wonderful world of volatile accesses, there is always something more to learn.

I think there are several other lessons to be learned from this episode. However I think I’ll save them for another post.

Tags:

24 Responses to “A volatile tempest”

  1. Lundin says:

    Wouldn’t this whole issue depend on the calling convention used rather than on the global volatile variable involved? If the global wasn’t volatile, the copy of the address must still occur if said global is used elsewhere in the program.

    But there is no telling in the C standard where that parameter “ch” should be allocated, the compiler is free to allocate it on the stack, or in a CPU register, or in thin air. The latter would no doubt be most convenient if the variable isn’t used.

    So from a C standard point of view, the function may set the global to point at “whatever the implementation’s calling convention prefers”, as long as it is set to something. It doesn’t matter if the global is volatile.

    Also as a side note, I have yet to understand the use of a volatile pointer (to volatile data). Is there actually any existing CPU arcitechture that allocates pointers in hardware registers located in the memory-mapped address space rather than in internal address registers?

  2. Hi Nigel, Lundin,

    I followed this thread there too… and my first instinct was similar to yours.

    As I see it, the compiler-jockeys are right, in that up till the line which takes the address of the value and uses it, the parameter quoted can be assumed to be fine and dandy sitting in a CPU register.
    Now you COULD take the view that when the code takes it’s address, the compiler should harrumph and copy the value to memory, or if it’s structure allows, re-generate the code for that procedure with the value copied to memory. But it would add overheard to the standard and burden any noddy compiler’s ability to comply.

    I was thinking that if you could define the parameter ITSELF as volatile – as in

    proc( volatile int x )

    ( which is what the patch-and-mend of copying it immediately into a local volatile variable actually achieves, )
    this would actually patch the leak where it is created… but we can’t change C now.

    Lundin

    Doesn’t this discussion lead toward a possible example of a volatile pointer ? It was created by trying to handle the registers of the DMA inside the MSP430 CPU as memory locations. Now C permits this, but doesn’t exactly support it.

    If you consider the “DMA start address” register, then wouldn’t that be naturally described as a volatile unsigned char * ??

    David

  3. Ratish Punnoose says:

    David,
    That is an interesting thought.
    I’m not sure what the C standard requires.

    But regarding this statement,
    “Now you COULD take the view that when the code takes it’s address, the compiler should harrumph and copy the value to memory”

    I think the compiler already does this in certain cases.
    For the following example:
    ———
    void func( unsigned char ch)
    {
    f2(&ch);
    }
    ——-
    I think the compiler has to copy the value to memory in this case.
    If not, I can see this breaking a lot of existing code.

    But these days, I find myself doubting what the C standard requires vs what is common practice.

    Ratish
    (I posted the initial example that stirred up the discussion)

    • Lundin says:

      I ran the original code on the compiler I’m currently using (Codewarrior for Freescale HC12) where the calling convention is such that the parameter is passed through CPU registers rather than the stack. But when entering the function, the compiler pushed the CPU register where the parameter resided onto the stack. I think it did that solely because there is no way for it to pass an address to a CPU register.

      The parameter will of course still instantly go out of scope and can be optimized away, but since it was passed in a CPU register there was no relation between the parameter and the stack pointer before the compiler pushed it onto the stack. Therefore they couldn’t simply set it to point at the stack pointer, as in the original example. Had the calling convention been such that the parameter is always passed on the stack, then it would have been a different story.

      To sum this whole thing up… the only thing you can be sure of after calling that function is that the pointer is pointing at a bad location after execution. It doesn’t really matter if it points to the wrong part of the stack or at a variable that has gone out of scope. The program will crash horribly in both cases.

      —-

      void func( unsigned char ch)
      {
      f2(&ch);
      }

      In that example there is no way that the code could be optimized. The compiler will clearly have to pass the address of a local object, no matter what f2() does with it. Making the parameter volatile in the original example would have the same effect.

  4. OK R.

    So suppose you correctly redefine the thing you store to as a pointer – does the compiler then correct it’s code?

    • Ratish Punnoose says:

      David, It still doesn’t.
      For example
      —————–
      unsigned char volatile * volatile v3; // volatile pointer to volatile
      void iar_buggy_func3(unsigned char ch)
      {
      v3 = &ch;
      }
      —————
      This doesn’t force the copy to stack memory.

  5. Lundin says:

    How about

    unsigned char volatile * volatile v4;
    void iar_buggy_func4(volatile unsigned char ch)
    {
    v4 = &ch;
    }

    Again, it doesn’t actually matter if the -global- is volatile or not, as long as it is used by the program elsewhere. Though in the above example the global needs to be volatile, or the volatile qualifier will disappear in the typecast, ie “volatile correctness”.

    • Ratish Punnoose says:

      For the following function, it still does not store the value of ch on the stack.
      —————————
      unsigned char volatile * volatile v4;
      void func4(volatile unsigned char ch)
      {
      v4 = &ch;
      }
      ————————–

      Ratish

      • Lundin says:

        That could possibly be considered a compiler bug then. You have three scenarios:

        Calling convention = the caller stores the parameter on the stack.
        If that is the case, then the code is working as expected.

        Calling convention = the function keeps the parameter in a register.
        If that is the case, I think it is a compiler bug. There is no connection between a CPU register and the stack pointer. v4 would be pointing at memory completely unrelated to the parameter, regardless of volatile qualifiers. Unless of course, the parameter was stacked before the address copy.

        Calling convention = the function stores the parameter on the stack.
        If that is the case, then I’m not sure. If the function is supposed to stack the parameter, then “ch” must be accessesed when doing so, and an access to a volatile variable mustn’t be optimized away. Although calling convention is not mentioned in ISO C, so it isn’t obvious whether this must be done or not from the C standard’s point of view. But it would be odd if the internal workings of the compiler threaded an implicit volatile access differently than an explicit one though. Ie it could still be a bug, even if the result conforms to ISO C.

        The “address of” operator makes ch an operand in an expression, but I don’t think it is to be considered an access of the stored value and thus wouldn’t cause any side-effects, so it can be optimized away.

        At any rate one would think that v4 should point at SP+1 rather than SP. This would possibly be a compiler bug as well, unrelated to the C standard. Imagine that the function was written with the purpose of getting the address of the next available byte on the stack, for example when checking for stack overflow from C code.

        • Ratish Punnoose says:

          In this case, the calling convention was 2) keep the parameter in a register.

          The stack pointer assignment was the equivalent of SP++; so it did correctly point to the correct place
          in memory where “ch” would have been stored.

          Ratish

  6. lundin

    R’s original code did much more around the problem statement. basically it was a routine to DMA 1 char to a USART , and the volatile location in question was a hardware pointer in the DMA engine to that char. ( you may ask yourself “elephant and nut?”, but there we are πŸ™‚ )

    if all optimisation is tunrned off, then a hole is made on the stack, ch is put there, and all works perfectly.

    However, by turning on optimisation, you get the weird situation that while space is still allocated on the stack ( and in part of it’s mind the compiler still thinks of that as the “home” of ch…… coz that’s whewre it points the pointer …. the store to that location can be “clearly seen” by the optimiser, to be unnecessary, and it is omitted.

    As far as I can understand, what the compiler writers are saying is “we have optimised this code as if all things referred to in it are memory locations – in that situation, the code would performing every function you had asked of it.

    In particular you point a pointer at something which is a local and evanescent byte on my routine stack, and then do nothing with it. I’m entitled to optimise that out.

    But now you tell me that BEYOND SIMPLY BEING ‘VOLATILE’ these are not memory locations, they are bits of some vast machine which “does things beyond the knowledge of C”. In that case all bets are off – these are side-effects, and if you turn on the slightest optimnisation, you will not get the effect you might have expected”

    Phew….

    • Nigel Jones says:

      As the comments demonstrate, this is a very interesting case. There’s never a dull moment in the land of embedded systems!

    • Lundin says:

      If we are to be engineers for a moment, and not programming language nerds, we’d be pragmatic. Neither the C language nor the compiler know or care about DMA. The only sane solution to the problem is to make a work-around for this fact. And while doing so, we’ll find the optimal solution to the problem:

      unsigned char volatile * volatile v;
      void func (unsigned char* ch)
      {
      v = ch;
      }

      This code is better, because it doesn’t concern itself with the data, while it only ever wanted the address. It doesn’t care where the data is stored and it makes no needless hardcopy of the data. So it is slightly more efficient than the original version. The compiler can optimize where the address is stored at, but we need not care about that.

      • Ratish Punnoose says:

        Certainly this is better and would work. I was just puzzled by the C compiler behavior to the original problem.
        Ratish

        • I think the purists view boils down to “if this runs on a virtual machine then it would fulfil all the externally visible effects without putting that value in memory”

          But I still think it would have been within the compiler’s ability to produce a warning that it had optimised your code away in a suspicious manner.

          In the end, I can’t see why it didn’t boil the routine down to empty πŸ™‚

          D

  7. I have asked IAR on the MSP430 group whether it wouldn’t be reasonable, for optimisation purposes, to treat the setting of a volatile pointer to point at something, as the “functional equivalent” of reading it.

    actually it would have to be the functional equivalent of “this variable cannot be stored in a register” too…. Hmmmm….

  8. GroovyD says:

    i have always learned not to rely on any function variables outside of the scope of the function. to me this usage was immediately obvious as wrong. so much so that i worry whoever has doubt must have numerous code problems perhaps yet to be discovered.

    • the code example was cut doiwn – all the “action” in fact took place within the routine, and the routine loped till it was over – so your fear of “acccess outside the scope was not the real issue here.
      It wasn’t working “as hoped” even within the scope.

  9. Lundin says:

    CPU hardware registers must be declared at file scope for linker- and access time reasons. So must variables communicating with ISRs and static member variables (private encapsulation). So it is pretty much impossible to write embedded programs without file scope variables.

    • GroovyD says:

      Not sure what you mean here…

      There is indeed no requirement of file scope for hardware registers, nor is it required for file scope variables to ‘communicate’ with ISRs (not sure if this even makes sense to say). If what you meant is that most ISRs need to modify global variables that other functions also access I would at least understand what you are saying.

      This is about non-static variables within a function not static variables within a file.

      • Lundin says:

        Ah ok now I understand what you mean. Agreed, the original code looks very strange and may not even make sense.

        • OK – just to get picky, I don’t believe the compiler is obliged to store even global values in memory

          so

          int main
          {
          int x=7;
          printf( “hullo world %d” , x ) ;

          }
          would not be under any obligation to store teh value 7 in memory, if it could fulfil the program’s objectives without doing so ???

          D

          • Lundin says:

            Nope, but then “x” isn’t a global variable. If you rewrite the code to

            int x=7;
            int main()

            then the compiler will most likely store it in RAM, in the area for “statically allocated variables”. It may or may not be allocated on the stack depending on system, but it probably won’t be allocated in CPU registers or program memory.

            This is because of a requirement in the C standard stating that all file scope variables must be initialized just as if they were static variables. And for static variables, there is a requirement that they -must- be initialized to zero, unless explicitly initialized to something else. The number 7 in this case.

            Because of this requirement, all such globals and statics must be initialized, making it hard for the compiler to perform optimizations. If you hadn’t initialized the variable, the compiler would be forced to print the value zero instead, and not some random junk.

            So if you have lets say 100 bytes of static/globals, they would all have to initialized by a generic algorithm in the compiler. I imagine it would be both tricky and inefficient to write such code if some of the variables are optimized out and stored elsewhere. I.e the compiler wouldn’t be able to memcpy() a whole chunk of init data into a block of memory where all statics are stored in adjacent cells. What they would gain from saving RAM, they’d lose in execution speed.

            Theoretically I think the compiler could perform an optimization, but I doubt it will for practical reasons. I’m just guessing here, I don’t actually know how/if compilers perform such optimizations. It would be interesting to hear how the compiler vendors reason about this.

Leave a Reply to david collier

You must be logged in to post a comment.