embedded software boot camp

Effective C Tip #6 – Creating a flags variable

Thursday, October 1st, 2009 by Nigel Jones

This is the sixth in a series of tips on writing effective C. Today I’m going to address the topic of creating what I call a flags variable. A flags variable is nothing more than an integral type that I wish to treat as an array of bits, where each bit represents a flag or boolean value. I find these particularly valuable in three situations:

  1. When the CPU has part of its address space that is much faster to access than other regions. Examples are the zero page on 6805 type processors, and the lower 256 bytes of RAM on AVR processors. Depending upon your compiler, you may also want to do this with the bit addressable RAM region of the 8051.
  2. When I’m running short on RAM and thus assigning an entire byte or integer to store a single boolean flag is waste I can’t afford.
  3. When I have a number of related flags where it just makes sense to group them together.

The basic approach is to use bitfields. Now I’m not a huge fan of bitfields – particularly when someone tries to use them to map onto hardware registers. However, for this application they work very well. As usual however, the devil is in the details. To show you what I mean, I’ll first show you a typical implementation of mine, and then explain what I’m doing and why.

typedef union
{
 uint8_t     all_flags;      /* Allows us to refer to the flags 'en masse' */
 struct
 {
  uint8_t foo : 1,        /* Explanation of foo */
          bar : 1,        /* Explanation of bar */
          spare5 : 1,     /* Unused */
          spare4 : 1,     /* Unused */
          spare3 : 1,     /* Unused */
          spare2 : 1,     /* Unused */
          spare1 : 1,     /* Unused */
          spare0 : 1;     /* Unused */
 };
} EX_FLAGS;

static EX_FLAGS    Flags;  /* Allocation for the Flags */

Flags.all_flags = 0U; /* Clear all flags */

...

Flags.bar = 1U; /* Set the bar flag */

There are several things to note here.
Use of a union
The first thing to note is that I have used a union of an integral type (uint8_t) and a structure of bitfields. This allows me to access all the flags ‘en masse’. This is particularly useful for clearing all the flags as shown in the example code. Note that our friends at MISRA disallow unions. However, in my opinion, this is a decent example of where they make for better code – except see the caveat below.
Use of integral type
Standard C requires that only types int and unsigned int may be used for the base type of an integer bitfield. However, many embedded systems compilers remove this restriction and allow you to use any integral type as the base type for a bitfield. This is particularly valuable on 8-bit processors. In this case I have taken advantage of the language extension to use a uint8_t type.
Use of an anonymous structure
You will note that the bitfield structure is unnamed, and as such is an anonymous structure. Anonymous structures are part of C++ – but not standard C. However, many C compilers support this construct and so I use it as I feel it makes the underlying code a lot easier to read.
Naming of unused flags
If you look at the way I have named the unused flags, it looks a little odd. That is the first unused flag is spare5, the next spare4 and so on down to spare0. Now I rarely do things on a whim, and indeed this is a good example. So why do I do it this way? Well, there are two reasons:

  1. When I first create the structure, I label all the flags, starting from spare7 down to spare0. This inherently ensures that I name precisely the correct number of flags in the structure. To see why this is useful, take the above code and allocate an extra flag in the bitfield structure. Then compile and see if you get a compilation error or warning. Whether you will or not depends upon whether your compiler allows bitfields to cross the storage unit boundary. If it does, then your compiler will allocate two bytes, and the all_flags member of the union will not cover all of the flags. This can come as a nasty surprise (and perhaps explains why MISRA is wary of unions). You can prevent this from happening by naming the flags as shown.
  2. When it becomes necessary to allocate a new flag, I simply replace the topmost unused flag (in this example that would be spare5) with its new name, e.g. zap. The remainder of the structure is unchanged. If instead I had named the topmost unused flag ‘spare0’, the next ‘spare1’ and so on, then the code would give a completely misleading picture of how many spare bits are left for future use after I had taken one of the unused flags.

If you look at what I have done here, it’s interesting to note that I have relied upon two extensions to standard C (which violates the MISRA requirement for no use of compiler extensions) and I have also violated a third MISRA tenet via the use of a union. I would not be surprised if I’ve also violated a few other rules as well. Now I don’t do these things lightly, and so I only use this construct when I see real benefit in doing so. I’ll leave it for another day to discuss my overall philosophy regarding adherence to the MISRA guidelines. It is of course up to you the reader to make the determination as to whether this is indeed effective C.
Next Tip
Previous Tip
Home

Tags: , ,

20 Responses to “Effective C Tip #6 – Creating a flags variable”

  1. Uhmmmm says:

    One problem I recall with C bitfields is that the order of the bits is unspecified. That causes problems when trying to map onto a hardware register, and it causes problems if you use multiple compilers (for a library and a calling application for example)

  2. Nigel Jones says:

    Couldn't agree more. I never use bitfields to map onto hardware for this very reason. I've noticed that some compiler vendors are getting wise to this problem, and allow you to specify as a compilation switch, in which order the bits should be assigned. Nevertheless until this is mandated, it's safest to just eschew using bitfields to map onto HW.

    • Ashleigh says:

      I used bitfields for mapping to fields in communication protocols. Naughty, I hear you say. Only if you don’t know what your compiler will do. Check the generated code, very carefully.

      Using things like a 3-bit named quantity TENDS in the long run to make your code far more readable and less error prone than explicitly writing something like (buffer[x] & 0xC8)>>3.

      When you find yourself doing the above more than about twice, there needs to be a better way. Sometimes an objecty-sort of approach (getter / setter) is the way to go but this can be horribly inefficient for run time speed if there is a function call overhead associated. A bit-fielded structure definition solve the problem. Just so long as you know what the compiler will do, document that in comments, and use the option in your makefile to force the compiler to always allocated fielded structures the way you wanted.

  3. Alan Bowens says:

    Having been bitten by bitfield-handling differences when porting a microcontroller design to a new silicon/compiler combination, I would be extremely wary of this approach. In this particular design the engineer concerned had used bitfields to set/clear flag bits in status bytes reported to a host microprocessor. And of course when the design was ported as a 'drop-in' replacement onto new silicon, the system communications stopped working as a result. I guess this is another case (along with the hardware registers mentioned above) where bitfields probably shouldn't be used at all.I have to agree that the bitfield approach results in very readable code, but the expense is that you're at the mercy of your compiler vendor. As K&R points out, "Almost everything about fields is implementation-dependent".To expand on one point Nigel mentions, if you're using the IAR compiler on AVRs, you can declare flag bytes as __regvars to put them into system registers at the bottom of RAM. As he points out, this results in single opcode accesses to set/clear individual bits, giving you smaller and faster code.

  4. Nigel Jones says:

    Nicely put Alan. This is definitely one of those areas where you don't get something for free.

  5. Anand says:

    Hi Nigel, Excellent post that. I have myself used bitfields exactly as you have mentioned, however I admit I have been bitten by porting issues in particular by the ordering of the flag bits.Hence as the old saying goes – once bitten twice…cautious, I am more careful with this particular construct. But the material is very lucid in its content and I wish I had read your blog when I was learning about bitfields. However I am writing to ask your opinion about MISRA C compliance. This is the first time a client company has asked us for MISRA C compliance and I am not quite sure where to start. I started reading the guidelines from the web, however soon realised that its an enormous task and I would never get through all of them. So what I would like to know from you is how realistic is it to expect to comply with all the guidelines? Since my compiled code size is less than 2KB so should we even bother with MISRA C compliance?Your valuable insights would be highly appreciated. Warm RegardsAnand

  6. Nigel Jones says:

    Anand:You ask some very profound questions, the answers to which are better given in a blog posting, rather than the comments section. I will endeavor to post something within the next day or so. Thanks for commenting.

  7. Mans says:

    While your intentions are good, I would advise against using bitfields. Many compilers (e.g. gcc) generate very poor code for accessing them, probably negating anything gained from the denser packing.Instead, I suggest using a variable (or array) of a standard integral type, and defining a few macros to access individual bits. If more than 8 flags are needed, use a type matching the native word size or, if accesses are scattered, the memory bus width if smaller than the word size.GCC will reload the same memory each time a different flag is requested, even it it is already held in a register.

    • David Brown says:

      gcc comes in many versions, for many targets. For some targets, especially using older versions of gcc, the generated code is poor. In most cases with modern versions, the generated code is good or optimal. The same applies to many commercial compilers.

      Certainly if you are using flag registers like this to get smaller or faster code (or use less RAM space), then it’s important to test it out and check the generated assembly from your compiler. If you are unsure about the bit ordering, then again this is something to check.

      In the end, it’s a style choice. The most important factor is that it produces correct code, and that it is easy to read and understand. If it produces good code from your compiler, then that’s useful too!

      • Ashleigh says:

        Always check the generated code.

        Usualyl when using fields like this for flags the code will be larger than just accessing a plain boolean in a byte. However, if you have the common problem on a micro of plenty of flash and not enough RAM then its a case of beggars not being choosers. A little extra flash to fiddle the bits is nothing if if buys you back the RAM you need.

  8. Lundin says:

    Nigel’s advises are almost always wise and sound, but this one is not. There are so many problems in that code, I don’t know where to start…

    The most blatant one is indeed that the code will not compile under a standard C compiler. As Nigel points out, bit fields can only be of type int, signed int or unsigned int. Sure, there are many many non-standard compilers out there. Just because they allow some inane non-standard syntax doesn’t mean it should be used. Good code == portable code, especially in embedded systems.

    The next mistake is to use structs/unions for bit mapping. According to the standard, the compiler may add any number of padding bytes into a struct/union, and bit fields are no exception to this rule. This is the main reason MISRA bans unions. So all we know about that union is that its size >= 1 byte. If you use an array of such struct/unions, you are asking for trouble.

    The royal mistake is to use bit fields in the first place. The following is not specified by the standard:

    – whether bit field int is treated as signed or unsigned int
    – the bit order (lsb is where?)
    – whether the bit field can reach past the memory alignment of the CPU or not
    – the alignment of non-bit field members of the struct
    – the memory alignment of bit fields (to the left or to the right?)
    – the endianess of bit fields larger than one byte
    – whether plain int values assigned to them are interpreted as signed or unsigned
    – how bit fields are promoted implicitly by the integer promotions
    – whether signed bit fields are one’s compliment or two’s compliment
    – padding bytes
    – padding bits
    – values of padding bits.
    – and so on.

    The only thing you know when declaring a bit field is that you have an unportable chunk of random data of a random size, that the program may use in random ways.

    Flags.bar = 1U; /* Set the bar flag */

    If we ignore the fact that 8-bit bit fields aren’t even valid C, here is just a few examples of what result this line of code will yield:

    0x02
    0x40
    0x0002
    0x4000
    0x0200
    0x0040
    0x00000002
    0x40000000
    0x02000000
    0x40000000
    0x02ABCDEF
    0x40ABCDEF

    Most important of all, bit fields will never be more efficient than plain bit-wise operators, but can be less efficient in many ways. Though of course, you have to write a PhD thesis about bit fields before you can actually tell what code they will yield, since they are so incredibly poorly defined.

    Don’t use bit fields ever.

    • Nigel Jones says:

      Thanks for the input Daniel. I think the issue you raised about portability is an interesting one. I haven’t opined on portability much in this blog, so perhaps it’s time I did. If portability is not an issue, how do you compare the readability of bit fields with bit masking in this application?

      • Lundin says:

        Perhaps needless to say, this is one of my pet peeves 🙂

        How can something be readable when you have no idea where the bits end up in the memory cells? Assuming that the reader doesn’t have in-depth knowledge of the specific compiler’s bit field handling.

        Flags.bar = 1U;

        From this I can tell that “bar” is given the value 1. I have no idea whatsoever where “bar” is allocated, or how much space that is allocated by the structure it resides in. That might be sufficient for the average PC programmer, but not for them picky embedded folks who want to know exactly where all ones and zeroes end up.

        But the real problems begin when I do like this:

        uint8_t x = Flags.bar;

        There is just no way I can know what value the uint8 has after that line, unless I read the compiler front-end documents. And what about this:

        uint8_t value;
        uint8_t mask;

        if((value & Flags.bar) == mask)

        Even if “Flags” was allocated as an ISO C bit field, I still doubt very few C programmers can tell you straight away what this code actually results in, because of the implicit integer promotion madness that the compiler will conjure. I cannot, not without scratching my head for at least half an hour and reading up on both the C standard and the compiler docs.

        If I were using the bitwise operators instead, nothing would be ambiguous, and I wouldn’t need to use the struct notation either. Then I could read/write the same code in one minute and the code would be completely deterministic under any C compiler for any system.

        • Nigel Jones says:

          I think where we differ is on the use of flags. When I use a flags variable as described here, all I’m interested in doing is grouping a number of related boolean values. All I ever do is set / clear a flag and test it for true / false. I suppose there has been the odd time in which I have set one flag equal to another. For example:
          flags.foo = flags.bar;
          However, in this case I don’t think there is much ambiguity that flags.foo will end up with the value of flags.bar.

          Anyway, if this is how one uses flags variables, then it doesn’t matter where the compiler places them in memory. What matters to me is that I can now write very clear code. For example:
          if (flags.foo)
          {
          flags.bar = 0;
          }

          Doing the same with bit masks and macros, the above code tends to look something like this:
          if (flags & BIT(3))
          {
          flags &= ~BIT(6);
          }
          Clearly one can define some intermediate macros. For example:
          #define FOO BIT(3)
          #define BAR BIT(6)
          if (flags & FOO)
          {
          flags &= ~BAR;
          }
          However IMHO it still isn’t as clear as the bit field representation. I’d also note that for code like this to be reliable, one has to be very careful about the definition of the BIT() macro. There’s also the issue of the underlying type of the flags variable. For example:
          uint8_t flags;

          flags |= BIT(9);
          Good compilers will warn you that this operation is useless. Not so good compilers will happily let you do this. Of course Lint will catch it 🙂

          • Lundin says:

            Ok readability. First of all, I don’t buy the argument that “the C language’s syntax is weird, so therefore we must make a work-around”. One must assume that the code is read by a competent C programmer, one who doesn’t raise a brow when they see flags &= ~BAR. If someone reads my code and think it is confusing because is is using the extremely common bitwise operators, they shouldn’t be programming embedded systems.

            As for readability of bit fields:
            The whole point of using flags on bit level, rather than “byte flags” (typedef BOOL uint8_t), is that you want to save memory space. Since we have this requirement to save memory space for the application, we can safely assume that there is a lot of flags present (8 or more), or we wouldn’t use bit fields in the first place.

            Then how do you check all flags in a loop? A very common task in my experience. With bitwise operators this is very straight-forward:

            /* “flags” is a pure integer */
            for(mask=0x01; mask!=0x00; mask< 0)
            {
            do_something();
            }
            }

            This loop could easily be modified by some memory guru to optimize for cache memory access etc. It has the potential to become extremely efficient without sacrificing readability.

            With C bit fields, you -cannot- use a loop like that, since you don’t know where the bits are allocated. You -must- do like this:

            /* “flags” is a C bit field */
            BOOL do_it = FALSE;

            switch(i)
            {
            case SOME_CONSTANT:
            if(flags.foo > 0)
            {
            do_it = TRUE;
            }
            break;

            case SOME_OTHER_CONSTANT:
            if(flags.bar > 0)
            {
            do_it = TRUE;
            }
            break;


            /* 30 more case statements here to check a uint32 */

            default:
            }

            if(do_it)
            {
            do_something();
            }

            And now I’d like to hear exactly how a vast amount of case statements is more readable than a 3 line loop! 🙂 Not to mention that code size and execution speed is far worse with the switch statement.

            Still, the real argument here is that bit fields come with countless amounts of pit falls. Introducing them to your program is to introduce a great bug potential, while at the same time throwing all portability away.

  9. Lundin says:

    To back up the arguments provided, the C safety pioneer/guru Prof. Les Hatton recommends that bit fields should be banned entirely upon implementing a safer subset of the C language. For those who don’t know who he is, his work in the 90s had heavy influence on most “de facto” safety standards such as MISRA-C. All his work is based on actual scientific research.

    Hatton, Les, Safer C: developing software for high-integrity and safety-critical systems. ISBN 0-07-707640-0.

  10. Ashleigh says:

    I’m afraid I have to agree with both Nigel and Lundin on this one.

    I used bit fielded named flags for the same reason as Nigel – when it makes sense to do so and it saves me space. If I want a list of bits for checking, I use an INT16 size value (with comments that its bit fielded) and use masks and loops.

    Its horses for courses.

    If you have ever done any ADA you will find bit fields dont exist for all the reasons of evilness cited, and this turns out to be a major pain in the neck. I spent a few happy hours writing my own Ada83 bit manipulation package because using things like arrays of booleans turns out to be hideous, as well as slow.

    Moral of the story: There is no one right solution which will solve every problem known to man.

  11. Anonymous says:

    I’m astonished how many comments ban the use of bitfields.
    On embedded systems, memory is often an issue and I see few reason to waste at least one byte for storing a flag. And my code has much more flags than scalar variables.
    Yes, you never know if your bits are filled from MSB or from LSB.
    Almost always, it doesn’t matter.
    Data is more compact and needs less load/store operations, and embedded processors often have good support for bitwise access (set bit, clear bit, test bit).

    But in one thing I disagree Nigel; the use of unions is not necessary. To reset the whole flag variable, just assigning a constant:
    static const EX_FLAGS zero_exflags = {0,0,0,0,0,0,0,0};
    Flags = zero_exflags; /* Clear all flags */

    Is just as good as the all_flags access. OK, one drawback: Most compilers will create a zero constant in ROM, and copy it to the variable (instead of load 0; store to variable) – one extra assembler instruction. But it’s worth avoiding the union.

    Indeed, instead of that I would rather not zero all at once, but zero one bit after another. The reason: “Where could this flag possibly be set/reset?” is often a question during debugging, and then I would find them instantly. But unfortunately, all compilers I met indeed clear one bit after the other, instead of masking, which is then too much waste. So you have to look for all-variable access too.

    • Nigel Jones says:

      I hadn’t thought about using the constant method you described. I was going to criticize it on maintenance grounds in that when the size of EX_FLAGS changes you’d have to change the initialization list. However it occurred to me that under the C initialization rules, I believe that static const EX_FLAGS zero_exflags = {0}; is sufficient to initialize all elements to zero, regardless of the number of elements. Thus a very nice idea.

  12. Milorad says:

    As we are talking about embedded here I’d like to add a very nasty issue with this approach (for that matter bit field in a variable is no different) and that is that changing values is not atomic by default and you have to be wary that in multithreaded applications changing bits from different threads can cause problems unless atomicity is implemented on that variable (or union or structure) access.

Leave a Reply to Lundin

You must be logged in to post a comment.