embedded software boot camp

Is “(uint16_t) -1” Portable C Code?

Thursday, June 2nd, 2011 by Michael Barr

Twice recently, I’ve run across third-party middleware that included a statement of the form:

uint16_t variable = (uint16_t) -1;

which I take as the author’s clever way of coding:

0xFFFF

I’m not naturally inclined to like the obfuscation, but wondered if “(uint16_t) -1” is even portable C code? And, supposing it is portable, is there some advantage I don’t know about that suggests using that form over the hex literal? In the process of researching these issues, I learned a helpful fact or two worth sharing.

Q: Is the result of “(uint16_t) -1” guaranteed (by the ISO C standard) to be 0xFFFF?

A: No. But it’s likely the result will be 0xFFFF on most compilers/processors, since there is really just the one common internal CPU representation of unsigned integers. (For signed integers, most/all processors will use the common 2’s complement representation underneath–even though that’s not required in any way by the language standard.)

Q: Is there any advantage to writing 0xFFFF that way?

A: According to the C99 Standard, all conforming implementations support uint_least16_t, but some may not support uint16_t. If the platform doesn’t support uint16_t, then “(uint16_t) -1” won’t compile, but 0xFFFF will compile as a value of some larger unsigned integer type (i.e., a bug waiting to happen).

Of course, platforms that don’t have a fixed-width 16-bit unsigned capability are rare, though it may be that some DSPs fall into that category. The same issue applies to uint32_t and 0xFFFFFFFF, of course. However, I suspect platforms that don’t have a fixed-width 32-bit unsigned capability are even rarer.

Q: What is the best way to represent the maximum unsigned integer value of a given size?

A: The very best way to represent the maximum values for unsigned (and signed) fixed-width types is to use the constants named in C99’s stdint.h header file. These are of the form UINTn_MAX (and INTn_MAX) where n is the number of bits (e.g., UINT16_MAX). That is guaranteed to either work or not compile, with no middle ground for bugs.

Hat Tip: Many thanks to C and C++ standards guru Dan Saks for help with these answers.

Tags: , , , ,

25 Responses to “Is “(uint16_t) -1” Portable C Code?”

  1. Thomas Thorsen says:

    An interesting thing to note is that intXX_t are guaranteed to be represented by 2’s complement (unlike the “raw” signed char, short, int, long which can be either signed magnitude, 1’s complement or 2’s complement). According to the precedence rules, the unary – operator is applied first (same precedence as cast, but right to left associativity), so dependable behaviour can be obtained by doing this:

    (uint16_t) -((int16_t) 1)

    This forces the 1 to be converted to a 2’s complement representation before applying the unary – operator, thus guaranteeing the result to be all 1’s, on any compliant platform (and it would have to do software emulation if the architecture does not support 2’s complement!).

    • Lundin says:

      Won’t they be 2’s complement only if the computer is 2’s compliment? You won’t have 1’s compliment and 2’s compliment in the same computer, so the stdint types are likely not going to be implemented by the compiler at all on such a machine.

      So I don’t think that typecast does anything meaningful. Also, be aware that the integer promotions are done implicitly as the – operator is evaluated. So if you would have a system where the default int type is larger than 16 bits, you are going to get trouble no matter. Both your example and the original one are equivalent to (uint16_t)(int)-1 on a system where int > 16 bits.

      • Thomas Thorsen says:

        Yes, you are indeed right that integer promotion completely destroys this little trick, and the cast doesn’t make any difference. However, I did some more investigation, and it turns out that the internal representation is actually irrelevant, because even though the standard does not define how integers are represented internally, it does explicitly state how signed numbers are converted to unsigned numbers, which is actually what we’re dealing with here, because everything involved here is governed by arithmetic and conversion rules:

        6.3.1.3.2: “…, if the new type is unsigned, the value is converted by repeatedly adding or
        subtracting one more than the maximum value that can be represented in the new type
        until the value is in the range of the new type”

        So what happens is this:
        1) -1 is picked up by the compiler and represented in an int or whatever the compiler wants to. It’s just a number.
        2) The -1 is converted to unsigned by the cast to uint16_t following the rule of section 6.3.1.3.2 by adding 65536 once, so it takes on the value 65535 which is representable in a uint16_t.
        3) 65535 is assigned to the variable.
        4) As the variable is unsigned, the internal representation is guaranteed to be all ones.

        The internal representation of signed integers never come into play (i guess we’ve been confusing it with the situation where we cast a pointer to a signed int to a pointer to an unsigned int and then dereference, in which case the internal representation is relevant).

        We rely on a well defined (but probably not well known) rule to convert the numerical value in a well defined way as part of the conversion to uint16_t.

        So i gather, casting -1 to an unsigned type, actually is a well defined and platform independent way to find the maximum value of any unsigned type.

        Looking at the other suggestion of using ~0, and considering the rules for conversion, i believe that one is actually worse, because it does depend on the internal representation, because of the nature of the ~ operator which is bitwise (not arithmetic like -). Section 6.5.4 warns about this:

        “Some operators (the unary operator ˜, and the binary operators <>, &, ˆ, and |,
        collectively described as bitwise operators) are required to have operands that have
        integer type. These operators return values that depend on the internal representations of
        integers, and have implementation-defined and undefined aspects for signed types.”

        On a 2’s complement platform, ~0 produces the desired result, but on 1’s complement the result will be 0 (all ones is also 0 in 1’s complement), and on signed magnitude the result is 32769 (all ones is -32767 in signed magnitude, adding 65536 once is 32769 or 0x8001 in hex).

        • Lundin says:

          Interesting, I didn’t know of that conversion rule. But wouldn’t it still be dependent on the signed representation?

          [any negative number] + 65535 + 1

          This should be predictable in any case except when the negative number is -32768 or -32767 (signed representation of 0x8000), then wouldn’t the result still be compliment-dependent?

          • Thomas Thorsen says:

            No, the rule is purely arithmetic – forget about the signed representation. If the signed number is -32768 or -32767 the result is -32768 + (1 * 65536) = 32768 and -32767 + (1 * 65536) = 32769 respectively. This is assuming conversion to uint16_t. If it was converted to uint8_t it would be -32768 + (128 * 256) = 0 and -32767 + (128 * 256) = 1 respectively. If it was converted to a uint32_t it would be -32768 + (1 * 4294967296) = 4294934528 and -32767 + (1 * 4294967296) = 4294934529 respectively.

            This is all regardless of the type and internal representation used for holding the signed number because only it’s numerical arithmetic value is significant.

            Note that this effectively means that if the signed number happens to be in a 2’s complement representation internally, the platform can just copy the bits of the internal representation unmodified and truncate/sign-extend as needed if the unsigned variable size differs. This is easy to see if you look at the bits, let’s take the example from before and assume the signed integer is held in a 16bit signed integer represented as 2’s complement (Dec are arithmetic values, Hex are internal representations):

            Input to conversion:
            int16_t Dec: -32768
            int16_t Hex: 0x8000

            Output from conversion:
            uint16_t Dec: 32768
            uint16_t Hex: 0x8000
            uint8_t Dec: 0
            uint8_t Hex: 0x00 (truncated to lower 8 bits)
            uint32_t Dec: 4294934528
            uint32_t Hex: 0xFFFF8000 (sign extended into upper 16bit)

            So, as you can see, it is trivial to convert from signed to unsigned for the machine if the integer representation is 2’s complement, because the rule of the standard in section 6.3.1.3.2 is designed this way, so all it has to do is perform sign-extension before converting to a larger size, and just truncate when converting to a lower size. These operations are commonly built into the instruction set, so it is very efficient.

  2. Boudewijn Dijkstra says:

    I always replace unsigned -1 by ~0, which does not perform sign conversion.

    • Lundin says:

      Depending on the type of “o” it may very well perform an implicit sign conversion because of the integer promotion rules in C. Good practice is to always cast the result of ~ to the intended (“underlying”) type. This is enforced by MISRA-C, for example.

  3. unknown says:

    On a similar related note. I have seen and used a similar example as part of a compile time assert to determine if a type is signed or unsigned:

    #define IS_TYPE_SIGNED(type) (((my_type) – 1) < 0)

    i.e.

    A header file some where:
    typedef uint16_t TUserType;

    A source file some where:
    STATIC_ASSERT(IS_TYPE_SIGNED(TUserType == false) /* Ensure type is unsigned. */

    Is this macro to check if a type is signed portable C Code?

  4. Laz says:

    I prefer not to mix numbers and packed-bit values. If the desired values is “-1”, then the type should be a signed int. If the desired value is binary “all 1s” then the type should be an unsigned int, and I use the (~0x00) to be independent of size. I have seen cases where a packed-bit type such as WORD or BYTE is used to distinguish these from numerical values. This is similar to using these instructions to set and clear a single bit:

    #define FLAG5_BIT 0x20

    packedBits |= FLAG5_BIT;

    packedBits &= ~FLAG5_BIT;

    Similarly, I test them differently:

    if (packedBits) // if any bits are set

    if (intVal != 0) // if not zero

    but not:

    if (intVal) // same as above, but is intVal “logical” or a numeric value?

  5. Dave says:

    Minor correction to the first Q: (uint16_t)-1 is guaranteed (by the ISO Standard) to be 0xFFFF _on all implementations that support uint16_t_. The representation of signed numbers has nothing to do with it, it is an artifact of how C defines unsigned arithmetic.

    I agree that if you want 0xFFFF, that seems a much better way to express it. And if you want the maximum value that can be represented by a uint16_t (which is not necessarily the same thing semantically, though it is numerically), UINT16_MAX is the way to go.

    However, if you want an unsigned type whose width you don’t necessarily know to have all bits set, then

    unsigned ones = -1;

    is guaranteed to provide you that value.

    Regards,

    -=Dave

    • Lundin says:

      If I didn’t know the width I would prefer the following:

      #include
      unsigned ones = UINT_MAX;

      This is logical, readable and completely portable.

  6. antiquus says:

    I, too, prefer the ~0 method, because of its size independence. But, as Lundin points out, integer promotion can grab you, so if I’m paying attention I use ~((uint16_t)0) — that is, cast first, then invert.

  7. Notwithstanding all the previous worthy comments, I would simply rephrase the question:-

    Question:
    Is “(uint16_t) -1″ Acceptable C Code?

    Answer:
    Definitely not!

    Justification for answer:
    Negative values should not be assigned to unsigned variables. Arguably, an explicit cast might make the expression slightly more acceptable; at least it would show that the programmer had thought for more than a second ot two about what (s)he was doing.

    Conclusion:
    Avoid this kind of thing, and use a static analyser to check compliance. A MISRA checker, for example, should flag a deviation from rule 10.1. An explicit cast would pacify MISRA in this situation, but would not satisfy me! I would use UINT16_MAX if I really wanted the maximum value or, for lack of binary notation in C, the explicit 0xFFFFU (via a #define) if I were setting a bit-mask.

    • I’ve just re-read my post and realised that the cast I referred to is actually there, in the question. Mea culpa, for copying and pasting without paying proper attention! This means it would get through the static analyser, but does not scupper my general “don’t do that” point.

  8. Peter says:

    I wonder why nobody suggested “-(uint16_t)1”.

    • Thomas Thorsen says:

      Because it has no effect, the uint16_t is promoted to int before applying the unary – operator.

  9. Graham says:

    ADI SHARC DSP C/C++ compiler is an example of a processor that does not have any exact 8-bit or 16-bit integers. (u)int16_t and (u)int8_t are not available in its C99 compiler. In regular C on SHARC, char/short/long/int are all 32-bits. There is a long long 64-bit.

    • ram says:

      hi graham,
      what is fix to be used in SHARC processor for this typecast problem.
      i am facing some problem since all the local (16bit) varibles placed in STACK (32 bit).

  10. Mohammad Elwakeel says:

    How about:

    uint16_t variable = ~0;

    will that make a difference?

  11. Rick M says:

    Notwithstanding all the other caveats, if you just do something like uint16_t i = -1, is that not equivalent to doing the cast? That makes it a little easier to change i to be a uint32_t, for example. If instead you cast, you’d have to change the cast as well, and if you just write 0xFFFF, you’d have to fix that, too.

    Personally, I prefer 0xFFFF-whatever if you’re intending to use the variable as a set of bit flags, and the MAX_-whatever if you really want it to hold the max value. It expresses the semantics better.

  12. Jim says:

    So wouldn’t a better way to do this be:

    uint16_t variable = ~0;

  13. Fatih says:

    limits.h can be used to get limit values.

  14. Martin Kunev says:

    Q: Is the result of “(uint16_t) -1″ guaranteed (by the ISO C standard) to be 0xFFFF?
    A: yes

    C99 6.3.1.3 Signed and unsigned integers
    … if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

    Here is a very good question on the topic:
    http://stackoverflow.com/a/809341/515212

Leave a Reply to Rick M