embedded software boot camp

Efficient C Tips #10 – Use unsigned integers

Friday, July 31st, 2009 by Nigel Jones

This is the tenth in a series of tips on writing efficient C for embedded systems. Today I consider the topic of whether one should use signed integers or unsigned integers in order to produce faster code. Well the short answer is that unsigned integers nearly always produce faster code. Why is this you ask? Well there are several reasons:

Lack of signed integer support at the op code level

Many low end microprocessors lack instruction set support (i.e. op codes) for signed integers. The 8051 is a major example. I believe low end PICs are also another example. The Rabbit processor is sort of an example in that my recollection is that it lacks support for signed 8 bit types, but does have support for signed 16 bit types! Furthermore some processors will have instructions for performing signed comparisons, but only directly support unsigned multiplication.

Anyway, so what’s the implication of this? Well lacking direct instruction set support, use of a signed integer forces the compiler to use a library function or macro to perform the requisite operation. Clearly this is not very efficient. But what if you are programming a processor that does have instruction set support for signed integers? Well for most basic operations such as comparison and addition you should find no difference. However this is not the case for division…

Shift right is not the same as divide by two for signed integers

I doubt there is a compiler in existence that doesn’t recognize that division by 2N is equivalent to a right shift N places for unsigned integers. However this is simply not the case for signed integers, since the issue of what to do with the sign bit always arises. Thus when faced with performing a division by 2N on a signed integer, the compiler has no choice other than to invoke a signed divide routine rather than a simple shift operation. This holds true for every microprocessor I have ever looked at in detail.

There is a third area where unsigned integers offer a speed improvement over signed integers – but it comes about by a different mechanism…

Unsigned integers can often save you a comparison

From time to time I find myself writing a function that takes as an argument an index into an array or a file. Naturally to protect against indexing beyond the bounds of the array or file, I add protection code. If I declare the function as taking a signed integer type, then the code looks like this:

void foo(int offset)
{
 if ((offset >= 0) && (offset < ARRAY_SIZE))
 {
  //Life is good...
 }
}

However, if I declare the function as taking an unsigned integer type, then the code looks like this:

void foo(unsigned int offset)
{
 if (offset < ARRAY_SIZE)
 {
  //Life is good...
 }
}

Clearly it’s nonsensical to check whether an unsigned integer is >=0 and so I can dispense with a check. The above are examples of where unsigned integer types are significantly more efficient than signed integer types. In most other cases, there isn’t usually any difference between the types. That’s not to say that you should choose one over the other on a whim. See this for a discussion of some of the other good reasons to use an unsigned integer. Before I leave this topic, it’s worth asking whether there are situations in which a signed integer is more efficient than an unsigned integer? Off hand I can’t think of any. There are situations where I could see the possibility of this occurring. For example when performing pointer arithmetic, the C standard requires that subtraction of two pointers return the data type ptrdiff_t. This is a signed integral type (since the result may be negative). Thus if after subtracting two pointers, you needed to add an offset to the result, it’s likely that you’ll get better code if the offset is a signed integral type. Of course this touches upon the nasty topic of mixing signed and unsigned integral types in an expression. I’ll address this another day.

Next Tip

Previous Tip

Home

Tags: ,

22 Responses to “Efficient C Tips #10 – Use unsigned integers”

  1. Uhmmmm says:

    I don't know if compilers actively exploit this, but assuming an arithmetic right shift, something like the following can be used. It's almost certainly faster than a full blown division routine:// performs x / 2^n, with correct rounding. NBITS is number of bits in the integer type#define DIV2n(x,n) ((x) + (((x) >> (NBITS-1)) & ((1 << (n)) – 1)) >> (n))Of course, if the rounding doesn't matter for your use case, you can just use right shift directly, if the compiler you care about maps it to arithmetic right shift.

  2. Nigel Jones says:

    The last time I looked at the compiler generated code, they (at least in that case) did indeed do something like you are suggesting. Even so, it's still a lot less efficient than a simple shift.

  3. Luís Fernando Schultz Xavier da Silveira says:

    Consider a processor that has arithmetic support only for word-sized values. Unsigned types of width less than the word width must be represented as a register with the upper bits zeroed, opposed to signed types which get their sign extended. The compiler may assume no overflow will happen with a signed type, but that is not the case with unsigned types. Suppose now we have a series of arithmetic operations which we know will not overflow followed by a comparison. In the signed case, the compiler may carry out the comparison normally, but in the unsigned case it must zero the upper bits of the registers first since it must discard the effects of possible overflows. This generates extra instructions, which is bad for both size and speed. Of course, if we could tell the compiler no overflow may happen with values of a type, maybe through a type attribute, we could have the best of both worlds.

    • Nigel Jones says:

      I’m not entirely sure I follow your logic that the compiler can assume that no overflow will happen with a signed type. Consider a 16 bit CPU that has arithmetic support only for 16 bit values. If I define a signed 8 bit variable, assign it the value of 127 and then increment it, how can overflow not occur in the sense that the eighth bit must now presumably be set? If you think that the sign extension will protect you, what if you have an 8 bit signed variable to which you add 127 a few hundred times? It will certainly overflow a 16 bit variable.

      There is one sense in which the compiler will assume that no overflow will happen with the signed type – and that is because implicit in signed integer operations is that the writer of the code has constructed it so that overflow cannot occur. As I pointed out in this post, signed integer overflow is undefined and hence disastrous.

      • Luís Fernando Schultz Xavier da Silveira says:

        From your post: “That is if you overflow a signed integer, the generated code is at liberty to do anything […]”. This implies the compiler may assume no signed integer overflow will occur and thus it can safely use sign extension to represent your signed variable in a larger width registers. If you set a value of 127 to a signed 8 bit variable and increment it, your program is wrong.

        To be more clear, consider the following code snippet:

        #include
        #include

        bool
        f (uint16_t x, uint16_t y)
        {
        x+= 3;
        y-= 42;
        return x == y;
        }

        Unfortunately my machine is x86_64 and thus CISC-based and contains arithmetic instructions to deal with any integer width. Since I do not have, say, an ARM7 toolchain at hand to test this out, I can only estimate the assembly for this has to include two AND operations before the compare while the signed version does not. This assumes the unsigned integers are represented as the lower bits of a larger width register while the signed integers are represented with sign extension to a larger width register. Additionally, ARM7 was a guess of a processor without a 16 bit compare instruction whose ABI performs extensions as described here.

        • Nigel Jones says:

          I have an IAR ARM Cortex compiler. I ran your code for both uint16_t and int16_t types.
          Here is the resulting assembly language:
          Unsigned
          \ 00000000 C01C ADDS R0,R0,#+3
          \ 00000002 2A39 SUBS R1,R1,#+42
          \ 00000004 80B2 UXTH R0,R0
          \ 00000006 89B2 UXTH R1,R1
          \ 00000008 8842 CMP R0,R1
          \ 0000000A 01D1 BNE.N ??f_0
          \ 0000000C 0120 MOVS R0,#+1
          \ 0000000E 7047 BX LR
          \ ??f_0:
          \ 00000010 0020 MOVS R0,#+0
          \ 00000012 7047 BX LR ;; return

          Signed
          \ 00000000 C01C ADDS R0,R0,#+3
          \ 00000002 2A39 SUBS R1,R1,#+42
          \ 00000004 00B2 SXTH R0,R0
          \ 00000006 09B2 SXTH R1,R1
          \ 00000008 8842 CMP R0,R1
          \ 0000000A 01D1 BNE.N ??f_0
          \ 0000000C 0120 MOVS R0,#+1
          \ 0000000E 7047 BX LR
          \ ??f_0:
          \ 00000010 0020 MOVS R0,#+0
          \ 00000012 7047 BX LR ;; return

          • Luís Fernando Schultz Xavier da Silveira says:

            I assume the UXTH and SXTH instructions perform zero extension and sign extension, respectively. Doesn’t the ARM ABI specify that halfword values passed in registers are already zero/sign extended? Even if it does not, this is sort of an ABI problem. My whole point is that if the ABI demanded sign extension to signed values, the SXTH instructions would not be necessary, while the UXTH instructions still are (even if zero extension is demanded).

            By the way, is optimization turned on? I don’t see why the use of predicated opcodes was abandoned in favor of plain branching.

            Well, I still believe there is a fundamental problem with C providing signed types without overflow semantics but not doing the same with unsigned types. Maybe there should be signed, unsigned and modular types. However, since other languages (such as ada) do the same, maybe I am wrong, but I can’t grasp why.

          • Nigel Jones says:

            I was in a bit of a hurry when I posted the code. I should have stated that it was compiled with full ‘balanced’ optimization. Anyway I think you have an interesting point – and certainly it isn’t one I have considered before. In my experience unsigned code is never slower than signed code (and is often faster). However I do most of my work on 8 / 16 bit machines where the issue of sign extension is less of an issue. You have given me something to ponder and I will continue to perform ad hoc benchmarking with my code to see if this continues to be the case with the move to 32 bit processors. I’d be interested to hear from anyone who is doing a lot of 32 / 64 bit work who has any thoughts on this topic.

          • Luís Fernando Schultz Xavier da Silveira says:

            What is full ‘balanced’ optimization?

            Just to clarify something, I share your view on how to use unsigned and signed types. My ideal solution would be leaving unsigned overflow unspecified and adding a modular modifier. Or you could add a type attribute to the compiler. My current solution is using different types when declaring variables that should go on registers (u8_reg_t), which is a pain.

            I also have searched a lot for an answer, but found nothing, not even a concern about the problem.

          • Nigel Jones says:

            Balanced optimization is when the compiler attempts to achieve a balance between size and speed.

          • Luís Fernando Schultz Xavier da Silveira says:

            In this case, the instructions

            \ 0000000A 01D1 BNE.N ??f_0
            \ 0000000C 0120 MOVS R0,#+1
            \ 0000000E 7047 BX LR
            \ ??f_0:
            \ 00000010 0020 MOVS R0,#+0
            \ 00000012 7047 BX LR ;; return

            could be replaced by

            MOVEQ R0, 1
            MOVNE R0, 0
            BX R14

            for a faster and smaller executable, no?

  4. itsnotvalid says:

    The link referring to “Signed versus unsigned integers” should be http://www.embeddedgurus.net/stack-overflow/2009/05/signed-versus-unsigned-integers/. Probably leftovers to changed URL schemes.

  5. Misha says:

    Just couple days ago I stared reading this site. In my opinion all the tips are great and explained exceptionally well. Other posts are good also, thank you for sharing experience Nigel.

    I come from DSP embedded world, which is slightly different than what is in focus here, but I must say that all the tips (13 so far) apply to that domain as well — except this one 🙂

    The reason why signed int is producing better code in DSP processors is explained well in comments of Luís Fernando Schultz Xavier da Silveira. He was considering the case when size of type is less than size of register (supported arithmetic), e.g. 8 bit int in 16 bit register. Well, this is the case practically all the time in DSP. The fact is that accumulator registers in DSPs almost always have some kind of guard bits, which are to catch overflow and subsequently perform saturation. Therefore for unsigned integers guard bits must be zeroed by compiler either before every comparison, or after every add/sub and shift left operations.

    However, in your example with index boundary check, I think the second code would still be better in most of the cases, because one comparison and bits zeroing beats two comparisons 🙂
    (Just tried with, not commonly used, Cirrus Logic DSP and the result is 4 against 5 instructions for second version)
    On the other hand, in case of, for example, a simple comparison unsigned integers will introduce two extra instructions (one guard zeroing for every operand).

    Anyhow, I just wanted to share my experience regarding this tip, because, as I said, all other tips perfectly apply to DSP world as well.

    • Nigel Jones says:

      Very interesting Misha. If I interpret this correctly, what’s going on is this. The C language defines what happens when you overflow unsigned integers (namely modulo arithmetic) and thus the compiler is forced to add a zero operation on the guard bits. By contrast, for signed integers the effect of overflow is undefined, and thus no such zeroing is required. I’m intrigued by the following. Let’s say you perform a series of additions on unsigned integers. Does the compiler zero out the guard bits after every operation, or does it simply wait until the end of the series and then perform the mask? What happens if you do the same thing with signed integers? Also if you actually get an overflow with signed integers, do you get the ‘expected’ result, or do you get garbage?

      • Misha says:

        Yes, you interpreted it correctly.

        Here are the answers (to the best of my knowledge) to the questions you raised.

        “Let’s say you perform a series of additions on unsigned integers. Does the compiler zero out the guard bits after every operation, or does it simply wait until the end of the series and then perform the mask?”
        This particular compiler (for Cirrus Logic DSP) performs the mask only in two places: before doing comparison and before shifting right. If I’m not mistaking, those are the only places where it is important. Of course, memory load/store instructions that do not perform saturation just ignore the guard part, so no problem there, and the same thing is happening when performing multiplication, etc.
        So, I believe that in a way it is different, i.e. simpler, than the case when you have, e.g. 16 bit arithmetic hardware, but you are using 8 bit data type. The case in DSP is that, for example, you use 16 bit data type, your hardware supports 16 bit arithmetic, but your register has additional special purpose bits on the left.

        “What happens if you do the same thing with signed integers?”
        With sign integers nothing is done with the guard bits. Since signed overflow is undefined in C standard, DSP hardware and compiler define it in a way that it produces the best code. But, it is not explained too much in the manual of the DSP I’m currently working with, rather they took a position that “overflow is bad” and programmer should avoid it.

        “Also if you actually get an overflow with signed integers, do you get the ‘expected’ result, or do you get garbage?”
        If by “expected” you mean “always the same”, or “determinable”, result, than you do get “expected” result 🙂 But it is definitely different that what you would get if overflow is handled as wrap around. I call the behavior “extended wrap around” or “delayed wrap around” because wrap around does happen but only after guard part is filled up.

        Here is another interesting thing.
        Luís Fernando says:
        “My ideal solution would be leaving unsigned overflow unspecified and adding a modular modifier. Or you could add a type attribute to the compiler.”
        The DSP compiler I’m working with supports C language extension called “C – Extensions to support embedded processors” (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf). It targets mostly DSP related issues. The main thing the extension introduced are fixed-point types. When defining fixed-point types overflow behavior they did something similar to what Luís Fernando proposed. If type has “sat” word (e.g. “sat fract”) then saturation has to happen on overflow, otherwise the behavior is undefined, that is, it should be done in the most efficient manner on the target processor. There is also a pragma that can force saturation on all fixed-point types. Similar thing might be done for signed integers.

  6. Simple Electronic and Microcontroller Projects with tutorials…

    […]Efficient C Tips #10 – Use unsigned integers « Stack Overflow[…]…

  7. Bregalad says:

    Signed and unsigned makes a difference only when comparing and dividing. There is no difference at all when adding and multiplying.

    The shift right problem can be solved with what they call “arithmetic shift right”. If your instruction set don’t have it, then it can be easily simulated by something like this :

    cmp reg, #$80 ; Sign bit is duplicated in carry
    lsr reg ;sign bit stays in place

    Now if the compiler writers didn’t know abotu this, it’s another problem ^^

    Last but not least I think (I’m not 100% sure) that in C it’s legal for the compiler to do both an artithmetic or a logical shift right when using >>, no matter the data type.
    The / operator is guaranteed to get the sign correctly at least.

    Nice set of tips, I was already aware of most of them but it’s still good to know I’m not the only one worrying about efficiency of coding.

  8. […] signed or unsigned better? Unsigned is better. See here for a detailed answer. […]

  9. Toby says:

    I’m not so sure I agree with this one. I default to unsigned types myself _normally_.
    However, especially considering your earlier tip on preferring count-down rather than count-up loops, one must always watch out for underflow with them such that: `for (uint8_ t x = 9; x <= 0; x-=2) {…}` will never stop looping!

  10. Ricky says:

    “for (uint8_t x = 9; x = 0”. Otherwise, it will never execute!

Leave a Reply

You must be logged in to post a comment.