embedded software boot camp

A tutorial on signed and unsigned integers

Wednesday, August 5th, 2009 by Nigel Jones

One of the interesting things about writing a blog is looking at the search terms that drive traffic to your blog. In my case, after I posted these thoughts on signed versus unsigned integers, I was amazed to see how many people were ending up here looking for basic information concerning signed and unsigned integers. In an effort to make these folks visits more successful, I thought I’d put together some basic information on this topic. I’ve done it in a question and answer format.

All of these questions have been posed to a search engine which has driven traffic to this blog. For regular readers of this blog looking for something a bit more advanced, you will find the last section more satisfactory.

Are integers signed or unsigned?

A standard C integer data type (‘int’) is signed. However, I strongly recommend that you do not use the standard ‘int’ data type and instead use the C99 data types. See here for an explanation.

How do I convert a signed integer to an unsigned integer?

This is in some ways a very elementary question and in other ways a very profound question. Let’s consider the elementary issue first. To convert a signed integer to an unsigned integer, or to convert an unsigned integer to a signed integer you need only use a cast. For example:

int  a = 6;
unsigned int b;
int  c;

b = (unsigned int)a;

c = (int)b;

Actually in many cases you can dispense with the cast. However many compilers will complain, and Lint will most certainly complain. I recommend you always explicitly cast when converting between signed and unsigned types.

OK, well what about the profound part of the question? Well if you have a variable of type int, and it contains a negative value such as -9 then how do you convert this to an unsigned data type and what exactly happens if you perform a cast as shown above? Well the basic answer is – nothing. No bits are changed, the compiler just treats the bit representation as unsigned. For example, let us assume that the compiler represents signed integers using 2′s complement notation (this is the norm – but is *not* mandated by the C language). If our signed integer is a 16 bit value, and has the value -9, then its binary representation will be 1111111111110111. If you now cast this to an unsigned integer, then the unsigned integer will have the value 0xFFF7 or 6552710. Note however that you cannot rely upon the fact that casting -9 to an unsigned type will result in the value 0xFFF7. Whether it does or not depends entirely on how the compiler chooses to represent negative numbers.

What’s more efficient – a signed integer or an unsigned integer?

The short answer – unsigned integers are more efficient. See here for a more detailed explanation.

When should I use an unsigned integer?

In my opinion, you should always use unsigned integers, except in the following cases:

  • When the entity you are representing with your variable is inherently a signed value.
  • When dealing with standard C library functions that required an int to be passed to them.
  • In certain weird cases such as I documented here.

Now be advised that many people strongly disagree with me on this topic. Naturally I don’t find their arguments persuasive.

Why should I use an unsigned integer?

Here are my top reasons:

  • By using an unsigned integer, you are conveying important information to a reader of your code concerning the expected range of values that a variable may take on.
  • They are more efficient.
  • Modulus arithmetic is completely defined.
  • Overflowing an unsigned data type is defined, whereas overflowing a signed integer type could result in World War 3 starting.
  • You can safely perform shift operations.
  • You get a larger dynamic range.
  • Register values should nearly always be treated as unsigned entities – and embedded systems spend a lot of time dealing with register values.

What happens when I mix signed and unsigned integers?

This is the real crux of the problem with having signed and unsigned data types. The C standard has an entire section on this topic that only a compiler writer could love – and that the rest of us read and wince at. Having said that, it is important to know that integers that are signed get promoted to unsigned integers. If you think about it, this is the correct thing to happen. However, it can lead to some very interesting and unexpected results. A number of years ago I wrote an article “A ‘C’ Test:The 0×10 Best Questions for Would-be Embedded Programmers” that was published in Embedded Systems Programming magazine. You can get an updated and corrected copy at my web site. My favorite question from this test is question 12 which is reproduced below – together with its answer: What does the following code output and why?

void foo(void)
{
 unsigned int a = 6;
 int b = -20;
 (a+b > 6) ? puts("> 6") : puts("<= 6");
}

This question tests whether you understand the integer promotion rules in C – an area that I find is very poorly understood by many developers. Anyway, the answer is that this outputs “> 6″. The reason for this is that expressions involving signed and unsigned types have all operands promoted to unsigned types. Thus -20 becomes a very large positive integer and the expression evaluates to greater than 6. This is a very important point in embedded systems where unsigned data types should be used frequently (see reference 2). If you get this one wrong, then you are perilously close to not being hired.

This is all well and good, but what should one do about this? Well you can pore over the C standard, run tests on your compiler to make sure it really does conform to the standard, and then write conforming code, or you can do the following: Never mix signed and unsigned integers in an expression. I do this by the use of intermediate variables. To show how to do this, consider a function that takes an int ‘a’ and an unsigned int ‘b’. Its job is to return true if b > a, otherwise it returns false. As you shall see, this is a surprisingly difficult problem… To solve this problem, we need to consider the following:

  • The signed integer a can be negative.
  • The unsigned integer b can be numerically larger than the largest possible value representable by a signed integer
  • The integer promotion rules can really screw things up if you are not careful.

With these points in mind, here’s my stab at a robust solution

bool foo(int a, unsigned int b)
{
 bool res;

 if (a < 0)
 {
  res = true; /* If a is negative, it must be less than b */
 }
 else
 {
  unsigned int c;
  c = (unsigned int) a; /* Since a is positive, this cast is safe */
  if (b > c)            /* Now I'm comparing the same data types */
  {
   res = true;
  }
  else
  {
   res = false;
  }
 }
 return res;
}

Is this a lot of work – yes. Could I come up with a more compact implementation that is guaranteed to work for all possible values of a and b – probably. Would it be as clear – I doubt it. Perhaps regular readers of this blog would like to take a stab at producing a better implementation?

Home

Tags: ,

14 Responses to “A tutorial on signed and unsigned integers”

  1. Uhmmmm says:

    I'm pretty sure that the C99 types are specified as twos complement in the standard. Of course, the format of plain old int/signed int are still left unspecified.

  2. maxbuds says:

    What is the difference between "signed int" and "int"? According to section A8.2 of the ANSI C standard – "The signed specifier is useful for forcing char objects to carry a sign; it is permissible but redundant with other integral types."Is it different in C99?

  3. glovepm says:

    Is it really true that “No bits are changed” when converting signed to unsigned? According to section 6.3.1.3 of C99 standard –
    “if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type”.
    I think that it’s true only for computers using two’s complement representation for signed but not for computers using one’s complement.

    • Nigel Jones says:

      Interesting. I’ve never worked on a 1′s complement machine. However thinking about it, you must be correct that it’s only on a 2′s complement machine that nothing changes.

  4. Akın Yılmaz says:

    thanks for your beneficial article

  5. David Šimáček says:

    Thank you for realy rewarding article. I have to say I didn´t know a lot mentioned facts.
    Few days ago I came upon a problem which I can´t solve even with new knowledge from the article. May be somebody could try to explain it to me.

    unsigned short a = 0xFFF8;
    signed short result;

    result = ((((signed short) a)*7)+8)/16;

    I was expextiong the result -3 but it was 28669!!! No clue why.

    • Jörg Seebohn says:

      ((signed short) 0xFFF8) is undefined if bitsof(signed short) <= 16 !!

      See 6.3.1.3 Part 3

      6.3.1.3 Signed and unsigned integers
      1 When a value with integer type is converted to another integer type other than _Bool, if
      the value can be represented by the new type, it is unchanged.
      2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
      subtracting one more than the maximum value that can be represented in the new type
      until the value is in the range of the new type.60)
      —————————————————————
      3 Otherwise, the new type is signed and the value cannot be represented in it; either the
      result is implementation-defined or an implementation-defined signal is raised.
      —————————————————————

    • Ernst Rattenhuber says:

      What compiler did you use? When I try this in Visual Studio Express 2010, I get the result that you expected.
      A possible reason could be that short is greater than 16 bits in your environment.

      • Sideshow_B0B says:

        virThat is correct, just put “int” instead of “short” in David’s expression and you will get 28869. That’s because in his enonment “short” has 32bits. And that means that as long as result is of expression is in value range from “0″ to
        ” 2 147 483 647″ number, effect is like there is no propagation. FFF8 does not convert to negative number because it can stand like positive value on 32 bit size of value;

        Signed type value ranges (32 bit size): FFF8 q)
        printf (” p is larger than q”);
        else // cannot be even, obvious reasons.
        printf (“p is is smaller than q”);

        Just for this example I assumed that int and long are types that have different bit size (unsigned int = 16 bit, signed long = 32).
        So in this example 15U (unsigned int) propagates to larger bit type that is signed long int. Lets visualize :
        unsigned int signed long
        33016 15
        1000 0000 1111 1000 0000 0000 0000 00000000 0000 0000 1111
        After propagation, one method:
        ————————————————

        15
        0000 0000 0000 0000 1111 1111 1111 0110 0000 000 0000 0000 0000 0000 0000 1111

        In this method we see logical expending of bits as we call it (not sure what is correct term), by adding zeros on left side. So unsigned int value = 33016 is larger than signed long int =15.
        Eventual other method:
        ———————-
        unsigned int signed long
        33016 15
        1000 0000 1111 1000 0000 0000 0000 00000000 0000 0000 1111

        ————————————————

        -32520 15
        1111 1111 1111 1111 1000 0000 1111 1000 0000 000 0000 0000 0000 0000 0000 1111

        This method uses Arithmetic expanding to right by expanding negative sign 1, when left-most significant bit of value with lower number of bit propagates to value in higher number of bits . My question is, is it possible that different architectures of computer could contribute to sometimes positive values propagate to negative values because of different instructions that “expand” bits are implemented or is that concern of the past? I red that information in book Programing in C , second edition.
        Sorry, my English is not superb, hope that I was understandable/readable.

  6. Adam H. Peterson says:

    I’m pretty sure “int” and “signed int” are exactly the same type in C. In C++, if I try to overload on int and signed int, g++ gives me a redefinition error. I can even define main as “signed main(signed, char**){}” (in C or C++).

    This does not happen for “char”, though — “char”, “signed char”, and “unsigned char” are three distinct (and, in C++, overloadable) types. (And if I try to define main using “int main(int, signed char**)”, gcc warns that it’s wrong.)

  7. Valery Venedictov says:

    Another version:

    bool foo(int a, unsigned int b)
    {
    if (a c; /* Now I’m comparing the same data types */
    }

  8. Ravikumar.R says:

    Really your information is useful… good explanation good examples

Leave a Reply