embedded software boot camp

Three byte integers

June 10th, 2009 by Nigel Jones

One of the enduring myths about the C language is that it is good for use on embedded systems. I’ve always been puzzled by this. While it is true that many other languages are dreadful for use on embedded systems, this merely means that C is less dreadful rather than ‘good’. While I have a host of issues with C, the one that constantly galls me is the lack of 3 byte integers. Using C99 notation these would be the uint24_t and int24_t data types. Now a quick web search indicates that there may be the odd compiler out there that supports 3 byte integers – but the vast majority do not.

So why exactly do I want a 3 byte integer? Well, there are two main reasons:

Intermediate results

When I look through my code, I find a huge number of incidences where I am performing an arithmetic operation on a 16 bit value, where intermediate values overflow 16 bits, yet the final value is 16 bits. For example:

uint16_t a, b;

a = (b * 51) / 64;

In this case, the code will fail if (b * 51) overflows 16 bits. As a result, I am forced to write:

a = ((uint32_t)b * 51) / 64;

However, examination of this code shows that (b * 51) could never overflow 24 bits for all 16 bit b. Thus I’d much rather write:

a = ((uint24_t)b * 51) / 64;

Now obviously on a 32 bit processor there would be zero benefit to doing this (indeed there may be a penalty). However on an 8 bit (and probably a 16 bit) processor, there would be a dramatic benefit to such a construct.

Real world values

I regularly find myself needing a static variable that requires more than 16 bits of range. However when I look at these variables they almost never require the staggering range of a 32 bit variable. Instead 24 bits would do very nicely. Needless to say I am forced to allocate 32 bits even though I know that the most significant byte will never take on anything other than zero. This is particularly galling when these variables are stored in EEPROM – with its associated cost and long write times.

Taking these two together across all the 8/16 bit embedded systems out there, the cost in wasted instruction cycles, memory, stack size and energy must be truly staggering. We could probably save a power plant or two world wide with all the energy being wasted!

So why don’t most compiler vendors support a 24 bit integer? I don’t know for sure, but I suspect it is some combination of:

  • No one has been asking for it.
  • They are more concerned with being C89 / C99 compliant than they are with being useful.
  • No one has ever implemented a compiler benchmark where support for a 3 byte integer would be useful.

If you happen to agree with me that a 3 byte integer would be very useful, then next time you see your friendly compiler vendor – complain (or at least point them to this blog). Who knows, change may yet come!

Home

Division of integers by constants

June 5th, 2009 by Nigel Jones

An issue that comes up frequently in embedded systems is division of an integer by a constant. Of course most of the time we try and arrange things such that the divisor is a power of two such that the division may be performed by shift operations. However, all too often we have to divide an integer by some non power of two value. Divisors that seem to crop up a lot are 10 & 100 (for obvious reasons), 3 (for no good reason), 60 (when dealing with time) and of course various combination’s of pi and root 2. In cases like these you can of course just code it ‘normally’ and let the compiler do the work for you. However, when you feel the need for speed, there are other techniques that are spectacularly good.

I learned about this subject in dribs and drabs over the years without ever coming across a good summary – until I located this paper by Douglas Jones (no relationship). It does a nice job of explaining most of what you need to know in order to perform division of an integer by a constant. I particularly like the fact that he has algorithms for CPUs that contain barrel shifters – and those that do not. I strongly recommend that you read the paper. One note of caution however – Jones like many academics is used to working on CPUs with 32 bit word lengths. As such, his code assumes that integers are 32 bits. If you use his code as is, then it will fail on 16 bit word length machines. It’s for reasons such as this that I really recommend everyone would use the C99 data types.

For those of you too lazy to read the paper, its basic premise is based upon the fact that division by a constant is equivalent to multiplication by the reciprocal of that constant. There is nothing of course earth shattering about this observation. However, Jones then goes ahead and explains about binary points, rounding etc in order to achieve the desired result. Since I had to reduce his paper to practice, I thought I’d go ahead and share the ‘recipe’ with you. Before doing so I should note that I work mostly with 8 & 16 bit CPUs that do not contain barrel shifters. As a result I am most interested in the techniques that use multiplication. If you are working with a 32 bit processor with a barrel shifter and an instruction cache then you should seriously look at his other implementations.

Division of a uint16_t by a constant K

In the steps that follow, there is no requirement that K be integer. It must however be greater than 1.
There are two recipes. The first works for many divisors – but not all and is the faster of the two. The second recipe will give better results for all inputs – but produces less efficient code. While I am sure that there is some analytical way of making the determination ahead of time, I’ve found it easier to use the first recipe and exhaustively test it. If it works – great. If not then switch to the second recipe.

In the following descriptions, Q is the quotient (i.e. the result) of dividing an unsigned integer A by the constant K.

Recipe #1
  1. Convert 1 / K into binary. There is a nice web based calculator here that will do the job.
  2. Take all the bits to the right of the binary point, and left shift them until the bit to the right of the binary point is 1. Record the required number of shifts S.
  3. Take the most significant 17 bits and add 1 and then truncate to 16 bits. This effectively rounds the result.
  4. Express the remaining 16 bits to the right of the binary point as a 4 digit hexadecimal number M of the form hhhh.
  5. Q = (((uint32_t)A * (uint32_t)M) >> 16) >> S
  6. Perform an exhaustive check for all A & Q. If necessary adjust M or try recipe #2.

Incidentally, you may be wondering why I don’t use the form espoused by Jones, namely:Q = (((uint32_t)A * (uint32_t)M) >> (16 + S))
The answer is that this requires a left shift 16 + S places of a 32 bit integer. By splitting the shift into two as shown and by making use of the C integer promotion rules, the expression becomes:

  1. Right shift a 32 bit integer 16 places and convert to a 16 bit integer. This effectively means just use the top half of the 32 bit integer.
  2. Right shift the 16 bit integer S places.

This is dramatically more efficient on an 8 or 16 bit processor. On a 32 bit processor it probably is not.

Recipe #2

  1. Convert 1 / K into binary.
  2. Take all the bits to the right of the binary point, and left shift them until the bit to the right of the binary point is 1. Record the required number of shifts S.
  3. Take the most significant 18 bits and add 1 and then truncate to 17 bits. This effectively rounds the result.
  4. Express the 17 bit result as 1hhhh. Denote the hhhh portion as M
  5. Q = ((((uint32_t)A * (uint32_t)M) >> 16) + A) >> 1) >> S;
  6. Perform an exhaustive check for all A & Q. If necessary adjust M.

Again I split the shifts up as shown for efficiency on an 8 / 16 bit machine.

Example 1 – Divide by 30

In this case I wish to divide a uint16_t by 30.

  1. Convert to binary. 1 / 30 = 0.000010001000100010001000100010001000100010001000100010001
  2. Left shift until there is a 1 to the right of the binary point. In this case it requires 4 shifts and we get 0.10001000100010001000100010001000100010001000100010001. S is thus 4.
  3. Take the most significant 17 bits: 1000 1000 1000 1000 1
  4. Add 1: giving 1000 1000 1000 1000 1 + 1 = 1000 1000 1000 1001 0
  5. Truncate to 16 bits: 1000 1000 1000 1001
  6. Express in hexadecimal: M = 0x8889
  7. Q = (((uint32_t)A * (uint32_t)0x8889) >> 16) >> 4

An exhaustive check confirms that this expression does indeed do the job for all 16 bit values of A. It is also about 10 times faster than the compiler division routine on an AVR processor.

Example 2 – Divide by 100

In this case I wish to divide a uint16_t by 100. This is one of those cases where we need 17 bit resolution

  1. Convert to binary. 1 / 100 = 0.00000010100011110101110000101000111101011100001010001111011
  2. Left shift until there is a 1 to the right of the binary point. In this case it requires 6 shifts and we get 0.10100011110101110000101000111101011100001010001111011. S is thus 6.
  3. Take the most significant 18 bits: 1 0100 0111 1010 1110 0
  4. Add 1: 1 0100 0111 1010 1110 0 + 1 = 1 0100 0111 1010 1110 1
  5. Truncate to 17 bits: 1 0100 0111 1010 1110
  6. Express in hexadecimal: M = 1 47AE
  7. Q = ((((uint32_t)A * (uint32_t)0x47AE) >> 16) + A) >> 1) >> 6;

An exhaustive check shows that the division is not exact for all A. I thus incremented M to 0x47AF and got exact results for all A. This code was about twice as fast as the compiler division routine on an AVR processor.

Example 3 – Divide by π

This is an example where the resultant expression results in an approximate result. The approximation is very good though, with a quotient that is off by at most 1 for all A.

  1. Convert to binary: 1 / π = 0.010100010111110011000001101101110010011100100010001001
  2. Left shift until there is a 1 to the right of the binary point. In this case it requires 1 shift and we get
    10100010111110011000001101101110010011100100010001001. S is thus 1.
  3. Take the most significant 18 bits: 1 0100 0101 1111 0011 0
  4. Add 1: 1 0100 0101 1111 0011 0 + 1 = 1 0100 0101 1111 0011 1
  5. Truncate to 17 bits: 1 0100 0101 1111 0011
  6. Express in hexadecimal: M = 1 45F3
  7. Q = ((((uint32_t)A * (uint32_t)0x45F3) >> 16) + A) >> 1) >> 1;

An exhaustive check that compared the result of this expression to (float)A * 0.31830988618379067153776752674503f showed that the match was exact for all but 263 values in the range 0 – 0xFFFF. Where there was a mismatch it is off by at most 1. It’s also 23 times faster than converting to floating point. Not a bad trade off.

Example 4 – Divide by 10 on an 8 bit value

This technique is obviously usable on 8 bit values. One just has to adjust the number of bits. Here’s an example

  1. Convert to binary. 1 / 10 = 0.0001100110011001100110011001100110011001100110011001101
  2. Left shift until there is a 1 to the right of the binary point. In this case it requires 3 shifts and we get 0.1100110011001100110011001100110011001100110011001101. S is thus 3.
  3. Take the most significant 9 bits: 1100 1100 1
  4. Add 1: giving 110011001 + 1 = 110011010
  5. Truncate to 8 bits: 1100 1101
  6. Express in hexadecimal: M = 0xCD
  7. Q = (((uint16_t)A * (uint16_t)0xCD) >> 8) >> 3

An exhaustive check confirms that this expression does indeed do the job for all 8 bit values of A. It is also about 8 times faster than the compiler division routine on an AVR processor.

Summary

Using the values generated by Jones, together with some of the values I have computed, here’s a summary of some common divisors for unsigned 16 bit integers.

Divide by 3: (((uint32_t)A * (uint32_t)0xAAAB) >> 16) >> 1
Divide by 5: (((uint32_t)A * (uint32_t)0xCCCD) >> 16) >> 2
Divide by 6: (((uint32_t)A * (uint32_t)0xAAAB) >> 16) >> 2
Divide by 7: ((((uint32_t)A * (uint32_t)0x2493) >> 16) + A) >> 1) >> 2
Divide by 9: (((uint32_t)A * (uint32_t)0xE38F) >> 16) >> 3
Divide by 10: (((uint32_t)A * (uint32_t)0xCCCD) >> 16) >> 3
Divide by 11: (((uint32_t)A * (uint32_t)0xBA2F) >> 16) >> 3
Divide by 12: (((uint32_t)A * (uint32_t)0xAAAB) >> 16) >> 3
Divide by 13: (((uint32_t)A * (uint32_t)0x9D8A) >> 16) >> 3
Divide by 14: ((((uint32_t)A * (uint32_t)0x2493) >> 16) + A) >> 1) >> 3
Divide by 15: (((uint32_t)A * (uint32_t)0x8889) >> 16) >> 3
Divide by 30: (((uint32_t)A * (uint32_t)0x8889) >> 16) >> 4
Divide by 60: (((uint32_t)A * (uint32_t)0x8889) >> 16) >> 5
Divide by 100: (((((uint32_t)A * (uint32_t)0x47AF) >> 16U) + A) >> 1) >> 6
Divide by PI: ((((uint32_t)A * (uint32_t)0x45F3) >> 16) + A) >> 1) >> 1
Divide by √2: (((uint32_t)A * (uint32_t)0xB505) >> 16) >> 0

Hopefully you have spotted the relationship between divisors that are multiples of two. For example compare the expressions for divide by 15, 30 & 60.

If someone has too much time on their hands and would care to write a program to compute the values for all integer divisors, then I’d be happy to post the results for everyone to use.

Update

Alan Bowens has risen to the challenge and has generated some nifty programs for generating coefficients for arbitrary 8 and 16 bit values. He’s also generated header files for all 8 and 16 bit integer divisors that you can just include and use. You’ll find it all at his blog. Nice work Alan.

Efficient C Tips #9 – Use lookup tables

May 28th, 2009 by Nigel Jones

This the ninth in a series of tips on how to make your C code more efficient.

(Note if you are looking for basic information on lookup tables, you should read this).

Typically the fastest ways to compute something on a microcontroller is to not compute it all – but to simply read the result from a lookup table. For example this is regularly done as part of CRC calculations. Despite this I’ve noticed over the years what I’ll call the ‘look up tables are boring’ syndrome. What do I mean by this? Well when having to code a solution to a problem, it seems that most of us would rather code something that involves crunching numbers, rather than generate a table where we just look up the result. I’m sure that many of you are thinking that I’m dead wrong and that you use lookup tables all the time. Well I’m sure many of you do. However the question is whether you make full use of this capability?

What started me thinking about this is the person who ended up on this blog looking for an efficient algorithm for determining the day of the year. I have no idea if they were coding for an embedded system or not, nor whether they were looking for a fast solution, a minimal memory solution, or something in between. However, it did make me realize that it would be a simple albeit slightly contrived way of demonstrating my point about look up tables.

First off, I imposed some constraints

  • The Gregorian calendar is to be used.
  • Days, months and years are numbered from 1 and not zero, such that January 1 is day 1 and not day 0.

Here’s my first solution, that makes use of a small lookup table:

#define JAN_DAYS (31)
#define FEB_DAYS (28)
#define LY_FEB_DAYS (29)
...
#define NOV_DAYS (30)
#define MONTHS_IN_A_YEAR (12+1)

uint16_t day_of_year(uint8_t day, uint8_t month, uint16_t year)
{
 static const uint16_t days[2][MONTHS_IN_A_YEAR] =
 {
  { /* Non leap year table */
   0,    /* Padding because first month is not zero */
   0,    /* If month is january, then no days before it */
   JAN_DAYS,  JAN_DAYS + FEB_DAYS,...JAN_DAYS + FEB_DAYS + ... + NOV_DAYS
  },
  { /* Leap year lookup table */
   0,    /* Padding because first month is not zero */
   0,    /* If month is January, then no days before it */
   JAN_DAYS,  JAN_DAYS + LY_FEB_DAYS,  ...JAN_DAYS + LY_FEB_DAYS + ... + NOV_DAYS
  }
 };

 uint16_t day_of_year;

 if ((year % 4 == 0) && (year % 100 != 0) || (year % 400 == 0))
 {
   /* Leap year */
   day_of_year = days[1][month] + day;
 }
 else
 {
  /* Non leap year */
  day_of_year = days[0][month] + day;
 }
 return day_of_year;
}

For most applications I think this is an optimal solution in that it handles a very wide range of dates, uses a small amount of storage for the lookup tables and requires minimal computational effort to achieve the result. (On an ARM7 it requires 128 bytes of code space, 64 bytes for the lookup table and executes in about 40 cycles). However, what about if the code had to run as fast as possible? I’d guess that most folks would work on optimizing the details of the implementation and leave it at that. I’m not sure that many people would consider a gigantic look up table so that the code looks like this:

#define LAST_YEAR (2400 + 1)/* Last year to worry about */
uint16_t day_of_year(uint8_t day, uint8_t month, uint16_t year)
{
 static const uint16_t days[LAST_YEAR][[MONTHS_IN_A_YEAR] =
 {
  {
   /* Padding because first year is not zero */ 0,
   /* Padding because first month is zero */ 0,
   /* If month is January, then no days before it */
   JAN_DAYS,  JAN_DAYS + FEB_DAYS,...JAN_DAYS + FEB_DAYS + ... + NOV_DAYS
  },

  {
    /* Year 1 - non leap year */0,
    /* Padding because first month is zero */0,
    /* If month is January, then no days before it */
    JAN_DAYS,  JAN_DAYS + FEB_DAYS,...JAN_DAYS + FEB_DAYS + ... + NOV_DAYS
  },

 ... 

  {
   /* Last year - a leap year */ 0,
   /* Padding because first month is zero */ 0,
   /* If month is January, then no days before it */
   JAN_DAYS,  JAN_DAYS + LY_FEB_DAYS,...JAN_DAYS + LY_FEB_DAYS + ... + NOV_DAYS
  },
};

return days[year][month] + day;
}

The lookup table would of course require at least 2401 * 13 * 2 = 62426 bytes. Evidently this would likely be unreasonable on an 8 bit processor. On a 32 bit processor with 8 Mbytes of Flash – not so unreasonable. I first learned this lesson many years ago in an application that required an 8051 processor to perform a complicated refresh of multiplexed LEDs at about 1 kHz (a significant load for the 8051). The initial implementation consisted of pure code. Over the next year or so the two of us working on it realized that we could speed it up by using lookup tables. We started off with a small look up table, and by the time we were done, the table was 48K (out of the 64K available to the 8051) while the execution time was a fraction of what it had been before. Thus next time you are faced with making something run faster consider using a look up table – even if it is huge. Sometimes it’s just the best way to go.

Next Tip

Previous Tip

Home

Checking the fuse bits in an Atmel AVR at run time

May 15th, 2009 by Nigel Jones

In general I try and post on topics that have broad appeal in the embedded world. Today I’m going to partially break with that tradition to show how to check the fuse bits in an Atmel AVR class processor. However, before I do so, I’d like to discuss my motivations for wanting to do this.

The AVR processor family, together with the PIC and other processor families contain fuse / configuration bits. These bits are settable only at program time and are used to configure the behavior of the processor at run time. Typical parameters that are configured are oscillator types, brown out voltage detect levels and memory partitioning. Now as I lamented in this post, there is no great way of communicating to the production staff how you want these fuse bits programmed. As a result I consider there to be a very high probability that a mistake will be made in production – and that all my efforts on crafting perfect code will thus be for naught. Thus while it is much better to prevent mistakes, if you can’t do so, then the next best thing to do is to detect them. As a result on one of the products that I am working on, I have as one of the startup tests a check to ensure that the fuse bits are indeed what they are supposed to be. While I recognize that if the fuse settings are dreadfully wrong it is unlikely that my code will run, I’m actually more concerned with the case where the fuse bits are set mostly correct – and thus that the code works most of the time.

So how do I do this on an AVR? Well if you are using an IAR compiler the work is mostly done for you. Here it is:

#include <intrinsics.h>

/* Macros to read the various fuse bytes */
#define _SPM_GET_LOW_FUSEBITS()  __AddrToZByteToSPMCR_LPM((void __flash*)0x0000U, 0x09U)
#define _SPM_GET_HIGH_FUSEBITS()  __AddrToZByteToSPMCR_LPM((void __flash*)0x0003U, 0x09U)
#define _SPM_GET_EXTENDED_FUSEBITS()  __AddrToZByteToSPMCR_LPM((void __flash*)0x0002U, 0x09U)

/* Structure to store the fuse bytes */
typedef struct{
uint8_t  fuse_low;      /* The low fuse setting */
uint8_t  fuse_high;     /* The high fuse setting */
uint8_t  fuse_extended; /* The extended fuse setting */
uint8_t  lockbits;      /* The lockbits */
} FUSE_SETTINGS;

/* Storage for the fuse settings will be in EEPROM */
static __eeprom __no_init FUSE_SETTINGS Fuse_Settings @ FUSE_VALUES; 

void fuses_Read(void)
{
 FUSE_SETTINGS value;

 value.fuse_low = _SPM_GET_LOW_FUSEBITS();
 value.fuse_high = _SPM_GET_HIGH_FUSEBITS();
 value.fuse_extended = _SPM_GET_EXTENDED_FUSEBITS();
 value.lockbits = _SPM_GET_LOCKBITS();
 __no_operation();

 Fuse_Settings = value;
}

The macro __AddrToZByteToSPMCR_LPM() is defined in intrinsics.h. Essentially it takes care of all the necessary finicky register usage required to read the fuse bits. You’ll also notice that I have used a macro _SPM_GET_LOCKBITS() to read the lockbits. This macro is also found in intrinsics.h. The really observant reader may wonder why there isn’t a macro in intrinsics.h for reading the fuse bits? Well there is – it’s just for reading the low fuse byte – which is all the early AVR processors had. I’ve pointed this out to IAR and they have promised to address this in the next release (thanks Steve!).

Before I leave this topic, I’ll also point out that I don’t read the fuse settings directly into EEPROM. Instead I read them into RAM and then copy the entire structure to EEPROM. I do this because writing to EEPROM messes with the same registers used for reading the fuse bits – and thus bad things happen. This also explains the __no_operation() statement before the data are copied to EEPROM.

Incidentally, I don’t know of a way to read the configuration bits of a PIC at run time. Chalk this up as one more reason why an AVR is superior to a PIC!

Home

Signed versus unsigned integers

May 9th, 2009 by Nigel Jones

If you are looking for some basic information on signed versus unsigned integers, you may also find this post useful. That being said, on to the original post…

Jack Ganssle’s latest newsletter arrived the other day. Within it is an extensive set of comments from John Carter, in which he talks about and quotes from a book by Derek Jones (no relation of mine). The topic is unsigned versus signed integers. I have to say I found it fascinating in the same way that watching a train wreck is fascinating. Here’s the entire extract – I apologize for its length – but you really have to read it all to understand my horror.

“Suppose you have a “Real World (TM)” always and forever positive value. Should you represent it as unsigned?

“Well, that’s actually a bit of a step that we tend to gloss over…

“As Jones points out in section 6.2.5 the real differences as far as C is concerned between unsigned and signed are…

” * unsigned has a larger range.

” * unsigned does modulo arithmetic on overflow (which is hardly ever what you intend)

” * mixing signed and unsigned operands in an expression involves arithmetic conversions you probably don’t quite understand.

“For example I have a bit of code that generates code … and uses __LINE__ to tweak things so compiler error messages refer to the file and line of the source code, not the generated code.

“Thus I must do integer arithmetic with __LINE__ include subtraction of offsets and multiplication.

“* I do not care if my intermediate values go negative.

“* It’s hard to debug (and frightening) if they suddenly go huge.

“* the constraint is the final values must be positive.

“Either I must be _very_ careful to code and test for underflows _before_ each operation to ensure intermediate results do not underflow. Or I can say tough, convert to 32bit signed int’s and it all just works. I.e. Line numbers are constrained to be positive, but that has nothing to do representation. Use the most convenient representation.

“C’s “unsigned” representation is useless as a “constrain this value to be positive” tool. E.g. A device that can only go faster or slower, never backwards:

unsigned int speed; // Must be positive.
unsigned int brake(void)
{
–speed;
}

“Was using “unsigned” above any help to creating robust error free code? NO! “speed” may now _always_ be positive… but not necessarily meaningful!

“The main decider in using “unsigned” is storage. Am I going to double my storage requirements by using int16_t’s or pack them all in an array of uint8_t’s?

“My recommendation is this…

” * For scalars use a large enough signed value. eg. int_fast32_t
” * Treat “unsigned” purely as a storage optimization.
” * Use typedef’s (and splint (or C++)) for type safety and accessor functions to ensure constraints like strictly positive. E.g.

typedef int_fast32_t velocity; // Can be negative
typedef int_fast32_t speed; // Must be positive.
typedef uint8_t dopplerSpeedImage_t[MAX_X][MAX_Y]; // Storage optimization

I read this, and quite frankly my jaw dropped. Now the statements made by Carter / Jones concerning differences between signed and unsigned are correct – but to call them the real differences is completely wrong. To make my point, I’ll first of all address his specific points – and then I’ll show you where the real differences are:

Unsigned has a larger range

Yes it does. However, if this is the reason you are using an unsigned type you’ve probably got other problems.

Unsigned does modulo arithmetic on overflow (which is hardly ever what you intend)

Yes it does, and au contraire – this is frequently what I want (see for example this). However, far more importantly is the question – what does a signed integer do on overflow? The answer is that it is undefined. That is if you overflow a signed integer, the generated code is at liberty to do anything – including deleting your program or starting world war 3. I found this out the hard way many years ago. I had some PC code written for Microsoft’s Version 7 compiler. The code was inadvertently relying upon signed integer overflow to work a certain way. I then moved the code to Watcom’s compiler (Version 10 I think) and the code failed. I was really ticked at Watcom until I realized what I had done and that Watcom was perfectly within their rights to do what they did.

Note that this was not a case of porting code to a different target. This was the same target – just a different compiler.

Now let’s address his comment about modulo arithmetic. Consider the following code fragment:

uint16_t a,b,c, res;

a = 0xFFFF; //Max value for a uint16_t
b = 1;
c = 2;

res = a;
res += b; //Overflow
res -= c;

Does res end up with the expected value of 0xFFFE? Yes it does – courtesy of the modulo arithmetic. Furthermore it will do so on every conforming compiler.

Now if we repeat the exercise using signed data types.

int16_t a,b,c, res;

a = 32767; //Max value for a int16_t
b = 1;
c = 2;

res = a;
res += b; //Overflow - WW3 starts
res -= c;

What happens now? Who knows? On your system you may or may not get the answer you expect.

Mixing signed and unsigned operands in an expression involves arithmetic conversions you probably don’t quite understand

Well whether I understand them or not is really between me and Lint. However, the key thing to know is that if you use signed integers by default, then it is really hard to avoid combining signed and unsigned operands. How is this you ask? Well consider the following partial list of standard ‘functions’ that return an unsigned integral type:

  • sizeof()
  • offsetof()
  • strcspn()
  • strlen()
  • strpsn()

In addition memcpy(), memset(), strncpy() and others also use unsigned integral types in their parameter lists. Furthermore in embedded systems, most compiler vendors typedef IO registers as unsigned integral types. Thus any expression involving a register also includes unsigned quantities. Thus if you use any of these in your code, then you run a very real risk of running into signed / unsigned arithmetic conversions. Thus IMHO the usual arithmetic conversions issue is actually an argument for avoiding signed types – not the other way around! So what are the real reasons to use unsigned data types? I think these reasons are high on my list:

  • Modulus operator
  • Shifting
  • Masking

Modulus Operator

One of the relatively unknown but nasty corners of the C language concerns the modulus operator. In a nutshell, using the modulus operator on signed integers when one or both of the operands is negative produces an implementation defined result. Here’s a great example in which they purport to show how to use the modulus operator to determine if a number is odd or even. The code is reproduced below:

int main(void)
{
 int i;

 printf("Enter a number: ");
 scanf("%d", &i);

 if( ( i % 2 ) == 0) printf("Even");
 if( ( i % 2 ) == 1) printf("Odd");

 return 0;
}

When I run it on one of my compilers, and enter -1 as the argument, nothing gets printed, because on my system -1 % 2 = -1. The bottom line – using the modulus operator with signed integral types is a disaster waiting to happen.

Shifting

Performing a shift right on a signed integer is implementation dependent. What this means is that when you shift right you have no idea whether the sign bit is preserved or if it is propagated. The implications of this are quite profound. For example, if foo is an unsigned integral type, then a shift right is equivalent to a divide by 2. However, if foo is a signed type, then a shift right is most certainly not the same as a divide by 2 – and will generate different code. It’s for this reason that Lint, MISRA and most good coding standards will reject any attempt to right shift a signed integral type. BTW while left shifts on signed types are safer, I really don’t recommend them either.

Masking

A similar class of problems occur if you attempt to perform masking operations on a signed data type.

Finally…

Before I leave this post, I just have to comment on this quote from Carter

“Either I must be _very_ careful to code and test for underflows _before_ each operation to ensure intermediate results do not underflow. Or I can say tough, convert to 32bit signed int’s and it all just works”.

Does anyone else find this scary? He seems to be advocating that rather than think about the problem at hand, he’d rather switch to a large signed data type – and trust that everything works out OK. He obviously thinks he’s on safe ground. However consider the case where he has a 50,000 line file (actually 46342 to be exact). Is this an unreasonably large file – well yes for a human generated file. However for a machine generated file (e.g. an embedded image file), it is not unreasonable at all. Furthermore let’s assume that his computations involve for some reason a squaring of the number of lines in the file: i.e. we get something like this:

int32_t lines, result;

lines = 46342;
result = lines * lines + some_other_expression;

Well 46342 * 46342 overflows a signed 32 bit type – and the result is undefined. The bottom line – using a larger signed data type to avoid thinking about the problem is not recommended. At least if you use an unsigned type you are guaranteed a consistent answer.

Home