Archive for the ‘General C issues’ Category

A ‘C’ Test: The 0x10 Best Questions for Would-be Embedded Programmers (reprised)

Tuesday, September 15th, 2009 Nigel Jones

In May 2000 Embedded Systems Programming magazine (now Embedded Systems Design) published an article I had written entitled “A ‘C’ Test: The 0x10 Best Questions for Would-be Embedded Programmers.”

A revised version is posted at: A ‘C’ Test: The 0x10 Best Questions for Would-be Embedded Programmers.

I received a lot of mail about it at the time (including a decent amount of hate mail), and, much to my amazement, I continue to get mail about it to this day. The article has been shamelessly copied all over the web and its title is a popular search term that drives people to this blog.

Be aware that this test has been widely publicized, so be very suspicious of someone that does really well on it! To illustrate my point, when I wrote the article I was doing some work at a large company and was sharing an office with a fellow consultant, Nelson. Naturally I had Nelson proof read the article. Fast forward a few months when Nelson went off to an interview with a new potential client. Well it so happens that the interview occurred on the same day that my article was published, and the interviewer proceeded to use it verbatim on Nelson. Nelson, of course, aced the test leaving the interviewer astounded. Needless to say, we both found this to be very amusing! Alas, I’ve never had anyone in the intervening 9+ years hit me with it. Maybe next week…

If you would like a version of this test in Word format, or could use some expertise from an embedded systems consultant, please contact me.

Observations on the relevance of C++ to embedded systems

Thursday, September 10th, 2009 Nigel Jones

My fellow blogger Mike Barr recently wrote an article entitled ‘Real men program in C’. Given that his blogs are cross posted at embedded.com, it was soon picked up by reddit et al and the usual language wars started – with all that these wars usually entail. Personally I don’t get very worked up on this subject and so I didn’t participate. However it did dove tail rather nicely with a conversation I had recently with Dan Saks. I had asked Dan for his thoughts on the difficulty (impossibility!) of inlining global functions in C. The conversation was interesting in its own right, but at the end Dan posed the question ‘Why don’t you program it in C++?’ (since for the uninitiated, C++ allows you to quite nicely inline a class’s public functions). I’ll leave for another day, my response and also my thoughts on C++. However, it did get me thinking a lot about this issue.

Now although I have many thoughts on this topic, the one that I’d like to share with you today is my observation that there is an incredible dearth of example C++ code for embedded systems. What do I mean by this? Well like most of you, I regularly download example code from vendors sites – and it’s nearly always written in C and not C++. I’d previously explained this away by assuming that it was because I do a lot of work in the 8/16 bit realm, and that smaller processors are more likely to be programmed in C than C++. However, yesterday I attended a seminar put on by TI. There were several things of interest in the seminar, including TI’s proprietary RF networking protocol SimpliciTI and also their recently acquired Cortex 3 line from Luminary. The FAE encouraged us to look at the code that was available for both of these entities – and so I did.

What I found is that the SimpliciTI code is all written in C as was all the Luminary code I looked at including their impressive graphics library. Hmmmm thought I – is this an aberration or is this norm? For my next stop I went over to the Micrium web site where they offer a fine array of products including an RTOS, a variety of protocol stacks, a graphics library and so on. All the ones I looked at were written in C. Same story over at Segger. OK, thought I, what about the compiler vendors? A sampling of the code examples at the IAR and Keil websites (for their respective ARM product lines) showed them to be all in C. Finally I headed over to the Greenhills website to check out their enormous Networking and Communications product line. I chose half a dozen products at random. In all cases where the language was specified, it was ANSI C.

Is this a true random sample – of course not. However it does suggest to me that the industry hasn’t exactly embraced C++. Now it’s debatable whether the tool vendors and silicon suppliers should lead the industry or whether they should reflect reality. Regardless of your perspective on this, it’s clear to me that I’ll know C++ has been embraced by the embedded community only when the majority of the publicly available code is written in C++. Personally, if it hasn’t happened by now, I don’t think it’s going to.

Home

A tutorial on signed and unsigned integers

Wednesday, August 5th, 2009 Nigel Jones

One of the interesting things about writing a blog is looking at the search terms that drive traffic to your blog. In my case, after I posted these thoughts on signed versus unsigned integers, I was amazed to see how many people were ending up here looking for basic information concerning signed and unsigned integers. In an effort to make these folks visits more successful, I thought I’d put together some basic information on this topic. I’ve done it in a question and answer format.

All of these questions have been posed to a search engine which has driven traffic to this blog. For regular readers of this blog looking for something a bit more advanced, you will find the last section more satisfactory.

Are integers signed or unsigned?

A standard C integer data type (‘int’) is signed. However, I strongly recommend that you do not use the standard ‘int’ data type and instead use the C99 data types. See here for an explanation.

How do I convert a signed integer to an unsigned integer?

This is in some ways a very elementary question and in other ways a very profound question. Let’s consider the elementary issue first. To convert a signed integer to an unsigned integer, or to convert an unsigned integer to a signed integer you need only use a cast. For example:

int  a = 6;
unsigned int b;
int  c;

b = (unsigned int)a;

c = (int)b;

Actually in many cases you can dispense with the cast. However many compilers will complain, and Lint will most certainly complain. I recommend you always explicitly cast when converting between signed and unsigned types.

OK, well what about the profound part of the question? Well if you have a variable of type int, and it contains a negative value such as -9 then how do you convert this to an unsigned data type and what exactly happens if you perform a cast as shown above? Well the basic answer is – nothing. No bits are changed, the compiler just treats the bit representation as unsigned. For example, let us assume that the compiler represents signed integers using 2’s complement notation (this is the norm – but is *not* mandated by the C language). If our signed integer is a 16 bit value, and has the value -9, then its binary representation will be 1111111111110111. If you now cast this to an unsigned integer, then the unsigned integer will have the value 0xFFF7 or 6552710. Note however that you cannot rely upon the fact that casting -9 to an unsigned type will result in the value 0xFFF7. Whether it does or not depends entirely on how the compiler chooses to represent negative numbers.

What’s more efficient – a signed integer or an unsigned integer?

The short answer – unsigned integers are more efficient. See here for a more detailed explanation.

When should I use an unsigned integer?

In my opinion, you should always use unsigned integers, except in the following cases:

  • When the entity you are representing with your variable is inherently a signed value.
  • When dealing with standard C library functions that required an int to be passed to them.
  • In certain weird cases such as I documented here.

Now be advised that many people strongly disagree with me on this topic. Naturally I don’t find their arguments persuasive.

Why should I use an unsigned integer?

Here are my top reasons:

  • By using an unsigned integer, you are conveying important information to a reader of your code concerning the expected range of values that a variable may take on.
  • They are more efficient.
  • Modulus arithmetic is completely defined.
  • Overflowing an unsigned data type is defined, whereas overflowing a signed integer type could result in World War 3 starting.
  • You can safely perform shift operations.
  • You get a larger dynamic range.
  • Register values should nearly always be treated as unsigned entities – and embedded systems spend a lot of time dealing with register values.

What happens when I mix signed and unsigned integers?

This is the real crux of the problem with having signed and unsigned data types. The C standard has an entire section on this topic that only a compiler writer could love – and that the rest of us read and wince at. Having said that, it is important to know that integers that are signed get promoted to unsigned integers. If you think about it, this is the correct thing to happen. However, it can lead to some very interesting and unexpected results. A number of years ago I wrote an article “A ‘C’ Test:The 0x10 Best Questions for Would-be Embedded Programmers” that was published in Embedded Systems Programming magazine. You can get an updated and corrected copy at my web site. My favorite question from this test is question 12 which is reproduced below – together with its answer: What does the following code output and why?

void foo(void)
{
 unsigned int a = 6;
 int b = -20;
 (a+b > 6) ? puts("> 6") : puts("<= 6");
}

This question tests whether you understand the integer promotion rules in C – an area that I find is very poorly understood by many developers. Anyway, the answer is that this outputs “> 6”. The reason for this is that expressions involving signed and unsigned types have all operands promoted to unsigned types. Thus -20 becomes a very large positive integer and the expression evaluates to greater than 6. This is a very important point in embedded systems where unsigned data types should be used frequently (see reference 2). If you get this one wrong, then you are perilously close to not being hired.

This is all well and good, but what should one do about this? Well you can pore over the C standard, run tests on your compiler to make sure it really does conform to the standard, and then write conforming code, or you can do the following: Never mix signed and unsigned integers in an expression. I do this by the use of intermediate variables. To show how to do this, consider a function that takes an int ‘a’ and an unsigned int ‘b’. Its job is to return true if b > a, otherwise it returns false. As you shall see, this is a surprisingly difficult problem… To solve this problem, we need to consider the following:

  • The signed integer a can be negative.
  • The unsigned integer b can be numerically larger than the largest possible value representable by a signed integer
  • The integer promotion rules can really screw things up if you are not careful.

With these points in mind, here’s my stab at a robust solution

bool foo(int a, unsigned int b)
{
 bool res;

 if (a < 0)
 {
  res = true; /* If a is negative, it must be less than b */
 }
 else
 {
  unsigned int c;
  c = (unsigned int) a; /* Since a is positive, this cast is safe */
  if (b > c)            /* Now I'm comparing the same data types */
  {
   res = true;
  }
  else
  {
   res = false;
  }
 }
 return res;
}

Is this a lot of work – yes. Could I come up with a more compact implementation that is guaranteed to work for all possible values of a and b – probably. Would it be as clear – I doubt it. Perhaps regular readers of this blog would like to take a stab at producing a better implementation?

Home

Three byte integers

Wednesday, June 10th, 2009 Nigel Jones

One of the enduring myths about the C language is that it is good for use on embedded systems. I’ve always been puzzled by this. While it is true that many other languages are dreadful for use on embedded systems, this merely means that C is less dreadful rather than ‘good’. While I have a host of issues with C, the one that constantly galls me is the lack of 3 byte integers. Using C99 notation these would be the uint24_t and int24_t data types. Now a quick web search indicates that there may be the odd compiler out there that supports 3 byte integers – but the vast majority do not.

So why exactly do I want a 3 byte integer? Well, there are two main reasons:

Intermediate results

When I look through my code, I find a huge number of incidences where I am performing an arithmetic operation on a 16 bit value, where intermediate values overflow 16 bits, yet the final value is 16 bits. For example:

uint16_t a, b;

a = (b * 51) / 64;

In this case, the code will fail if (b * 51) overflows 16 bits. As a result, I am forced to write:

a = ((uint32_t)b * 51) / 64;

However, examination of this code shows that (b * 51) could never overflow 24 bits for all 16 bit b. Thus I’d much rather write:

a = ((uint24_t)b * 51) / 64;

Now obviously on a 32 bit processor there would be zero benefit to doing this (indeed there may be a penalty). However on an 8 bit (and probably a 16 bit) processor, there would be a dramatic benefit to such a construct.

Real world values

I regularly find myself needing a static variable that requires more than 16 bits of range. However when I look at these variables they almost never require the staggering range of a 32 bit variable. Instead 24 bits would do very nicely. Needless to say I am forced to allocate 32 bits even though I know that the most significant byte will never take on anything other than zero. This is particularly galling when these variables are stored in EEPROM – with its associated cost and long write times.

Taking these two together across all the 8/16 bit embedded systems out there, the cost in wasted instruction cycles, memory, stack size and energy must be truly staggering. We could probably save a power plant or two world wide with all the energy being wasted!

So why don’t most compiler vendors support a 24 bit integer? I don’t know for sure, but I suspect it is some combination of:

  • No one has been asking for it.
  • They are more concerned with being C89 / C99 compliant than they are with being useful.
  • No one has ever implemented a compiler benchmark where support for a 3 byte integer would be useful.

If you happen to agree with me that a 3 byte integer would be very useful, then next time you see your friendly compiler vendor – complain (or at least point them to this blog). Who knows, change may yet come!

Home

Division of integers by constants

Friday, June 5th, 2009 Nigel Jones

An issue that comes up frequently in embedded systems is division of an integer by a constant. Of course most of the time we try and arrange things such that the divisor is a power of two such that the division may be performed by shift operations. However, all too often we have to divide an integer by some non power of two value. Divisors that seem to crop up a lot are 10 & 100 (for obvious reasons), 3 (for no good reason), 60 (when dealing with time) and of course various combination’s of pi and root 2. In cases like these you can of course just code it ‘normally’ and let the compiler do the work for you. However, when you feel the need for speed, there are other techniques that are spectacularly good.

I learned about this subject in dribs and drabs over the years without ever coming across a good summary – until I located this paper by Douglas Jones (no relationship). It does a nice job of explaining most of what you need to know in order to perform division of an integer by a constant. I particularly like the fact that he has algorithms for CPUs that contain barrel shifters – and those that do not. I strongly recommend that you read the paper. One note of caution however – Jones like many academics is used to working on CPUs with 32 bit word lengths. As such, his code assumes that integers are 32 bits. If you use his code as is, then it will fail on 16 bit word length machines. It’s for reasons such as this that I really recommend everyone would use the C99 data types.

For those of you too lazy to read the paper, its basic premise is based upon the fact that division by a constant is equivalent to multiplication by the reciprocal of that constant. There is nothing of course earth shattering about this observation. However, Jones then goes ahead and explains about binary points, rounding etc in order to achieve the desired result. Since I had to reduce his paper to practice, I thought I’d go ahead and share the ‘recipe’ with you. Before doing so I should note that I work mostly with 8 & 16 bit CPUs that do not contain barrel shifters. As a result I am most interested in the techniques that use multiplication. If you are working with a 32 bit processor with a barrel shifter and an instruction cache then you should seriously look at his other implementations.

Division of a uint16_t by a constant K

In the steps that follow, there is no requirement that K be integer. It must however be greater than 1.
There are two recipes. The first works for many divisors – but not all and is the faster of the two. The second recipe will give better results for all inputs – but produces less efficient code. While I am sure that there is some analytical way of making the determination ahead of time, I’ve found it easier to use the first recipe and exhaustively test it. If it works – great. If not then switch to the second recipe.

In the following descriptions, Q is the quotient (i.e. the result) of dividing an unsigned integer A by the constant K.

Recipe #1
  1. Convert 1 / K into binary. There is a nice web based calculator here that will do the job.
  2. Take all the bits to the right of the binary point, and left shift them until the bit to the right of the binary point is 1. Record the required number of shifts S.
  3. Take the most significant 17 bits and add 1 and then truncate to 16 bits. This effectively rounds the result.
  4. Express the remaining 16 bits to the right of the binary point as a 4 digit hexadecimal number M of the form hhhh.
  5. Q = (((uint32_t)A * (uint32_t)M) >> 16) >> S
  6. Perform an exhaustive check for all A & Q. If necessary adjust M or try recipe #2.

Incidentally, you may be wondering why I don’t use the form espoused by Jones, namely:Q = (((uint32_t)A * (uint32_t)M) >> (16 + S))
The answer is that this requires a left shift 16 + S places of a 32 bit integer. By splitting the shift into two as shown and by making use of the C integer promotion rules, the expression becomes:

  1. Right shift a 32 bit integer 16 places and convert to a 16 bit integer. This effectively means just use the top half of the 32 bit integer.
  2. Right shift the 16 bit integer S places.

This is dramatically more efficient on an 8 or 16 bit processor. On a 32 bit processor it probably is not.

Recipe #2

  1. Convert 1 / K into binary.
  2. Take all the bits to the right of the binary point, and left shift them until the bit to the right of the binary point is 1. Record the required number of shifts S.
  3. Take the most significant 18 bits and add 1 and then truncate to 17 bits. This effectively rounds the result.
  4. Express the 17 bit result as 1hhhh. Denote the hhhh portion as M
  5. Q = ((((uint32_t)A * (uint32_t)M) >> 16) + A) >> 1) >> S;
  6. Perform an exhaustive check for all A & Q. If necessary adjust M.

Again I split the shifts up as shown for efficiency on an 8 / 16 bit machine.

Example 1 – Divide by 30

In this case I wish to divide a uint16_t by 30.

  1. Convert to binary. 1 / 30 = 0.000010001000100010001000100010001000100010001000100010001
  2. Left shift until there is a 1 to the right of the binary point. In this case it requires 4 shifts and we get 0.10001000100010001000100010001000100010001000100010001. S is thus 4.
  3. Take the most significant 17 bits: 1000 1000 1000 1000 1
  4. Add 1: giving 1000 1000 1000 1000 1 + 1 = 1000 1000 1000 1001 0
  5. Truncate to 16 bits: 1000 1000 1000 1001
  6. Express in hexadecimal: M = 0x8889
  7. Q = (((uint32_t)A * (uint32_t)0x8889) >> 16) >> 4

An exhaustive check confirms that this expression does indeed do the job for all 16 bit values of A. It is also about 10 times faster than the compiler division routine on an AVR processor.

Example 2 – Divide by 100

In this case I wish to divide a uint16_t by 100. This is one of those cases where we need 17 bit resolution

  1. Convert to binary. 1 / 100 = 0.00000010100011110101110000101000111101011100001010001111011
  2. Left shift until there is a 1 to the right of the binary point. In this case it requires 6 shifts and we get 0.10100011110101110000101000111101011100001010001111011. S is thus 6.
  3. Take the most significant 18 bits: 1 0100 0111 1010 1110 0
  4. Add 1: 1 0100 0111 1010 1110 0 + 1 = 1 0100 0111 1010 1110 1
  5. Truncate to 17 bits: 1 0100 0111 1010 1110
  6. Express in hexadecimal: M = 1 47AE
  7. Q = ((((uint32_t)A * (uint32_t)0x47AE) >> 16) + A) >> 1) >> 6;

An exhaustive check shows that the division is not exact for all A. I thus incremented M to 0x47AF and got exact results for all A. This code was about twice as fast as the compiler division routine on an AVR processor.

Example 3 – Divide by π

This is an example where the resultant expression results in an approximate result. The approximation is very good though, with a quotient that is off by at most 1 for all A.

  1. Convert to binary: 1 / π = 0.010100010111110011000001101101110010011100100010001001
  2. Left shift until there is a 1 to the right of the binary point. In this case it requires 1 shift and we get
    10100010111110011000001101101110010011100100010001001. S is thus 1.
  3. Take the most significant 18 bits: 1 0100 0101 1111 0011 0
  4. Add 1: 1 0100 0101 1111 0011 0 + 1 = 1 0100 0101 1111 0011 1
  5. Truncate to 17 bits: 1 0100 0101 1111 0011
  6. Express in hexadecimal: M = 1 45F3
  7. Q = ((((uint32_t)A * (uint32_t)0x45F3) >> 16) + A) >> 1) >> 1;

An exhaustive check that compared the result of this expression to (float)A * 0.31830988618379067153776752674503f showed that the match was exact for all but 263 values in the range 0 – 0xFFFF. Where there was a mismatch it is off by at most 1. It’s also 23 times faster than converting to floating point. Not a bad trade off.

Example 4 – Divide by 10 on an 8 bit value

This technique is obviously usable on 8 bit values. One just has to adjust the number of bits. Here’s an example

  1. Convert to binary. 1 / 10 = 0.0001100110011001100110011001100110011001100110011001101
  2. Left shift until there is a 1 to the right of the binary point. In this case it requires 3 shifts and we get 0.1100110011001100110011001100110011001100110011001101. S is thus 3.
  3. Take the most significant 9 bits: 1100 1100 1
  4. Add 1: giving 110011001 + 1 = 110011010
  5. Truncate to 8 bits: 1100 1101
  6. Express in hexadecimal: M = 0xCD
  7. Q = (((uint16_t)A * (uint16_t)0xCD) >> 8) >> 3

An exhaustive check confirms that this expression does indeed do the job for all 8 bit values of A. It is also about 8 times faster than the compiler division routine on an AVR processor.

Summary

Using the values generated by Jones, together with some of the values I have computed, here’s a summary of some common divisors for unsigned 16 bit integers.

Divide by 3: (((uint32_t)A * (uint32_t)0xAAAB) >> 16) >> 1
Divide by 5: (((uint32_t)A * (uint32_t)0xCCCD) >> 16) >> 2
Divide by 6: (((uint32_t)A * (uint32_t)0xAAAB) >> 16) >> 2
Divide by 7: ((((uint32_t)A * (uint32_t)0x2493) >> 16) + A) >> 1) >> 2
Divide by 9: (((uint32_t)A * (uint32_t)0xE38F) >> 16) >> 3
Divide by 10: (((uint32_t)A * (uint32_t)0xCCCD) >> 16) >> 3
Divide by 11: (((uint32_t)A * (uint32_t)0xBA2F) >> 16) >> 3
Divide by 12: (((uint32_t)A * (uint32_t)0xAAAB) >> 16) >> 3
Divide by 13: (((uint32_t)A * (uint32_t)0x9D8A) >> 16) >> 3
Divide by 14: ((((uint32_t)A * (uint32_t)0x2493) >> 16) + A) >> 1) >> 3
Divide by 15: (((uint32_t)A * (uint32_t)0x8889) >> 16) >> 3
Divide by 30: (((uint32_t)A * (uint32_t)0x8889) >> 16) >> 4
Divide by 60: (((uint32_t)A * (uint32_t)0x8889) >> 16) >> 5
Divide by 100: (((((uint32_t)A * (uint32_t)0x47AF) >> 16U) + A) >> 1) >> 6
Divide by PI: ((((uint32_t)A * (uint32_t)0x45F3) >> 16) + A) >> 1) >> 1
Divide by √2: (((uint32_t)A * (uint32_t)0xB505) >> 16) >> 0

Hopefully you have spotted the relationship between divisors that are multiples of two. For example compare the expressions for divide by 15, 30 & 60.

If someone has too much time on their hands and would care to write a program to compute the values for all integer divisors, then I’d be happy to post the results for everyone to use.

Update

Alan Bowens has risen to the challenge and has generated some nifty programs for generating coefficients for arbitrary 8 and 16 bit values. He’s also generated header files for all 8 and 16 bit integer divisors that you can just include and use. You’ll find it all at his blog. Nice work Alan.