Archive for August, 2009

Minimizing memory use in embedded systems Tip #1 – Eliminate unnecessary strings

Wednesday, August 26th, 2009 Nigel Jones

I already have a series of tips on efficient C, another on effective C and a third on lowering the power consumption of embedded systems. Today I’m introducing a fourth series of tips related to minimizing memory usage in embedded systems. Now back when I was a lad the single biggest issue in an embedded system was nearly always a lack of memory, and as a result one had to quickly learn how to husband this resource with great care. Fast forward 20 years and this notion probably seems quite quaint to those of you programming ARM system with 16 Mbytes of Flash and 64 Mbytes of RAM. So what’s the motivation for this post then? Well, despite the presence of gigantic memory systems in many embedded systems, it’s still surprisingly common for one to find oneself in a situation where memory is being gobbled up at an alarming rate. Anyone that has programmed an 8051 or an 8 bit PIC recently will know exactly what I’m talking about. So for those of you out there that find yourself in this situation, I hope that you’ll find this series informative. Enough preamble – on to business. The first tip is quite simple – eliminate unnecessary strings. Even if your reaction is ‘well that’s useless – I don’t have any strings in my code’, then I still suggest you read on. In order to eliminate unnecessary strings, the first step is to determine the list of strings in your code. You can of course pore over your source code. However a far better approach is to scan the binary image looking for strings. Somewhat amazingly I actually use a utility called ‘strings.exe’ that is supplied by Microsoft. It’s available here. I like this program because you can search for ASCII and/or Unicode strings, while also controlling the minimum number of matching characters. (Please note that this utility is intended to scan a pure binary file. Intel Hex, S records etc don’t cut it). If you do this, then you may of course find no strings – and I apologize for wasting your time. However, even if your program is supposed to be string free, you may well find things such as:

  • Copyright notices
  • Strings associated with assert statements.
  • Other compiler artifacts such as path names.

The latter two tend to arise if any code references the __FILE__ macro or its brethren.Of course working out how to eliminate these strings can be challenging – and in the case of copyright notices may violate the terms of a license agreement – so don’t get too aggressive.If your code does contain intentional strings, then you have several opportunities to reduce their footprint. The obvious method of making the strings more terse is of course an excellent thing to do. Less obvious is that you may find that you have multiple strings that are very similar – particularly if multiple people are working on a project. For example, I’ve recently seen code that contained a dozen variations on the string “Malloc failed”. For example:

  • Malloc failed
  • Malloc Failed
  • Malloc error
  • Etc

Now, the robust way to handle this is of course to ban inline strings and instead place them all in a string file, so that someone needing to use a string can simply reuse one that already exists. If this strikes you as too much work, then you may be interested to know that there are some linkers out there that will recognize duplicate strings and collapse them down to a single entry. However, to get this benefit, the strings need to be absolutely identical. Searching the binary image as I have described is a great way of identifying strings which will benefit from this manual optimization.Next Tip Home

Consulting as a leading economic indicator

Thursday, August 20th, 2009 Nigel Jones

The IEEE has a rather depressing news release out that claims that EE unemployment more than doubled last quarter to a record high 8.6%. The previous quarterly record was a mere 7% in Q1 2003. Interestingly the unemployment rate for all engineers was a mere 5.5% which suggests that EE’s are taking the brunt of engineering unemployment. If you are one of those unfortunate enough to be axed, then what’s the employment outlook for you?

Well I’m no economist and I certainly don’t have access to, or interest in, reams of economic data. What I can do is give you my micro-economic perspective. Over the 15 years I’ve been a consultant I’ve developed the notion that consultant activity is a leading economic indicator. That is, when companies need engineering help, but are unsure whether to take on employees, then they turn to consultants. Conversely when companies need to cut costs, the first to go are consultants and contractors. In short, consultants are the first to go in bad times and the first to be retained in good times. This hypothesis seems reasonable to me, and broadly reflects my experiences. So with this as a background, what can I tell you about the current economic state of affairs?

Well, firstly the current ‘slowdown’ came on so hard and so fast that my sense is that consultants and employees basically bit the dust simultaneously. OK, so what about the upside? Am I seeing an increase in demand for my services? In short – no. Having said that I almost never see an increase in demand for my services in July and August for the simple reason that too many people are on holiday. Notwithstanding this, my sense is that it is still very quiet.

So am I pessimistic? Actually – no. A large slice of the stimulus money has been funneled to organizations such as the NSF, which are only now getting around to doling out various grants. Thus I expect this to start having an effect on EE demand soon. I also have the sense that a lot of companies having weathered the financial storm are now looking ahead to see how they can best exploit the upturn when it comes. If I’m right, then the phone should start ringing again in September. I’ll post an update around the end of September and let you know if I’m right!


Effective C Tip #5 – Use pre-masking rather than post-masking

Monday, August 17th, 2009 Nigel Jones

This is the fifth in a series of tips on writing what I call effective C.

Today I’d like to offer a simple hint that can potentially make your buffer manipulation code a little more robust at essentially zero cost. I’d actually demonstrated the technique in this posting, but had not really emphasized its value.

Consider, for example, a receive buffer on a communications channel. The data are received a character at a time under interrupt and so the receive ISR needs to know where to place the next character. The question arises as to how best to do this? Now for performance reasons I usually make my buffer size a power of 2 such that I can use a simple mask operation. I then use an offset into the buffer to dictate where the next byte should be written. Code to do this typically looks something like this:

#define RX_BUF_SIZE (32)
#define RX_BUF_MASK  (RX_BUF_SIZE - 1)
static uint8_t Rx_Buf[UART_RX_BUF_SIZE]; /* Receive buffer */

static uint8_t RxHead = 0; /* Offset into Rx_Buf[] where next character should be written */

__interrupt void RX_interrupt(void)
 uint8_t rx_char;

 rx_char = HW_REG;         /* Get the received character */
 Rx_Buf[RxHead] = rx_char; /* Store the received char */
 ++RxHead;                 /* Increment offset */
 RxHead &= RX_BUF_MASK;    /* Mask the offset into the buffer */

In the last couple of lines, I increment the value of RxHead and then mask it, with the intention of ensuring that the next write into Rx_Buf[] will be in the requisite range. The operative word here is ‘intention’. To see what I mean, consider what would happen if RxHead gets corrupted in some way. Now if the corruption is caused by RFI or some other such phenomenon then you are probably out of luck. However, what if RxHead gets unintentionally manipulated by a bug elsewhere in your code? As written, the manipulation may cause a write to occur beyond the end of the buffer – with all the attendant chaos that would inevitably arise. You can prevent this by simply doing the masking before indexing into the array. That is the code looks like this:

__interrupt void RX_interrupt(void)
 uint8_t rx_char;

 rx_char = HW_REG;         /* Get the received character */
 RxHead &= RX_BUF_MASK;    /* Mask the offset into the buffer */
 Rx_Buf[RxHead] = rx_char; /* Store the received char */
 ++RxHead;                 /* Increment offset */

What has this bought you? Well by coding it this way you guarantee that you will not index beyond the end of the array regardless of the value of RxHead when the ISR is invoked. Furthermore the guarantee comes at zero performance cost. Of course this hasn’t solved your problem with some other piece of code stomping on RxHead. However it does make finding the problem a lot easier because your problem will now be highly localized (i.e. data are received out of order) versus the system crashes randomly. The former class of problem is considerably easier to locate than is the latter.

So is this effective ‘C’. I think so. It’s a simple technique that adds a little robustness for free. I wouldn’t mind finding a few more like it.

Next Tip
Previous Tip

A tutorial on signed and unsigned integers

Wednesday, August 5th, 2009 Nigel Jones

One of the interesting things about writing a blog is looking at the search terms that drive traffic to your blog. In my case, after I posted these thoughts on signed versus unsigned integers, I was amazed to see how many people were ending up here looking for basic information concerning signed and unsigned integers. In an effort to make these folks visits more successful, I thought I’d put together some basic information on this topic. I’ve done it in a question and answer format.

All of these questions have been posed to a search engine which has driven traffic to this blog. For regular readers of this blog looking for something a bit more advanced, you will find the last section more satisfactory.

Are integers signed or unsigned?

A standard C integer data type (‘int’) is signed. However, I strongly recommend that you do not use the standard ‘int’ data type and instead use the C99 data types. See here for an explanation.

How do I convert a signed integer to an unsigned integer?

This is in some ways a very elementary question and in other ways a very profound question. Let’s consider the elementary issue first. To convert a signed integer to an unsigned integer, or to convert an unsigned integer to a signed integer you need only use a cast. For example:

int  a = 6;
unsigned int b;
int  c;

b = (unsigned int)a;

c = (int)b;

Actually in many cases you can dispense with the cast. However many compilers will complain, and Lint will most certainly complain. I recommend you always explicitly cast when converting between signed and unsigned types.

OK, well what about the profound part of the question? Well if you have a variable of type int, and it contains a negative value such as -9 then how do you convert this to an unsigned data type and what exactly happens if you perform a cast as shown above? Well the basic answer is – nothing. No bits are changed, the compiler just treats the bit representation as unsigned. For example, let us assume that the compiler represents signed integers using 2’s complement notation (this is the norm – but is *not* mandated by the C language). If our signed integer is a 16 bit value, and has the value -9, then its binary representation will be 1111111111110111. If you now cast this to an unsigned integer, then the unsigned integer will have the value 0xFFF7 or 6552710. Note however that you cannot rely upon the fact that casting -9 to an unsigned type will result in the value 0xFFF7. Whether it does or not depends entirely on how the compiler chooses to represent negative numbers.

What’s more efficient – a signed integer or an unsigned integer?

The short answer – unsigned integers are more efficient. See here for a more detailed explanation.

When should I use an unsigned integer?

In my opinion, you should always use unsigned integers, except in the following cases:

  • When the entity you are representing with your variable is inherently a signed value.
  • When dealing with standard C library functions that required an int to be passed to them.
  • In certain weird cases such as I documented here.

Now be advised that many people strongly disagree with me on this topic. Naturally I don’t find their arguments persuasive.

Why should I use an unsigned integer?

Here are my top reasons:

  • By using an unsigned integer, you are conveying important information to a reader of your code concerning the expected range of values that a variable may take on.
  • They are more efficient.
  • Modulus arithmetic is completely defined.
  • Overflowing an unsigned data type is defined, whereas overflowing a signed integer type could result in World War 3 starting.
  • You can safely perform shift operations.
  • You get a larger dynamic range.
  • Register values should nearly always be treated as unsigned entities – and embedded systems spend a lot of time dealing with register values.

What happens when I mix signed and unsigned integers?

This is the real crux of the problem with having signed and unsigned data types. The C standard has an entire section on this topic that only a compiler writer could love – and that the rest of us read and wince at. Having said that, it is important to know that integers that are signed get promoted to unsigned integers. If you think about it, this is the correct thing to happen. However, it can lead to some very interesting and unexpected results. A number of years ago I wrote an article “A ā€˜Cā€™ Test:The 0x10 Best Questions for Would-be Embedded Programmers” that was published in Embedded Systems Programming magazine. You can get an updated and corrected copy at my web site. My favorite question from this test is question 12 which is reproduced below – together with its answer: What does the following code output and why?

void foo(void)
 unsigned int a = 6;
 int b = -20;
 (a+b > 6) ? puts("> 6") : puts("<= 6");

This question tests whether you understand the integer promotion rules in C – an area that I find is very poorly understood by many developers. Anyway, the answer is that this outputs “> 6”. The reason for this is that expressions involving signed and unsigned types have all operands promoted to unsigned types. Thus -20 becomes a very large positive integer and the expression evaluates to greater than 6. This is a very important point in embedded systems where unsigned data types should be used frequently (see reference 2). If you get this one wrong, then you are perilously close to not being hired.

This is all well and good, but what should one do about this? Well you can pore over the C standard, run tests on your compiler to make sure it really does conform to the standard, and then write conforming code, or you can do the following: Never mix signed and unsigned integers in an expression. I do this by the use of intermediate variables. To show how to do this, consider a function that takes an int ‘a’ and an unsigned int ‘b’. Its job is to return true if b > a, otherwise it returns false. As you shall see, this is a surprisingly difficult problem… To solve this problem, we need to consider the following:

  • The signed integer a can be negative.
  • The unsigned integer b can be numerically larger than the largest possible value representable by a signed integer
  • The integer promotion rules can really screw things up if you are not careful.

With these points in mind, here’s my stab at a robust solution

bool foo(int a, unsigned int b)
 bool res;

 if (a < 0)
  res = true; /* If a is negative, it must be less than b */
  unsigned int c;
  c = (unsigned int) a; /* Since a is positive, this cast is safe */
  if (b > c)            /* Now I'm comparing the same data types */
   res = true;
   res = false;
 return res;

Is this a lot of work – yes. Could I come up with a more compact implementation that is guaranteed to work for all possible values of a and b – probably. Would it be as clear – I doubt it. Perhaps regular readers of this blog would like to take a stab at producing a better implementation?