Archive for the ‘Compilers / Tools’ Category

Optimizing for the CPU / compiler

Sunday, June 3rd, 2012 Nigel Jones

It is well known that standard C language features map horribly on to the architecture of many processors. While the mapping is obvious and appalling for some processors (low end PICs, 8051 spring to mind), it’s still not necessarily great at the 32 bit end of the spectrum where processors without floating point units can be hit hard with C’s floating point promotion rules. While this is all obvious stuff, it’s essentially about what those CPUs are lacking. Where it gets really interesting in the embedded space is when you have a processor that has all sorts of specialized features that are great for embedded systems – but which simply do not map on to the C language view of the world. Some examples will illustrate my point.

Arithmetic vs. Logical shifting

The C language does of course have support for performing shift operations. However, these are strictly arithmetic shifts. That is when bits get shifted off the end of an integer type, they are simply lost. Logical shifting, sometimes known as rotation, is different in that bits simply get rotated back around (often through the carry bit but not always). Now while arithmetic shifting is great for, well arithmetic operations, there are plenty of occasions in which I find myself wanting to perform a rotation. Now can I write a rotation function in C – sure – but it’s a real pain in the tuches.

Saturated addition

If you have ever had to design and implement an integer digital filter, I am sure you found yourself yearning for an addition operator that will saturate rather than overflow. [In this form of arithmetic, if the integral type would overflow as the result of an operation, then the processor simply returns the minimum or maximum value as appropriate].  Processors that the designers think might be required to perform digital filtering will have this feature built directly into their instruction sets.  By contrast the C language has zero direct support for such operations, which must be coded using nasty checks and masks.

Nibble swapping

Swapping the upper and lower nibbles of a byte is a common operation in cryptography and related fields. As a result many processors include this ever so useful instruction in their instruction sets. While you can of course write C code to do it, it’s horrible looking and grossly inefficient when compared to the built in instruction.


If you look over the examples quoted I’m sure you noticed a theme:

  1. Yes I can write C code to achieve the desired functionality.
  2. The resultant C code is usually ugly and horribly inefficient when compared to the intrinsic function of the processor.

Now in many cases, C compilers simply don’t give you access to these intrinsic functions, other than resorting to the inline assembler. Unfortunately, using the inline assembler causes a lot of problems. For example:

  1. It will often force the compiler to not optimize the enclosing function.
  2. It’s really easy to screw it up.
  3. It’s banned by most coding standards.

As a result, the intrinsic features can’t be used anyway. However, there are embedded compilers out there that support intrinsic functions. For example here’s how to swap nibbles using IAR’s AVR compiler:

foo = __swap_nibbles(bar);

There are several things to note about this:

  1. Because it’s a compiler intrinsic function, there are no issues with optimization.
  2. Similarly because one works with standard variable names, there is no particular likelihood of getting this wrong.
  3. Because it looks like a function call, there isn’t normally a problem with coding standards.

This then leads to one of the essential quandaries of embedded systems. Is it better to write completely standard (and hence presumably portable) C code, or should one take every advantage of neat features that are offered by your CPU (and if it is any good), your compiler?

I made my peace with this decision many years ago and fall firmly into the camp of take advantage of every neat feature offered by the CPU / compiler – even if it is non-standard. My rationale for doing so is as follows:

  1. Porting code from one CPU to another happens rarely. Thus to burden the bulk of systems with this mythical possibility seems weird to me.
  2. End users do not care. When was the last time you heard someone extoll the use of standard code in the latest widget? Instead end users care about speed, power and battery life. All things that can come about by having the most efficient code possible.
  3. It seems downright rude not to use those features that the CPU designer built in to the CPU just because some purist says I should not.

Having said this, I do of course understand completely if you are in the business of selling software components (e.g. an AES library), where using intrinsic / specialized instructions could be a veritable pain. However for the rest of the industry I say use those intrinsic functions! As always, let the debate begin.


Why you really shouldn’t steal source code

Saturday, February 11th, 2012 Nigel Jones

As an embedded systems consultant, I spend a substantial part of my work time working on your typical embedded systems projects. However I also spend a significant amount of time working as an expert witness in legal proceedings. While the expert witness work is quite varied, one of the things I have noticed in the last few months is an increase in the number of cases related to source code theft. Typically these cases involve the plaintiff claiming that the defendant has stolen their source code and is using it in a competing product. These claims are often plausible, because as we all know, it’s trivial to walk out of a company with Gigabytes of information in your pocket. Even in companies with strong security measures, it’s normally the case that the engineers are smart enough to work out how to bypass the security systems, so the ‘there’s no way I could have got the code out of there’ defense isn’t usually very plausible.

Thus given how easy it is to steal source code, why shouldn’t you do it? Well let’s start with the obvious – it’s wrong. if you don’t understand this, go and have a chat with your mother – I’m sure she’ll spell it out for you. Notwithstanding the morality (and legality) of the issue, here’s another reason why you shouldn’t do it – there’s a great chance you’ll be found out. If that happens, you can find yourself in serious legal jeopardy.

So just how easy is it to show that someone has stolen your code?

Typically the first step is for the (future) plaintiff to have their suspicions aroused. If half the engineering department leaves and starts up a company with a competing product, then it’s hardly surprising that your ex-employer will be suspicious. Of course suspicions aren’t grounds for a lawsuit. The plaintiff needs at least some evidence of your malfeasance. Now sometimes this can be done purely by the functionality / look and feel of a product. However in other cases it’s necessary for the plaintiff to get at your code’s binary image. You can make this very hard (and hence expensive) to do. However for your typical microprocessor, this step is surprisingly easy. Indeed there are any number of organizations around the world that are quite adept at extracting binary images from processors.  So what you may ask? I took the code, moved stuff around, used a different compiler and compiled it for a different processor, so good luck with showing that I used your code. Well the trouble with this, is that using tools such as Ida-Pro, it’s easy to count the number of functions in the code, and the arguments they take. These metrics are a remarkably good signature. [BTW, there are other metrics as well, but I really don’t want to give the whole game away]. Thus if the original code base and the stolen code base have a very similar function call signature, then there’s an excellent chance that the plaintiffs have enough evidence to file a lawsuit.

It’s at this point that you are really in trouble. As part of the lawsuit, the plaintiffs are allowed to engage in discovery.  In a case like this, it means quite simply that the court will require you to turn your source code over to an expert that has been retained by the plaintiffs (i.e. someone like yours truly). At this point, I can use any number of tools that are available for comparing code bases. Some of the tools are designed expressly for litigation purposes, while others are just some of the standard tools we use as part of our everyday work. Anyway, the point is this: these tools are really good at finding all sorts of obfuscations, including things such as:

  • Renaming variables, constants functions etc.
  • Changing function parameter orders
  • Replacing comments
  • Adding / deleting white space
  • Splitting / merging files

In many cases, they can even detect the plagiarism (theft) even if you have switched languages. In other words, if you have indeed stolen the source code, then the chances of  it not being conclusively proven at this stage are pretty slim. In short, life is about to get very unpleasant.

Having said the above, I like to think that the readers of this blog are not the type that would engage in source code theft. However I suspect that some of you have been tempted to go into business competing against your current employer. If this describes you, then what should you do to ensure that you don’t get hit with a lawsuit a year or two after starting your own business?  Well clearly the best bet is not to go into a competing business. However if you must do this, then get some legal advice (please don’t rely on what is written here – I’m just an engineer!) before you start. You will probably be advised to do a ‘clean room’ design, which in a nutshell will require you to demonstrate that the code in your competing product was designed from scratch, using nothing from your former employer. Be advised that even in these cases, if you adopt the same algorithms, then you may still be in trouble.




An open letter to the developers of the MPLAB IDE

Tuesday, September 27th, 2011 Nigel Jones

I recently inherited a project that uses a Microchip PIC18 processor. Without going into too much detail, suffice it to say that I ended up using Microchip’s MPLAB IDE Version 8.73 together with Microchip’s C compiler. It had been a number of years since I last used MPLAB, in part because my experience back then was so painful. I was heartened to see that the version number had jumped from 6.X to 8.X and so I was fully expecting to find an IDE that was at the very least, decent. Boy was I disappointed. In no particular order, here are some of the head-slapping things I discovered. Disclaimer: I’m no MPLAB expert (and quite frankly after this experience I doubt I will ever become one). Thus it’s entirely possible that the issues listed below are a reflection of my incompetence.

No editor support for splitting the window

The title says it all. Text editors back in the DOS allowed you to open a file and look at various parts of the file at the same time via split windows. This is such a fundamental operation for text editors that I couldn’t believe it wasn’t supported. Do the developers of MPLAB use an editor with this limitation?

No single file compilation

There appears to be no way to simply compile a single file. Instead one is forced to perform a make. Not only is this really time consuming as make wades through all the files that aren’t relevant, it’s also a pain because:

  • It forces you to correct problems in the first file that make finds to process – rather than the one you are interested in.
  • If the compilation passes, it proceeds immediately to the link and download phase – regardless whether you want to or not.

A baffling debugger configuration interface

The first time I tried to download to the debugger, I received an error message telling me that there was a problem with the configuration bits. I beat my head against the issue for a few hours and in the end called a colleague who is a Microchip Consultant (Matt).  Matt told me that he avoided using MPLAB, but vaguely recollected that you had to configure the debugger to explicitly download to the target. Well Matt was spot on. I was just left wondering why anyone would want to debug code without having it downloaded first.

A joke of a simulator

While I was waiting to get hold of Matt, I experimented with the simulator. Despite the documentation saying that various peripherals were supported, I found that the simulator simply didn’t support them. Matt confirmed that the simulator was a piece of junk that no one bothered using.

Unnecessary variable scope limitation

I’m not sure if this is an IDE or a compiler issue. Anyway, when debugging within MPLAB, I discovered that if I had the temerity to declare a variable as static, then the only time I could examine its value was if the variable was in compiler scope. That is, if I was stopped in file foo.c, then the debugger would not let me see the values of static variables declared in bar.c. As a result, I was forced to declare variables as global simply so that I could look at them. Quite frankly, it blows my mind that in 2011 an IDE can force you to adopt lousy programming practices because of its limitations.

No pointer dereferencing

In a similar vein, I discovered that MPLAB only shows you the values of pointers, and not what they are pointing to. I thought that limitation disappeared at least ten years ago.

Limited support for ‘large ‘arrays

The PIC18 doesn’t handle arrays or structures greater than 256 bytes very well. However the Microchip compiler guys came up with a decent work around that is quite straight forward to use – until you want to look at the array in the debugger. In a nutshell all you seem to be able to do is examine the array / structure in the memory window. You can forget about looking at 16 bit integers, or other such ‘complex’ constructs.

Breakpoints that aren’t where I put them

I discovered that placing a breakpoint on a function call was highly problematic. In some cases the break would occur prior to the call, and in other cases it would occur after the call. If I wanted the breakpoint in the called function, I’d put it there myself, thank you.

Anyway, I could go on – but I think you get the picture. What I don’t get is this. Microchip is a big successful company. Why, after a decade of trying can’t they come up with a decent development environment? It has got to be costing them a lot of design wins when people such as myself cringe at the though of having to use their tools.


Formatted output when using C99 data types

Tuesday, February 1st, 2011 Nigel Jones

Regular readers of this blog will know that I am a proponent of using the C99 data types. They will also know that I’m no fan of formatted output. Notwithstanding this, I do use formatted output (particularly vsprintf) on larger systems. Well if you use the C99 data types and you use formatted output, you will quickly run into a problem – namely what modifier do you give printf()  to print say a uint16_t variable? Now if you are working on an 8 or 16 bit architecture, then you’d probably be OK guessing that %u would work quite nicely. However if you are working on a 32 bit architecture, what would you use for say a uint_fast8_t variable? Well it so happens that the C99 folks were aware of this problem and came up with just about the ugliest solution imaginable.


In order to solve this problem, you first of all need to #include a file inttypes.h. This header file in turn includes stdint.h so that you have access to the C99 data types. If you examine this file, you will find that it consists of a large number of definitions. An example definition might look like this:

#define PRId16 __INT16_SIZE_PREFIX__ "d"

If you are like me, when I first saw this I was a little puzzled. How exactly was this supposed to help? Well I’ll give you an example of its usage, and then explain how it works.

#include <inttypes.h>
#include <stdio.h>

void print_int16(int16_t value)
 printf("Value = %" PRId16, value);

So what’s going on here? Well let’s assume for now that __INT16_SIZE_PREFIX__ is in turn defined to be “h”.  Our code is converted by the preprocessor into the following:

#include <inttypes.h>
#include <stdio.h>

void print_int16(int16_t value)
 printf("Value = %" "h" "d", value);

At compile time, the successive strings “Value = %” “h” “d” are concatenated into the single string “Value = %hd”, so that we end up with:

#include <inttypes.h>
#include <stdio.h>

void print_int16(int16_t value)
 printf("Value = %hd", value);

This is legal syntax for printf. More importantly, the correct format string for this implementation is now being passed to printf () for an int16_t data type.

Thus the definitions in inttypes.h allow one to write portable formatted IO while still using the C99 data types.

Naming Convention

Examination of inttypes.h shows that a consistent naming convention has been used. For output, the constant names are constructed thus:

<PRI><printf specifier><C99 modifier><number of bits> where

<PRI> is the literal characters PRI.

<printf specifier> is the list of integer specifiers we all know so well {d, i, o, u, x, X}

<C99 modifier> is one of {<empty>, LEAST, FAST, MAX, PTR}

<number of bits> is one of {8, 16, 32,64 <empty>}. <empty> only applies to the MAX and PTR C99 modifiers.


To print a uint_fast8_t in lower case hexadecimal you would use PRIxFAST8.

To print a int_least64_t in octal you would use PRIoLEAST64.

Formatted Input

For formatted input, simply replace PRI with SCN.


While I applaud the C99 committee for providing this functionality, it can result in some dreadful looking format statements. For example here’s a string from a project I’m working on:

wr_vstr(1, 0, MAX_STR_LEN, "%-+*" PRId32 "%-+4" PRId32 "\xdf", tap_str_len, tap, angle);

Clearly a lot of this has to do with the inherently complex formatted IO syntax. The addition of the C99 formatters just makes it even worse.

Personally I’d have liked the C99 committee to have bitten the bullet and introduced a formatted IO function that had the following characteristics:

  1. Explicit support for the C99 data types.
  2. No support for octal. Does anyone ever use the octal formatter?
  3. Support for printing binary – this I do need to do from time to time.
  4. A standard defined series of reduced functionality formatted IO subsets. This way I’ll know that if I restrict myself to a particular set of format types I can use the smallest version of the formatted IO function.

PC Lint

Regular readers will also know that I’m a major proponent of using PC-Lint from Gimpel. I was surprised to discover that while Lint is smart enough to handle string concatenation with printf() etc, it doesn’t do it with user written functions that are designed to accept format strings. For example, the function wr_vstr() referenced above looks like this:

static void wr_vstr(uint_fast8_t row, uint_fast8_t col, uint_fast8_t width, char const * format, ...)
 va_list  args;
 char  buf[MAX_STR_LEN];

 va_start(args, format);
 (void)vsnprintf(buf, MAX_STR_LEN, format, args);     /* buf contains the formatted string */

 wr_str(row, col, buf, width);    /* Call the generic string writer */

 va_end(args);                    /* Clean up. Do NOT omit */

I described this technique here. Anyway, if you use the inttypes.h constants like I did above, then you will find that PC-Lint complains loudly.

Final Thoughts

Inttypes.h is very useful for writing portable formatted IO with the C99 data types. It’s ugly – but it beats the alternative. I recommend you add it to your bag of tricks.

DigiView Logic Analyzer

Wednesday, October 6th, 2010 Nigel Jones

Today is one of those rare days on which I recommend a product. I only do this when I find a product that has genuinely made my life easier, and which by extension I think will also make your life easier. The product in question is a  DigiView logic analyzer. Now the fact that logic analyzers are useful tools should not be news to you. Indeed if you have been in this business long enough you will no doubt remember the bad old days of debugging code by decoding execution traces on a logic analyzer. That being said, I almost stopped using logic analyzers because they were big, expensive, difficult to set up and highly oriented towards bus-based systems. Given that I had my own consulting company with limited cash, limited space and a propensity to work on non-bus based systems (i.e. single chip microcontrollers), it’s hardly surprising that a logic analyzer wasn’t part of my toolbox.

This state of affairs persisted for a number of years until I obtained via a convoluted route a DigiView DV1-100. This is a USB powered, hand-sized box, with 18 channels at 100 MHz. It’s successor (The DV3100) sells for $499. The device sat on my shelf for a while until I decided to give it a spin one day. Since then I have found it to be an indispensable tool. Interestingly I find I use it the most when implementing the myriad of synchronous protocols that seem to exist on peripheral ICs today. While it is of course very useful for getting the interfaces working, I also find it extremely useful in fine tuning the interfaces. Via the use of the logic analyzer I can really examine set-up and hold times, clock frequencies, transmission latencies and so on. Doing so has allowed me to dramatically improve the performance of these interfaces in many cases. Indeed, I have had such success in this area that I now routinely hook the analyzer up, even when the interface works first time. If nothing else it gives me a nice warm fuzzy feeling that the interface is working the way it was designed – and not by luck.

Another area where I find it very useful is when I need to reverse engineer a product. I do this a lot as part of my expert witness work – and it is really quite remarkable how much you can learn from looking at a logic analyzer trace.

Anyway, the bottom line is this. $499 gets you an 18 channel 100 MHz personal logic analyzer that can handle most of the circuitry most of us see on a daily basis. If you value your time at all, then the DigiView will pay for itself the first time you use it. Go hassle your boss to get one.