Archive for the ‘Uncategorized’ Category

Thoughts on BCC's, LRC's, CRC's and being experienced

Saturday, June 20th, 2009 Nigel Jones

Those of us that have been working in this field for a long time are referred to as ‘experienced’. Experienced is taken to mean that we have been doing this for long enough that we have experienced many of the problems common to embedded systems and thus know how to solve them. Although this is true for many things, I think there is a downside to it – namely that because we’ve successfully solved a particular problem a number of times that we fall into the trap of thinking that our solution is optimal. In order to guard against this it is essential to be proactive in seeking out new solutions to old problems. To illustrate my point, I’ll take you on an abbreviated trip through the memory lane of my career when it comes to that most prosaic of problems – transmitting serial data between microcontrollers.

Back when I was a lad I was by definition naive and so I just transmitted the data without any thought to how to detect errors beyond the use of a parity bit on each byte. Well it didn’t take me long to work out that a simple parity bit wasn’t exactly a robust way of detecting errors, and so I started appending a simple additive checksum to the message.

Well that worked for a while until the day it dawned on me that an additive checksum without an initial seed value was vulnerable to a stuck channel (e.g. all zeros). From that day on I started seeding my checksum computations with initial values. I tended to favour 0x2B (with apologies to Hamlet).

Somewhere along the road I switched from performing an additive checksum to using an XOR operation. I can’t remember why I did this – but it just seemed ‘better’.

This approach served me well for many years until I started investigating cyclic redundancy checks (CRC). I’d known about CRC’s for a long time of course. However all the ones I knew about used 16 or 32 bit values and had certain wondrous but rather unspecified properties for detecting certain classes of errors. To put it bluntly they seemed like complete overkill for sending a short message between two microprocessors – and so I didn’t entertain them. However this all changed the day I came across an 8 bit CRC. This changed my perspective dramatically. An 8 bit CRC designed for protecting small messages – excellent! Thus henceforth I eschewed the use of an LRC and instead opted for an 8 bit CRC to protect my messages.

Well this continued for a number of years. I learned more about CRCs, I got older until one day I decided to ask myself the question – is the 8 bit CRC I am using optimal? For regular readers of this blog, you’ll probably have noticed that ‘optimal solutions’ is a recurring theme. Anyway, with this thought in mind, I set off on a hunt to determine whether in fact the 8 bit CRC I was using to protect small messages was indeed optimal. That’s when I came across this paper by Koopman and Chakravarty. It’s entitled ‘Cyclic Redundancy Code (CRC) Polynomial Selection for Embedded Networks’. It’s a highly readable and informative paper. They essentially investigate what constitutes ‘optimal’ for a CRC polynomial and then exhaustively explore optimal polynomials for different data lengths and different polynomial lengths. Most interestingly they slay some sacred cows along the way, including the popular CRC-8 polynomial (x8+x7+x6+x4+x1+1).

Having read the paper, I discovered that the CRC I was using (the so called ATM-8 polynomial(x8+x2+x1+1)) wasn’t bad for my application – but it wasn’t optimal. Upon reflection this was hardly surprising since I had essentially selected it on the basis that it was designed for a similar application to mine – and thus must be decent. However as Koopman shows – this can be a very foolhardy assumption. I just got lucky.

More importantly from my perspective is that using Koopman’s paper I now have a logical methodology for determining the optimal CRC for any application. Thus after close to 30 years of doing this I think I’m finally homing in on the truly optimal solution to this problem.

Of course, the larger lesson to be learned here is that just because you have done something a certain way for many years means nothing unless you know that it is the optimal way of doing it. That’s when you are truly ‘experienced’.
Home

Doxygen

Saturday, May 2nd, 2009 Nigel Jones

Today’s post was inspired by a new version notice from Dimitri van Heesch concerning his great documentation generator tool doxygen. If you aren’t aware of doxygen, then I strongly recommend reading about it and then using it.

So what is Doxygen exactly? Well it has a lot of capabilities, but in a nutshell it can parse your code (C, C++, Java and a host of others not usually used in the embedded space) and from it generate a very nice hyper-linked documentation set. It does this in part by looking for what I’ll call control directives embedded in comments. Now what I particularly like about Doxygen is that it allows you to trade off between adding control directives while still making your comments readable. For example, at one extreme you can do nothing special to your code and still end up with a reasonable documentation set. On the other extreme, you can embed so many control directives into your comments that the only sane way to read the comments is via Doxygen; however the documentation will be truly impressive! In my case, I find control directives to be very distracting, and so I opt to use a minimal set that doesn’t offend my sensibilities but still gives me very useful results.

So why do I do this? Well while this documentation set is very nice in its own right, I actually find it very useful in improving my code. As remarkable a claim as this is, it’s easily substantiated. Here are a few examples:

Call Trees

One of the very nice add-ons to Doxygen is graphviz. Using graphviz, Doxygen will generate call trees for all of your functions. I often find this very illuminating – both at a macro level and also a micro level. At the macro level, if I see a call tree that looks like your average two years old’s art work, then it’s a clear indication of muddled thinking – and impending doom. At the micro level it allows you to spot some errors. For example consider this code fragment, that is intended to update a parameter in an EEPROM data structure, together with its backup copy:

void params_NosChargesSet(uint16_t nos_charges)
{
 Factory_Params1.n_charges = nos_charges;
 update_factory1_crc();
 Factory_Params2.n_charges = nos_charges;
 update_factory1_crc();
}

I found the bug in this code not by testing it, but by simply browsing the Doxygen documentation and noticing that the call tree for this function was incorrect. What I liked about this is that this kind of bug is very difficult to detect through testing, and will not be noticed by static analysis. It was however clear as day by looking at its call tree.

Missing documentation

Sometimes when I’m anxious to solve ‘the real problem’, I find that I’m not as diligent as I should be about describing the use of manifest constants, variables etc. As a result I’ll sometimes end up with code that looks like this:

#define SHORT_TERM_BUF_SIZE (8U) /**< meaningful comment */
#define LONG_TERM_BUF_SIZE (32U)

You’ll notice that LONG_TERM_BUF_SIZE has no comment associated with it. However, it’s “obvious” what its use is because of the comment associated with SHORT_TERM_BUF_SIZE that immediately precedes it. Well when you generate the Doxygen documentation, and you click on the hyperlink associated with LONG_TERM_BUF_SIZE, guess what – no description. While some may think that this is a weakness in Doxygen, I actually think it’s a major strength. Here’s why:

  • My coding standard requires me to provide a comment for all manifest constants. Thus it is reminding me of the error of my ways.
  • Someone new coming to the code will typically be overwhelmed by what they are faced with. Having an ‘implicit comment’ is just one more hurdle for them to overcome. Thus Doxygen is accurately reflecting what someone will see when they read your code.

Is Doxygen perfect? No it’s not. It often hangs when I run it. However to be fair, that’s usually because I haven’t played by the rules. Despite this I find it a useful tool in my arsenal. I recommend you take a look at it.

Home

PIC stack overflow

Saturday, April 25th, 2009 Nigel Jones

For regular readers of this blog I apologize for turning once again to the topic of my Nom de Guerre. If you really don’t want to read about stack overflow again, then just skip to the second section of this posting where I address the far more interesting topic of why anyone uses an 8-bit PIC in the first place.

Anyway, the motivation for this post is that the most common search term that drives folks to this blog is ‘PIC stack overflow’. While I’ve expounded on the topic of stacks in general here and here, I’ve never explicitly addressed the problem with 8 bit PICs. So to make my PIC visitors happy, I thought I’ll give them all they need to know to solve the problem of stack overflow on their 8 bit PIC processors.

The key thing to understand about the 8 bit PIC architecture is that the stack size is fixed. It varies from a depth of 2 for the really low end devices to 31 for the high end 8 bit devices. The most popular parts (such as the 16F877) have a stack size of 8. Every (r)call consumes a level, as does the interrupt handler. To add insult to injury, if you use the In Circuit Debugger (ICD) rather than a full blown ICE, then support for the ICD also consumes a level. So if you are using a 16 series part (for example) with an ICD and interrupts, then you have at most 6 levels available to you. What does this mean? Well if you are programming in assembly language (which when you get down to it was always the intention of the PIC designers) it means that you can nest function calls no more than six deep. If you are programming in C then depending on your compiler you may not even be able to nest functions this deep, particularly if you are using size optimization.

So on the assumption that you are overflowing the call stack, what can you do? Here’s a checklist:

  • Switch from the ICD to an ICE. It’s only a few thousand dollars difference…
  • If you don’t really need interrupt support, then eliminate it.
  • If you need interrupt support then don’t make any function calls from within the ISR (as this subtracts from your available levels).
  • Inline low level functions
  • Use speed optimization (which effectively inlines functions)
  • Examine your call tree and determine where the greatest call depth occurs. At this point either restructure the code to reduce the call depth, or disable interrupts during the deepest point.
  • Structure your code such that calls can be replaced with jumps. You do this by only making calls at the very end of the function, so that the compiler can simply jump to the new function. (Yes this is a really ugly technique).
  • Buy a much better compiler.

If you are still stuck after trying all these, then you really are in a pickle. You could seek paid expert help (e.g. from me or some of the other folks that blog here at embeddedgurus) or you could change CPU architectures. Which leads me to:

So why are you using a PIC anyway?

The popularity of 8 bit PICs baffles me. Its architecture is awful – the limited call stack is just the first dreadful thing. Throw in the need for paging and banking together with the single interrupt vector and you have a nightmare of a programming model. It would be one thing if this was the norm for 8 bit devices – but it isn’t. The AVR architecture blows the PIC away, while the HC05 / HC08 are also streets ahead of the PIC. Given the choice I think I’d even take an 8051 over the PIC. I don’t see any cost advantages, packaging advantages (Atmel has just released a SOT23-6 AVR which is essentially instruction set compatible with their largest devices) or peripheral set advantages. In short, I don’t get it! Incidentally, this isn’t an indictment of Microchip – they are a great company and I really like a lot of their other products, their web site, tech support and so on (perhaps this is why the PIC is so widely used?). So to the (ir)regular readers of this blog – if you are you using 8 bit PICs perhaps you could use the comment section to explain why. Let the debate begin!

Home

Unused interrupt vectors

Sunday, April 19th, 2009 Nigel Jones

With the exception of low end PIC microcontrollers, most microcontrollers have anywhere from quite a few to an enormous number of interrupt vectors. It’s a rare application that uses every single interrupt vector, and so the question arises as to what, if anything, should one do with unused interrupt vectors? I have seen two approaches used – neither of which is particularly good.

Do nothing

I would say this is the most common approach. My guess is that when this approach is used, it’s not via conscious choice, but rather the result of inaction. So what’s the implication of this approach? Well if an interrupt occurs for which you have not installed an interrupt handler, then the microcontroller will vector to the appropriate address and start executing whatever code happens to be there. It’s fair to say that this will ultimately cause a system crash – the only question is how much damage will be done in the process? Having said that, I don’t necessarily consider that this approach is always awful. For example a reasonable argument might go something like this.

I know via design, code inspection, static analysis and testing that the probability of a coding error enabling the wrong interrupt is remote. Thus if it does happen it’s probably either via severe RF interference, or because the code has crashed. In either case the system has bigger problems than vectoring to an unsupported interrupt.

Of course anybody that’s put this much thought into it, will probably be conscientious enough to do something different.

Another valid argument on very memory constrained processors is that you need the unused interrupt vector space for the application. Indeed I have coded 8051 applications where this has been the case. Such is the price we sometimes have to pay on very small systems.

Install ‘RETI’ instructions at all unused vectors

In this approach, you arrange for there to be a ‘Return From Interrupt’ instruction at every unused interrupt vector. Indeed this approach is common enough that some compiler manufacturers offer it as a linker option. The concept with this approach is that if an unexpected interrupt occurs, then by executing a RETI instruction, the application will simply continue with very little harm done. All in all this isn’t a bad approach. However it has several weaknesses.

  • The biggest problem with this approach is that it doesn’t solve the problem of an interrupt source that keeps on interrupting. The most egregious example of this is a level triggered interrupt on a port pin. In this case, depending upon the CPU architecture, it is quite possible for the system to go into a mode whereby it essentially spends all its time vectoring to the interrupt and then returning. However this is by no means the only example. Others that spring to mind are ‘Transmit buffer empty’ interrupts, and timer overflow type interrupts. In the latter case, the system probably wouldn’t spend all of its time interrupting; however a certain fraction of the CPU bandwidth would be wasted, which in a battery powered application for instance, would be a big deal.
  • If you do this at the start of a project, you lose the opportunity to discover errors in which an interrupt source has been erroneously enabled. In short this approach can mask problems, while what is really needed is an approach that can reveal problems.

Recommended Approach

What I do is the following.

  1. At the start of a project I create a file called vector.c In vector.c I create an interrupt handler for every possible interrupt vector. Not only is this an essential first step in solving the problem, I also find it very illuminating as it forces me to read about and understand all the CPU’s interrupt sources. This is always a useful step, as in many ways the interrupt sources for a CPU tell you a lot about its capabilities and the designers intent.
  2. Within each interrupt handler, I explicitly mask the interrupt source. This will prevent the interrupt from reoccurring in all but the most extreme of cases.
  3. If necessary, I also clear the interrupt flag. (In some CPU architectures this occurs automatically by vectoring to the interrupt. In others you have to do it manually).
  4. After masking the interrupt source, I then make a call to my trap function. What this means is that while I’m debugging the code, if any unexpected interrupt occurs, then I’ll know about it in a hurry. Conversely, of course, with a release build, the trap function compiles down to nothing, essentially removing it from the code.

Here’s a code fragment that shows what I mean. In this case it’s for an AVR processor and the IAR compiler. However it should be trivial to port this to other architectures / compilers. Note that for the AVR it is in general not necessary to clear the interrupt flag as it is cleared automatically upon vectoring to the ISR.

#pragma vector=INT1_vect /* External Interrupt Request 1 */
__interrupt void int1_isr(void)
{
 EIMSK_INT1 = 0;  /* Disable the interrupt */
 /* Interrupt flag is cleared automatically */
 trap();
}
#pragma vector=PCINT0_vect /* Pin Change Interrupt Request 0 */
__interrupt void pcint0_isr(void)
{
 PCICR_PCIE0 = 0; /* Disable the interrupt */
 /* Interrupt flag is cleared automatically */
 trap();
}
...
#ifndef NDEBUG
/** Flag to allow us to exit the trap and see who caused the interrupt */
static volatile bool Exit_Trap = false;
#endif
static inline void trap(void)
{
#ifndef NDEBUG
 while (!Exit_Trap)
 {
 }
#endif
}

Home

Commuting is crazy!

Friday, April 3rd, 2009 Nigel Jones

A few posts back I suggested that (American) employers would benefit from giving their engineers a lot more time off. In the comments section, Brad opined that he would very much like to work four 10-hour days. One of the reasons he gave was to avoid the stress and hassle of his daily commute. I agree completely with him. However, I’d like to take this one step further. Why is that (most) employers insist that their staff come to the office each day to work? This always strikes me as ludicrous. Of course there are days where one has to attend meetings, or where you need to use the specialized test equipment that your employer owns. In addition there are many of us who work for employers where secrecy demands that you be at work. However, for the vast majority of engineers there is absolutely no need to be in the office every day. Instead a decent home computer, a broad band connection and a VPN and you are pretty much all set to do exactly what you’d do if you went into the office for the day.

Now notwithstanding that allowing / encouraging / demanding that staff work from home whenever possible has great benefits to the the engineer and the environment, the real key is the boost in productivity that is possible. Any engineer I know will tell you that the best way to get a lot of (hard) work done in a hurry is to shut the door, turn off the telephone and block your email. Maybe it’s just me, but that’s exactly what can happen when you work from home.

But what about the staff that will go home and slough off for the day? Well I’m sure they exist. I’m also sure that anyone that managed to get through an engineering degree program has enough brains to work out how to goof off at work without being caught if that’s their inclination. In short I don’t see being at work as evidence that you’re actually doing anything useful.

What’s maddening about this is when you consider the list of jobs that don’t require you to come to the office each day. Examples that spring to mind include sales, truck drivers and home-care health workers. Apparently their employers somehow manage to come up with ways of determining whether they are productive or not.

So what to make of this? I think it’s largely inertia. Twenty years ago, the cost of engineering tools was so high that you had to go to work to use them. Today you can set up a well equipped laboratory for $10K. Despite this, the notion of engineers having to go to work persists. If I’m correct, and there aren’t any substantive reasons for most of us to go to the office every day, then ultimately logic should overcome the inertia – and working from home several days a week will become the norm. However it won’t start changing until more of us start pressuring management to explain why we shouldn’t do this.
Home