Archive for the ‘Uncategorized’ Category

Demand more time off!

Sunday, March 22nd, 2009 Nigel Jones

I’ve been posting on a lot of technical issues lately and so I thought I’d turn to a less cerebral topic – but one which I feel quite passionate about. First off – some background. I’m British by birth and was raised in Europe (UK & Germany) before moving to the USA in my early twenties. Upon arrival in the USA I was struck by many things; however professionally what amazed me was the number of hours the typical engineer works in the USA compared to their European counterparts. When I left the UK, the standard work week was 37.5 hours and the typical amount of paid time off was 4 weeks for new hires, quickly increasing to 6 weeks or more with length of service. To this was added 8 bank holidays. Perhaps more importantly, employers seemed to think that this was a good thing. For example, my employer at the time had the following policies in effect:

  • Employees were encouraged to work their 37.5 hours in such a way, that the work week ended at lunch time on Friday, effectively ensuring that employees had 2.5 day weekends.
  • Employees were strongly encouraged to take at least 2 weeks off as a block, thus ensuring that they got at least one long break from work every year.

By contrast, when I arrived in the USA, I discovered that the norm was quite different. Indeed the policies I encountered were as follows (and this from the American branch of the same firm as I had worked for in the UK):

  • Work week of 40 hours.
  • Engineers were routinely expected to put in unpaid overtime, with 10 hours being the norm.
  • Annual vacation of two weeks, which only started accruing after 6 months service.
  • Very long serving employees might get 3 weeks vacation a year.
  • Taking more than one week off at a time was actively discouraged.

So what to make of this? If you do the mathematics, a typical engineer in the USA would be working about 50 * 50 = 2500 hours a year (ignoring bank holidays – which are about the same), whereas a typical engineer in the UK would be working 37.5 * 48 = 1800 hours – a 39% difference. Now the question is, did I perceive the engineers in the USA to generate more output? I’d say yes, but only by a few percent, and certainly no where near the 39% more hours that they worked.
I’m sure other people’s experience will differ. However it’s clear to me why there isn’t a big difference in productivity. I solve most of my toughest technical problems when I’m not at work. Indeed, there is nothing like taking a stroll, going for a bike ride, or even sitting down for a beer with friends for clearing the mind and allowing you to literally look at issues from a new perspective. I know this experience isn’t unique to me, so why don’t employers see the light and realize that everyone benefits from requiring engineers (and other professions – but that’s outside my bailiwick) to take more time off?

Maybe it’s just me, but a start in changing this situation could be for more engineers to start demanding more time off. Some companies are starting to see the light. For example Netrino offers its employees 5 weeks vacation. Let’s make them the norm – not the exception!

As a final note, I know I have regular readers from other parts of the world – South America, Australasia, and the former eastern block. I’d be interested to hear what your working conditions are like.

Home

Computing your stack size

Sunday, March 1st, 2009 Nigel Jones

Many of the folks that come to this blog by way of search engines do so because they are having problems with stack overflow. I’ve already given my take on the likely causes of a stack overflow. Today I’d like to offer some hints on a related topic – how to set about computing the stack size for your application. This is an extremely difficult problem, which can be approached in one of three ways – experimentally, analytically or randomly. The latter is by far the most common technique, which consists essentially of choosing a number and seeing whether it works! In an effort to reduce the use of the random approach, I’ll try and summarize the other two methods.

Experimentally

In the experimental method, a typically very large stack size is selected. The stack is then filled with an arbitrary bit pattern such as 0xFC. The code is then executed for a ‘suitable’ amount of time, and then the stack is examined to see how far it has grown. Most people will typically take this number, add a safety margin and call it the required stack size. The main advantage of this approach is that it’s easy to do (indeed many good debuggers have this feature built in to them). It also has the advantage of being ‘experimental data’. However, there are two big problems with this approach, which will catch the unwary.

The biggest single problem with the experimental approach is the implicit assumption that the experiment that is run is representative of the worst case conditions. What do I mean by the worst case conditions? Well, the maximum stack size occurs in an embedded system when an interrupt that uses the most stack size occurs at a point in the code that the foreground application is also using the maximum stack size. On the assumption that most interrupts are asynchronous to the foreground application, the problem should be clear. How exactly do you know after your testing whether or not the interrupt that uses the most stack size did indeed trigger at the worst (best?) possible moment? Thus even if your testing had 100% code coverage, it still isn’t possible to know for sure whether you have covered all possible scenarios. If, as is the normal case, you don’t even begin to approach full code coverage, it should be clear to you that testing tends to reveal the typical ‘worst-case’ condition, rather than the genuine worst case condition.

The second major issue with testing is that it tends to be done when the code is close to being completed, rather than when it is completed. The problem is that small changes in the source code can have a huge impact on the required stack size. For example, let’s say that during testing it is discovered that an interrupt service routine is taking so long to complete that another interrupt is being occasionally missed. A ‘quick fix’ is to simply enable interrupts in the long interrupt handler, so that the other interrupts can do their thing. This one line change can lead to a dramatic increase in stack usage. (If you aren’t cognizant of the stack usage of interrupt handlers, you should read this article I wrote).

Analytically

In the analytical approach, the idea is to examine the source code and from the analysis work out the maximum stack usage of the foreground application, and then to add to this the worst case interrupt handler usage. This is obviously a daunting task for anything but the simplest of applications. You will not be surprised to hear that computer programs have been written to perform this analysis. Indeed good quality linkers will now do this for you as a matter of course. Furthermore, my favorite third party tool, PC-Lint from Gimpel, will also now do this starting with version 9. However be warned that it takes a lot of work to set up PC-Lint to perform the analysis.

Although analysis can theoretically give an accurate answer, it does have several problems.

Recursion

It’s almost impossible for an analytical approach to compute the stack usage of a program that uses recursion. Indeed it’s because of the unbounded effect on stack size that recursion is a really bad idea in embedded systems. Indeed MISRA bans it, and I personally banned it about twenty years ago.

Indirect Function Calls

Pointers to functions are something that I use extensively and heartily recommend (for a discussion see this article I wrote). Although they don’t have a deleterious effect on stack size, they do make it quite difficult for analysis programs to track what is going on. Indeed PC-Lint cannot handle pointers to functions when it comes to computing stack usage. Thus if you use an analytical approach and you use pointers to functions, then make absolutely sure that the analysis program can track all the indirect calls.

Optimizers

Code optimization can play havoc with the stack usage. Some optimizations reduce stack usage (by e.g. placing function parameters in registers), while others can increase stack usage. I should note that it’s only third party tools that should be bamboozled by the optimizer. The linker that makes up part of the compilation package should be aware of everything that the compiler has done.

Complexity

Even if you have a linker that will compute stack usage, interpreting the output of the linker is always a daunting task. For example, the linker from IAR will compute your stack usage. However, it isn’t nice enough to simply say: You need 279 bytes of stack space. Instead you have to study the linker output carefully to glean the requisite information.

A Practical Approach

It’s clear from the above that it isn’t easy to determine the stack size for an application. So how exactly does one set about this in practice? Well here’s what I do.

  1. Locate the stack at the beginning / end of memory (depending upon how the stack grows) and place all variables at the other end of the memory. This essentially means that you are implicitly allowing the maximum amount of memory possible for the stack. Note that many good compilers / linkers will do this automatically for you.
  2. As a starting point, I allocate 10% of the available memory for stack use. If I know I will be using functions that are huge users of the stack (such as printf, scanf and their brethren), then I’ll typically set it to 20% of available memory.
  3. I set up the debug environment from day 1 to monitor and report stack usage. This way as I progress through the development process I get a very good feel for the application’s stack consumption. This also helps in spotting changes to the code that have big impacts on the required stack size.
  4. Once I have ‘all the code written’, I start to make use of the information in the linker report. The more tight I am on memory, the closer I examine the linker output. In particular, what I often find is that there is one and only one function call chain that leads to a stack usage that is much greater than all the other call chains. In which case, I look to see if I can restructure that call chain so as to bring the maximum stage usage more in line with the typical stack usage.

If you stumbled upon this blog courtesy of a search engine, then I hope you found the above useful. I invite you to check out some of my other posts, which you may find useful. If you are a regular reader, then as always, thanks for stopping by.

Home

Electrical Engineers versus Computer Scientists

Friday, February 6th, 2009 Nigel Jones

Looking back at my various blog postings, I’ve noticed that although I may be controversial on technical topics, I haven’t to date written anything that is controversial on a, shall I say, human side. Well no more Mr. Nice Guy, since today I intend to wade in on the topic of whether Embedded Systems should be programmed by Electrical Engineers or Computer Scientists. Regular readers will know I’m an EE (actually my degree is in EE & ME – but that’s another story) and so you won’t be surprised to hear that my usual preference is for Electrical Engineers. Although I am a (very) opinionated person, I’d like to think that most of my opinions have some basis in reality, and so here’s my opinion and its supporting observations…

The more embedded a product is, the better off you are with an EE, the less embedded it is, the better off you are with a CS.

So what’s the basis for this overblown, sweeping generalization and what exactly do I mean by ‘more embedded’?

Well, I consider a product to be highly embedded if it meets one or more of the following criteria:

  • It has no or very simple user interfaces.
  • It performs a lot of hardware type functions in software. For example a DSP that performs a lot of signal processing is essentially doing in software what was once done in hardware.
  • It contains a lot of complicated hardware that needs extensive configuration and software support (For example a PowerQUICC processor).

By contrast, I consider a product to be lightly embedded if it meets either of the following criteria:

  • It has a sophisticated user interface (especially if the interface is web based)
  • It is database centric.

Evidently there exists products that meet the criteria for both sides of the dichotomy. For example, my new flat screen TV has a very sophisticated user interface, but I’m sure it does an extensive amount of signal processing.

If you accept this dichotomy, then it is evident that folks working on highly embedded systems really need to understand the hardware (since that’s what the product is about) whereas those working on lightly embedded systems need a good understanding of how to build large software systems. Having said this, my experience is that whereas EE’s (OK some EE’s) are able to quickly learn the principles of building large software systems, I’ve never yet met a CS major that had anything beyond a casual understanding of what’s really happening at the hardware level. I’ve seen this lack of knowledge (interest?) manifest itself in many ways. Examples include:

  • Not knowing / understanding the Nyquist Sampling theorem
  • Failure to realize that EEPROM / Flash have extraordinarily long write times
  • Not realizing that sampling jitter can destroy the performance of a digital filter

What about the other way? Have I seen EE’s write 1000 line functions, and be completely clueless about principles such as data encapsulation? Absolutely! However, I have also seen EE’s successfully craft very large systems. As a result I’ve come to two basic observations:

  • A deeply embedded system written entirely by a CS major will have major problems.
  • A lightly embedded system written entirely by an EE major may have major problems.

On this basis, I prefer (slightly) to have EE’s work on embedded systems.

It doesn’t take a rocket scientist to conclude that perhaps the best approach is to have a team where the EE’s handle the hardware centric stuff and the CS’s handle the computer centric stuff. Indeed, this is the approach I see taken in most organizations.

As a final thought, although it is common to find EE majors that have gone back to college to get a Masters in Computer Science, I haven’t yet met a CS major that has gone back to college to get a Masters in Electrical Engineering.

Home

Improvements versus Features

Thursday, August 7th, 2008 Nigel Jones

I’m taking a slight detour from my usual topics to blather about what I see as an unfortunate trend that is making its way from the PC world to the embedded world. My perception is that as more embedded systems get sophisticated user interfaces, the desire to add features seems inescapable. While I don’t see adding features as bad, per se, doing so instead of improving the product is a bad thing. What do I mean by improving the product? Well, typically those things that most users don’t understand, for example noise floors, power consumption, SNR, software reliability and so on.

In the days before user interfaces, pretty much the only way to improve a product was to work on the “invisible” parameters. Today, it’s often far easier to add a new feature than it is to labor at, for example, wringing a few more db of performance out of that digital filter while keeping the number of clock cycles unchanged.

Am I tilting at windmills? I don’t think so. Is my plea pointless – probably. However the next time someone comes along asking for a YANF (Yet Another New Feature), do them and you a favor and ask how time spent on the YANF compares to time spent on improving the product.

Home

The perils of overloading

Monday, February 4th, 2008 Nigel Jones

This post is coming to you from Sweden – a very fine country that I heartily recommend visiting if you get the chance. (If you’re wondering why I’m in Sweden – I’m here on business as one of my clients is located in Gothenburg). Anyway, the fact that I’m in Sweden is relevant to this post, as to get here I had to put myself at the mercies of United Airlines. Now the fact that the flight over here was less than perfect wouldn’t be news to any of you that travel regularly. However, the reason that the flight was a disaster is relevant, as I’ll now try and explain…

Upon arrival at the United check in desk at Dulles airport, I was greeted by an array of self check in kiosks, with a total of one real live human being to take care of baggage check in. Thinking myself to be computer savvy, I negotiated the check in kiosk with ease, only to be told that:

  1. I had to see the human in order to check my bags in, and
  2. The system was unable to assign me a seat and that seat assignment would be done at the gate.

The first instruction was par for the course, while the second instruction I found to be very strange. Anyway, I shrugged my shoulders and went over to the sole person working the desk. There was one gentleman in front of me. This gentleman, not unreasonably asked if he could use some of his frequent flier miles to upgrade to business class. No problem said the United employee, who proceeded to rattle the keys. After 5 minutes, he announced that although the system was showing that seats were available in business class, the computer system refused to allow him to assign a seat. This was the second clue that things were heading south in a hurry. It then took the clerk another 10 minutes to wait list the gentleman (giving a total processing time of 15 minutes). Although it’s possible the clerk was incompetent, I got the impression that he really knew what he was doing, and was just being stymied by the system.

Anyway, I checked my bag in and proceeded to the gate. When I got to the gate, I found another 100+ passengers that also had no seat assignments. When eventually I got called to the counter, I found a harried women with a sea of boarding passes printed out in front of her. She was manually searching through them trying to find my name. Eventually she found it and handed it over. My nature being what it is, I politely inquired as to the reason for this astonishingly strange system of assigning seats and issuing boarding passes. Apparently this was the opportunity that the clerk had been waiting for to vent her frustration, as she gladly explained to me that the powers that be had over booked the flight. And so my gentle reader, we come to the point of this post. It was apparent that the United system was unable to handle an overbooked flight correctly, and rather than degrade gracefully, had all but collapsed. At which point I started making some snarky comments to myself about database programmers and how surely all database programmers worked in that field because they couldn’t handle the rigors of the embedded / real time world and that any half decent embedded systems person would never make such an elementary mistake. It was then that I had my epiphany. We make the same mistake in the embedded world all the time. When was the last time you used RMA (Rate monotonic analysis) to guarantee that all your tasks would meet their scheduling deadlines? How many failures of embedded systems are caused by overloading (or over scheduling) and the failure to correctly assign task priorities. How many times do weird things happen in your code that you just shrug off as “one of those things”? In short, I found myself cutting a break to the poor sod that wrote United’s code. I was still ticked off though!

Home