Posts Tagged ‘rtos’

Is Reliable Multithreaded Software Possible?

Wednesday, December 23rd, 2009 Michael Barr

Until earlier this month, I’d overlooked a most interesting May 2006 article in Embedded Software Design magazine by Mark Bereit titled “Escape the Software Development Paradigm Trap“. The article opines that the methods we use to design embedded software, particularly multitasked software with interrupt service routines and/or real-time operating systems, are fundamentally incompatible with reliability.

Here’s the critical analogy:

Imagine for a minute that I’ve invented the Universal Bolt. This is a metal object for joining threaded holes that can extend or collapse to fit a variety of lengths. It can expand or contract to fit holes of different diameters. The really cool feature is that I have replaced the bolt’s spiral ridge with a series of extendable probes that can accommodate different thread pitches. You no longer need to stock a variety of bolts of different sizes and lengths and thread spacings because my Universal Bolt can be used in place of any of them.

Because it’s able to change configurations extremely quickly, a single Universal Bolt can take the place of many conventional bolts simultaneously. What we do is rig up a clever and very fast dispatcher device that quickly moves the [Universal Bolt] from hole to hole. If the dispatcher is fast enough, my Universal Bolt can spend a moment in each hole in turn and get the whole way through your [mechanical] product so fast that it returns to each hole before the joint has had a chance to separate.

You’d have to be crazy to fly in an airplane designed this way. “If anything caused the dispatcher to derail, the entire product would collapse in a second.” Yet this analogy describes the design of most products powered by embedded computers.

A fast and complex thread dispatcher keeps moving one simple and stupid integer-computation unit all over a big system tending to tasks [and ISRs] rapidly enough that they all get done. And if that dispatcher ever once leads the CPU into an invalid memory address the whole thing crashes to a halt.

Clearly, we need a new paradigm for reliable embedded software architecture. My thoughts on that are coming to this space in 2010.

Embedded Java Lives!

Wednesday, December 16th, 2009 Michael Barr

Reading the latest embedded software market survey highlights from VDC Research I was surprised to note two data points indicating new upward momentum for Java as an embedded software development language.

First, of those survey respondents using an operating system on their current project 11% indicated that a Java Virtual Machine is required in their product.  Second, Java was selected as the fifth most used language for firmware development at 14% of respondents (behind C, assembly, C++, and Matlab, in that order).

This is an interesting trend.  My regular readers will note that I have written and spoken about Java in embedded systems since 1997 and that I declared Java “dead” in the embedded realm about 18 months ago.

Slack Scheduling vs. Rate Monotonic Analysis

Friday, October 2nd, 2009 Michael Barr

The “slack scheduling” technique described in a recent Embedded.com article by Bill Cronk is interesting to me for a few reasons. First, because a traditional priority-based preemptive RTOS used in conjunction with RMA priority assignment offers all of the pros and none of the cons of the described slack scheduling method. And second because slack scheduling may still be valuable when working within a corporate or industry regime, such ARINC-653, that legislates a cyclic executive approach to achieving determinism.

I don’t think the original article addresses either of these points, so I will address them here.

I have written and spoken about RMA extensively in the past. For readers wanting background, start with Introduction to Rate Monotonic Schedule. Then read Why RMA is Not Just for Academics, from which this important and relevant quote:

The principal benefit of RMA is that the performance of a set of tasks thus prioritized degrades gracefully. Your key “critical set” of tasks can be guaranteed (even proven a priori) to always meet its deadlines–even during periods of transient overload. Dynamic-priority operating systems cannot make this guarantee. Nor can static-priority RTOSes running tasks prioritized in other ways.

In plain English, RMA helps you prove (via math) that your interrupt service routines (ISRs) plus the subset of tasks with deadlines will always complete their work on time. Then you can do whatever the heck you want at lower priority levels without messing that up. Consider that with this architecture a ventilator could feature a low priority video game. If the CPU becomes overloaded, only the game’s performance is at risk. The rest (most) of the time the game gets to use all the “slack” CPU cycles left behind by the critical set.

The pros of slack scheduling appear to be:

  • Non-critical tasks (i.e., those without deadline miss consequences) can’t steal CPU cycles needed by critical tasks.
  • Allows slack CPU cycles to be used by the non-critical tasks.
  • No need to ever “terminate any lower criticality [code] that exceeds” its time slot.
  • Client tasks never need to “wait for their server thread to be scheduled”.

All of these pros are common to the RMA+RTOS approach as well.

The cons of slack scheduling appear to be:

  • There is overhead associated with tracking, granting, and requesting slack time (the article doesn’t provide enough info to quantify these).
  • There is overhead associated with waking tasks (or polling instead of interrupts) to see if they need even use any of their allocated time in this cycle.
  • Like all time-partitioned code structures, it is fragile in the face of late-added work items (“the introduction of a new thread may be difficult if there is no room available on the timeline”.

The RMA+RTOS approach suffers none of these downsides.

It is important to understand that the two really powerful and beautiful aspects of RMA are first that tasks outside the critical set need not be budgeted/analyzed at all (they don’t have deadlines anyway) and second that these low priority non-critical tasks have automatic free (no-overhead) use of all the CPU cycles not used by those with deadlines. All of the critical set schedulability analysis is done on worst-case CPU use. Best case and average case may be substantially less. For example, an interrupt that could first worst-case every 1 ms eats up a lot of CPU time on paper only (well, at least until that worst case when it does really consume that much CPU); the actual implementation need not be changed to poll; and every last unused CPU cycle is available for other uses in the real system.

Unfortunately, few embedded software engineers apply RMA at all. And some of those who do, may get the math wrong. There are assumptions and calculations to be handled properly. And that’s where a time-partitioned code structure (a.k.a., cyclic executive or real-time executive) is demonstrably better than RMA+RTOS. And that is probably why ARINC-653 mandates time-partitioned scheduling.

Under a mandate of time-partitioned, the described slack scheduling approach may aid implementers.

Where Have All the RTOS Vendors Gone?

Wednesday, September 23rd, 2009 Michael Barr

I’m pleased to report that the Embedded Systems Conference (ESC) is alive and well here in Boston this year. This success is despite the recession and industry trends that have caused some other technical trade shows to fold this year. (That’s right, I’m talking about you Software Development Conference.) There’s even apparently going to be an ESC Chicago in 2010!

However, the RTOS vendors are largely and notably absent from this year’s event. Of the major players, only Enea and Green Hills have booths.

Wind River has long been fickle about making camp at ESC, of course, with yearly vascillations between the largest booth at the show and none at all. Their new parent Intel has acted similarly regarding pitching chips to embedded system designers over the years. Thus it is not too surprising to me that neither are here while they sort through the post-acquisition marketing shifts and tactical planning.

But where are the booths for Micrium, Mentor (Nucleus and VRTX), Keil, QNX, Express Logic, LynuxWorks, Quadros and the others this year? Are Microsoft, Linux, Enea, and Green Hills eating your lunch?

I can’t help but connect their absence with the five year downward trend of intention to buy a “Commercial OS” I noted in TechnInsight’s 2009 Embedded Market Study. But is that simply because no one is marketing RTOSes to developers any more?

Take the Mutual Exclusion Challenge

Thursday, September 10th, 2009 Michael Barr

If you’ve been reading my articles or blog for a while, you’ve probably noticed a few pieces about the differences between mutexes and semaphores. The most concise presentation of these issues that I’ve made was published last year in Embedded Systems Design. That article, called Mutexes and Semaphores Demystified is also available at http://www.netrino.com/Embedded-Systems/How-To/RTOS-Mutex-Semaphore.

A new blogger in the embedded software area (Niall Cooling) is revisiting the mutex vs. semaphore subject and reading that caused me to come across a few other sources on the subject. (You can find his blog at http://www.feabhas.com/blog.) The “Toilet Example” that he cites via a link to another website contains one of the worst explanations of the use of semaphores I have seen. I don’t even know where to start rewriting it.

So I challenge you, dear RSS subscribers, can you individually or collectively (a) identify the flaws in the Toilet Example explanation at http://koti.mbnet.fi/niclasw/MutexSemaphore.html and (b) propose a proper implementable solution by way of a rewrite? I suggest we do this via the comment mechanism provided at the end of this blog post.