Archive for the ‘RTOS Multithreading’ Category

Firmware-Specific Bug #7: Deadlock

Wednesday, November 17th, 2010 Michael Barr

A deadlock is a circular dependency between two or more tasks. For example, if Task 1 has already acquired A and is blocked waiting for B while Task 2 has previously acquired B and is blocked waiting for A, neither task will awake. Circular dependencies can occur at several levels in the architecture of a multithreaded system (for example, each task is waiting for an event only the other will send) but here I am concerned with the common problem of resource deadlocks involving mutexes.

Best Practice: Two simple programming practices are each able to entirely prevent resource deadlocks in embedded systems. The first technique, which I recommend over the other, is to never attempt or require the simultaneous acquisition of two or more mutexes. Holding one mutex while blocking for another mutex turns out to be a necessary condition for deadlock. Holding one mutex is never, by itself, a cause of deadlock.[3]

In my view, the practice of acquiring only one mutex at a time is also consistent with an excellent architectural practice of always pushing the acquisition and release of mutexes into the leaf nodes of your code. The leaf nodes are the device drivers and reentrant libraries. This keeps the mutex acquisition and release code out of the task-level algorithmics and helps to minimize the amount of code inside critical sections.[4]

The second technique is to assign an ordering to all of the mutexes in the system (for example, alphabetical order by mutex handle variable name) and to always acquire multiple mutexes in that same order. This technique will definitely remove all resource deadlocks but comes with an execution-time price. I recommend removing deadlocks this way only when you’re dealing with large bodies of legacy code that can’t be easily refactored to eliminate the multiple-mutex dependency.

Footnotes

[3] In theory, the task that wants the mutex could starve while a series of higher priority tasks take turns with the mutex. However, the rate monotonic analysis can be used to ensure this doesn’t happen to tasks with deadlines that must be met.

[4] An additional benefit of this architectural pattern is that it reduces the number of programmers on the team who must remember to use and correctly use each mutex. Other benefits are that each mutex handle can be hidden inside the leaf node that uses it and that doing this allows for easier switches between interrupt disables and mutex acquisition as appropriate to balance performance and task prioritization. One of the most famous priority inversions happened on Mars in 1997. Glitches were observed in Earth-based testing that could not be reproduced and were not attributed to priority inversion until after the problems on Mars forced investigation. For more details, read Glenn Reave’s “What really happened on Mars?” account.

Firmware-Specific Bug #6

Firmware-Specific Bug #8

Upcoming Embedded Software Boot Camps

Thursday, October 21st, 2010 Michael Barr

Earlier this week, Netrino announced the dates and locations for a pair of upcoming public sessions of the popular hands-on Embedded Software Boot Camp workshops.  The dates and locations will be as follows:

These will be the 9th and 10th times, respectively, that the Embedded Software Boot Camp has been offered to the public since the first was held in 2008.  The workshop is a one-week skills strengthening program consisting of a series of lectures and hands-on exercises. This intense educational (yet fun!) program is guaranteed to quickly and dramatically raise the embedded programming skills of individuals and teams.

Everyone who attends learns a ton, including:

  • How to write portable device drivers and interrupt handlers in C
  • How to decide if an RTOS will benefit your application
  • How to architect real-time software to ensure all deadlines are met
  • More than fifty practical tips for reducing firmware bugs
  • How to find, fix, and prevent each of the top 10 firmware bugs

I will be the instructor for both of these sessions and hope you will make the commitment to become a Master Firmware Engineer in 2011 by taking this course!  You can find out more about pricing and other details here: http://www.netrino.com/Training-Calendar.

Get Your (RTOS Task and ISR) Priorities Straight!

Thursday, October 14th, 2010 Michael Barr

Get your priorities straight!  Or suffer the consequence of missed deadlines.  Of course, I’m talking here about the relative priorities of your real-time tasks and interrupt service routines.  In my travels around the embedded design community, I have learned that most real-time systems are designed with ad hoc priorities.

Unfortunately, mis-prioritized systems often “appear” to work fine without discernibly missing critical deadlines in testing.  The worst-case workload may have never yet happened in the field or there is sufficient CPU to accidentally succeed despite the lack of proper planning.  This has lead to a generation of embedded software developers being unaware of the proper technique.  There is simply too little feedback from non-reproducible deadline misses in the field to the team that designed it—unless a death and a lawsuit forces an investigation.

Truth be told: There is a science to the process of assigning relative priorities.  That science is associated with the “rate monotonic algorithm,” which provides a formulaic way to assign task priorities based on facts.  It is also associated with the “rate monotonic analysis,” which helps you prove that your correctly-prioritized tasks and ISRs will find sufficient available CPU bandwidth between them during extreme busy workloads called “transient overload.”   It’s too bad most engineers don’t know how to use these tools.

I’ve written on the why and how of RMA before and recommend you start with my article Introduction to Rate-Monotonic Scheduling and then read my blog post 3 Things Every Programmer Should Know About RMA.

3 Things Every Programmer Should Know About RMA

Wednesday, August 18th, 2010 Michael Barr

This post was originally posted in the wrong blog.  I’m reposting it here.

Real-time systems design and RMA go together like peanut butter and jelly.  So why is it that wherever I go in the embedded community, engineers are developing real-time systems without applying RMA?  This is a dangerous situation, but one that is easily remedied by ensuring every programmer knows three things about RMA.

In case you are entirely unfamiliar with RMA, there’s a handy primer on the technique athttp://www.netrino.com/Embedded-Systems/How-To/RMA-Rate-Monotonic-Algorithm/. I’ve tried to write this blog post in a way that you can read that before or after, at your option.

#1: RMA is Not Just for Academics

You have probably heard of RMA.  Maybe you can even expand the acronym.   Maybe you also know that the theoretical underpinnings of RMA were developed largely at Carnegie Mellon University’s Software Engineering Institute and/or that the technique has been known for about three decades.

If, however, you are like the vast majority of the thousands of firmware engineers I have communicated with on the subject during my years as a writer/editor, consultant, and trainer, you probably think RMA is just for academics.  I also thought that way years ago—but here’s the straight dope:

  • All of the popular commercial real-time operating systems (e.g., VxWorksThreadX, and MicroC/OS) are built upon fixed-priority preemptive schedulers.[1]
  • RMA is the optimal method of assigning fixed priorities to RTOS tasks.  That is to say that if a set of tasks cannot be scheduled using RMA, it can’t be scheduled using any fixed-priority algorithm.
  • RMA provides convenient rules of thumb regarding the percentage of CPU you can safely use while still meeting all deadlines.  If you don’t use RMA to assign priorities to your tasks, there is no rule of thumb that will ensure all of their deadlines will be met.[2]

A key feature of RMA is the ability to prove a priori that a given set of tasks will always meet its deadlines—even during periods of transient overload.  Dynamic-priority operating systems cannot make this guarantee.  Nor can fixed-priority RTOSes running tasks prioritized in other ways.

Too many of today’s real-time systems built with an RTOS are working by luck. Excess processing power may be masking design and analysis sins or the worst-case just hasn’t happened—yet.

Bottom line: You’re playing with fire if you don’t use RMA to assign priorities to critical tasks; it might be just a matter of time before your product’s users get burned.[3]

#2: RMA Need Not Be Applied to Every Task

As any programmer that’s already put RMA into practice will tell you, the hardest part of the analysis phase is establishing an upper bound for the worst-case execution time of each task. The CPU utilization of each task is computed as the ratio of its worst-case execution time to its worst-case period. [4]

There are three ways to place an upper bound on execution time: (1) by measuring the actual execution time during the tested worst-case scenario; (2) by performing a top-down analysis of the code in combination with a cycle-counter; or (3) by making an educated guess based on big-O notation.  I call these alternatives measuring, analyzing, and budgeting, respectively, and note that the decision of which to use involves tradeoffs of precision vs. level of effort. Measurement can be extremely precise, but requires the ability to instrument and test the actual working code—which must be remeasured after every code change.  Budgeting is easiest and can be performed even at the beginning of the project, but it is necessarily imprecise (in the conservative direction of requiring more CPU bandwidth than is actually required).

But there is at least some good news about the analysis.  RMA need not be performed across the entire set of tasks in the system.  It is possible to define a smaller (often much smaller, in my experience) critical set of tasks on which RMA needs to be performed, with the remaining non-critical tasks simply assigned lower priorities.

This critical set of tasks should contain all of the tasks with deadlines that can’t be missed or else.  In addition, it should contain any other tasks the former set either share mutexes with or from which they require timely semaphore or message queue posts.  Every other task is considered non-critical.

RMA can be meaningfully applied to the critical set tasks only, so long as we ensure that all of the non-critical tasks have priorities below the entire critical set.  We then need only determine worst-case periods and worst-case execution times for the critical set.  Furthermore, we need only follow the rate monotonic algorithm for assignment of priorities within the critical set.

Bottom line: Anything goes at lower priorities where there are no deadlines.

#3: RMA Applies to Interrupt Service Routines Too

With few exceptions, books, articles, and papers that mention RMA describe it as a technique for prioritizing the tasks on a preemptive fixed-priority operating system.  But the technique is also essential for correctly prioritizing interrupt handlers.

Indeed, even if you have designed a real-time system that consists only of interrupt service routines (plus a do-nothing background loop in main), you should use the rate monotonic algorithm to prioritize them with respect to their worst-case frequency of occurrence.  Then you can use rate monotonic analysis to prove that they will all meet their real-time deadlines even during transient overload.

Furthermore, if you have a set of critical tasks in addition to interrupt service routines the prioritization and analysis associated with RMA need to be performed across the entire set of those entities.[5] This can be complicated, as there is an arbitrary “priority boundary” imposed by the CPU hardware: even the lowest priority ISR is deemed more important than the highest priority task.

For example, consider the conflict in the set of ISRs and tasks in Table 1.  RMA dictates that the priority of Task A should be higher than the priority of the ISR, because Task A can occur more frequently.  But the hardware demands otherwise, by limiting our ability to move ISRs down in priority.  If we leave things as they are, we cannot simply sum the CPU utilization of this set of entities to see if they are below the schedulable bound for four entities.

Runnable Entity Priority by RMA Worst-Case Execution Time Worst-Case Period CPU Utilization
ISR 2 500 us 10 ms 5%
Task A 1 750 us 3 ms 25%
Task C 3 300 us 30 ms 1%
Task B 4 8 ms 40 ms 20%

Table 1.  A Misprioritized Interrupt Handler

So what should we do in a conflicted scenario like this?  There are two options.  Either we change the program’s structure, by moving the ISR code into a polling loop that operates as a 10 ms task at priority 2—in which case total utilization is 51%.  Or we treat the ISR, for purposes of proof via rate monotonic analysis anyway, as though it actually has a worst-case period of 3 ms.  In the latter option, the ISR has an appropriate top priority by RMA but the CPU bandwidth dedicated to the ISR increases from 5% to 16.7%–bringing the new total up to 62.7%  Either way, the full set is provably schedulable.[6]

Bottom line: Interrupt handlers must be considered part of the critical set, with RMA used to prioritize them in relation to the tasks they might steal the CPU away from.

Conclusion

Every programmer should know three key things about RMA.  First, RMA is a technique that should be used to analyze any preemptive system with deadlines; it is not just for academics after all.  Second, the amount of effort involved in RMA analysis can be reduced by ignoring tasks outside the critical set; non-critical tasks can be assigned an arbitrary pattern of lower priorities and need not be analyzed.  Finally, if interrupts can preempt critical set tasks or even just each other, RMA should be used to analyze those too.


[1] Schedulers that tweak task priorities dynamically, as desktop flavors of Windows and Linux do, may miss deadlines indiscriminately during transient overload periods.   They should thus not be used in the design of safety-critical real-time systems.

[2] For example, it is widely rumored that a system less than 50% loaded will always meet its deadlines.  Unfortunately, there is no such rule of thumb that’s correct.  By contrast, when you do use RMA there is a simple rule of thumb ranging from a high of 82.8% for 2 tasks to a low of 69.2% for N tasks.

[3] Perhaps your failure to use RMA to prioritize tasks and prove they’ll meet deadlines explains one or more of those “glitches” your customers have been complaining about?

[4] Establishing the worst-case period of a task is both easier and more stable.

[5] Note this is necessary even if one or more of the interrupts doesn’t have a real-time deadline of its own.  That’s because the interrupts may occur during the transient overload and thus prevent one or more critical set tasks from meeting its real-time deadline.

[6] However, switching the code to use polling actually consumes cycles that are only reserved for the worst-case in the other solution.  That could mean failing to find CPU time for low priority non-critical tasks in the average case.

Rate Monotonic Analysis and Round Robin Scheduling

Friday, January 22nd, 2010 Michael Barr

Rate Monotonic Analysis (RMA) is a way of proving a priori via mathematics (rather than post-implementation via testing) that a set of tasks and interrupt service routines (ISRs) will always meet their deadlines–even under worst-case timing.  In this blog, I address the issue of what to do if two or more tasks or ISRs have equal priority and whether round robin scheduling is necessary in an RTOS to deal with that special case.

First a little background.  In order for the schedulability analysis portion of the RMA mathematics to provide meaningful results, the following assumptions must hold:

Under RMA, the relative priorities are assigned according to a simple rule: “The more often a task or ISR runs (in the worst-case), the higher its priority.” Put another way, the task or ISR with the longest period between iterations (interarrival time, if you prefer) is least important. This is because an infrequent but high-priority task could prevent a more frequent task from missing an entire iteration.

So what happens if you are using RMA to assign priorities and you wind up with two (or more) tasks or ISRs assigned equal priority? (Translation: they have the same worst-case interarrival times). Must they be assigned equal priority in the real system? What if the RTOS (in the case of tasks) or hardware (in the case of interrupts) doesn’t support round-robin scheduling–or even equal priorities with run-to-completion?

Interestingly, it turns out not to matter a bit whether you:

  1. Merge the two tasks into one (i.e., executed code for Task A then Task B).
  2. Give them equal priority, either with round robin or run-to-completion behavior.
  3. Give them adjacent unequal priorities (in either relative order).

If you run through the timing diagrams for each of the above scenarios, you’ll see that all three are equivalent. Except that the equal priority with round robin potentially suffers a performance impact from unnecessary additional context switches.