Posts Tagged ‘safety’

Verification vs. Validation

Tuesday, December 15th, 2009 Michael Barr

The FDA 510(k) guidelines for medical device software leave something to be desired in the poor differentiation of two important and distinct software development practices: verification and validation.  In particular, the FDA often uses the word ‘validation’ to describe both types of activities.  (See, for example, the General Principles of Software Validation; Final Guidance for Industry and FDA Staff.)

Put simply, software validation is a set of activities that together demonstrate that you “made the correct product” (or, as others have put it, “built the right thing”) for the customer’s needs.  Validation tests that the product’s behavior is consistent with the requirements, safe, and efficacious.

By contrast, software verification is a set of activities that together demonstrate that the implementation matches the design.  That is, verification tests that you “made the product correctly” (“built it right”).

In the larger context, verification should come before validation.  It doesn’t make sense to check that the product does what it is supposed to unless you first confirm that it does what you programmed it to.  If it were only the case that the many engineers and organizations that talk about software verification and validation (a.k.a., V&V) could get this simple concept.  It wouldn’t hurt, of course, if the FDA rewrote the above document.

Breathalyzer Source Code Analysis

Thursday, November 5th, 2009 Michael Barr

Firmware bugs seem to be everywhere these days. So much so that firmware source code analysis is even entering the courtroom in criminal cases involving data collection devices with software inside. Consider the precedent-setting case of the Alcotest 7110. After a two-year legal fight, seven defendants in New Jersey DUI cases successfully won the right to have their experts review the source code for the Alcotest firmware.

The state and the defendants both ultimately produced expert reports evaluating the quality of the firmware source code. Though each side’s experts reached divergent opinions as to the overall code quality, several facts seem to have emerged as a result of the analysis:

– Of the available 12-bits of A/D precision, just 4-bits (most-significant) are used in the actual calculation. This sorts each raw blood-alcohol reading into one of 16 buckets. (I wonder how they biased the rounding on that.)
– Out of range A/D readings are forced to the high or low limit. This must happen with at least 32 consecutive readings before any flags are raised.
– There is no feedback mechanism for the software to ensure that actuated devices, such as an air pump and infrared sensor, are actually on or off when they are supposed to be.
– The software first averages the initial two readings. Then it averages the third reading with that average. Then the fourth reading is averaged in, etc. No comments or documentation explains the use of this formula, which causes the final reading to have a weight of 0.5 in the final value and the one before that to have a weight of 0.25, etc.
– Out of range averages are forced to the high or low limit too.
– Static analysis with lint produced over 19,000 warnings about the code (that’s about three errors for ever five lines of source code).

What would you infer about the reliability of a defendant’s blood-alcohol level if you were on that jury? If you’re so inclined you can read the full expert reports for yourself: at defendants and state.

This Code Stinks! The Worst Embedded Code Ever

Thursday, November 5th, 2009 Michael Barr

At the Embedded Systems Conference Boston in September, I gave a popular ESC Theater talk titled “This Code Stinks! The Worst Embedded Code Ever” that used lousy code from real products as a teaching tool. The example code was gathered by a number of engineers from a broad swath of companies over several years. (Minor details, including variable names and function names, were changed as needed to hide the specifics of applications, companies, or programmers.)

Here’s just one example of the bad code in that presentation:

y = (x + 305) / 146097 * 400 + (x + 305) % 146097 / 36524 * 100 + (x + 305) % 146097 % 36524 / 1461 * 4 + (x + 305) % 146097 % 36524 % 1461 / 365;

I don’t know if the above snippet contains any bugs, as most of the other examples were found to. And that’s a problem. Where are we supposed to begin an analysis of the above? What is this code supposed to do when it works? What range of input values is appropriate to test? What are the correct output values for a given input? Is this code responsible for handling out of range inputs gracefully? In the original listing, there were no comments on or around this line to help.

I eventually learned that this code computes the year, with accounting for extra days in leap years, given the number of days since a known reference date (e.g., January 1, 1970). But I note that we still don’t know if it works in all cases; despite it being present in an FDA-regulated medical device. I note too that the Microsoft Zune Bug was buried in a much better formatted snippet of code that performed a very similar calculation.

Here’s another example, this time in C++, with the bug finding left as an exercise for the reader:

bool Probe::getParam(uint32_t param_id, int32_t idx)
{
int32_t val = 0;
int32_t ret = 0;

ret = m_pParam->readParam(param_id, idx, &val);

if (!ret)
{
logMsg(“attempt to read parameter failed\n”);
exit(1);
}
else …

Hint: This code was embedded in a piece of factory automation equipment.

I’ve placed the full set of slides online at http://bit.ly/badcode.

Help Bring the Embedded Software Boot Camp to Your City

Friday, October 30th, 2009 Michael Barr

When Netrino announced the first public offering of the Embedded Software Boot Camp a year and a half ago, I had no idea how popular it would be. Or just how much I could love teaching the intensive hands-on week-long version of the training we had developed over many years.

At this point, we have educated hundreds of engineers about embedded software architecture and related best practices through the topics of Hardware Interfacing in C, Multithreaded RTOS Programming, and RTOS Alternatives and ARM-based programming exercises.

Here’s just a small sampling of feedback from recent attendees:

“I would like to thank you again for the Embedded Software Boot Camp. I brought all the books back to the company and showed my boss the slides and all the handouts and all that good stuff and he was very impressed. Needless to say he was happy with the investment he made in Netrino.” — Garrett

“A better use of time and money than the Wind River VxWorks training course I took last month!” — David, IBM

“Hands on exercises are well thought out.” — Mahesh

“This is one of the best trainings I have ever attended.” — H., Hughes Network Systems

“Fabulous, pertinent, comprehensive and articulate collection of the most important things needed practically. Awesome!” — Sourabh

“Complete and correct embedded software training.” — P. Sipika

For 2010, we are planning a multi-city worldwide road-show for this popular event. I plan to teach as many of them personally as I can. I’d love to have you join us, but we first need your input to select the best cities and dates. If you’ve got a minute, please take our quick 5-question online survey at:

http://survey.constantcontact.com/survey/a07e2m7qx6ig1dxfq9v/start

No personally identifying information is gathered in the survey. Thus if you want to be the first to know what cities and dates we choose, be sure to sign up for our mailing list or bookmark our public training calendar.

Slack Scheduling vs. Rate Monotonic Analysis

Friday, October 2nd, 2009 Michael Barr

The “slack scheduling” technique described in a recent Embedded.com article by Bill Cronk is interesting to me for a few reasons. First, because a traditional priority-based preemptive RTOS used in conjunction with RMA priority assignment offers all of the pros and none of the cons of the described slack scheduling method. And second because slack scheduling may still be valuable when working within a corporate or industry regime, such ARINC-653, that legislates a cyclic executive approach to achieving determinism.

I don’t think the original article addresses either of these points, so I will address them here.

I have written and spoken about RMA extensively in the past. For readers wanting background, start with Introduction to Rate Monotonic Schedule. Then read Why RMA is Not Just for Academics, from which this important and relevant quote:

The principal benefit of RMA is that the performance of a set of tasks thus prioritized degrades gracefully. Your key “critical set” of tasks can be guaranteed (even proven a priori) to always meet its deadlines–even during periods of transient overload. Dynamic-priority operating systems cannot make this guarantee. Nor can static-priority RTOSes running tasks prioritized in other ways.

In plain English, RMA helps you prove (via math) that your interrupt service routines (ISRs) plus the subset of tasks with deadlines will always complete their work on time. Then you can do whatever the heck you want at lower priority levels without messing that up. Consider that with this architecture a ventilator could feature a low priority video game. If the CPU becomes overloaded, only the game’s performance is at risk. The rest (most) of the time the game gets to use all the “slack” CPU cycles left behind by the critical set.

The pros of slack scheduling appear to be:

  • Non-critical tasks (i.e., those without deadline miss consequences) can’t steal CPU cycles needed by critical tasks.
  • Allows slack CPU cycles to be used by the non-critical tasks.
  • No need to ever “terminate any lower criticality [code] that exceeds” its time slot.
  • Client tasks never need to “wait for their server thread to be scheduled”.

All of these pros are common to the RMA+RTOS approach as well.

The cons of slack scheduling appear to be:

  • There is overhead associated with tracking, granting, and requesting slack time (the article doesn’t provide enough info to quantify these).
  • There is overhead associated with waking tasks (or polling instead of interrupts) to see if they need even use any of their allocated time in this cycle.
  • Like all time-partitioned code structures, it is fragile in the face of late-added work items (“the introduction of a new thread may be difficult if there is no room available on the timeline”.

The RMA+RTOS approach suffers none of these downsides.

It is important to understand that the two really powerful and beautiful aspects of RMA are first that tasks outside the critical set need not be budgeted/analyzed at all (they don’t have deadlines anyway) and second that these low priority non-critical tasks have automatic free (no-overhead) use of all the CPU cycles not used by those with deadlines. All of the critical set schedulability analysis is done on worst-case CPU use. Best case and average case may be substantially less. For example, an interrupt that could first worst-case every 1 ms eats up a lot of CPU time on paper only (well, at least until that worst case when it does really consume that much CPU); the actual implementation need not be changed to poll; and every last unused CPU cycle is available for other uses in the real system.

Unfortunately, few embedded software engineers apply RMA at all. And some of those who do, may get the math wrong. There are assumptions and calculations to be handled properly. And that’s where a time-partitioned code structure (a.k.a., cyclic executive or real-time executive) is demonstrably better than RMA+RTOS. And that is probably why ARINC-653 mandates time-partitioned scheduling.

Under a mandate of time-partitioned, the described slack scheduling approach may aid implementers.