Posts Tagged ‘security’

Trends in Embedded Software Design

Wednesday, April 18th, 2012 Michael Barr

In many ways, the story of my career as an embedded software developer is intertwined with the history of the magazine Embedded Systems Design. When it was launched in 1988, under the original title Embedded Systems Programming (ESP), I was finishing high school. Like the vast majority of people at that time, I had never heard the term “embedded system” or thought much about the computers hidden away inside other kinds of products. Six years later I was a degreed electrical engineer who, like many EEs by that time in the mid-90′s, had a job designing embedded software rather than hardware. Shortly thereafter I discovered the magazine on a colleague’s desk, and became a subscriber and devotee.

The Early Days

In the early 1990s, as now, the specialized knowledge needed to write reliable embedded software was mostly not taught in universities. The only class I’d had in programming was in FORTRAN; I’d taught myself to program in assembly and C through a pair of hands-on labs that were, in hindsight, my only formal education in writing embedded software. It was on the job and from the pages of the magazine, then, that I first learned the practical skills of writing device drivers, porting and using operating systems, meeting real-time deadlines, implementing finite state machines, the pros and cons of languages other than C and assembly, remote debugging and JTAG, and so much more.

In that era, my work as a firmware developer involved daily interactions with Intel hex files, device programmers, tubes of EPROMs with mangled pins, UV erasers, mere kilobytes of memory, 8- and 16-bit processors, in-circuit emulators, and ROM monitors. Databooks were actual books; collectively, they took up whole bookshelves. I wrote and compiled my firmware programs on an HP-UX workstation on my desk, but then had to go downstairs to a lab to burn the chips, insert them into the prototype board, and test and debug via an attached ICE. I remember that on one especially daunting project eight miles separated my compiler and device programmer from the only instance of the target hardware; a single red LED and a dusty oscilloscope were the extent of my debugging toolbox.

Like you I had the Internet at my desk in the mid-90s, but it did not yet provide much useful or relevant information to my work other than via certain FTP sites (does anyone else remember FTPing into sunsite.unc.edu? or Gopher?). The rest was mostly blinking headlines and dancing hamster; and Amazon was merely the world’s biggest river. There was not yet an Embedded.com or EETimes.com. To learn about software and hardware best practices, I pursued an MSEE and CS classes at night and traveled to the Embedded Systems Conferences.

At the time, I wasn’t aware of any books about embedded programming. And every book that I had found on C started with “Hello, World”, only went up in abstraction from there, and ended without ever once addressing peripheral control, interrupt service routines, interfacing to assembly language routines, and operating systems (real-time or other). For reasons I couldn’t explain years later when Jack Ganssle asked me, I had the gumption to think I could write that missing book for embedded C programmers, got a contract from O’Reilly, and did–ending, rather than starting, mine with “Hello, World” (via an RS-232 port).

In 1998, a series of at least three twists of fate spanning four years found me taking a seat next to an empty chair at the speaker’s lunch at an Embedded Systems Conference. The chair’s occupant turned out to be Lindsey Vereen, who was then well into his term as the second editor-in-chief of the magazine. In addition to the book, I’d written an article or two for ESP by that time and Lindsey had been impressed with my ability to explain technical nuances. When he told me that he was looking for someone to serve as a technical editor, I didn’t realize it was the first step towards my role in that position.

Future Trends

Becoming and then staying involved with the magazine, first as technical editor and later as editor-in-chief and contributing editor, has been a highlight of my professional life. I had been a huge fan of ESP and of its many great columnists and other contributors in its first decade. And now, looking back, I believe my work helped make it an even more valuable forum for the exchange of key design ideas, best practices, and industry learning in its second decade. And, though I understand the move away from print towards online publishing and advertising, I am nonetheless saddened to see the magazine come to an end.

Reflecting back on these days long past reminds me that a lot truly has changed about embedded software design. Assembly language is used far less frequently today; C and C++ much more. EPROMs with their device programmers and UV erasers have been supplanted by flash memory and bootloaders. Bus widths and memory sizes have increased dramatically. Expensive in-circuit emulators and ROM monitors have morphed into inexpensive JTAG debug ports. ROM-DOS has been replaced with whatever Microsoft is branding embedded Windows this year. And open-source Linux has done so well that it has limited the growth of the RTOS industry as a whole–and become a piece of technology we all want to master if only for our resumes.

So what does the future hold? What will the everyday experiences of embedded programmers be like in 2020, 2030, or 2040? I see three big trends that will affect us all over those timeframes, each of which has already begun to unfold.

Trend 1: Volumes Finally Shift to 32-bit CPUs

My first prediction is that inexpensive, low-power, highly-integrated microcontrollers–as best exemplified by today’s ARM Cortex-M family–will bring 32-bit CPUs into even the highest volume application domains. The volumes of 8- and 16-bit CPUs will finally decline as these parts become truly obsolete.

Though you may be programming for a 32-bit processor already, it’s still true that 8- and 16-bit processors still drive CPU chip sales volumes. I’m referring, of course, to microcontrollers such as those based on 8051, PIC, and other instruction set architectures dating back 30-40 years. These older architectures remain popular today only because certain low-margin, high-volume applications of embedded processing demand squeezing every penny out of BOM cost.

The limitations of 8- and 16-bit architectures impact the embedded programmers in a number of ways. First, there are the awkward memory limitations resulting from limited address bus widths–and the memory banks, segmenting techniques, and other workarounds to going beyond those limitations. Second, these CPUs are much better at decision making than mathematics–they lack the ability to manipulate large integers efficiently and have no floating-point capability. Finally, these older processors frequently lack modern development tools, are unable to run larger Internet-enabled operating systems, such as Linux, and don’t feature the security and reliabiltiy protections afforded by an MMU.

There will, of course, always be many applications that are extremely cost-conscious, so my prediction is not that they will disappear completely, but that the overall price (including BOM cost as well as power consumption) of 32-bit micro controllers, with their improved instruction set architectures and transistor geometries, will win on price. That will put the necessary amount of computing power into the hands of some designers and make our work easier for all of us. It also helps programmers accomplish more in less time.

Trend 2: Complexity Forces Programmers Beyond C

My second prediction is that the days of the C programming language’s dominance in embedded systems are numbered.

Don’t get me wrong, C is a language I know and love. But, as you may know firsthand, C is simply not up to the task of building systems requiring over a million lines of code. Nonetheless, the demanded complexity of embedded software has been driving our systems towards more than a million lines of code. At this level of complexity, something has to give.

Additionally, our industry is facing a crisis: the average age of an embedded developer is rapidly increasing and C is generally not taught in universities anymore. Thus, even as the demand for embedded intelligence in every industry continues to increase, the population of skilled and experienced C programmers is on the decline. Something has to give on this front too.

But what alternative language can be used to build real-time software, manipulate hardware directly, and be quickly ported to numerous instruction set architectures? It’s not going to be C++ or Ada or Java, for sure–as those have already been tried and found lacking. A new programming language is probably not the answer either, across so many CPU families and with so many other languages already tried.

Thus I predict that tools that are able to reliably generate those millions of lines of C code automatically for us, based on system specifications, will ultimately take over. As an example of a current tool of this sort that could be part of the trend, I direct your attention to Miro Samek’s dandy open source Quantum Platform (QP) framework for event-driven programs and his (optional) free Quantum Modeler (QM) graphical modeling tool. You may not like the idea of auto-generated code today, but I guarantee that once you push a button to generate consistent and correct code from an already expressive statechart diagram, you will see the benefits of the overall structure and be ready to move up a level in programming efficiency.

I view C as a reasonable common output language for such tools (given that C can manipulate hardware registers directly and that every processor ever invented has a C compiler). Note that I do expect there to be continued demand for those of us with the skills and interest to fine tune the performance of the generated code or write device drivers to integrate it more closely to the hardware.

Trend 3: Connectivity Drives Importance of Security

We’re increasingly connecting embedded systems–to each other and to the Internet. You’ve heard the hype (e.g., “Internet of things” and “ubiquitous computing”) and you’ve probably already also put TCP/IP into one or more of your designs. But connectivity has a lot of implications that we are only starting to come to terms with. The most obvious of these is security.

A connected device cannot hide for long behind “security through obscurity” and, so, we must design security into our connected devices from the start. In my travels around our industry I’ve observed that the majority of embedded designers are largely unfamiliar with security. Sure some of you have read about encryption algorithms and know the names of a few. But mostly the embedded community is shooting in the dark as security designers, within organizations that aren’t of much help. And security is only as strong as the weakest link in the chain.

This situation must change. Just as Flash memory has supplanted UV-erasable EPROM, so too will over-the-net patches and upgrades take center stage as a download mechanism in coming years and decades. We must architect our systems first to be secure and then to accepted trusted downloads so that our products can keep up in the inevitable arms race against hackers and attackers.

And That’s a Wrap

Whatever the future holds, I am certain that embedded software development will remain an engaging and challenging career. And you’ll still find me writing about the field at http://embeddedgurus.com/barr-code and http://twitter.com/embeddedbarr.

Building Reliable and Secure Embedded Systems

Tuesday, March 13th, 2012 Michael Barr

In this era of 140 characters or less, it has been well and concisely stated that, “RELIABILITY concerns ACCIDENTAL errors causing failures, whereas SECURITY concerns INTENTIONAL errors causing failures.” In this column I expand on this statement, especially as regards the design of embedded systems and their place in our network-connected and safety-concious modern world.

As the designers of embedded systems, the first thing we must accomplish on any project is to make the hardware and software work. That is to say we need to make the system behave as it was designed to. The first iteration of this is often flaky; certain uses or perturbations of the system by testers can easily dislodge the system into a non-working state. In common parlance, “expect bugs.”

Given time, tightening cycles of debug and test can get us past the bugs and through to a shippable product. But is a debugged system good enough? Neither reliability nor security can be tested into a product. Each must be designed in from the start. So let’s take a closer look at these two important design aspects for modern embedded systems and then I’ll bring them back together at the end.

Reliable Embedded Systems

A product can be stable yet lack reliability. Consider, for example, an anti-lock braking computer installed in a car. The software in the anti-lock brakes may be bug-free, but how does it function if a critical input sensor fails?

Reliable systems are robust in the face of adverse run-time environments. Reliable systems are able to work around errors encountered as they occur to the system in the field–so that the number and impact of failures are minimized. One key strategy for building reliable systems is to eliminate single-points-of-failure. For example, redundancy could be added around that critical input sensor–perhaps by adding a second sensor in parallel with the first.

Another aspect of reliability that is under the complete control of designers (at least when they consider it from the start) are the “fail-safe” mechanisms. Perhaps a suitable but lower-cost alternative to a redundant sensor is detection of the failed sensor with a fall back to mechanical braking.

Failure Mode and Effect Analysis (FMEA) is one of the most effective and important design processes used by engineers serious about designing reliability into their systems. Following this process, each possible failure point is traced from the root failure outward to its effects. In an FMEA, numerical weights can be applied to the likelihoods of each failure as well as the seriousness of consequences. An FMEA can thus help guide you to a cost effective but higher reliability design by highlighting the most valuable places to insert the redundancy, fail-safes, or other elements that reinforce the system’s overall reliability.

In certain industries, reliability is a key driver of product safety. And that is why you see these techniques and FMEA and other design for reliability processes being applied by the designers of safety-critical automotive, medical, avionics, nuclear, and industrial systems. The same techniques can, of course, be used to make any type of embedded system more reliable.

Regardless of your industry, it is typically difficult or impossible to make your product as reliable via patches. There’s no way to add hardware like that redundant sensor, so your options may reduce to a fail-safe that is helpful but less reliable overall. Reliability cannot be patched or tested or debugged into your system. Rather, reliability must be designed in from the start.

Secure Embedded Systems

A product can also be stable yet lack security. For example, an office printer is the kind of product most of us purchase and use without giving a minute of thought to security. The software in the printer may be bug-free, but is it able to prevent a would-be eavesdropper from capturing a remote electronic copy of everything you print, including your sensitive financial documents?

Secure systems are robust in the face of persistent attack. Secure systems are able to keep hackers out by design. One key strategy for building secure systems is to validate all inputs, especially those arriving over an open network connection. For example, security could be added to a printer by ensuring against buffer overflows and encrypting and digitally signing firmware updates.

One of the unfortunate facts of designing secure embedded systems is that the hackers who want to get in only need to find and exploit a single weakness. Adding layers of security is good, but if even any one of those layers remains fundamentally weak, a sufficiently motivated attacker will eventually find and breach that defense. But that’s not an excuse for not trying.

For years, the largest printer maker in the world apparently gave little thought to the security of the firmware in its home/office printers, even as it was putting tens of millions of tempting targets out into the world. Now the security of those printers has been breached by security researchers with a reasonable awareness of embedded systems design. Said one of the lead researchers, “We can actually modify the firmware of the printer as part of a legitimate document. It renders correctly, and at the end of the job there’s a firmware update. … In a super-secure environment where there’s a firewall and no access — the government, Wall Street — you could send a résumé to print out.”

Security is a brave new world for many embedded systems designers. For decades we have relied on the fact that the microcontrollers and Flash memory and real-time operating systems and other less mainstream technologies we use will protect our products from attack. Or that we can gain enough “security by obscurity” by keeping our communications protocols and firmware upgrade processes secret. But we no longer live in that world. You must adapt.

Consider the implications of an insecure design of an automotive safety system that is connected to another Internet-connected computer in the car via CAN; or the insecure design of an implanted medical device; or the insecure design of your product.

Too often, the ability to upgrade a product’s firmware in the field is the very vector that’s used to attack. This can happen even when a primary purpose for including remote firmware updates is motivated by security. For example, as I’ve learned in my work as an expert witness in numerous cases involving reverse engineering of the techniques and technology of satellite television piracy, much of that piracy has been empowered by the same software patching mechanism that allowed the broadcasters to perform security upgrades and electronic countermeasures. Ironically, had the security smart cards in those set-top boxes had only masked ROM images the overall system security may have been higher. This was certainly not what the designers of the system had in mind. But security is also an arms race.

Like reliability, security must be designed in from the start. Security can’t be patched or tested or debugged in. You simply can’t add security as effectively once the product ships. For example, an attacker who wished to exploit a current weakness in your office printer or smart card might download his hack software into your device and write-protect his sectors of the flash today so that his code could remain resident even as you applied security patches.

Reliable and Secure Embedded Systems

It is important to note at this point that reliable systems are inherently more secure. And that, vice versa, secure systems are inherently more reliable. So, although, design for reliability and design for security will often individually yield different results–there is also an overlap between them.

An investment in reliability, for example, generally pays off in security. Why? Well, because a more reliable system is more robust in its handling of all errors, whether they are accidental or intentional. An anti-lock braking system with a fall back to mechanical braking for increased reliability is also more secure against an attack against that critical hardware input sensor. Similarly, those printers wouldn’t be at risk of fuser-induced fire in the case of a security breach if they were never at risk of fire in the case of any misbehavior of the software.

Consider, importantly, that one of the first things a hacker intent on breaching the security of your embedded device might do is to perform a (mental, at least) fault tree analysis of your system. This attacker would then target her time, talents, and other resources at one or more single points of failure she considers most likely to fail in a useful way.

Because a fault tree analysis starts from the general goal and works inward deductively toward the identification of one or more choke points that might produce the desired erroneous outcome, attention paid to increasing reliability such as via FMEA usually reduces choke points and makes the attacker’s job considerably more difficult. Where security can break down even in a reliable system is where the possibility of an attacker’s intentionally induced failure is ignored in the FMEA weighting and thus possible layers of protection are omitted.

Similarly, an investment in security may pay off in greater reliability–even without a directed focus on reliability. For example, if you secure your firmware upgrade process to accept only encrypted and digitally signed binary images you’ll be adding a layer of protection against an inadvertently corrupted binary causing an accidental error and product failure. Anything you do to improve the security of communications (i.e., checksums, prevention of buffer overflows, etc.) can have a similar effect on reliability.

The Only Way Forward

Each year it becomes increasingly important for all of us in the embedded systems design community to learn to design reliable and secure products. If you don’t, it might be your product making the wrong kind of headlines and your source code and design documents being poured over by lawyers. It is no longer acceptable to stick your head in the sand on these issues.

Firmware Forensics: Best Practices in Embedded Software Source Code Discovery

Tuesday, September 27th, 2011 Michael Barr

Software has become ubiquitous, embedded as it is into the fabric of our lives in literally billions of new (non-computer) products per year, from microwave ovens to electronic throttle controls. When products controlled by software are the subject of litigation, whether for infringement of intellectual property rights or product liability, it is imperative to analyze the embedded software (a.k.a., firmware) properly and thoroughly. This article enumerates five best practices for embedded software source code discovery and the rationale for each.

In February 2011, the U.S. government’s National Highway Traffic Safety Administration and a team from NASA’s Engineering and Safety Center published reports of their joint investigation into the causes of unintended acceleration in Toyota vehicles. While NHTSA led the overall effort and examined recall records, accident reports, and complaint statistics, the more technically focused team from NASA performed reviews of the electronics and embedded software at the heart of Toyota’s “electronic throttle control subsystem” (ETCS). Redacted public versions of the official reports from each agency, together with a number of related documents, can be found at http://www.nhtsa.gov/UA.

These reports are very interesting in what they have to say about the quality of Toyota’s firmware and NASA’s review of the same. However, of greater significance is what they are not able to say about unintended acceleration. It appears that NASA did not follow a number of best practices for reviewing embedded software source code that might have identified useful evidence. In brief, NASA failed to find a firmware cause of unintended acceleration—but their review also fails to rule out firmware causes entirely.

This article describes a set of five recommended practices for firmware source code review that are based on my experiences as both an embedded software developer and as an expert witness. Each of the recommendations will consider what more could have been done to determine whether Toyota’s ETCS firmware played a role in any of the unintended acceleration. The five recommended practices are: (1) ask for the bug list; (2) insist on an executable; (3) reproduce the development environment; (4) try for the version control repository; and (5) remember the hardware. The relative value and importance of the individual practices will vary by type of litigation, so the recommendations are presented in the order that is most readable.

Ask for the Bug List

Any serious litigation involving embedded software will require an expert review of the source code. The source code should be requested early in the process of discovery. Owners of source code tend to strenuously resist such requests but procedures limiting access to the source code to only certain named and pre-approved experts and only under physical security (often a non-networked computer with no removable storage in a locked room) tend to be agreed upon or ordered by a judge.

Software development organizations commonly keep additional records that may prove more important or useful than a mere copy of the source code. Any reasonably thorough software team will maintain a bug list (a.k.a., defect database) describing most or all of the problems observed in the software along with the current status of each (e.g., “fixed in v2.2” or “still under investigation”). The list of bugs fixed and known—or the company’s lack of such a list—is germane to issues of software quality. Thus the bug list should be routinely requested and supplied in discovery. (It is also recommended that a request be made for copies of software design documents, coding standards, build logs and associated tool outputs, testing logs, and other artifacts of the embedded software design and development process.)

Very nearly every piece of software ever written has defects, both known and unknown. Thus the bug list provides helpful guidance to a reviewer of the source code. Often, for example, bugs cluster in specific source files in need of major rework. To ignore the company’s own records of known bugs, as the NASA reviewers apparently did, is to examine a constitution without considering the historical reasons for the adoption of each section and amendment. Indeed, a simple search of the text in Toyota’s bug list for the terms “stuck” and “fuel valve” might yet provide some useful information about unintended acceleration.

Insist on an Executable

In software parlance, the “executable” program is the binary version of the program that’s actually executed in the product. The machine-readable executable is constructed from a set of human-readable source code files using software build tools such as compilers and linkers. It is important to recognize that one set of source code files may be capable of producing multiple executables, based on tool configuration and options.

Though not human-readable, an executable program may provide valuable information to an expert reviewer. For example, one common technique is to extract the human-readable “strings” within the executable. The strings in an executable program include information such as on-screen messages to the user (e.g., “Press the ‘?’ button for help.”). In a copyright infringement case in which I once consulted several strings in the defendant’s executable helpfully contained a phrase similar to “Copyright Plaintiff”! You may not be so lucky, but isn’t it worth a try?

It may also be possible to reverse engineer or disassemble an executable file into a more human-readable form. Disassembly could be important in cases of alleged patent infringement, for example, where what looks like an infringement of a method claim in the source code might be unused code or not actually part of the executable in the product as used by customers.

Sometimes it is easy to extract the executable directly from the product for expert examination—in which case the expert should engage in this step. For instance, software running on Microsoft Windows consists of an executable file with the extension .EXE, which is easily extracted. However, the executable programs in most embedded systems are difficult, at best, to extract. (Note that if it is possible for the expert to extract an executable from one or more exemplars of the product, an automated comparison should always be made between the installed and produced binary files. You never know what you may find and any difference could have important implications for the facts underlying the case.) Extraction of Toyota’s ETCS firmware might not be physically possible. Thus the legal team should insist on production of the executable(s) actually used by the relevant customers.

Reproduce the Development Environment

The dichotomy between source code and executable code and the inability of even most software experts to make much sense of binary code can create problems in the factual landscape of litigation. For example, suppose that the source code produced by Toyota was inadvertently incomplete in that it was missing two or three source code files. Even an expert reviewer looking at the source code might not know about the absent files. For example, if the bug the expert is looking for is related to fuel valve control and the code related to that subject doesn’t reference the missing files, the reviewer may not notice their absence. No expert can spot a bug in a missing file.

Fortunately, there is a reliable way for an expert to confirm that she has been provided with all of the source code. The objective is simply stated: reproduce the software build tools setup and compile the produced source code. To do this it is necessary to have a copy of the development team’s detailed build settings, such as make files, preprocessor defines, and linker control files. If the build process completes and produces an executable, it is certain the other party has provided a complete copy of the source code. (Further additional technical details include the need to start with a “clean” set of files that contains no object files or libraries. It may also be necessary to obtain third-party header files or libraries.)

Furthermore, if the executable as built matches the executable as produced (actually, ideally, the executable as extracted from the product) bit by binary bit, it is certain that the other party has provided a true and correct version of the source code. Unfortunately, trying to prove this part may take longer than just completing a build; the build could fail to produce the desired proof for a variety of reasons. The details here get complicated: to get exactly the same output executable, it is necessary to use all of the following: precisely the same version of the compiler, linker, and each other build tool as the original developers; precisely the same configuration of each of those tools; and precisely the same set of build instructions. Even a slight variation in just one of these details will generally produce an executable that doesn’t match the other binary image at all—just as the wrong version of the source code would.

Try for the Version Control Repository

Embedded software source code is never created in an instant. All software is developed one layer at a time over a period of months or years in the same way that a bridge and the attached roadways exist in numerous interim configurations during their construction. The version control repository for a software program is like a series of time-lapse photos tracking the day-by-day changes in the construction of the bridge. But there is one considerable difference: it is possible to go back to one of those source code snapshots and rebuild the executable of that particular version. This becomes critically important when multiple software versions will be deployed over a number of years. In the automotive industry, for example, it must be possible to give one customer a bug fix for his v2.1 firmware while also working on the new v3.0 firmware to be released the following model year.

Consider, for the sake of discussion, that the executable version of Toyota’s ETCS v2.1 firmware that was installed in the factory in one million cars around the world had an undiscovered bug that could result in unintended acceleration under certain rare operating conditions. Now further suppose that this bug was (perhaps unintentionally) eliminated in the v2.2 source code, from which a subsequent executable was created and installed at the factory into millions more cars with the same model names—and also as an upgrade into some of the original one million cars as they visited dealers for scheduled maintenance. In this scenario, an examination of the v2.2 source code proves nothing about the safety of the hundreds of thousands of cars still with v2.1 under the hood.

Gaining access to the entire version control repository containing all of the past versions of a company’s firmware source code through discovery may be out of the question. For example, a judge in a source code copyright and trade secrets case I consulted in would only allow the plaintiff to choose one calendar date and to then receive a snapshot of the defendant’s source code from that specific date. If the plaintiff was lucky it would find evidence of their proprietary code in that specific snapshot. But the observed absence of their proprietary code from that one specific snapshot doesn’t prove the alleged theft didn’t happen earlier or later in time.

There are some problems with examination of an entire version control repository. It may be difficult to make sense of the repository’s structure. Or, if the structure can be understood, it might take many times as long to perform a thorough review of the major and minor versions of the various source code files as it would to just review one snapshot in time. At first glance, many of those files would appear the same or similar in every version—but subtle differences could be important to making a case. To really be productive with that volume of code, it may be necessary to obtain a chronological schedule provided by a bug list and/or other production documents describing the source code at various points in time.

Remember the Hardware

Embedded software is always written with the hardware platform in mind and should be reviewed in the same manner. For example, it is only possible to properly reverse engineer or disassemble an executable program once the specific microprocessor (e.g., Pentium, PowerPC, or ARM) is known. But knowing the processor is just the beginning, because the hardware and software are intertwined in complex ways in such embedded systems.

Only one or more features of the hardware are enabled or active when the hardware is in a particular configuration. For instance, consider an embedded system with a network interface, such as an Ethernet jack that is only powered when a cable is mechanically inserted. Some or all of the software required to send and receive messages over this network may be not be executed until a cable is inserted. A proper analysis of the software needs to keep hardware-software interactions like this in perspective. Ideally, testing of the firmware should be done on the hardware as configured in exemplars of the units at issue—so it is useful to ask for hardware during discovery, if you are not able to acquire exemplars in other ways. It is not clear from the redacted reports if NHTSA’s testing of certain Toyota Camrys was done using the same firmware version on exactly the same hardware as the owners who experienced unintended acceleration. Hardware interactions can be one of the most important considerations of all when analyzing embedded software.

Sometimes a bug is not visible in the software itself. Such a bug may result from a combination of hardware and software behaviors or multi-processor interactions. For example, one motor control system I’m familiar with had a dangerous race condition. The bug, though, was the result of an unforeseen mismatch between the hardware reaction time and the software reaction time around a sequence of commands to the motor.

Additional Analysis Required

As you can see, the review of embedded software can be complicated. This is partly because the hardware of each embedded system is unique. In addition, the system as a whole generally involves complex interactions between hardware, software, and user. An expert in embedded software should typically have a degree in electrical engineering, computer engineering, or computer science plus years of relevant experience designing embedded systems and programming in the relevant language(s).

The five best practices presented here are meant to establish the critical importance of making certain specific requests early in the legal discovery process. They are by no means the only types of analysis that should be performed on the source code. For example, in any case involving the quality or reliability of embedded software, the source code should be tested via static analysis tools. This and other types of technical analysis should be well understood by any expert witness or litigation consultant with the proper background.

In the case of Toyota’s unintended acceleration issues, I hope that expert review in the class action litigation against Toyota will include these and other additional types of analysis to identify all of the potential causes and determine if embedded software played any role. Though government funds for analysis by NASA are understandably limited, it is suggested that transportation safety organizations, such as NHTSA, should establish rules that ensure that future investigations are more thorough and that safety-related technical findings in litigation cannot be hidden behind the veil of secrecy of a settlement agreement.

How to Enforce Coding Standards (Automatically)

Wednesday, May 25th, 2011 Michael Barr

Coding standards can be an important tool in the fight to keep bugs out of embedded software. Unfortunately, too many well-intentioned (especially, corporate) coding standards are ineffective and gather more dust than followers. The hard truth is that enforcement of coding standards too often depends on programmers already under deadline pressure to be disciplined while they code and/or to make time to perform peer code reviews. And when peer reviews are done in this scenario, they too easily devolve into compliance discussions that miss the forest for the trees.

To ensure your selected coding standard is followed and thus effective, your team should find as many automated ways to enforce as many of its rules as possible. And you should make such automated rule checking part of the everyday software build process. Ideally, you would also restrict version control check-ins to just code that has passed all the automated checks. No code that breaks any of these automatable rules should be allowed in peer code reviews. That way, code reviewers can focus their limited hours on (1) what the code is supposed to do, (2) whether it does so correctly, and (3) whether the code and comments together are easily understood by all involved.

One of the ways Netrino has found to increase compliance with its Embedded C Coding Standard is by configuring static analysis tools we already use to automatically enforce individual rules.

Perhaps the best option is to use a static analysis tool that includes built-in support for your chosen coding standard and/or the ability to be customized to proprietary rules. LDRA‘s static analysis engine, a screenshot from which is shown in the image below, comes preconfigured to check about 80% of the “NETRINO” coding standard rules, as well as a number of other widely used coding standards.

The LDRA Tool Suite Enforces Netrino's Embedded C Coding Standard

But there are other, less expensive, options available as well. Two tools that we have used are RSM and PC-Lint from M Squared Technologies and Gimpel Software, respectively. Neither can easily and independently enforce all of our Embedded C Coding Standard rules. But used together and properly configured, these two tools offer a low-cost automated coding standards enforcement mechanism that covers a large percentage of the rules.

Configuring RSM

The following RSM outputs can be examined and configuration options set (in version 7.75) to assist in the automated enforcement of the identified rules from the Embedded C Coding Standard. Pricing for the RSM tool, which runs on Windows, Linux, and other versions of Unix (including Mac OS X), is online at http://msquaredtechnologies.com/m2rsm/order/.

Rule # Brief Description Tool Configuration Notes
1.2.a Line length limited to 80 characters. Quality Notice 1
1.3.a Braces surround all blocks of code. Quality Notice 22
1.7.c Keyword goto not used. Quality Notice 9
1.7.d Keyword continue not used. Quality Notice 43
1.7.e Keyword break not used outside switch. Quality Notice 44
2.2 Location and content of comments. Quality Notices 17, 20, 51; Options -Es -Ec -EC
3.1 Use of white space. Quality Notices 16, 19
3.5.a No tab characters. Quality Notice 30; Option -Dt
3.6.a UNIX-style (single-character) linefeeds. Option -Du
4.1.a-d Module naming conventions. Option -Rn
4.2.a Precisely one header file per source file. Option -Rn
6.1.a-e Precisely one header file per source file. Quality Notice 2; Option -l
6.2 Cyclomatic complexity of functions. Quality Notices 10, 18, 27; Option -c
6.3.b.i-iii Preprocessor macro safety mechanisms. Option -m
8.2.d If-else if statements always have else. Quality Notice 22
8.3.b Switch statements always have default. Quality Notices 13, 14, 56
8.5.a No unconditional jumps. Quality Notices 9, 43, 444

Configuring PC-Lint

In a future blog post, I will similarly identify PC-Lint configurations that can be used (in version 9.0) to assist in the automated enforcement of specific rules from the Embedded C Coding Standard. Pricing for the PC-Lint tool, which runs on Windows, Linux, and other versions of Unix (including Mac OS X), is online at http://gimpel.com/html/order.htm.

Firmware-Specific Bug #10: Jitter

Thursday, December 2nd, 2010 Michael Barr

Some real-time systems demand not only that a set of deadlines be always met but also that additional timing constraints be observed in the process. Such as managing jitter.

An example of jitter is shown in Figure 1. Here a variable amount of work (blue boxes) must be completed before every 10 ms deadline. As illustrated in the figure, the deadlines are all met. However, there is considerable timing variation from one run of this job to the next. This jitter is unacceptable in some systems, which should either start or end their 10 ms runs more precisely.

Jitter Figure 1

If the work to be performed involves sampling a physical input signal, such as reading an analog-to-digital converter, it will often be the case that a precise sampling period will lead to higher accuracy in derived values. For example, variations in the inter-sample time of an optical encoder’s pulse count will lower the precision of the velocity of an attached rotation shaft.

Best Practice: The most important single factor in the amount of jitter is the relative priority of the task or ISR that implements the recurrent behavior. The higher the priority the lower the jitter. The periodic reads of those encoder pulse counts should thus typically be in a timer tick ISR rather than in an RTOS task.

Figure 2 shows how the interval of three different 10 ms recurring samples might be impacted by their relative priorities. At the highest priority is a timer tick ISR, which executes precisely on the 10 ms interval. (Unless there are higher priority interrupts, of course.) Below that is a high-priority task (TH), which may still be able to meet a recurring 10-ms start time precisely. At the bottom, though, is a low priority task (TL) that has its timing greatly affected by what goes on at higher priority levels. As shown, the interval for the low priority task is 10 ms +/- approximately 5 ms.

Jitter Figure 2

Firmware-Specific Bug #9