embedded software boot camp

Unintended Acceleration and Other Embedded Software Bugs

Tuesday, March 1st, 2011 by Michael Barr

Last month, NHTSA and the NASA Engineering and Safety Center (NESC) published reports of their joint investigation into the causes of unintended acceleration in Toyota vehicles. NASA’s multi-disciplinary NESC technical team was asked, by Congress, to assist NHTSA by performing a review of Toyota’s electronic throttle control and the associated embedded software. In carefully worded concluding statement, NASA stated that it “found no electronic flaws in Toyota vehicles capable of producing the large throttle openings required to create dangerous high-speed unintended acceleration incidents.” (The official reports and a number of supporting files are available for download at http://www.nhtsa.gov/UA.)

The first thing you will notice if you join me in trying to judge the technical issues for yourself are the redactions: pages and pages of them. In parts and entirely for unexplained reasons, this report on automotive electronics reads like the public version of a CIA Training Manual. I’ve observed that approximately 193 of the 1,061 pages released so far feature some level of redaction (via black boxes, which obscure from a single number, word, or phrase to a full table, page, or section). The redactions are at their worst in NASA’s Appendix A, which describes NASA’s review of Toyota’s embedded software in detail. More than half of all the pages with redactions (including the vast majority of fully redacted tables, pages, and sections) are in that Appendix.

Despite the redactions, we can still learn some interesting facts about Toyota’s embedded software and NASA’s technical review of the same. The bulk of the below outlines what I’ve been able to make sense of in about two days of reading. Throughout, my focus is on embedded software inside the electronic throttle control, so I’m leaving out considerations of other potential causes, including EMI (which NASA also investigated). First a little background on the investigation.

Background

Although the inquiry was taken to examine unintended acceleration reports across all Toyota, Scion, and Lexus models, NASA focused its technical inquiry almost entirely on Toyota Camry models equipped with the Electronic Throttle Control System, Intelligent (ETCS-i). The Camry has long been among the top cars bought in the U.S., so this choice probably made finding relevant complaint data and affected vehicles easier for NHTSA. (BTW, NASA says the voluntary complaint database shows both that unintended accelerations were reported before the introduction of electronic throttle control and that press coverage and Congressional hearings can increase the volume of complaints.)

According to a press release by the company made upon publication of the NHTSA and NASA reports, Toyota’s ETCS-i has been installed in “more than 40 million cars and trucks sold around the world, including more than 16 million in the United States.” Undoubtedly, ETCS-i has also “made possible significant safety advances such as vehicle stability control and traction control.” But as with any other embedded system there have been refinements made through the years to both the electronics and the embedded software.

Though Toyota apparently made available, under agreed terms and via its attorneys, schematics, design documents, and source code “for multiple Camry years and versions” (Appendix A, p. 9) as well as many of the Japanese engineers involved in its design and evolution, NASA only closely examined one version. In NASA’s words, “The area of emphasis will be the 2005 Toyota Camry because this vehicle has a consistently high rate of reported ‘UA events’ over all Toyota models and all years, when normalized to the number of each model and year, according to NHTSA data.” (p. 7) Except as otherwise stated, everything else in this column concerns the electronics and firmware found in that year, make, and model.

Event Data Recorders

Event Data Recorder (EDR) is the generic term for the automotive equivalent of an aircraft black box flight data recorder. EDRs were first installed in cars in the early 1990s and have increased in use as well as sophistication in the years since. Generally speaking, the event data recorder is an embedded system residing within the airbag control module located in the front center of the engine compartment. The event data recorder is connected to other parts of the car’s electronics via the CAN bus and is always monitoring vehicle speed, the position of the brake and accelerator pedals, and other key parameters.

In the event of an impossibly high (for the vehicle operating normally) acceleration or deceleration sensor reading, Toyota’s latest event data recorders save the prior five 1Hz samples of these parameters in a non-volatile memory area. Once saved, an event record can be read over the car’s On-Board Diagnostics (OBD) port (or, in the event of a more severe accident, directly from the airbag control module) via a special cable and PC software. If the airbag actually deploys, the event record will be permanently locked. The last 2 or 3 (depending on version) lesser “bump” records are also stored, but may be overwritten in a FIFO manner.

This investigation of Toyota’s unintended acceleration marked the first time that anyone from NHTSA had ever read data from a Toyota event data recorder. (Toyota representatives apparently testified in Congress that there had previously just been one copy of the necessary PC software in the U.S.) As part of this study, NHTSA validated and used tools provided by Toyota to extract historical data from 52 vehicles involved in incidents of unintended acceleration, with acknowledged bias toward geographically reachable recent events. After reviewing driver and other witness statements and examining said black box data, NHTSA concluded that 39 of these 52 events were explainable as “pedal misapplications.” That’s a very nice way of saying that whenever the driver reported “stepping on the brake” he or she had pressed the accelerator pedal by mistake. Figure 5 of a supplemental report describing these facts portrays an increasing likelihood of such incidents with driver age vs. the bell curve of Camry ownership by age.

Note that no record is apparently ever made, in the event data recorder or elsewhere, of any events or state changes within the ETCS-i firmware. So-called “Diagnostic Trouble Codes” concerning sensor and other hardware failures are recorded in non-volatile memory and the presence of one or more such codes enables the “Check Engine” light on the dashboard. But no logging is done of significant software faults, including but not limited to watchdog-initiated resets.

Engine Control Module

ETCS-i is a collection of components and features that was changed in the basic engine design when Toyota switched from mechanical to electronic throttle control. (Electronic throttle control is also known as “throttle-by-wire”.) Toyota has used two different types of pedal sensors in the ETCS-i system, always in a redundant fashion. The earlier design, pre-2007, using potentiometers was susceptible to current leakage via growth of tin whiskers. Though this type of failure was not known to cause sudden high-speed behaviors, it did seem to be associated with a higher number of warranty claims. The newer pedal sensor design uses Hall effect sensors.

Importantly, the brakes are not a part of the ETCS-i system. In the 2005 Camry, Toyota’s brake pedal was mechanically controlled. (It may still be.) It appears this is one of the reasons the NASA team felt comfortable with their conclusion that driver reports of wide open throttle behavior that could not be stopped with the brakes were not caused by software failures (alone). “The NESC team did not find an electrical path from the ETCS-i that could disable braking.” (NASA Report, p. 15) It is clear, though, that power assisted brakes lose the enabling vacuum pressure when the throttle is wide open and the driver subsequently pumps the brakes; thus any system failure that opened the throttle could indirectly make bringing the vehicle to a stop considerably harder.

The Engine Control Module at the heart of the ETCS-i consists of a Main-CPU and a Sub-CPU located within a pair of ASICs. The Sub-CPU contains a set of A/D converters that translates raw sensor inputs, such as voltages VPA and VPA1 from the accelerator pedal, into digital position values and sends them to the Main-CPU via a serial interface. In addition, the Sub-CPU monitors the outputs of the Main-CPU and is able to reset (in the manner of a watchdog timer) the Main-CPU.

The Main-CPU is reported to be a V850E1 microcontroller, which is “a 32-bit RISC CPU core for ASIC” designed by Renesas (nee NEC). The V850E1 processor has a 64MB program address space, which is part of an overall 4GB linear address space. The Main-CPU also keeps tabs on the Sub-CPU and can reset it if anything is found wrong.

NASA reports that the embedded software in the Main-CPU is written (mostly) in ANSI C and compiled using a GreenHills C compiler (Appendix A, p. 14). Furthermore, an OSEK-compliant real-time operating system with fixed-priority preemptive scheduling is used to manage a redacted (but apparently larger than ten, based on the size of the redaction) number of real-time tasks. The actual firmware development (design, coding and unit testing) was outsourced to Denso (p. 19). Toyota apparently performed integration testing and ran several commercial and in-house static analysis tools, including QAC (p. 20). The code was written in English, with Japanese comments and design documents, and follows a proprietary Toyota naming convention/coding standard that predates but half overlaps with the 1998 version of MISRA-C.

Are There Bugs in Toyota’s Firmware?

In the NASA Report’s executive summary it is made clear that “because proof that the ETCS-i caused the reported UAs was not found does not mean it could not occur.” (NASA Report, p. 17) The report also states that NASA’s analysis was time-limited and top-down, remarking “The Toyota Electronic Throttle Control (ETC) was far more complex than expected involving hundreds of thousands of lines of software code” and that this affected the quality of a planned peer review.

It’s stated that “Reported [Unintended Accelerations (UAs)] are rare events. Typically, the reporting of UAs is about 1/100,000 vehicles/year.” But there are millions of cars on the road, and so NHTSA has collected some “831 UA reports for Camry” alone. “Over one-half of the reported events described large (greater than 25 degrees) high-throttle opening UAs of unknown cause” (NASA Report, p. 14), the causes of which are never fully explained in these reports.

The NASA apparently identified some lesser firmware bugs themselves, saying “[our] logic model verifications identified a number of potential issues. All of these issues involved unrealistic timing delays in the multiprocessing, asynchronous software control flow.” (Appendix A, p. 11) NASA also spent time simulating possible race conditions due to worrisome “recursively nested interrupt masking” (pp, 44-46); note, though, that simulation success is not a sufficient proof of lack of races. As well, the NASA team seems to recommend “reducing the amount of global data” (p. 38) and eliminating “dead code” (p. 40).

Additionally, the redacted text in other parts of Appendix A seems to be obscuring that:

  • The standard gcc compiler version 4″ generated a redacted number of warnings (probably larger than 100) about the code, in 11 different warning categories. (p. 25)
  • Coverity version 4.2″ generated a redacted number of warnings (probably larger than 154) about the code, in 10 different warning categories. (p. 27)
  • Codesonar version 3.6p1″ generated a redacted number of warnings (probably larger than 136) about the code, in 10 different warning categories.
  • Uno version 2.12″ generated a redacted number of warnings (probably larger than 72) about the code, in 9 different warning categories.
  • The code contained at least 347 deviations from a subset of 14 of the MISRA-C rules.
  • The code contained at least 243 violations of a subset of 9 of the 10 “Power of 10–Rules for Developing Safety Critical Code,” which was published in IEEE Computer in 2006 by NASA team member Gerard Holzmann.

It looks to me like Figure 6.2.3-1 of the NASA Report (p. 30) shows that UA complaints filed with NHTSA increased in the year of introduction of electronic throttle control for the vast majority of Toyota, Scion, and Lexus models–and that complaint counts have remained higher but generally declined over time since those transitions years. Such a complaint data pattern is perhaps consistent with firmware bugs. (Note to NHTSA: It would be helpful to see this same chart normalized by number of vehicles sold by model year and with the rows sorted by the year of ETC introduction. It would also be nice to see a chart of ETCS-i firmware versions and updates, which vehicles they apply to, and the dates on which each was put into new production vehicles or distributed through dealers.)

Final Thoughts

I am not privy to all of the facts considered by the NHTSA or NASA review teams and thus cannot say if I agree or disagree with their overall conclusion that embedded software bugs are not to blame for reports of unintended acceleration in Toyota vehicles. How about you? If you’ve spotted something I missed in the reports from NHTSA or NASA, please send me an e-mail or leave a comment below. Let’s keep the conversation going.

Tags: , , , , , , , , ,

30 Responses to “Unintended Acceleration and Other Embedded Software Bugs”

  1. Dave Telling says:

    Michael,
    Very interesting! I am curious as to how you arrived at the estimated numbers of the redacted sections that listed the error/warning messages in Appendix A?
    The other thing that caught my eye was “hundreds of thousands of lines” of C code, just to run the throttle? Maybe they just over-complicated things?

    • Michael Barr says:

      In most places, the redactions are made via “appropriately sized” black boxes to just cover the text deemed proprietary. So in a table, you can still tell how many rows and columns there are.

      To check my numbers, first download the NESC Report’s Appendix A (http://www.nhtsa.gov/staticfiles/nvs/pdf/NASA_FR_Appendix_A_Software.pdf). Then take the MISRA example and turn to page 28. Read the introduction to section A.8.1 and then turn to the table at the top of the next page.

      I calculate a minimum of 347 deviations of a subset of 14 MISRA-C rules as follows:

      – There are 14 rows in the table, each with an obscured MISRA rule number in the middle column.
      – The first three rows contain first column values between 100-999 each (minimum 100).
      – The next four rows contain first column values between 10-99 each (minimum 10).
      – The final seven rows contain first column values between 1-9 each (minimum 1).

      The table for the “Power of 10” coding rules follows on the next page. And if you look back a few pages, I believe you find similar redacted tables for the outputs from Coverity, CodeSonar, and Uno–in that order.

      I would be interested to know if you disagree with my approach to this.

  2. MikeG says:

    Of course the document was “carefully worded” – what else would one expect, something vague and misleading?

    Having lived through the unintended acceleration scam of the 1980s in which Audi automobiles were erroneously blamed, I do not believe that Toyota is at fault. As far as I am concerned, this says it all:
    NHTSA concluded that 39 of these 52 events were explainable as “pedal misapplications.”

    • C.S. Mauro, Jr. says:

      Because 39 is not 100% of the recorded events, code bugs can not be automatically excluded as a possible cause.

      • Michael Barr says:

        NHTSA identified 831 unintended acceleration voluntary complaints regarding Toyota Camry in its database; there may have been unreported incidents as well. Some of these incidents were “pedal misapplications” (and some even pre-dated electronic throttle control). But the sample of 52 cars that had their black boxes read were all recent (mostly 2010) events, with cooperative owners and accessible vehicles.

        Suppose, hypothetically speaking, that Toyota had a firmware bug that caused occasional unintended accelerations. And suppose they developed a patch or upgrade for the ETCS-i that fixed the problem and that this patch would be automatically applied on the owner’s next service call. Then we would expect that more recent events would be increasingly likely to be the result of other causes.

        This is just a hypothetical scenario meant to show why 39 of 52 is not a complete explanation.

      • MikeG says:

        No, of course, code bugs cannot be automatically eliminated. Can they really EVER be?

        I said that I “believed” that it wasn’t a bug. Yes, I know, faith is no place in ENGINEERING. Considering the broad range of driver capabilities, and the fact that in the face of an accelerating car there are many options a driver can use to stop (shut off the motor. shift to neutral or a lower gear, mash the brakes) and many drivers apparently DID NOT or COULD NOT, and that drivers have a vested interest in blaming the car, I can have little faith in their eyewitness testimony.

        As far as I can tell, the static code analysis employed does not find bugs, only finds constructs that are more prone to bugs. Evidence of these constructs in the code is no smoking gun. When someone finds that the code used Metric and English units haphazardly, I’ll believe it!

  3. C.S. Mauro, Jr. says:

    I will point to the experience Steve Wozniak (Woz) had last year, when testing his Toyota cars for UA. He was able to initiate a UA event via manipulation of the cruise control system. It was his hypothesis that UA might occur due to a bug lurking in the cruise control firmware portion of the ETCS-i, and his limited exploratory testing seems to bare this likelihood out. He claimed to develop a speed well beyond 70 mph before he disengaged the system by turning the motor off. Perhaps that portion of the code needs to be heavily validated. Something is still very wrong here – and the heavily redacted report seems to indicate much more investigation is needed to truly uncover the heart of the matter.

    • Michael Barr says:

      Unfortunately, NASA apparently didn’t look at the Prius code at all. Do we know if it’s the same ETCS-i hardware and firmware in Wozniak’s Prius as in the 2005 Camry?

      NASA seems to be ruling out software bugs as a cause of unintended >25 degree openings of the throttle. But they don’t seem to be ruling out software bugs as a cause of smaller unintended openings of the throttle. That may be consistent with Wozniak’s observations, which I believe involved cruise control use at speeds well above the legal limit.

  4. Smith says:

    This is a nice peek into a major automaker’s software tools and process, which is a rare opportunity. I was surprised that the throttle controller has a much larger processor and s/w project than I would have anticipated, but when I think about what all it is doing, I guess it makes sense.

    The dual processor arrangement is not surprising, and it would be nice to hear more about the code tools (only one was mentioned by name).

    After the Congressional kerfuffle, I went out and got a new Camry, and I’ve been extremely happy with it.

    • Michael Barr says:

      In one place the name Toyota’s chosen static analysis tool (QAC) is redacted. But on pages 19-20 of the NESC Report’s Appendix A (http://www.nhtsa.gov/staticfiles/nvs/pdf/NASA_FR_Appendix_A_Software.pdf), there is this list:

      “Toyota reported the use of a number of tools and checks to assess the quality of their software.

      * QAC, a tool from British company Programming Research, is the primary tool used to catch common coding defects and to verify compliance with some commonly used coding rules.
      * CAST is an in-house TMC tool to verify type conversions and potential cases of value overflow and some common cases of coding defects. The tool checks compliance with 25 specific rules (e.g., possible incorrect use of precedence rules in expressions with binary shift operations, or the potential loss of precision in an expression with mixed variable types).
      * Careless, another in-house TMC tool for verifying different coding patterns and coding style, perhaps compliance with naming conventions (not much else is know [sic] about this tool and the team was not able to obtain a manual with more information).
      * A Mode Transition checker, to verify corrected transitions between normal and failure modes.
      * A Stack overflow checker. The system stack is limited to just 4096 bytes, it is therefore important to secure that no execution can exceed the stack limit. This type of check is normally simple to perform in the absence of recursive procedures, which is standard in safety critical embedded software.
      * A Task interference check, which is a mostly manual check, assisted by a tool-generated list of cases to be inspected.”

      NESC used Coverity, CodeSonary, and Uno for the static analysis portion of their work. They stated they purposefully wanted to use tools not used by Toyota. (It’d be a good idea to run Toyota’s tools also, to ensure Toyota actually paid attention to their outputs.)

      • Lundin says:

        “The system stack is limited to just 4096 bytes, it is therefore important to secure that no execution can exceed the stack limit.”

        Are you serious?! If you manage a stack overflow in this kind of application when you had 4k stack at your disposal, then you should probably give up embedded programming and look for work at McDonalds, or if they don’t have any jobs available, a job in PC programming.
        🙂

        But yeah I suppose they have to test it still, though…

        • Michael Barr says:

          One thing I can’t yet understand from the report is why there is just one stack in this multitasking system? In one place in Appendix A the stack is said to be 4096 bytes and completely analyzable. In another part that I’ve just read (page 130) where it says there is recursion (!) of the A() calls B() calls C() calls A() sort in the ETCS-i source code and that:

          “Toyota used a tool gstack from Green Hills Software to determine the maximum stack size requirements. The tool gstack is a static analyzer that computes a upper bound on the stack size required by a given executable. For the ETCS-i, it reported the maximum possible stack size as 1688 bytes. This result comes with a caveat, however: gstack cannot account for recursion. … Faced with this limitation, Toyota added an extra margin of safety to the predicted bound by allocating [redacted but 4096 would fit] bytes for the ETCS-i stack–more than double the original prediction.”

          And what about interrupt service routines? Tools like gstack can’t account for their use of the stack either, but the report states that there are interrupts in this system.

          • Antonio Arena says:

            I’m a big fan of yours (gosh, I actually started my embedded career because of a book you wrote).

            Single stack implementations are actually very common with OSEK basic tasks (as defined in http://portal.osek-vdx.org/files/pdf/specs/os223.pdf ) because they are single-shot tasks. Toyota appears to be using an OSEK RTOS, so they probably use basic tasks.

            Intuitevely, undesirable firmware features, such as the one pointed out by NASA, correlate to a certain a defect density, we don’t know how strong this correlation is (and whether any of those defects would be observable to a user given the amount of fail-safe built into cars).

            Also, we don’t know how Toyota firmware would compare against any other automotive manufacturer’s firmware for the same model year.

  5. I currently own a Toyota Tacoma and have owned Toyotas since 1986.

    The brakes are mechanical so – “). “The NESC team did not find an electrical path from the ETCS-i that could disable braking.”

    Most of these toyotas in question have anti lock brakes that are interconnected via the CAN. SO there is at some level an electronic/logical interconnect of sorts.

    Anti lock brakes engage and disable braking when brakes are applied and there is a significant difference in wheel speed (spinning under acceleration???)

    I am not convinced that the overall vehicle control system does not have some unrealized mechanical and logical feedback circuits in play along with “other” software/system bugs.

  6. Phil Platt says:

    Couple of things from the viewpoint of someone completely devoid of programming knowledge.

    I have encountered acceleration due to “floor mat interruption” of the accelerator pedal.

    I have a 2006 Honda, and it has a “sorta” intuitive transmission/cruise control system. I believe it takes the angle of an incline into account when attempting to maintain a given speed. It can at times drop into a gear that would seem inappropriate, and if that function were to somehow receive false information, it could indeed cause a grossly inappropriate rate of acceleration.

    Also in the 80’s the USAF Thunderbirds flew four or five brand new F-16’s into the ground. I do not know if a cause was ever established, however I believe the F-16 is completely “fly-by-wire.” I guess I am trying to draw a correlation between two unrelated events.

    Lastly I find it strange that the people responsible for doing the testing here, have to use software/testing equipment provided by Toyota….

    • Smith says:

      The F-16 is fly-by-wire, but the Thunderbirds fly it in a way that is special. The equivalent driving would be seen in something like NASCAR, and they crash all the time. The latest Tbird crash that I know of was due to an incorrect MSL setting, meaning the ground was closer than the pilot thought, although he punched out in time.

    • Michael Barr says:

      NHTSA could and probably should set industry standards for event data recorders (e.g., what data is recorded, when/how often, how many events, when to overwrite vs. lock down, and how to get the data out), but they’ve not done that yet.

      A big side project for NHTSA in this investigation was to validate Toyota’s black box and the associated PC software. To do this, they did a whole Mythbusters-style track test with one vehicle bumping another and higher precision recording equipment installed in the passenger seat of the vehicle that had its black box read. (You can read all about that here: http://www.nhtsa.gov/staticfiles/nvs/pdf/NHTSA-Toyota_EDR_pre-crash_validation.pdf)

      It’s interesting to note that Toyota was actively improving the EDR Reader software as NHTSA was working, so NHTSA also had to validate that the same results were read by all those different versions.

  7. Eric Smith says:

    “The code was written in English, with Japanese comments”

    Does “the code was written in English” refer to the variable and function names? Does it mean that these variable and function names make sense to any English-speaking programmer?

    • Michael Barr says:

      All I know is what it says at the bottom of page 20 of Appendix A: “the source lines were readable as C, and the comments were in Japanese.” This is mentioned in the context of redacted counts of source lines of code (SLOC), non-commented source lines, and overall comment density.

    • Another Smith (I guess the 3rd?) says:

      As someone who’s recently been in a similar situation, I suspect it means exactly that: the C code (of course), function names, variable names, etc. are in English, but all the comments look like chicken scratches to Westerners.

      Two comments on this — I’ve worked on ECU firmware, BTW… 1) often times the teams are international, so it makes sense to use English where possible (often comments are the hardest to write clearly in a foreign language, hence the Japanese), and 2) code like this is likely riddled with English acronyms (ECU, RPM, etc.) so all the functions & variables pertaining to these are recognizable.

  8. Charley Moore says:

    Very nice analysis, Michael. Thank-you for doing that.

    Although, I also remember the Audi UA incidents of the 80’s and was therefore skeptical about the Toyota incidents, the symptoms that I heard about made me conclude that there was almost certainly some defects in the embedded code. Having seen numerous embedded systems and seen the “don’t touch that module” mentality that permeates so much corporate engineering management, the fingerprints are all too familiar to what you describe.

    A scary thought: If this is what the code from Toyota looks like, can you imagine what the code would look like from a company that does not have Toyota’s attention to quality? Even if one doesn’t have those type of cars – they’re out there on the road.

  9. Very interesting article. Caught my attention by the title.

    There are billions of electronics components in automobile/industrial equipment. But the point of the discussion clearly elevates the importance since this is life-critical and always better to point out and discuss aloud.
    As you rightly point, there could be potential bugs lurking in the firmware.
    How reliable is their testing? Did Toyota give any statistics on the testing they have done?
    Electronics add sophistication to the device/equipment but ill designs could cause performance degradation or the worse – life threatening.

  10. Lundin says:

    Interesting reading indeed. Without viewing a single line of their code, I would also suspect plenty of bugs in it. Scientifical population studies of production code written in C show that “state of the art” code typically has 1 bug in 10k lines of code. State of the art in this case would be something that has gone through extreme testing, ie aerospace software (NASA & friends). All other production code is going to contain a higher bug frequency than that.

    So by statistics, Toyota should have >10 bugs in that program, assuming they have as tough software testing as NASA/avionics. Statistics also shows that the number of bugs, as well as the severity of them, increases with program complexity.

    I can’t understand why they would need 100k LOC for a break system alone. If I do a rough comparison with a similar application of my own, which is also a safety-critical control system, then they would need: an advanced break algorithm, approx 20-30k LOC, a CAN driver <2k LOC, system diagnostics <1k LOC, NVM routines <2k LOC, and some supervising/safety-related code <5k LOC (checksums, memory integrity tests, communication with 2nd CPU etc). Conveniently, they don't need any form of user or programming interface (a program without any pesky humans involved is the sw engineer's wet dream).

    That would make it a rough total of 50k LOC with numbers exaggerated. Makes me wonder what the other 50k code is for.

  11. Brent Goldberg says:

    I don’t think CodeSonar warnings and compiler warning themselves is proof of anything whatsoever. I’ve seen CodeSonar spit out a bunch of potential errors that were perfectly fine, and compiler warnings can be extremely innocous. By themselves, this is evidence of little-to-nothing.

  12. Vani says:

    Some things stand out like red flags. I have worked as a software architect in a safety critical automotive application, one with an EDR.

    >>This investigation of Toyota’s unintended acceleration marked the first time that anyone from NHTSA had ever read data from a Toyota event data recorder. (Toyota representatives apparently testified in Congress that there had previously just been one copy of the necessary PC software in the U.S.)

    This just sounds wrong to me. In most cases, data from an EDR is reported back via a CAN bus. The testing by the supplier must include using the OEM’s tool to ensure data correctness. EDR is quite complex with regard to data and often the storage and reporting data formats vary. When we worked, everyone of us had a running copy of the OEM tool on our PC.

    >>NASA also spent time simulating possible race conditions due to worrisome “recursively nested interrupt masking”

    This is downright scary. nested interrupt masking ??? in a safety critical application ?

    >>All of these issues involved unrealistic timing delays in the multiprocessing, asynchronous software control flow.

    One rather simple rule of development that I like to follow. Keep your code simple. Do a worst case timing analysis and use that as the basis for your overall system timing.

    Some more about an EDR: This is a pretty time sensitive application. It is also tricky, because if you in a crash, then you are about to blow a bag and that gets the highest priority. So the writes to the NVRAM really are a lower priority. There is almost always a time constraint on when the data should be written by, because of finite capacitor charge. Throw in CAN message peridocity, bus data reading and processing that has to be matched with acceleration triggers, the timing requirements can get complex – it is easy to mess up the timing.

    I am not very familiar with the analysis of acceleration data, but I am wondering how much of the “driver error” decision was based on talking to people vs a hard look at the data. An EDR will store data when certain acceleration parameters are met, but it stores the vehicle accleration as detected by the airbag sensors – which does not differentiate between ‘who’ is causing the acceleration. If one is talking about throttle information received over CAN, then that is transmitted by the Engine Control Module, which is the module under investigation. The CAN data has a very low periodicity compared to the data generated actually inside the Engine Control Module.

    What I am reading from the article seem to indicate disorganised code without a cohesive structure. Yes, I have seen plenty of code like that and have first hand experience of “unexpected- this is just not possible” bugs that come out of it.

    I would need to be more convinced before I can believe that it is not a ‘software’ bug.

  13. tomkawal says:

    My own comments are based what I see everyday as a designer.
    The SW authors rarely understand the errors distribution in physical systems.
    For good SW is deterministic – predictable, for good HW is stochastic. There comes the problem that not every HW glitch is predicted by SW guru. Add the fact logical reasoning is not the important part of modern SW development training plus math skills are really poor ( My 2 japanese cars from late 90’s were behaving really badly on accelerating because of naive SW). Another point is that many automotive systems are just designed to ‘pass the test’, then comes ‘design freeze’ and the designer cannot fix the bug, even if discovered and important.

  14. NASA reviewed the redactions and released a new version of the document containing “certain material previously deemed confidential” on April 15. There are still many, many areas which are blacked out.

  15. Jeff Domer says:

    Is it possible that we’re looking in the wrong place? 2002 + Lexus and I assume high end Camery are equipped with ABS Brakes, Traction Control, Elecrtonic Brake Force Distribution, and Vehicle Stibility Control. These are all software based systems that share ABS sensors and components plus other components to modify the driver’s braking. The VSC system is the Lexus/Toyota anti-skid/roll over control system. In addition, the VSC uses yaw and lateral motion and inertia sensors to control independent or selective wheel braking and throttle control , independent of driver input. Starting with the 2006 models, Toyota added DBW power steering and Vehicle Dynamics Integrated Management that is supposed to manage all this SW.

    I am suspect of the VSC system because Lexus has had a number of strange accidents and near accident claims where drivers have claimed that while doing slow speed, parking lot type monuvers, the car suddenly accelerates and they claim to loose brakes and steering too. A few drivers noticed that the “VSC” indicator lite on the dash, meaning that it was engaged. These incidents are blamed on foot misplacement, especially when it involves a pre-2006 car with conventional steering. But Toyota’s own literature describes the ability of the VSC to apply selective wheel braking to steer a car out of a skid. There are other strange incidents such as a car at highway speed that suddenly skid into the other lane, “VSC” indicator on. According to the dealer, extremely low tire pressure on one side that caused too much drag and lean on one side and the VSC was too sensitive. A SW upgrade took care of it.

    Lexus has issued a number of TSB on the VSC, mostly for SW upgrades and “enhancements”, including at least one TSB to replace the the lateral sensor under the rear deck that can be affected by “cell phone interference”.

    I suspect some acceleration problem might be caused by one of these other systems that can control the throttle.

Leave a Reply to tomkawal