Archive for the ‘Uncategorized’ Category

Tools to Detect Software Copyright Infringement

Thursday, September 23rd, 2010 Michael Barr

An emerging class of tools makes it easy to automatically detect copying of copyrighted software source code, even if it came from one of the hundreds of thousands of open source packages.

I am presently providing litigation support in a case of alleged software copyright infringement.  In a nutshell, the plaintiff brought suit against the defendant for allegedly continuing to use plaintiff’s copyrighted software source code in defendant’s products after termination of a license agreement between the parties.  Fortunately, automated tools are making it easier than ever to quickly and inexpensively detect copying of software source code.

Some of the most powerful tools for doing direct comparisons between a pair of source code sets are from S.A.F.E. Their CodeMatch tool works by comparing each file of source code in the first set with every file of code in the second set.  Results are presented in a table that is sorted by the relative amount of matching code in the files.  And CodeMatch is clever enough to detect copying in which variable and function names and other details were subsequently changed; CodeMatch can even detect code that was copied from one programming language into another.  The only weakness of CodeMatch is that you have to have the source code for each product, which is not always possible early in litigation.

Other tools from S.A.F.E. provide additional help.  For example, BitMatch can compare a pair of executable binary programs or one party’s source code against another’s executable code.  It works by matching strings that appear in both programs.  Meanwhile, SourceDetective helps rule out that the two programs are only similar because they both borrowed from some third program—by automatically searching the Internet for hundreds or thousands of matching phrases.  CodeMatch, BitMatch, and SourceDetective are part of a suite of related tools called CodeSuite.  CodeSuite is a free download that runs on Microsoft Windows, with license keys sold based on the amount of code to be compared.

Of course, sometimes code may be copied from open source software.  Open source software is subject to so-called copyleft licenses, which are a special type of copyright that makes the source code open to the public.  Copyleft language is drafted to ensure that the source code for certain categories of derived work are also open to the public.  This creates problems for companies that wish to keep their source code private but also rely upon open source software.

Fortunately, there are also tools to detect the presence of part of all of an open source software package within a proprietary program.  I have used such tools from Black Duck Software and Protecode.  Both work similarly: each company maintains a database of hundreds of thousands of known open source packages against which the source code you provide is tested. Results are presented as a list of open source packages from which code may have been copied. This testing can be done entirely on a personal computer running Microsoft Windows, so that proprietary source code need not be sent outside a trusted network.  Both tools are generally licensed for an expected level of use on an annual basis.

Unfortunately, the precision of CodeMatch is lost in trying to cast such a broad net for potential copying.  The tools from BlackDuck and Protecode don’t actually compare your code against each and every of the millions of source code files in their database.  Instead, they reduce each file of your source code to a simpler representation of its structure and then compute a unique mathematical signature for that new file.  This signature is subsequently compared to a similar representation of the files in their database.  In plain English, this means that you get lots of false positives.  Some open source packages that weren’t actually copied usually turn up in the results list.

When searching for potential copying of open source code, I recommend searching the database from BlackDuck or Protecode first.  Then, to eliminate the false positives, a more thorough analysis should be performed by obtaining the listed open source packages and using CodeMatch to compare the proprietary code against them file-by-file.

With the help of tools like those mentioned here, it is possible to quickly ascertain whether source code copying has taken place.  Prior to the appearance of these tools, it was necessary for an expert in software development to manual perform dozens of searching and comparison steps.  This strategy can be used early in litigation with the benefit of dramatically reducing the cost of such analysis.  The same tools can also be employed proactively by companies seeking to reduce their risks of copyright infringement litigation.

Design for the Worst Case

Wednesday, August 11th, 2010 Michael Barr

In real-time systems, as in life, anything that can go wrong will! A nurse could be using a GUI task to change system parameters on a ventilator just as the attached patient’s lungs demand the most help from another task. Or an interrupt signal could start acting funny, generating a stream of unexpected ISR invocations. Or all of those at once. And something else.

The designers of hard real-time systems must design for such a worst-case. They must ensure that sufficient CPU and memory bandwidth are present to handle the worst-case demands that could be placed on the software—simultaneously. In simple terms, we must size the processor bandwidth to the worst-case scenario.

Safety for the users of our products emerges as a side effect of buying a faster (read “higher priced”) CPU. Rate Monotonic Analysis helps ensure we’ve specified the right processor clock rate, so the users are safe. RMA is also the optimal fixed-priority scheduling algorithm, which prevents us from over-paying for clock rate. If a set of tasks cannot be scheduled using RMA, it can’t be scheduled using any fixed-priority algorithm.

The basics of RMA are well covered in many places, including my article Introduction to Rate Monotonic Scheduling. In summary, Rate Monotonic Analysis gives us mathematics to prove all deadlines are always met when you’ve followed the Rate Monotonic Algorithm to assign priorities.

Rate Monotonic Algorithm is a procedure for assigning fixed priorities to tasks and ISRs to maximize their schedulability. A particular set of tasks and ISRs is considered schedulable if all deadlines will be met even in the worst-case scenario.  The algorithm is simple: “Assign the priority of each task and ISR according to its worst-case period, so that the shorter the period the higher the priority.” For example if Task 1 and Task 2 have periods of 50 ms and 100 ms, respectively, then Task 1 is given higher priority. This ensures that a long Task 2 job can’t prevent Task 1 from missing its more frequent deadline.

Too many of today’s real-time systems built with an RTOS are working by luck. Excess processing power may be masking design and analysis sins or the worst-case simply hasn’t happened—yet.  Bottom line: You’re playing with fire if you don’t use RMA to assign priorities to safety-critical tasks; it might be just a matter of time before your product’s users get burned.  Perhaps your failure to use RMA to prioritize tasks and prove they’ll meet deadlines explains one or more of those “glitches” your customers have been complaining about?

How to Set the Size of your C Stack

Wednesday, March 24th, 2010 Michael Barr

A reader of my monthly Firmware Update newsletter recently sent an e-mail to ask:

I am a firmware engineer. I read your recent blog post regarding the C stack, about which I have two questions: First, how can I increment or decrement the size of the stack in my code? Second, what size should I choose?

Here’s what I told him:

The size of the stack is set either in the linker command file or in the C or C++ startup code. You should be able to learn more about how to change the stack size from your specific compiler vendor’s manual or customer support.

Identifying the minimum stack size required for your specific application is made challenging by these stubborn facts:

- MEASURING the maximum stack growth during testing may not be sufficient. If you test for half a year, the product is sure to be run for a year or longer in the field. Have you really tested all possible cases? What about all possible series of interrupt service routines on top of that worst case use by main()?

- TOP DOWN ANALYSIS of the compiled code can be done to determine the number of function calls and interrupt service routines at maximum depth; their individual parameter and local variable use, etc. Unfortunately, these things may keep changing whenever you change the code and recompile.

The best approach is usually to perform a conservative top down analysis of the source code; when in doubt, always round up. Don’t forget about nested interrupt service routines. Double that conservative to set your initial stack budget. Then measure actual stack utilization during testing, preferably with code coverage analysis tools running–to ensure that you’ve tested all possible paths (except interrupts, which may run at different times in the field).

Then if you need to reclaim memory to ship the product, start shrinking the stack. But also put into place a high water mark system (e.g., 0xDEADBEEF) complete with supervisor code to put the product into a failsafe state if more than, for example, 80% of the stack is ever used.

Toyota’s Embedded Software Image Problem

Friday, March 19th, 2010 Michael Barr

It remains unclear whether Toyota’s higher-than-industry-average number of complaints regarding sudden unintended acceleration (SUA) is caused (in whole or in part) by an embedded software problem. But whether it is or it isn’t actually firmware, the company has clearly denied it and yet still developed an embedded software “image problem”. They’ve brought some of this on themselves.

Side Note: I think it is a net positive that journalists, the mass media, and a broader swath of the general public are increasingly aware that there is software embedded inside cars, airplanes, medical devices, and just about everything else with a power supply or batteries. Firmware has been inside these products for many years, of course. But as I wrote in a recent article in Electronic Design, my experience working with companies across many industries lead me to believe there is a looming firmware quality crisis. Greater public awareness is sure to bring litigation. This will force engineering management to care more about firmware quality than they currently do.

Toyota’s Firmware Image Problem

Long before the “floor-mat recall” NHTSA had logged a higher number of unintended acceleration complaints (4.51 complaints per 100,000 cars sold for the 2005 to 2010 model years) for Toyota than any other company. (A recent Washington Post graphic has more data.) Apparently, NHTSA and Toyota were investigating the reports–but hadn’t yet taken any action.

It seems that what set that first Toyota recall in motion was a high-profile fatal August 2009 crash involving an off-duty California Highway Patrol office, his family, a runaway Lexus, and a disturbing 911 call,  Given the context of that specific crash, I’m not convinced the floor mat recall made much sense. In particular, I find it hard to believe that a police officer with adrenaline pumping through his veins and his family’s life on the line, wouldn’t just rip a stuck floor mat out of the way like the Incredible Hulk. (Or that he would choose running off the road at 125 mph vs. shutting the vehicle off entirely.)  But I don’t have all the facts about either that specific accident or the reasoning behind the floor mat recall.

The broader recalls that have happened since have focused on also adding mechanical strength to the accelerator pedals in a number of different makes and models. To this day, Toyota categorically denies any sort of electrical problem.  Yet some cars that have been modified in this way have since been reported to experience unintended acceleration!  Besides which, mechanical parts generally fail visibly or entirely once they first fail–rather than intermittently.  Intermittent failures are far more common with electronics (think EMI) and firmware.

Toyota’s firmware image problem stems from two things:  First, they have separately recalled the Prius for a braking-related firmware upgrade.  Other possible Prius software issues have been identified by Steve Wozniak and Jim Sikes, but these have not yet been confirmed.  Additionally, the continued reliance (by Toyota and NHTSA) on theories such as “we can’t reproduce the problem and we haven’t been able to see it during testing” as proof that there’s not a software bug is simply unbelievable.  

Anyone who works with software knows from experience that lots of bugs can’t be easily reproduced.  The fact that these incidents can’t be reproduced is not a proof of anything.

Software in Cars: The Future

Don’t get me wrong.  I want more software in my car not less.  I very much look forward to the day that an in-car computer takes over the driving for me.  After all, some cars already have more sensor data to make decisions on than the driver does.  Imagine what a car with an integrated GPS navigation system, auto-follow cruise control, and collision avoidance systems could do.  While I guess that I should move left one lane to avoid a crash, the computer is capable of seeing in all directions at once, calculating all of the trajectories of near-by cars, including instantaneous changes in their acceleration or deceleration.

Additionally, I suspect that even with bugs in a car’s drive-by-wire software the car may be much safer overall for its electronic traction control and anti-lock braking systems.

I just wish that Toyota would own up to the fact that the inability to reproduce a problem doesn’t rule out a software (or EMI) flaw.

Embedded Gurus – Site Redesign

Tuesday, March 2nd, 2010 Michael Barr

I am pleased to announce that the EmbeddedGurus website has been redesigned. Among the new features of the site are:

1. A dynamically updating home page, featuring the most recent posts from all of our bloggers. If you prefer, you may view these posts by category.

2. A common look and feel to all of the individual blogs.

3. The ability to search individual blogs, as well as to easily browse from one post to the next and via tags and categories.

4. A sixth guru named Gary Stringham with a blog called Embedded Bridge.

A number of other minor improvements have also been made.

We hope you like the new look and continue to find our blogs about embedded systems design both readable and informative.