Posts Tagged ‘copyright’

Intellectual Property Protections for Embedded Software: A Primer

Tuesday, June 11th, 2013 Michael Barr

My experiences as a testifying expert witness in numerous lawsuits involving software and source code have taught me a thing or two about the various intellectual property protections that are available to the creators of software. These are areas of the law that you, as an embedded software engineer, should probably know at least a little about. Hence, this primer.

Broadly speaking, software is protectable under three areas of intellectual property law: patent law, copyright law, and trade secret law. Each of these areas of the law protects your software in a different way and you may choose to rely on none, some, or all three such protections. (The name of your product may also be protectable by trademark law, though that has nothing specifically to do with software.)

Embedded Software and Patent Law

Patent law can be used to protect one or more innovative IDEAS that your product uses to get the job done. If you successfully patent a mathematical algorithm specific to your product domain (e.g., an algorithm for detecting or handling a specific arrhythmia used in your pacemaker) then you own a (time-limited) monopoly on that idea. If you believe another company is using the same algorithm in their product then you have the right to bring an infringement suit (e.g., in the ITC or U.S. District Court).

In the process of such a suit, the competitor’s schematics, source code, and design documents will generally be made available to independent expert witnesses (i.e., not to you directly). The expert(s) will then spend time reviewing the competitor’s source code to determine if one or more of the claims of the asserted patent(s) is infringed. It is a useful analogy to think of the claims of a patent as a description of the boundaries of real property and of infringement of the patent as trespassing.

Patents protect ideas regardless of how they are expressed. For example, you may have heard about (purely) “software patents” being new and somewhat controversial. However, the patents that protect most embedded systems typically cover a combination of at least electronics and software. Patent protection is typically broad enough to cover purely hardware, purely software, as well as hardware-software. Thus the protection can span a range of hardware vs. software decompositions and provides protection within software even when the programming languages and/or function and variable names differ.

To apply for a patent on your work you must file certain paperwork with and pay registration fees to the U.S. Patent and Trademark Office. This process generally begins with a prior art search conducted by an attorney and takes at least several years to complete. You should expect the total cost (not including your own time), per patent, to be measured in the tens of thousands of dollars.

Embedded Software and Copyright Law

Copyright law can be used to protect one or more creative EXPRESSIONS that the authors of the source code employed to get the job done. Unlike patent law, copyright law cannot be used to protect ideas or algorithms. Rather, copyright can only protect the way that you specifically creatively choose to implement those ideas. Indeed if there is only one or a handful of ways to implement a particular algorithm, or only one way to do so efficiently or in your chosen language, you may not be able to protect that aspect of your software with copyright.

The attorneys in a source code copyright infringement lawsuit wind up arguing over two primary issues. First, they argue which individual parts of the source code (e.g., function prototypes in an API) are protectable because they are sufficiently creative. The judge generally decides this issue, based on expert analysis. Second, they argue how the selection and arrangement of these individually protectable “islands” together shows a pattern of “substantial similarity”. The jury decides that.

Source code copyright infringement is easiest to prove when the two programs have source code that looks similar in some important way. That is, when the programming languages are the same and the function and variable names are similar. However, it is rare that the programs are identical in every detail. Thus, due to the possibility of the accused software developers independently creating something similar by coincidence rather than malfeasance, the legal standard for proving copyright infringement is much higher when it cannot be shown that the defendants had “access” to some version of the source code.

Unlike patents, copyrights do not need to be awarded. You, or your employer, own a copyright in your work merely by creating it. (Whether you write “Copyright (c) 2013 by MyCompany, Inc.” at the top of every source code file or not.) However, there are some advantages to registering your copyright (by submitting a sample) in a work of software with the U.S. Copyright Office before any alleged infringement occurs. Even if you outsource it to an attorney, the cost of registering a copyright should only be about a thousand dollars at most.

As source code frequently changes and new versions will inevitably be released, you should be reassured that a single copyright extends to “derivative works”, which generally includes later versions of the software. You don’t have to keep registering every minor release with the Copyright Office. And, very importantly, the binary executable version of your software (e.g., the contents of Flash or a library of object code) is extended copyright protection as a derivative work of the source code. Thus someone who copies your binary can also be found to have infringed your copyright.

Interestingly, both patent law and copyright law are called for in the U.S. Constitution. However, of course, the extension of these areas of law to software is a modern development.

Embedded Software and Trade Secret Law

Unlike patent and copyright law, which each at best protects only a portion (“islands”) of your source code, trade secret law can be used to protect the entirety of the SECRETS within the source code. Secrets need not be innovative ideas nor creative expressions. The key requirement for this area of law to apply is that you take reasonable steps to keep the source code “secret”. So, for example, though open source software may be protectable by patent law and copyright law it cannot be protected by trade secret law due to the lack of secrecy.

You may think that there is a fundamental conflict between registering the copyright in your software, which requires submitting a copy to the government, and keeping your source code secret. However, the U.S. Copyright Office only requires that a small portion of the source code of your program be filed to successfully identify the copyrighted software and its owner; the vast majority of the source code need not be submitted.

Preserving this secrecy is one of the reasons for the inconveniences software developers often encounter at the companies that employ them (e.g., not being able to take source code home). (And certain terms of their employment agreements.) Protecting software like the secret formula for Coca-Cola or Krabby Patties helps an owner prove that the source code is a trade secret and thus opens the door to this additional legal basis for bringing a lawsuit against a competitor. Trade secrets cases I have been involved with as an expert have involved allegations that one or more insiders left a company and subsequently misappropriated it’s software secrets to compete via a startup or existing competitor.

Final Thoughts

In my work as an expert, I always look to the attorneys for more precise definitions of legal terms. Importantly, there are many terms and concepts I have purposefully avoided using here to keep this at an introductory level of detail. You should, of course, always consult with an attorney about your specific situation. You should never simply rely on what you read on the Internet. Hopefully, there is enough information in this primer to help you at least understand the types of protections potentially available to you and to find a lawyer who specializes in the right field.

Dead Code, the Law, and Unintended Consequences

Wednesday, February 6th, 2013 Michael Barr

Dead code is source code that is not executed in the final system. It comes in two forms. First, there is dead code that is commented out or removed via #ifdef’s. That dead code has no corresponding form in the binary. Other dead code is present in the binary but cannot be or is never invoked. Either way, dead code is a vestige or unnecessary part of the product.

One of the places I have seen a lot of dead code is in my work as an expert witness. And I’ve observed that the presence of dead code can have unintended legal consequences. In at least one case I was involved in it is likely that strings in certain dead code in the binary was a major cause of a lawsuit being brought against a maker of embedded system products. I have also observed several scenarios in which dead code (at least in part) heightened the probability of a loss in court.

One way that dead code can increases the probability of a loss in court is if the dead code implements part (or all) of a patented algorithm. When a patent infringement suit is brought against your company, one or more versions of your source code–when potentially relevant–must be produced to the other side’s legal team. The patent owner’s expert(s) will pore over this source code for many hours, seeking to identify portions of the code that implement each part of the algorithm. If one of those parts is implemented in dead code that becomes part of the binary the product may still infringe an asserted claim of the patent–even if it is never invoked. (I’m not a lawyer and not sure if dead code does legally infringe, but consider it at least possible that neither side’s expert will notice it is dead or that the judge or jury won’t be convinced by a dead code defense.)

Another potential consequence of dead code is that an expert assessing the quality of your source code (e.g., in a product liability suit involving injury or death) may use as one basis of her opinion of poor quality that the source code she examined is overly-complex and riddled with commented out code and/or preprocessing directives. As you know, source code that is hard to read is harder than it needs to be to maintain. And,I think most experts would agree, code that is hard to read and maintain is more likely to contain bugs. In such a scenario, your engineering team may come off as sloppy or incompetent to the jury, which is not exactly the first impression you want to make when your product is alleged to have caused injury or death. Note that overly-complex code also increases the cost of litigation–as both side’s experts will need to spend more time reviewing the source code to understand it fully.

In a source code copyright (or copyleft) suit the mere presence of another party’s source code may result be sufficient to prove infringement–even if it is isn’t actually built into the binary! Consider the risks of your code containing files or functions of open source software that, by their mere existence in your source code, attaches an open source license to all of your proprietary code.

Bottom line advice: If source code is dead remove it. If you think you might need to refer to that code again later, well that is what version control systems are for–make a searchable comment about what you’ve removed at such a checkin. Do this as soon as you are certain it won’t be in a release version of your firmware.

Where in the World is Michael Barr?

Friday, November 9th, 2012 Michael Barr

Dear reader, it has been over six months since my last blog post. My apologies for being absent without leave from this blog and from my Firmware Update e-newsletter. I have never been as busy, professionally, as over the past 14 months.

I recognize I have been quiet for too long for many of you and note that several readers have written to ask if I am okay or what is keeping me so busy. I am thankful for your concern and also for your patience. I hope this will be the first of several blog posts I will write in coming weeks and that I will resume a normal pace in coming months. I have quite a backlog of ideas.

In addition to launching the new company, Barr Group, and bringing on CEO Andrew Girson earlier this year here’s a quick summary of just some of what’s been keeping me so busy:

  • Toyota Unintended Acceleration. You may be aware of the NHTSA investigation and associated NASA Report on software. About a year ago, I was retained by plaintiffs in the consolidated personal injury and economic loss claims against Toyota in U.S. District Court (note: I am not involved in any of the state court cases). I am honored to have had the chance to review Toyota’s engine control source code with the assistance of a very talented team of Barr Group and other engineers. We were able to push the source code analysis deeper than NASA and also across many more vehicle years and models
  • Smartphone vs Apple (and LG). I have also been working as an expert witness in the smartphone wars. In this matter, Barr Group’s client, Smartphone, is the holder of a number of patents originally awarded to now-defunct Palm as it first added cellular phone capabilities to its popular handheld PDA products. My team has been working the infringement side of this patent dispute, which has required me to review Apple’s iOS source code as well as LG’s Android source code.
  • Madden Football. Another client is the original author of the popular Madden football games for Apple II, Commodore 64, and IBM PC. He is suing the game’s publisher (a little company called Electronic Arts) for breach of contract and past royalties. In a nutshell, the issue is whether the move of the early PC game’s code to consoles like Sega Genesis and Super Nintendo was cleanroom or a port. Reviewing so much assembly code for decades old 8- and 16-bit CPUs has reminded me how wonderful it is to program even in the “relatively low-level high-level language” of C.
  • Printers and Set-Top Boxes. Lest the above give you the impression I only work with plaintiffs, in this time I have also been helping: Samsung defend against allegations that it misused a former partner’s software in its printers; Motorola Mobility (now Google) defend against allegations that its cable TV set-top boxes infringe a pair of Microsoft patents; and a Canadian satellite TV company defend against allegations that it allowed its service to be pirated to the detriment of a rival cable TV company’s business.

Fun stuff!

While I never expected or planned to work with so many lawyers when I majored in electrical engineering and practiced embedded software, I do very much enjoy working as an expert witness. For one thing, I enjoy the required range of having to understand the technical issues as well as find ways to explain them to less technical audiences, including judges and juries. For another, reading so much source code written by others and doing related forms of reverse engineering has continued to inform my view on best practices in embedded software process and architecture. As these and other cases wind down in coming months and years, I expect to be able to share some of these lessons with you in this blog and in my other work as a consultant and trainer.

Firmware Forensics: Best Practices in Embedded Software Source Code Discovery

Tuesday, September 27th, 2011 Michael Barr

Software has become ubiquitous, embedded as it is into the fabric of our lives in literally billions of new (non-computer) products per year, from microwave ovens to electronic throttle controls. When products controlled by software are the subject of litigation, whether for infringement of intellectual property rights or product liability, it is imperative to analyze the embedded software (a.k.a., firmware) properly and thoroughly. This article enumerates five best practices for embedded software source code discovery and the rationale for each.

In February 2011, the U.S. government’s National Highway Traffic Safety Administration and a team from NASA’s Engineering and Safety Center published reports of their joint investigation into the causes of unintended acceleration in Toyota vehicles. While NHTSA led the overall effort and examined recall records, accident reports, and complaint statistics, the more technically focused team from NASA performed reviews of the electronics and embedded software at the heart of Toyota’s “electronic throttle control subsystem” (ETCS). Redacted public versions of the official reports from each agency, together with a number of related documents, can be found at http://www.nhtsa.gov/UA.

These reports are very interesting in what they have to say about the quality of Toyota’s firmware and NASA’s review of the same. However, of greater significance is what they are not able to say about unintended acceleration. It appears that NASA did not follow a number of best practices for reviewing embedded software source code that might have identified useful evidence. In brief, NASA failed to find a firmware cause of unintended acceleration—but their review also fails to rule out firmware causes entirely.

This article describes a set of five recommended practices for firmware source code review that are based on my experiences as both an embedded software developer and as an expert witness. Each of the recommendations will consider what more could have been done to determine whether Toyota’s ETCS firmware played a role in any of the unintended acceleration. The five recommended practices are: (1) ask for the bug list; (2) insist on an executable; (3) reproduce the development environment; (4) try for the version control repository; and (5) remember the hardware. The relative value and importance of the individual practices will vary by type of litigation, so the recommendations are presented in the order that is most readable.

Ask for the Bug List

Any serious litigation involving embedded software will require an expert review of the source code. The source code should be requested early in the process of discovery. Owners of source code tend to strenuously resist such requests but procedures limiting access to the source code to only certain named and pre-approved experts and only under physical security (often a non-networked computer with no removable storage in a locked room) tend to be agreed upon or ordered by a judge.

Software development organizations commonly keep additional records that may prove more important or useful than a mere copy of the source code. Any reasonably thorough software team will maintain a bug list (a.k.a., defect database) describing most or all of the problems observed in the software along with the current status of each (e.g., “fixed in v2.2” or “still under investigation”). The list of bugs fixed and known—or the company’s lack of such a list—is germane to issues of software quality. Thus the bug list should be routinely requested and supplied in discovery. (It is also recommended that a request be made for copies of software design documents, coding standards, build logs and associated tool outputs, testing logs, and other artifacts of the embedded software design and development process.)

Very nearly every piece of software ever written has defects, both known and unknown. Thus the bug list provides helpful guidance to a reviewer of the source code. Often, for example, bugs cluster in specific source files in need of major rework. To ignore the company’s own records of known bugs, as the NASA reviewers apparently did, is to examine a constitution without considering the historical reasons for the adoption of each section and amendment. Indeed, a simple search of the text in Toyota’s bug list for the terms “stuck” and “fuel valve” might yet provide some useful information about unintended acceleration.

Insist on an Executable

In software parlance, the “executable” program is the binary version of the program that’s actually executed in the product. The machine-readable executable is constructed from a set of human-readable source code files using software build tools such as compilers and linkers. It is important to recognize that one set of source code files may be capable of producing multiple executables, based on tool configuration and options.

Though not human-readable, an executable program may provide valuable information to an expert reviewer. For example, one common technique is to extract the human-readable “strings” within the executable. The strings in an executable program include information such as on-screen messages to the user (e.g., “Press the ‘?’ button for help.”). In a copyright infringement case in which I once consulted several strings in the defendant’s executable helpfully contained a phrase similar to “Copyright Plaintiff”! You may not be so lucky, but isn’t it worth a try?

It may also be possible to reverse engineer or disassemble an executable file into a more human-readable form. Disassembly could be important in cases of alleged patent infringement, for example, where what looks like an infringement of a method claim in the source code might be unused code or not actually part of the executable in the product as used by customers.

Sometimes it is easy to extract the executable directly from the product for expert examination—in which case the expert should engage in this step. For instance, software running on Microsoft Windows consists of an executable file with the extension .EXE, which is easily extracted. However, the executable programs in most embedded systems are difficult, at best, to extract. (Note that if it is possible for the expert to extract an executable from one or more exemplars of the product, an automated comparison should always be made between the installed and produced binary files. You never know what you may find and any difference could have important implications for the facts underlying the case.) Extraction of Toyota’s ETCS firmware might not be physically possible. Thus the legal team should insist on production of the executable(s) actually used by the relevant customers.

Reproduce the Development Environment

The dichotomy between source code and executable code and the inability of even most software experts to make much sense of binary code can create problems in the factual landscape of litigation. For example, suppose that the source code produced by Toyota was inadvertently incomplete in that it was missing two or three source code files. Even an expert reviewer looking at the source code might not know about the absent files. For example, if the bug the expert is looking for is related to fuel valve control and the code related to that subject doesn’t reference the missing files, the reviewer may not notice their absence. No expert can spot a bug in a missing file.

Fortunately, there is a reliable way for an expert to confirm that she has been provided with all of the source code. The objective is simply stated: reproduce the software build tools setup and compile the produced source code. To do this it is necessary to have a copy of the development team’s detailed build settings, such as make files, preprocessor defines, and linker control files. If the build process completes and produces an executable, it is certain the other party has provided a complete copy of the source code. (Further additional technical details include the need to start with a “clean” set of files that contains no object files or libraries. It may also be necessary to obtain third-party header files or libraries.)

Furthermore, if the executable as built matches the executable as produced (actually, ideally, the executable as extracted from the product) bit by binary bit, it is certain that the other party has provided a true and correct version of the source code. Unfortunately, trying to prove this part may take longer than just completing a build; the build could fail to produce the desired proof for a variety of reasons. The details here get complicated: to get exactly the same output executable, it is necessary to use all of the following: precisely the same version of the compiler, linker, and each other build tool as the original developers; precisely the same configuration of each of those tools; and precisely the same set of build instructions. Even a slight variation in just one of these details will generally produce an executable that doesn’t match the other binary image at all—just as the wrong version of the source code would.

Try for the Version Control Repository

Embedded software source code is never created in an instant. All software is developed one layer at a time over a period of months or years in the same way that a bridge and the attached roadways exist in numerous interim configurations during their construction. The version control repository for a software program is like a series of time-lapse photos tracking the day-by-day changes in the construction of the bridge. But there is one considerable difference: it is possible to go back to one of those source code snapshots and rebuild the executable of that particular version. This becomes critically important when multiple software versions will be deployed over a number of years. In the automotive industry, for example, it must be possible to give one customer a bug fix for his v2.1 firmware while also working on the new v3.0 firmware to be released the following model year.

Consider, for the sake of discussion, that the executable version of Toyota’s ETCS v2.1 firmware that was installed in the factory in one million cars around the world had an undiscovered bug that could result in unintended acceleration under certain rare operating conditions. Now further suppose that this bug was (perhaps unintentionally) eliminated in the v2.2 source code, from which a subsequent executable was created and installed at the factory into millions more cars with the same model names—and also as an upgrade into some of the original one million cars as they visited dealers for scheduled maintenance. In this scenario, an examination of the v2.2 source code proves nothing about the safety of the hundreds of thousands of cars still with v2.1 under the hood.

Gaining access to the entire version control repository containing all of the past versions of a company’s firmware source code through discovery may be out of the question. For example, a judge in a source code copyright and trade secrets case I consulted in would only allow the plaintiff to choose one calendar date and to then receive a snapshot of the defendant’s source code from that specific date. If the plaintiff was lucky it would find evidence of their proprietary code in that specific snapshot. But the observed absence of their proprietary code from that one specific snapshot doesn’t prove the alleged theft didn’t happen earlier or later in time.

There are some problems with examination of an entire version control repository. It may be difficult to make sense of the repository’s structure. Or, if the structure can be understood, it might take many times as long to perform a thorough review of the major and minor versions of the various source code files as it would to just review one snapshot in time. At first glance, many of those files would appear the same or similar in every version—but subtle differences could be important to making a case. To really be productive with that volume of code, it may be necessary to obtain a chronological schedule provided by a bug list and/or other production documents describing the source code at various points in time.

Remember the Hardware

Embedded software is always written with the hardware platform in mind and should be reviewed in the same manner. For example, it is only possible to properly reverse engineer or disassemble an executable program once the specific microprocessor (e.g., Pentium, PowerPC, or ARM) is known. But knowing the processor is just the beginning, because the hardware and software are intertwined in complex ways in such embedded systems.

Only one or more features of the hardware are enabled or active when the hardware is in a particular configuration. For instance, consider an embedded system with a network interface, such as an Ethernet jack that is only powered when a cable is mechanically inserted. Some or all of the software required to send and receive messages over this network may be not be executed until a cable is inserted. A proper analysis of the software needs to keep hardware-software interactions like this in perspective. Ideally, testing of the firmware should be done on the hardware as configured in exemplars of the units at issue—so it is useful to ask for hardware during discovery, if you are not able to acquire exemplars in other ways. It is not clear from the redacted reports if NHTSA’s testing of certain Toyota Camrys was done using the same firmware version on exactly the same hardware as the owners who experienced unintended acceleration. Hardware interactions can be one of the most important considerations of all when analyzing embedded software.

Sometimes a bug is not visible in the software itself. Such a bug may result from a combination of hardware and software behaviors or multi-processor interactions. For example, one motor control system I’m familiar with had a dangerous race condition. The bug, though, was the result of an unforeseen mismatch between the hardware reaction time and the software reaction time around a sequence of commands to the motor.

Additional Analysis Required

As you can see, the review of embedded software can be complicated. This is partly because the hardware of each embedded system is unique. In addition, the system as a whole generally involves complex interactions between hardware, software, and user. An expert in embedded software should typically have a degree in electrical engineering, computer engineering, or computer science plus years of relevant experience designing embedded systems and programming in the relevant language(s).

The five best practices presented here are meant to establish the critical importance of making certain specific requests early in the legal discovery process. They are by no means the only types of analysis that should be performed on the source code. For example, in any case involving the quality or reliability of embedded software, the source code should be tested via static analysis tools. This and other types of technical analysis should be well understood by any expert witness or litigation consultant with the proper background.

In the case of Toyota’s unintended acceleration issues, I hope that expert review in the class action litigation against Toyota will include these and other additional types of analysis to identify all of the potential causes and determine if embedded software played any role. Though government funds for analysis by NASA are understandably limited, it is suggested that transportation safety organizations, such as NHTSA, should establish rules that ensure that future investigations are more thorough and that safety-related technical findings in litigation cannot be hidden behind the veil of secrecy of a settlement agreement.

Tools to Detect Software Copyright Infringement

Thursday, September 23rd, 2010 Michael Barr

An emerging class of tools makes it easy to automatically detect copying of copyrighted software source code, even if it came from one of the hundreds of thousands of open source packages.

I am presently providing litigation support in a case of alleged software copyright infringement.  In a nutshell, the plaintiff brought suit against the defendant for allegedly continuing to use plaintiff’s copyrighted software source code in defendant’s products after termination of a license agreement between the parties.  Fortunately, automated tools are making it easier than ever to quickly and inexpensively detect copying of software source code.

Some of the most powerful tools for doing direct comparisons between a pair of source code sets are from S.A.F.E. Their CodeMatch tool works by comparing each file of source code in the first set with every file of code in the second set.  Results are presented in a table that is sorted by the relative amount of matching code in the files.  And CodeMatch is clever enough to detect copying in which variable and function names and other details were subsequently changed; CodeMatch can even detect code that was copied from one programming language into another.  The only weakness of CodeMatch is that you have to have the source code for each product, which is not always possible early in litigation.

Other tools from S.A.F.E. provide additional help.  For example, BitMatch can compare a pair of executable binary programs or one party’s source code against another’s executable code.  It works by matching strings that appear in both programs.  Meanwhile, SourceDetective helps rule out that the two programs are only similar because they both borrowed from some third program—by automatically searching the Internet for hundreds or thousands of matching phrases.  CodeMatch, BitMatch, and SourceDetective are part of a suite of related tools called CodeSuite.  CodeSuite is a free download that runs on Microsoft Windows, with license keys sold based on the amount of code to be compared.

Of course, sometimes code may be copied from open source software.  Open source software is subject to so-called copyleft licenses, which are a special type of copyright that makes the source code open to the public.  Copyleft language is drafted to ensure that the source code for certain categories of derived work are also open to the public.  This creates problems for companies that wish to keep their source code private but also rely upon open source software.

Fortunately, there are also tools to detect the presence of part of all of an open source software package within a proprietary program.  I have used such tools from Black Duck Software and Protecode.  Both work similarly: each company maintains a database of hundreds of thousands of known open source packages against which the source code you provide is tested. Results are presented as a list of open source packages from which code may have been copied. This testing can be done entirely on a personal computer running Microsoft Windows, so that proprietary source code need not be sent outside a trusted network.  Both tools are generally licensed for an expected level of use on an annual basis.

Unfortunately, the precision of CodeMatch is lost in trying to cast such a broad net for potential copying.  The tools from BlackDuck and Protecode don’t actually compare your code against each and every of the millions of source code files in their database.  Instead, they reduce each file of your source code to a simpler representation of its structure and then compute a unique mathematical signature for that new file.  This signature is subsequently compared to a similar representation of the files in their database.  In plain English, this means that you get lots of false positives.  Some open source packages that weren’t actually copied usually turn up in the results list.

When searching for potential copying of open source code, I recommend searching the database from BlackDuck or Protecode first.  Then, to eliminate the false positives, a more thorough analysis should be performed by obtaining the listed open source packages and using CodeMatch to compare the proprietary code against them file-by-file.

With the help of tools like those mentioned here, it is possible to quickly ascertain whether source code copying has taken place.  Prior to the appearance of these tools, it was necessary for an expert in software development to manual perform dozens of searching and comparison steps.  This strategy can be used early in litigation with the benefit of dramatically reducing the cost of such analysis.  The same tools can also be employed proactively by companies seeking to reduce their risks of copyright infringement litigation.