Archive for November, 2003

The Limits of Knowledge

Friday, November 7th, 2003 Michael Barr

The practice of engineering has often been likened to a form of art. It is, I think, the art of making scientific tradeoffs. As scientists with a practical, rather than academic or theoretical, focus, we are often challenged to build things on the basis of information at or very near the boundaries of what is known to man.

In virtually all endeavors of engineers, there are unknowns, subtleties, and complexities over which we exercise limited control. The cost, in engineering time and resources, to fully comprehend everything about a system is in some cases unbounded; such a thorough analysis is generally at least cost prohibitive. If the product works, we can’t often afford to do much more than ship it and move on to the next project.

Just as tradeoffs are made in the area of features or implementation techniques, so too must tradeoffs be made in the area of knowledge. It is rarely possible to build a saleable product (that will also earn our employer profits) while at the same time completely understanding all of the possible implications of our numerous design and implementation decisions.

Simply put: components fail. And when individual components fail they can take even carefully-designed systems down with them. Such system failures do sometimes also take the lives of their operators or other people. Catastrophes like this are unfortunate—and bound to increase as people rely increasingly on technological solutions to everyday problems.

The designers of each system must decide how much time and money to spend investigating the dark corners. Those designing pacemakers and airplanes, for example, are responsible to shine the light of knowledge brightly in all corners of their designs; whereas the designers of stereos and televisions can leave a great deal more to chance.

There are, of course, areas of engineering that suffer from the need for thorough analysis but are not profit driven. Manned missions to space, such as those conducted by NASA, are of this nature. Tremendous efforts are made by the engineers working at NASA to understand all of the complexities and potential failure points of the Space Shuttles. Unfortunately, there is likely an unbounded amount of work to be done; these systems have millions of individual components and operate in unforgiving and poorly understood environments. And there’s only limited time to show results.

As the losses of the Challenger and Columbia have demonstrated, sometimes it is a part of a design that is thought to be reasonably well understood that is actually the most dangerous. In both cases, very similar past failures had been observed, documented, and discussed by engineers—yet the true problem and the danger it posed was not fully comprehended until after each catastrophe struck.

I don’t blame the engineers at NASA for the loss of either shuttle; in both cases they knew there was a problem but had too many other, seemingly more important, concerns. I’m willing to let NASA administrators and their overseers decide if managerial mistakes were made and, if so, how to correct them. But all engineers everywhere should learn from NASA’s mission failures: What is the true source of the problem in your system? What danger does it pose? How can you overcome organizational challenges to see the proper solution through?