Defect Prevention Process
The expected impact may be strongly affected not only by whether or not a risk becomes a problem, but also by how long it takes for a problem to become recognized and how long it takes to be fixed once recognized. In one reported example, a telephone company had an error in its billing system that caused it to under bill its customers by about $30 million. By law, the telephone company had to issue corrected bills within thirty days, or write-off the under billing. By the time the telephone company recognized it had a problem, it was too late to collect much of the revenue.
Expected impact is also affected by the action that is taken once a problem is recognized. Once Johnson and Johnson realized it had a problem with Tylenol tampering, it greatly reduced the impact of the problem by quickly notifying doctors, hospitals, distributors, retail outlets, and the public of the problem. While the tampering itself was not related to a software defect, software systems had been developed by Johnson and Johnson to quickly respond to drug related problems. In this case, the key to Johnson & Johnson's successful management of the problem was how it minimized the impact of the problem once the problem was discovered.
Minimizing expected impact involves a combination of the following three strategies:
Eliminate the Risk: While this is not always possible or desirable, there are situations where the best strategy will be simply to eliminate the risk altogether. For example, reducing the scope of a system, or deciding not to use the latest unproven technology are ways to eliminate certain risks altogether.
Reduce the Probability of a Risk Becoming a Problem: Most strategies will fall into this category. Inspections and testing are examples of approaches that reduce, but do not eliminate, the probability of problems.
Reduce the Impact if there is a Problem: In some situations, the risk can not be eliminated, and even when the probability of a problem is low, the expected impact is high. In these cases, the best strategy may be to explore ways to reduce the impact if there is a problem. Contingency plans and disaster recovery plans would be examples of this strategy.
From a conceptual viewpoint, there are two ways to minimize the risk. These are deduced from the annual loss expectation formula. The two ways are to reduce the expected loss per event, or reduce the frequency of an event. If both of these can be reduced to zero, the risk will be eliminated. If the frequency is reduced, the probability of a risk becoming a problem is reduced. If the loss per event is reduced, the impact is reduced when the problem occurs.
There is a well known engineering principle that says that if you have a machine with a large number of components, even if the probability that any given component will fail is small, the probability that one or more components will fail may be unacceptably high. Due to this phenomenon, engineers are careful to estimate the mean time between failure of the machine. If the machine can not be designed with a sufficiently large mean time between failure, the machine can not be made. When applied to software development, this principle would say that unless the overall expected impact of the system can be made sufficiently low, do not develop the system.
|