Troubleshooting: A Technician's Guide, 2nd Edition

What failure is
How hardware fails
How software fails
How environment effects failure rates
Functional failures
Systematic failures
Common cause failures
Root cause analysis
Failure is the condition of not achieving a desired state or function. Everything is subject to failure it is only a matter of when and how. Dealing with failures is a troubleshooter's business, and to troubleshoot successfully, we must first understand how failures occur. Failures can occur due to factors such as a faulty component (hardware), an incorrect line of programming code (software), or a human error (systematic). A system can even have a functional failure when it is working properly but is asked to do something it was not designed to do or when it is exposed to a transient condition that causes a momentary failure. Consequently we can classify failures according to four general types:
Hardware failures
Software failures
Systematic failures
Functional failures
The troubleshooter's primary purpose in an operating plant is to find what has failed so that it can be repaired and be made available again. Keeping the process running properly is the primary concern. At its heart, this means identifying the root cause of a failure.
Failures can have internal or external causes. If the cause is internal to an instrument, that is generally the root cause; the instrument is repaired or replaced and that is the end of the problem. But the root cause may be outside the instrument itself. If a failure happens too...