Safety Instrumented Systems: Design, Analysis, and Justification, 2nd Edition

In order to measure and compare the performance of different systems, one needs a common frame of reference, a common understandable set of terms. A number of different performance terms have been used over the years, such as availability and reliability. Unfortunately, these seemingly trivial terms have caused problems. If you ask four people what they mean, you'll likely get four different answers. The main reason stems from the two different failure modes of safety systems discussed earlier. If there are two failure modes, there should be two different performance terms, one for each failure mode. How can one term, such as availability, be used to describe the performance of two different failure modes? If someone says a valve fails once every 10 years, what does that actually mean? What if 10% of the failures are "safe" (e.g., fail closed) and 90% are "dangerous" (e.g., fail stuck)? What if the numbers are switched? What if they're both the same? One overall number simply doesn't tell you enough.
Another reason the term availability causes confusion is the typical range of numbers encountered. Anything over 99% sounds impressive. Whenever PLC vendors give performance figures, it always ends up being a virtually endless string of nines. Stop and consider whether there's a significant difference between 99% and 99.99% availability. It's less than 1%, right? True, but the numbers also differ by two orders of magnitude! It can be confusing!
In terms of safety, some prefer to the...