10.5: Fault Injection

10.5 Fault Injection

As mentioned previously in this chapter, simulating a system to obtain its reliability or similar attributes requires the knowledge of parameters such as the components' failure rates. These can be obtained either through lengthy observations, or much faster through fault injection experiments. In such experiments, various faults are injected either into a simulation model of the target system or a hardware and software prototype of the system. The behavior of the system in the presence of each fault is then observed and classified. Parameters that can be estimated based on such experiments include the probability that a fault will cause an error, and the probability that the system will perform successfully the actions required to recover from that error (the latter probability is often called coverage factor, see Chapter 2). These actions consist of detecting the fault, identifying the system component affected by the fault, and taking an appropriate recovery action which may involve system reconfiguration. Each of these actions takes time that is not a constant but may change from one fault to another and may also depend on the current workload. Thus, fault injection experiments, in addition to providing estimates for the coverage factor, can also be used to estimate the distribution of the individual delay associated with each of the above actions.

In addition, fault injection experiments can be used to evaluate and validate the system dependability. For example, errors in the implementation of fault tolerance mechanisms can be discovered, and system components whose failure is...

< Previous Excerpt Next Excerpt >

Purchase This Book

Fault-Tolerant Systems

TABLE OF CONTENTS

10.5: Fault Injection

10.5 Fault Injection

Contact Preferences

This is embarrasing...

Customize Your GlobalSpec Experience

Select Your Free Newsletters

Industry Newsletters

Select Your Free Product Alerts

This is embarrasing...