Chapter 1 - Introduction

Before designing a dependable system, we need to have enough knowledge of the system’s
faults, errors, and failures of the dependable techniques including coding techniques, and of
the design process for practical codes. This chapter provides the background on code design
for dependable systems.

1.1 FAULTS AND FAILURES

First, we need to make clear the difference between three frequently encountered technical
terms in designing dependable systems—namely faults, errors, and failures. These terms
are fully defined in [LAPR92, AVIZ04]. Faults are primarily identified as the generic
sources of abnormalities that alter the operation of circuits, devices, modules, systems, and /
or software products. Failure can arise from any type of possible faults. Faults are often
called defects when they occur in hardware and bugs when in software.

1.1.1 Faults

As causes of failure, faults are sometimes predictable but difficult to identify. Faults can occur
during any stage in a system’s or product’s life cycle: during specification, design, production,
or operation. Faults are characterized by their origin and their nature [LAPR92, GEFF02].

Origin of Faults Timing is a factor because faults can provoke failure in the operation phase
at any one of a system’s previous life phases: specification, design, production, and operation.

During the specification phase, for example, an incomplete definition of services may
lead to different interpretations by the client, the designer, and the user. Eventually, in the
operation phase, the failure becomes evident when the services provided differ from the
user’s expectations.

During the design and the production phases, for example, a designer’s lack of
sufficient knowledge of architectural levels, structural levels, and the like, may result in a
type of physical defect that induces, for example, short or open circuits.

During the operation phase, for example, an elevation of ambient temperature can cause
electronic devices and products to malfunction.

Nature of Faults During the specification and the design phases, faults that occur are called
human-made faults. During the production and the operation phases, these may occur physical
faults, hardware faults, or solid faults. Each type is due to some physical abnormality in the
component arising from aging or defective materials. Faults are of two types in their duration:

Permanent. These faults arise, for example, from a power supply breakdown,
defective open or short circuits, bridging or open lines, electro-migration, and so
forth. The defects in the input / output of the logical circuits or lines are called
stuck-at ‘1’ faults or stuck-at ‘0’ faults.
Temporal. These faults can be transient or intermittent. Transient faults occur
randomly and externally because of external noise, namely environmental problems
of external electromagnetic waves but also external particles such as α-particles and
neutrons. Intermittent faults occur randomly but internally because of unstable or
marginally stable hardware, varying hardware or software state as a function of load
or activity, or signal coupling (i.e., crosstalk) between adjacent signal lines. Some
intermittent faults may be due to glitches [LO05], which are unpredictable spike
noise pulses occurring and propagated especially in large exclusive-OR (XOR) tree
networks (see Chapter 8). Parallel decoding circuits of error control codes with
large code lengths require large exclusive-OR tree networks, so glitches can become
serious problems. This topic will be covered in more detail in Section 8.3.

Next Excerpt >

TABLE OF CONTENTS

Chapter 1 - Introduction

Contact Preferences

This is embarrasing...

Customize Your GlobalSpec Experience

Select Your Free Newsletters

Industry Newsletters

Select Your Free Product Alerts

This is embarrasing...