Server Architectures: Multiprocessors, Clusters, Parallel Systems, Web Servers, and Storage Solutions

In this chapter, we will present some examples of high-availability systems solutions offered by both manufacturers of information processing systems and operating systems suppliers.
We can divide these solutions into two main categories:
Hardware-based
Software-based
We should note that theoretically these two types of solutions are non-exclusive and that it is possible to combine the two approaches within a single system. In practice, however, systems tend to end up in one or the other of the categories, for reasons of development cost. We proceed then to a comparison of these solutions. Here we must note that implementing a software-based solution requires that the hardware platform have a number of properties, in particular in error detection, but also in the areas of redundancy and repair tolerance. That is, one cannot use a software-only approach to make just any hardware platform capable of continuous service.
Hardware-based solutions aim at tolerating hardware failures, and rely on redundancy to do so. Their effect is to make hardware failures invisible to the applications (although there may be some perceptible slow-down immediately following the detection of a hardware failure, the system rapidly resumes normal performance levels). A pure hardware-based approach cannot hide software failures.
Software-based solutions aim at tolerating both hardware and software failures. For software, whether system or application, faults tend to be transient that is, Heisenbugs. To provide Bohrbug tolerance would require a solution that, in effect, ran multiple different software solutions concurrently, using (for example) majority voting to select actions at...