Chapter 4: When Exchange Servers Fall Down

If you recall my earlier discussion of the black box of downtime, I was very adamant about pointing out that downtime is not a singular event, but a series of individual outage components. My main emphasis was to point out that, by identifying and evaluating each individual component of a downtime occurrence, we can look for ways to reduce the overall duration of a downtime event. By looking inside each outage point, we may be able to find possible areas of process improvement that will substantially reduce or even eliminate periods that are unnecessary or too lengthy. In Chapter 2, I identified seven points or components of a typical outage. These were prefailure errors, the failure point, the notification point, the decision point, the recovery action point, the postrecovery point, and the normal operational point. Within each of these components of downtime we can find many subcomponents in which we may be able to find errors or oversights that, once addressed, can be substantially reduced or eliminated.

It is the recovery action point that we will focus on in this chapter. I believe that this component of downtime is responsible for the majority of the chargeable time within a downtime event. For example, I have seen many organizations rack up hours and hours of downtime simply because they did not have (or could not find) a good backup or because they interfered with Exchange Server s own recovery measures. I believe that lack of knowledge and poor operational procedures can create...

< Previous Excerpt Next Excerpt >

Purchase This Book

Mission Critical Microsoft Exchange 2003

TABLE OF CONTENTS

Chapter 4: When Exchange Servers Fall Down

Contact Preferences

This is embarrasing...

Customize Your GlobalSpec Experience

Select Your Free Newsletters

Industry Newsletters

Select Your Free Product Alerts

This is embarrasing...