February 20, 2011
Georedundancy and Service Availability Proposal
A best practice for mitigating the risk of site destruction, denial or unavailability causing disastrous loss of critical services is to deploy redundant systems in a geographically separated location; this practice is called geographic redundancy or georedundancy. Enterprises deploying a geographically redundant system may spend twice as much as for a standalone configuration up front, and will have higher on going operating expenses to maintain the redundant recovery site. While the business continuity benefits of georedundancy are easy to understand, the feasible and likely service availability benefits of georedundancy are not generally well understood. This paper considers the high level question of what service availability improvement is feasible and likely with georedundancy. The service availability benefit is characterized both for product attributable failures, as well as for non-product attributable failures, including site disasters. Furthermore, this paper considers other topics, such as whether georedundancy can/should be used to reduce planned downtime for activities such as hardware growth and software update, whether it is better to only do a georedundancy failover for a failed element or for an entire cluster that contains the failed element, what availability-related georedundancy requirements should apply to each network element and to clusters of elements, and what network element and cluster testing is appropriate to assure expected service availability benefits of georedundancy.