skip to main content
Effects of replication on availability in distributed database systems
Publisher:
  • Dartmouth College
  • Computer and Information Systems Dept. Nathan Smith Building Hanover, NH
  • United States
Order Number:UMI Order No. GAX92-26549
Bibliometrics
Skip Abstract Section
Abstract

The goal of this thesis is to characterize the effects of replication in distributed databases subject to component faults. Specifically, we characterize the benefits of replication with respect to data availability and duration of failure. Replication is the technique of placing copies of data items at various sites in the network in order to improve database performance. We model a network as a collection of sites and links, each failing and recovering independently according to a Poisson process. We compare the performance of various replication schemes, not only to each other as in previous research, but also to the performance of nonreplicated database systems. We show that known and rather simple protocols for the maintenance of multiple copies are essentially the best possible by comparing them against an unrealizable protocol that knows the future. One of our results is an upper bound on the update availability benefits of replication in any network. The bound states that if a well-placed single copy provides availability A for 0 $\leq$ A $\leq$ 1, no scheme can achieve availability greater than $\sqrt{A}$ for update requests in the same network. This is the best possible bound for any network with single copy availability greater than.25. We verify these results experimentally for large database systems, including systems with significant proportions of read requests. We also present a number of bounds on the benefits of replication with respect to duration of failure. These bounds also show that, for many realistic networks, the duration of failure incurred using a nonreplicated data object is nearly as short as that incurred using replication. Since a single copy is effective only when well-placed, we investigate the single copy placement problem. We establish the complexity of a number of network reliability and data placement problems and give practical, efficient methods of approximating on-line the optimal copy placement in general networks. Thus this thesis shows that, for many database systems, a single copy can be carefully placed and can thereby achieve availability and duration of failure close to that of the best replication schemes.

Contributors
  • Dartmouth College

Index Terms

  1. Effects of replication on availability in distributed database systems

    Recommendations