Reliable Communication in the Presence of Failures

Reliable Communication in the Presence of FailuresJuly 1985

July 1985

1985 Technical Report

Publisher:

Cornell University
PO Box 250, 124 Roberts Place Ithaca, NY
United States

Published:01 July 1985

Bibliometrics

Abstract

We report on the design and correctness of a communication facility for a distributed computer system. The facility provides support for fault tolerant process groups in the form of a family of reliable multicast protocols that can be used both in local and wide-area networks. These protocols attain high levels of concurrency while respecting application-specific delivery ordering constraints, and have varying cost and performance that depends on the degree of ordering desired. In particular, a protocol that enforces causal delivery orderings is introduced, and shown to be a valuable alternative to conventional asynchronous communication protocols. The facility also ensures that the processes belonging to a fault tolerant process group will observe consistent orderings of events affecting the group as a whole, including process failures, recoveries, migration, and dynamic changes to group properties like member rankings. A review of several uses for the protocols in the ISIS system, which supports fault-tolerant resilient objects and bulletin boards, illustrates the significant simplification of higher-level algorithms; made possible by our approach.

Cited By

Birman K A history of the virtual synchrony replication model Replication, (91-120)

Contributors

Ken Birman
Cornell University
- Publication Years1982 - 2023
- Publication counts204
- Citation count6,310
- Available for Download67
- Downloads (cumulative)69,344
- Downloads (12 months)4,199
- Downloads (6 weeks)763
- Average Downloads per Article1,035
- Average Citation per Article31
View Full Profile
Thomas A Joseph
Cornell University
- Publication Years1984 - 2000
- Publication counts28
- Citation count1,419
- Available for Download5
- Downloads (cumulative)10,174
- Downloads (12 months)603
- Downloads (6 weeks)98
- Average Downloads per Article2,035
- Average Citation per Article51
View Full Profile

Recommendations

Reliable communication in the presence of failures

The design and correctness of a communication facility for a distributed computer system are reported on. The facility provides support for fault-tolerant process groups in the form of a family of reliable multicast protocols that can be used in both ...
Read More
On Quiescent Reliable Communication

We study the problem of achieving reliable communication with quiescent algorithms (i.e., algorithms that eventually stop sending messages) in asynchronous systems with process crashes and lossy links. We first show that it is impossible to solve this ...
Read More
Reliable Broadcast in Radio Networks with Locally Bounded Failures

This paper studies the reliable broadcast problem in a radio network with locally bounded failures. We present a sufficient condition for achievability of reliable broadcast in a general graph subject to Byzantine/crash-stop failures. We then consider ...
Read More

Comments

Browse Reports

Sections

Cited By

Reliable communication in the presence of failures

On Quiescent Reliable Communication

Reliable Broadcast in Radio Networks with Locally Bounded Failures

Save to Binder

Sections

Cited By

Save to Binder

Recommendations

Reliable communication in the presence of failures

On Quiescent Reliable Communication

Reliable Broadcast in Radio Networks with Locally Bounded Failures