skip to main content
article
Free Access

Consensus in the presence of partial synchrony

Authors Info & Claims
Published:01 April 1988Publication History
Skip Abstract Section

Abstract

The concept of partial synchrony in a distributed system is introduced. Partial synchrony lies between the cases of a synchronous system and an asynchronous system. In a synchronous system, there is a known fixed upper bound Δ on the time required for a message to be sent from one processor to another and a known fixed upper bound Φ on the relative speeds of different processors. In an asynchronous system no fixed upper bounds Δ and Φ exist. In one version of partial synchrony, fixed bounds Δ and Φ exist, but they are not known a priori. The problem is to design protocols that work correctly in the partially synchronous system regardless of the actual values of the bounds Δ and Φ. In another version of partial synchrony, the bounds are known, but are only guaranteed to hold starting at some unknown time T, and protocols must be designed to work correctly regardless of when time T occurs. Fault-tolerant consensus protocols are given for various cases of partial synchrony and various fault models. Lower bounds that show in most cases that our protocols are optimal with respect to the number of faults tolerated are also given. Our consensus protocols for partially synchronous processors use new protocols for fault-tolerant “distributed clocks” that allow partially synchronous processors to reach some approximately common notion of time.

References

  1. 1 ATTIYA, A., DOLEV, D., AND GIL, J. Asynchronous Byzantine consensus. In Proceedings of the 3rd ACM Symposium on Principles of Distributed Computing (Vancouver, B.C., Canada, Aug. 27-29). ACM, New York, 1984, pp. 119-133. Google ScholarGoogle Scholar
  2. 2 BRACHA, G., AND TOUEG, S. Asynchronous consensus and broadcast protocols. J. ACM 32, 4 (Oct. 1985), 824-840. Google ScholarGoogle Scholar
  3. 3 DOLEV, D., AND STRONG, H. R. Authenticated algorithms for Byzantine agreement. SIAM J. Comput. 12 (1983), 656-666.Google ScholarGoogle Scholar
  4. 4 DOLEV, D., DWORK, C., AND STOCKMEYER, L. On the minimal synchronism needed for distributed consensus, d. ACM 34, 1 (Jan. 1987), 77-97. Google ScholarGoogle Scholar
  5. 5 DOLEV, D., FISCHER, i. J., FOWLER, R., LYNCH, N. A., AND STRONG, H.R. Efficient Byzantine agreement without authentication. Inf. Control 52 (1982), 257-274.Google ScholarGoogle Scholar
  6. 6 DOLEV, D., LYNCH, N. A., PINTER, S. S., STARK, E. W., AND WEIHL, W.E. Reaching approximate agreement in the presence of faults. J. ACM 33, 3 (July 1986), 499-516. Google ScholarGoogle Scholar
  7. 7 DWORK, C., AND MOSES, Y. Knowledge and common knowledge in a Byzantine environment I: Crash failures. In Proceedings of the 1986 Conference on Theoretical Aspects of Reasoning about Knowledge (Monterey, Calif., Mar. 19-22). Kaufmann, Los Altos, Calif., 1986, pp. 149-169. Google ScholarGoogle Scholar
  8. 8 FISCHER, M.J. The consensus problem in unreliable distributed systems (a brief survey). Rep. YALEU/DSC/RR-273. Dept. of Computer Science, Yale Univ., New Haven, Conn., June 1983.Google ScholarGoogle Scholar
  9. 9 FISCHER, M. J., AND LAMPORT, L. Byzantine generals and transaction commit protocols. Tech. Rep. Op. 62, SRI International, Menlo Park, Calif., 1982.Google ScholarGoogle Scholar
  10. 10 FISCHER, i. J., LYNCH, N. A., AND PATERSON, M.S. Impossibility of distributed consensus with one faulty process. J. ACM 32, 2 (Apr. 1985), 374-382. Google ScholarGoogle Scholar
  11. 11 GARCIA-MOLINA, H., PITI"ELLI, F., AND DAVIOSON, S. Is Byzantine agreement useful in a distributed database? In Proceedings of the 3rd SIGACT-SIGMOD Symposium on Principles of Database Systems (Waterloo, Ont., Canada, Apr. 2-4). ACM, New York, 1984, pp. 61-69. Google ScholarGoogle Scholar
  12. 12 GRAY, J. N. Notes on database operating systems. In Operating Systems: An Advanced Course. Lecture Notes in Computer Science, vol. 60. Springer-Verlag, New York, 1978, pp. 393-481. Google ScholarGoogle Scholar
  13. 13 LAMPORT, L. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July 1978), 558-565. Google ScholarGoogle Scholar
  14. 14 LAMPORT, L. The weak Byzantine generals problem. J. ACM 30, 3 (July 1983), 668-676. Google ScholarGoogle Scholar
  15. 15 LAMPORT, L., SHOSTAK, R., AND PEASE, M. The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4, 3 (July 1982), 382-401. Google ScholarGoogle Scholar
  16. 16 PINTER, S. Distributed computation systems: Modelling, verification and algorithms. Ph.D. dissertation. Dept. of Computer Science, Boston Univ., Boston, Mass., 1984. Google ScholarGoogle Scholar
  17. 17 REISCHUK, R. A new solution for the Byzantine generals problem. Inf. Control 64 (1985), 23-42. Google ScholarGoogle Scholar
  18. 18 SCHNEIDER, F.B. Byzantine generals in action: Implementing fail-stop processors. ACM Trans. Comput. Syst. 2, 2 (May 1984), 145-154. Google ScholarGoogle Scholar
  19. 19 SKEEN, D. A quorum based commit protocol. Tech. Rep. TR 82-483, Computer Science Dept., Cornell Univ., Ithaca, N.Y., Feb. 1982. Google ScholarGoogle Scholar
  20. 20 SRIKANTH, T. K., AND TOUEG, S. Simulating authenticated broadcasts to derive simple faulttolerant algorithms. Rep. 84-623, Computer Science Dept., Cornell Univ., Ithaca, N.Y., 1984.Google ScholarGoogle Scholar

Index Terms

  1. Consensus in the presence of partial synchrony

              Recommendations

              Reviews

              Jason Gait

              A distributed set of processors reaches consensus on a value when the correctly performing processors decide on the same value. This outcome is subject to the conditions that if those correct processors begin with the same value, then they must agree on the result, and that the presence of faulty processors does not prevent an agreement among the correct ones. A synchronous system has known upper bounds on message transit time and processor speed. For a partially synchronous system, either the upper bounds exist but are not a priori known, or the bounds are known but are only valid at some undetermined future time. For processor synchronous systems, it is known that the existence of a moment after which the message transit time will respect a known bound is equivalent to the conditions of safety (i.e., correct processors do not disagree and do make valid decisions) and termination (i.e., correct processors will eventually decide). The authors study fault-tolerant consensus protocols for partially synchronous systems and for various fault models. These protocols are optimal with respect to the number of faults tolerated. The resiliency of a synchronization model is the maximum number of faults that can be tolerated by any protocol in the model. If either communication or processor operation is asynchronous, then it is known that no resilient consensus protocol exists. It is thus meaningful to study partially synchronous systems. In the general method that the authors use, each processor sends messages to the others to see if they agree with the value it has found. A processor decides on a value when it receives sufficiently many acknowledgements from other processors. The protocols depend on distributed clocks maintained by a number of processors; this number is linear in the number of faults tolerated. The authors' protocols reach consensus in time polynomial in the system parameters, and the communication overhead to reach consensus is also polynomial in the system parameters.

              Access critical reviews of Computing literature here

              Become a reviewer for Computing Reviews.

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image Journal of the ACM
                Journal of the ACM  Volume 35, Issue 2
                April 1988
                205 pages
                ISSN:0004-5411
                EISSN:1557-735X
                DOI:10.1145/42282
                Issue’s Table of Contents

                Copyright © 1988 ACM

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 1 April 1988
                Published in jacm Volume 35, Issue 2

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • article

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader