ABSTRACT
Failure detectors are fundamental building blocks in distributed systems. Multi-node failure detectors, where the detector is tasked with monitoring N other nodes, play a critical role in overlay networks and peer-to-peer systems. In such networks, failures need to be detected quickly and with low overhead. Achieving these properties simultaneously poses a difficult tradeoff between detection latency and resource consumption.
In this paper, we examine this central tradeoff, formalize it as an optimization problem and analytically derive the optimal closed form formulas for multi-node failure detectors. We provide two variants of the optimal solution for optimality metrics appropriate for two different deployment scenarios. √s-LM is a latency-minimizing optimal failure detector that achieves the lowest average failure detection latency given a fixed bandwidth constraint for system maintenance. √s-BM is a bandwidth-minimizing optimal failure detector that meets a desired detection latency target with the least amount of bandwidth consumed. We evaluate our optimal results with node lifetimes chosen from bimodal and Pareto distributions, as well as real-world trace data from PlanetLab hosts, web sites and Microsoft PCs. Compared to standard failure detectors in wide use, √s failure detectors reduce failure detection latencies by 40% on average for the same bandwidth consumption, or conversely, reduce the amount of bandwidth consumed by 30% for the same failure detection latency.
- M. K. Aguilera, W. Chen, and S. Toueg. Heartbeat: a Timeout-free Failure Detector for Quiescent Reliable Communication. In Proceedings of the International Workshop on Distributed Algorithms, Saarbrücken, Germany, Sept. 1997. Google ScholarDigital Library
- M. K. Aguilera, W. Chen, and S. Toueg. Failure Detection and Consensus in the Crash-Recovery Model. In Proceedings of the International Symposium on Distributed Computing, Andros, Greece, Sept. 1998. Google ScholarDigital Library
- Y. Amir, D. Dolev, S. Kramer, and D. Malkhi. Transis: A Communication Sub-system for High Availability. In Proceedings of the International Symposium on Fault-Tolerant Computing, Boston, Massachussetts, July 1992.Google Scholar
- M. Bakkaloglu, J. J. Wylie, C. Wang, and G. R. Ganger. On Correlated Failures in Survivable Storage Systems. Technical Report CMU-CS-02-129, School of Computer Science, Carnegie Mellon University, May 2002.Google ScholarCross Ref
- M. Bertier, O. Marin, and P. Sens. Implementation and Performance Evaluation of an Adaptable Failure Detector. In Proceedings of the International Conference on Dependable Systems and Networks, Washington, DC, June 2002. Google ScholarDigital Library
- M. Bertier, O. Marin, and P. Sens. Performance Analysis of Hierarchical Failure Detector. In Proceedings of the International Conference on Dependable Systems and Networks, San Francisco, California, June 2003.Google ScholarCross Ref
- W. J. Bolosky, J. R. Douceur, D. Ely, and M. Theimer. Feasibility of a Serverless Distributed File System Deployed on an Existing Set of Desktop PCs. In Proceedings of the SIGMETRICS Conference, Santa Clara, California, June 2000. Google ScholarDigital Library
- M. Castro, M. Costa, and A. I. T. Rowstron. Performance and Dependability of Structured Peer-to-Peer Overlays. In Proceedings of the International Conference on Dependable Systems and Networks, Florence, Italy, June 2004. Google ScholarDigital Library
- T. D. Chandra and S. Toueg. Unreliable Failure Detectors for Reliable Distributed Systems. Journal of the ACM, 43(2):225--267, Mar. 1996. Google ScholarDigital Library
- W. Chen. On the Quality of Service of Failure Detectors. PhD thesis, Cornell University, May 2000. Google ScholarDigital Library
- W. Chen, S. Toueg, and M. K. Aguilera. On the Quality of Service of Failure Detectors. IEEE Transactions on Computers, 51(5), May 2002. Google ScholarDigital Library
- X. Defago, P. Felber, and A. Schiper. Optimization Techniques for Replicating CORBA Objects. In Proceedings of the IEEE International Workshop on Object-oriented Real-time Dependable Systems, Santa Barbara, California, Jan. 1999. Google ScholarDigital Library
- X. Defago, P. Urban, N. Hayashibara, and T. Katayama. Definition and Specification of Accrual Failure Detectors. In Proceedings of the International Conference on Dependable Systems and Networks, Yokohama, Japan, June 2005. Google ScholarDigital Library
- J. Dunagan, N. J. A. Harvey, M. B. Jones, D. Kostic, M. Theimer, and A. Wolman. FUSE: Lightweight Guaranteed Distributed Failure Notification. In Proceedings of the Symposium on Operating System Design and Implementation, San Francisco, California, Dec. 2004. Google ScholarDigital Library
- S. A. Fakhouri, G. S. Goldszmidt, and I. Gupta. Gulfstream - A System for Dynamic Topology Management in Multi-domain Server Farms. Technical Report RC21954, IBM T.J. Watson Research Center, Feb. 2001.Google ScholarCross Ref
- C. Fetzer, M. Raynal, and F. Tronel. An Adaptive Failure Detection Protocol. In Proceedings of the Pacific Rim Symposium on Dependable Computing, Seoul, Korea, Dec. 2001. Google ScholarDigital Library
- B. Glade, K. P. Birman, R. Cooper, and R. van Renesse. Light-Weight Process Groups in the ISIS System. Distributed Systems Engineering, 1(1):29--36, Sept. 1993.Google ScholarCross Ref
- Gnutella. The Gnutella Protocol Specification. http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf.Google Scholar
- P. B. Godfrey, S. Shenker, and I. Stoica. Minimizing Churn in Distributed Systems. In Proceedings of the SIGCOMM Conference, Pisa, Italy, Sept. 2006. Google ScholarDigital Library
- I. Gupta, K. Birman, P. Linga, A. Demers, and R. van Rennesse. Kelips: Building an Efficient and Stable P2P DHT Through Increased Memory and Background Overhead. In Proceedings of the International Workshop on Peer-to-Peer Systems, Berkeley, California, Feb. 2003.Google ScholarCross Ref
- I. Gupta, T. D. Chandra, and G. S. Goldszmidt. On Scalable and Efficient Distributed Failure Detectors. In Proceedings of the ACM Symposium on Principles of Distributed Computing, Newport, Rhode Island, Aug. 2001. Google ScholarDigital Library
- N. Hayashibara, X. Defago, and T. Katayama. Two-ways Adaptive Failure Detection with the ψ-failure Detector. In Proceedings of the International Workshop on Adaptive Distributed Systems, Italy, Oct. 2003.Google Scholar
- M. G. Hayden. The Ensemble System. PhD thesis, Cornell University, Jan. 1998. Google ScholarDigital Library
- M. Larrea, S. Arevalo, and A. Fernandez. Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems. In Proceedings of the International Symposium on Distributed Computing, Bratislava, Slovak Republic, Sept. 1999. Google ScholarDigital Library
- M. Larrea, A. Fernandez, and S. Arevalo. Optimal Implementation of the Weakest Failure Detector for Solving Consensus. In Proceedings of the ACM Symposium on Principles of Distributed Computing, Portland, Oregon, July 2000. Google ScholarDigital Library
- J. Li, J. Stribling, R. Morris, and M. F. Kaashoek. Bandwidth-Efficient Management of DHT Routing Tables. In Proceedings of the Symposium on Networked System Design and Implementation, Boston, Massachusetts, May 2005. Google ScholarDigital Library
- L. E. Moser, P. M. Melliar-Smith, D. A. Argarwal, R. K. Budhia, and C. A. Lingley-Papadopoulos. Totem: A Fault-tolerant Multicast Group Communication System. Communications of the ACM, 39(4):54--63, Apr. 1996. Google ScholarDigital Library
- V. Paxson. End-to-End Internet Packet Dynamics. In Proceedings of the SIGCOMM Conference, France, Sept. 1997. Google ScholarDigital Library
- V. Ramasubramanian, R. Peterson, and E. G. Sirer. Corona: A High Performance Publish-Subscribe System for the World Wide Web. In Proceedings of the Symposium on Networked System Design and Implementation, San Jose, California, May 2006. Google ScholarDigital Library
- V. Ramasubramanian and E. G. Sirer. The Design and Implementation of a Next Generation Name Service for the Internet. In Proceedings of the SIGCOMM Conference, Portland, Oregon, Aug. 2004. Google ScholarDigital Library
- R. V. Renesse, Y. Minsky, and M. Hayden. A Gossip-Style Failure Detection Service. In Proceedings of the International Conference and Distributed Systems Platforms and Open Distributed Processing, Vienna, Austria, Sept. 1998.Google ScholarCross Ref
- S. Rhea, D. Geels, T. Roscoe, and J. Kubiatowicz. Handling Churn in a DHT. In Proceedings of the USENIX Technical Conference, Boston, Massachussetts, June 2004. Google ScholarDigital Library
- A. Rowstorn and P. Druschel. Pastry: Scalable, Decentralized Object Location and Routing for Large-scale Peer-to-Peer Systems. In Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms, Heidelberg, Germany, Nov. 2001. Google ScholarDigital Library
- S. Saroiu, K. Gummadi, and S. D. Gribble. Measuring and Analyzing the Characteristics of Napster and Gnutella Hosts. Multimedia Systems Journal, 9(2):170--184, Aug. 2003. Google ScholarDigital Library
- Y. J. Song, V. Ramasubramanian, and E. G. Sirer. Optimal Resource Utilization in Content Distribution Networks. Technical Report TR2005-2004, Cornell University, Computing and Information Science, Ithaca, New York, Nov. 2005.Google Scholar
- P. Stelling, I. Foster, C. Kesselman, C. Lee, and G. von Laszewski. A Fault Detection Service for Wide Area Distributed Computations. In Proceedings of the IEEE Symposium on High Performance Distributed Computing, Chicago, Illinois, July 1998. Google ScholarDigital Library
- J. Stribling. PlanetLab All Pairs Ping Data. http://pdos.csail.mit.edu/~strib/pl_app.Google Scholar
- V. Vishnumurthy and P. Francis. On Heterogeneous Overlay Construction and Random Node Selection in Unstructured P2P Networks. In Proceedings of the INFOCOM Conference, Barcelona, Spain, Apr. 2006.Google ScholarCross Ref
- H. Weatherspoon, B.-G. Chun, C. W. So, and J. Kubiatowicz. Long-Term Data Maintenance in Wide-Area Storage Systems: A Quantitative Approach. Technical Report UCB/CSD-05-1404, University of California-Berkeley, July 2005.Google Scholar
- M. Yajnik, S. B. Moon, J. F. Kurose, and D. F. Towsley. Measurement and Modeling of the Temporal Dependence in Packet Loss. In Proceedings of the INFOCOM Conference, pages 345--352, New York, New York, Mar. 1999.Google Scholar
- S. Zhuang, D. Geels, I. Stoica, and R. Katz. On Failure Detection Algorithms in Overlay Networks. In Proceedings of the INFOCOM Conference, Miami, Florida, Mar. 2005.Google ScholarCross Ref
Index Terms
- Latency and bandwidth-minimizing failure detectors
Recommendations
Latency and bandwidth-minimizing failure detectors
EuroSys'07 Conference ProceedingsFailure detectors are fundamental building blocks in distributed systems. Multi-node failure detectors, where the detector is tasked with monitoring N other nodes, play a critical role in overlay networks and peer-to-peer systems. In such networks, ...
On the Implementation of Unreliable Failure Detectors in Partially Synchronous Systems
Unreliable failure detectors were proposed by Chandra and Toueg as mechanisms that provide information about process failures. Chandra and Toueg defined eight classes of failure detectors, depending on how accurate this information is, and presented an ...
Unreliable failure detectors for reliable distributed systems
We introduce the concept of unreliable failure detectors and study how they can be used to solve Consensus in asynchronous systems with crash failures. We characterise unreliable failure detectors in terms of two properties—completeness and accuracy. We ...
Comments