Abstract
This article explains the full reconfigurable multidecree Paxos (or multi-Paxos) protocol. Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. We provide pseudocode and explain it guided by invariants. We initially avoid optimizations that complicate comprehension. Next we discuss liveness, list various optimizations that make the protocol practical, and present variants of the protocol.
- Peter Alvaro, Tyson Condie, Neil Conway, Joseph M. Hellerstein, and Russell Sears. 2010. I do declare: Consensus in a logic language. SIGOPS Operating Systems Review 43, 4, 25--30. Google ScholarDigital Library
- Hagit Attiya, Amotz Bar-Noy, and Danny Dolev. 1995. Sharing memory robustly in message-passing systems. Journal of the ACM 42, 1, 124--142. Google ScholarDigital Library
- Romain Boichat, Partha Dutta, Svend Frølund, and Rachid Guerraoui. 2003. Deconstructing Paxos. SIGACT News 34, 1, 47--67. Google ScholarDigital Library
- William J. Bolosky, Dexter Bradshaw, Randolph B. Haagens, Norbert P. Kusters, and Peng Li. 2011. Paxos replicated state machines as the basis of a high-performance data store. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI'11). 141--154. Google ScholarDigital Library
- Navin Budhiraja, Keith Marzullo, Fred B. Schneider, and Sam Toueg. 1993. The primary-backup approach. In Distributed Systems (2nd ed.), S. Mullender (Ed.). ACM Press/Addison-Wesley, New York, NY, 199--216. Google ScholarDigital Library
- Mike Burrows. 2006. The chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI'06). 335--350. Google ScholarDigital Library
- Tushar Deepak Chandra and Sam Toueg. 1991. Unreliable failure detectors for asynchronous systems (preliminary version). In Proceedings of the 10th Annual ACM Symposium on Principles of Distributed Computing (PODC'91). ACM, New York, NY, 325--340. Google ScholarDigital Library
- Roberto De Prisco, Butler W. Lampson, and Nancy Lynch. 2000. Revisiting the Paxos algorithm. Theoretical Computer Science 243, 1--2, 35--91. Google ScholarDigital Library
- Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. 1985. Impossibility of distributed consensus with one faulty process. Journal of the ACM 32, 2, 374--382. Google ScholarDigital Library
- Eli Gafni and Leslie Lamport. 2003. Disk Paxos. Distributed Computing 16, 1, 1--20. Google ScholarDigital Library
- Cary G. Gray and David R. Cheriton. 1989. Leases: An efficient fault-tolerant mechanism for distributed file cache consistency. In Proceedings of the 12th ACM Symposium on Operating Systems Principles (SOSP'89). ACM, New York, NY, 202--210. Google ScholarDigital Library
- Jonathan Kirsch and Yair Amir. 2008a. Paxos for System Builders. Technical Report CNDS-2008-2. Johns Hopkins University, Baltimore, MD.Google Scholar
- Jonathan Kirsch and Yair Amir. 2008b. Paxos for system builders: An overview. In Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware (LADIS'08). ACM, New York, NY, Article No. 3. Google ScholarDigital Library
- Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Communations of the ACM 21, 7, 558--565. Google ScholarDigital Library
- Leslie Lamport. 1998. The part-time parliament. ACM Transactions on Computer Systems 16, 2, 133--169. Google ScholarDigital Library
- Leslie Lamport. 2001. Paxos made simple. ACM SIGACT News 32, 4, 51--58.Google Scholar
- Leslie Lamport. 2005. Generalized Consensus and Paxos. Technical Report MSR-TR-2005-33. Microsoft Research, Mountain View, CA. Available at http://research.microsoft.com/pubs/64631/tr-2005-33.pdfGoogle Scholar
- Leslie Lamport. 2006. Fast Paxos. Distributed Computing 19, 2, 79--103.Google ScholarDigital Library
- Leslie Lamport, Dahlia Malkhi, and Lidong Zhou. 2008. Stoppable Paxos. Technical Report. Microsoft Research, Mountain View, CA. Available at http://research.microsoft.com/apps/pubs/default.aspx?id=101826Google Scholar
- Leslie Lamport, Dahlia Malkhi, and Lidong Zhou. 2009. Vertical Paxos and primary-backup replication. In Proceedings of the 28th ACM Symposium on Principles of Distributed Computing (PODC'09). ACM, New York, NY, 312--313. Google ScholarDigital Library
- Leslie Lamport and Mike Massa. 2004. Cheap Paxos. In Proceedings of the 2004 International Conference on Dependable Systems and Networks (DSN'04). IEEE, Los Alamitos, CA, 307--315. Google ScholarDigital Library
- Butler W. Lampson. 1996. How to build a highly available system using consensus. In Proceedings of the 10th International Workshop on Distributed Algorithms (WDAG'96). 1--17. Google ScholarDigital Library
- Butler W. Lampson. 2001. The ABCD's of Paxos. In Proceedings of the 20th Annual ACM Symposium on Principles of Distributed Computing (PODC'01). ACM, New York, NY, 13--14. Google ScholarDigital Library
- Harry C. Li, Allen Clement, Amitanand S. Aiyer, and Lorenzo Alvisi. 2007. The Paxos register. In Proceedings of the 26th IEEE International Symposium on Reliable Distributed Systems (SRDS'07). IEEE, Los Alamitos, CA, 114--126. Google ScholarDigital Library
- Jacob R. Lorch, Atul Adya, William J. Bolosky, Ronnie Chaiken, John R. Douceur, and Jon Howell. 2006. The SMART way to migrate replicated stateful services. In Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006 (EuroSys'06). ACM, New York, NY, 103--115. Google ScholarDigital Library
- Yanhua Mao, Flavio P. Junqueira, and Keith Marzullo. 2008. Mencius: Building efficient replicated state machines for WANs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI'08). 369--384. Google ScholarDigital Library
- David Mazières. 2007. Paxos Made Practical. Technical Report. Stanford University, Stanford, CA. Available at http://www.scs.stanford.edu/∼dm/home/papers/paxos.pdfGoogle Scholar
- Hein Meling and Leander Jehl. 2013. Tutorial summary: Paxos explained from scratch. In Principles of Distributed Systems. Lecture Notes in Computer Science, Vol. 8304. Springer, 1--10. Google ScholarDigital Library
- C. Mohan, Don Haderle, Bruce Lindsay, Hamid Pirahesh, and Peter Schwarz. 1992. ARIES: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Transactions on Database Systems 17, 1, 94--162. Google ScholarDigital Library
- Iulian Moraru, David G. Andersen, and Michael Kaminsky. 2013. There is more consensus in egalitarian parliaments. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP'13). ACM, New York, NY, 358--372. Google ScholarDigital Library
- Brian M. Oki and Barbara H. Liskov. 1988. Viewstamped replication: A new primary copy method to support highly-available distributed systems. In Proceedings of the 7th Annual ACM Symposium on Principles of Distributed Computing (PODC'88). ACM, New York, NY, 8--17. Google ScholarDigital Library
- Diego Ongaro and John Ousterhout. 2014. In search of an understandable consensus algorithm. In Proceedings of the 2014 USENIX Annual Technical Conference (ATC'14). 305--319. Google ScholarDigital Library
- Richard D. Schlichting and Fred B. Schneider. 1983. Fail-stop processors: An approach to designing fault-tolerant computing systems. ACM Transactions on Computer Systems 1, 3, 222--238. Google ScholarDigital Library
- Fred B. Schneider. 1990. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys 22, 4, 299--319. Google ScholarDigital Library
- Robert H. Thomas. 1979. A majority consensus approach to concurrency control for multiple copy databases. ACM Transactions on Database Systems 4, 2, 180--209. Google ScholarDigital Library
- Robbert van Renesse, Nicolas Schiper, and Fred B. Schneider. 2014. Vive la différence: Paxos vs. Viewstamped Replication vs. Zab. IEEE Transactions on Dependable and Secure Computing PP, 99, 1.Google Scholar
- Robbert van Renesse, Fred B. Schneider, and Johannes Gehrke. 2011. Nerio: Leader Election and Edict Ordering. Technical Report. Cornell University, Ithaca, NY. Available at http://hdl.handle.net/1813/23631Google Scholar
Index Terms
- Paxos Made Moderately Complex
Recommendations
Paxos Made Wireless: Consensus in the Air
EWSN '19: Proceedings of the 2019 International Conference on Embedded Wireless Systems and NetworksMany applications in low-power wireless networks require complex coordination between their members. Swarms of robots or sensors and actuators in industrial closed-loop control need to coordinate within short periods of time to execute tasks. Failing to ...
Seamless Paxos coordinators
The Paxos algorithm requires a single correct coordinator process to operate. After a failure, the replacement of the coordinator may lead to a temporary unavailability of the application implemented atop Paxos. So far, this unavailability has been ...
When paxos meets erasure code: reduce network and storage cost in state machine replication
HPDC '14: Proceedings of the 23rd international symposium on High-performance parallel and distributed computingPaxos-based state machine replication is a key technique to build highly reliable and available distributed services, such as lock servers, databases and other data storage systems. Paxos can tolerate any minority number of node crashes in an ...
Comments