ABSTRACT
Amazon Aurora is a high-throughput cloud-native relational database offered as part of Amazon Web Services (AWS). One of the more novel differences between Aurora and other relational databases is how it pushes redo processing to a multi-tenant scale-out storage service, purpose-built for Aurora. Doing so reduces networking traffic, avoids checkpoints and crash recovery, enables failovers to replicas without loss of data, and enables fault-tolerant storage that heals without database involvement. Traditional implementations that leverage distributed storage would use distributed consensus algorithms for commits, reads, replication, and membership changes and amplify cost of underlying storage. In this paper, we describe how Aurora avoids distributed consensus under most circumstances by establishing invariants and leveraging local transient state. Doing so improves performance, reduces variability, and lowers costs.
- David F. Bacon, Nathan Bales, Nico Bruno, Brian F. Cooper, Adam Dickinson, Andrew Fikes, Campbell Fraser, Andrey Gubarev, Milind Joshi, Eugene Kogan, Alexander Lloyd, Sergey Melnik, Rajesh Rao, David Shue, Christopher Taylor, Marcel van der Holst, and Dale Woodford. 2017. Spanner: Becoming a SQL System. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). ACM, New York, NY, USA, 331--343. Google ScholarDigital Library
- Larry Carpenter, Joseph Meeks, Charles Kim, Bill Burke, Sonya Carothers, Joydip Kundu, Michael Smith, and Nitin Vengurlekar. 2009. Oracle Data Guard 11G Handbook (bibinfoedition1 ed.). McGraw-Hill, Inc., New York, NY, USA. Google ScholarDigital Library
- David K. Gifford. 1979. Weighted Voting for Replicated Data. In Proceedings of the Seventh ACM Symposium on Operating Systems Principles (SOSP '79). ACM, New York, NY, USA, 150--162. Google ScholarDigital Library
- Leslie Lamport. 1998. The Part-time Parliament. ACM Trans. Comput. Syst. Vol. 16, 2 (May . 1998), 133--169. Google ScholarDigital Library
Index Terms
- Amazon Aurora: On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes
Recommendations
Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of DataAmazon Aurora is a relational database service for OLTP workloads offered as part of Amazon Web Services (AWS). In this paper, we describe the architecture of Aurora and the design considerations leading to that architecture. We believe the central ...
Highly available fault tolerant distributed computing using reflection and replication
ICAC3 '09: Proceedings of the International Conference on Advances in Computing, Communication and ControlHigh availability is a desired feature of a good distributed system. Replication is a well-known technique to achieve fault tolerance in distributed systems, thereby enhancing availability.
Distributed computing for partitionable system presents a ...
An Efficient Data Replication Algorithm for Distributed Systems
This article describes how data replication plays an important role in distributed systems. It primarily focuses on the redundancy of data at two or more nodes, to achieve both fault tolerance and improved performance. Therefore, many researchers have ...
Comments