skip to main content
10.1145/3035918.3056101acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases

Published:09 May 2017Publication History

ABSTRACT

Amazon Aurora is a relational database service for OLTP workloads offered as part of Amazon Web Services (AWS). In this paper, we describe the architecture of Aurora and the design considerations leading to that architecture. We believe the central constraint in high throughput data processing has moved from compute and storage to the network. Aurora brings a novel architecture to the relational database to address this constraint, most notably by pushing redo processing to a multi-tenant scale-out storage service, purpose-built for Aurora. We describe how doing so not only reduces network traffic, but also allows for fast crash recovery, failovers to replicas without loss of data, and fault-tolerant, self-healing storage. We then describe how Aurora achieves consensus on durable state across numerous storage nodes using an efficient asynchronous scheme, avoiding expensive and chatty recovery protocols. Finally, having operated Aurora as a production service for over 18 months, we share the lessons we have learnt from our customers on what modern cloud applications expect from databases.

References

  1. B. Calder, J. Wang, et al. Windows Azure storage: A highly available cloud storage service with strong consistency. In SOSP 201 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. O. Khan, R. Burns, J. Plank, W. Pierce, and C. Huang. Rethinking erasure codes for cloud file systems: Minimizing I/O for recovery and degraded reads. In FAST 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P.A. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency control and recovery in database systems, Chapter 7, Addison Wesley Publishing Company, ISBN 0-201-10715-5, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Mohan, B. Lindsay, and R. Obermarck. Transaction management in the R* distributed database management system?. ACM TODS, 11(4):378--396, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Mohan and B. Lindsay. Efficient commit protocols for the tree of processes model of distributed transactions. ACM SIGOPS Operating Systems Review, 19(2):40--52, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D.K. Gifford. Weighted voting for replicated data. In SOSP 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Mohan, D.L. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. ARIES: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM TODS, 17 (1): 94--162, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. van Renesse and F. Schneider. Chain replication for supporting high throughput and availability. In OSDI 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Kopytov. Sysbench Manual. Available at http://imysql.com/wp-content/uploads/2014/10/sysbench-manual.pdfGoogle ScholarGoogle Scholar
  10. J. Levandoski, D. Lomet, S. Sengupta, R. Stutsman, and R. Wang. High performance transactions in deuteronomy. In CIDR 2015.Google ScholarGoogle Scholar
  11. P. Bailis, A. Fekete, A. Ghodsi, J.M. Hellerstein, and I. Stoica. Scalable atomic visibility with RAMP Transactions. In SIGMOD 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Bailis, A. Davidson, A. Fekete, A. Ghodsi, J.M. Hellerstein, and I. Stoica. Highly available transactions: virtues and limitations. In VLDB 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Taft, E. Mansour, M. Serafini, J. Duggan, A.J. Elmore, A. Aboulnaga, A. Pavlo, and M. Stonebraker. E-Store: fine-grained elastic partitioning for distributed transaction processing systems. In VLDB 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Woollen. The internal design of salesforce.com's multi-tenant architecture. In SoCC 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. Davidson, H. Garcia-Molina, and D. Skeen. Consistency in partitioned networks. ACM CSUR, 17(3):341--370, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Gilbert and N. Lynch. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News, 33(2):51--59, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D.J. Abadi. Consistency tradeoffs in modern distributed database system design: CAP is only part of the story. IEEE Computer, 45(2), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Adya. Weak consistency: a generalized theory and optimistic implementations for distributed transactions. PhD Thesis, MIT, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Saito and M. Shapiro. Optimistic replication. ACM Comput. Surv., 37(1), Mar. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O'Neil, and P. O'Neil. A critique of ANSI SQL isolation levels. In SIGMOD 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Bailis and A. Ghodsi. Eventual consistency today: limitations, extensions, and beyond. ACM Queue, 11(3), March 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. Bernstein and S. Das. Rethinking eventual consistency. In SIGMOD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. Cooper et al. PNUTS: Yahoo!'s hosted data serving platform. In VLDB 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. C. Corbett, J. Dean, et al. Spanner: Google's globally-distributed database. In OSDI 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. David K. Gifford. Information Storage in a Decentralized Computer System. Tech. rep. CSL-81--8. PhD dissertation. Xerox PARC, July 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jeffrey Dean and Sanjay Ghemawat. MapReduce: a flexible data processing tool?. CACM 53 (1):72--77, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. M. Hellerstein, M. Stonebraker, and J. R. Hamilton. Architecture of a database system. Foundations and Trends in Databases. 1(2) pp. 141--259, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Gray, R. A. Lorie, G. R. Putzolu, I. L. Traiger. Granularity of locks in a shared data base. In VLDB 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P-A Larson, et al. High-Performance Concurrency control mechanisms for main-memory databases. PVLDB, 5(4): 298--309, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Stonebraker and A. Weisberg. The VoltDB main memory DBMS. IEEE Data Eng. Bull., 36(2): 21--27, 2013.Google ScholarGoogle Scholar
  31. V. Leis, A. Kemper, and T. Neumann. Exploiting hardware transactional memory in main-memory databases. In ICDE 2014.Google ScholarGoogle ScholarCross RefCross Ref
  32. H. Mühe, S. Wolf, A. Kemper, and T. Neumann: An evaluation of strict timestamp ordering concurrency control for main-memory database systems. In IMDM Workshop 2013.Google ScholarGoogle Scholar
  33. M. Rosenblum and J. Ousterhout. The design and implementation of a log-structured file system. ACM TOCS 10(1): 26--52, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. Levandoski, D. Lomet, S. Sengupta. LLAMA: A cache/storage subsystem for modern hardware. PVLDB 6(10): 877--888, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Levandoski, D. Lomet, and S. Sengupta. The Bw-Tree: A B-tree for new hardware platforms. In ICDE 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. Aguilera, J. Leners, and M. Walfish. Yesquel: scalable SQL storage for web applications. In SOSP 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Percona Lab. TPC-C Benchmark over MySQL. Available at https://github.com/Percona-Lab/tpcc-mysqlGoogle ScholarGoogle Scholar
  38. P. Bernstein, C. Reid, and S. Das. Hyder -- A transactional record manager for shared flash. In CIDR 2011.Google ScholarGoogle Scholar
  39. M. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: A new paradigm for building scalable distributed systems. ACM Trans. Comput. Syst. 27(3): 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. M. Weiner. Sharding Pinterest: How we scaled our MySQL fleet. Pinterest Engineering Blog. Available at: https://engineering.pinterest.com/blog/sharding-pinterest-how-we-scaled-our-mysql-fleetGoogle ScholarGoogle Scholar
  41. G. Graefe. Instant recovery for data center savings. ACM SIGMOD Record. 44(2):29--34, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. Dean and L. Barroso. The tail at scale. CACM 56(2):74--80, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in
                  • Published in

                    cover image ACM Conferences
                    SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
                    May 2017
                    1810 pages
                    ISBN:9781450341974
                    DOI:10.1145/3035918

                    Copyright © 2017 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 9 May 2017

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • research-article

                    Acceptance Rates

                    Overall Acceptance Rate785of4,003submissions,20%

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader