skip to main content
10.1145/3064176.3064210acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Saturn: a Distributed Metadata Service for Causal Consistency

Authors Info & Claims
Published:23 April 2017Publication History

ABSTRACT

This paper presents the design, implementation, and evaluation of Saturn, a metadata service for geo-replicated systems. Saturn can be used in combination with several distributed and replicated data services to ensure that remote operations are made visible in an order that respects causality, a requirement central to many consistency criteria.

Saturn addresses two key unsolved problems inherent to previous approaches. First, it eliminates the tradeoff between throughput and data freshness, when deciding what metadata to use for tracking causality. Second, it enables genuine partial replication, a key property to ensure scalability when the number of geo-locations increases. Saturn addresses these challenges while keeping metadata size constant, independently of the number of clients, servers, data partitions, and locations. By decoupling metadata management from data dissemination, and by using clever metadata propagation techniques, it ensures that the throughput and visibility latency of updates on a given item are (mostly) shielded from operations on other items or locations.

We evaluate Saturn in Amazon EC2 using realistic benchmarks under both full and partial geo-replication. Results show that weakly consistent datastores can lean on Saturn to upgrade their consistency guarantees to causal consistency with a negligible penalty on performance.

References

  1. M. Ahamad, G. Neiger, J. Burns, P. Kohli, and P. Hutto. Causal memory: definitions, implementation, and programming. Distributed Computing, 9(1):37--49, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Ajoux, N. Bronson, S. Kumar, W. Lloyd, and K. Veeraraghavan. Challenges to adopting stronger consistency at scale. In Proceeding of the 15th Workshop on Hot Topics in Operating Systems, HotOS '15, 2015.Google ScholarGoogle Scholar
  3. D. D. Akkoorath, A. Z. Tomsic, M. Bravo, Z. Li, T. Crain, A. Bieniusa, N. Preguia, and M. Shapiro. Cure: Strong semantics meets high availability and low latency. In Proceeding of the IEEE 36th International Conference on Distributed Computing Systems, ICDCS '16, pages 405--414, 2016. Google ScholarGoogle ScholarCross RefCross Ref
  4. M. Al-Fares, A. Loukissas, and A. Vahdat. A scalable, commodity data center network architecture. In Proceedings of the ACM SIGCOMM Conference on Data Communication, SIGCOMM '08, pages 63--74, Seattle, WA, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Almeida, J. a. Leitão, and L. Rodrigues. ChainReaction: A causal+ consistent datastore based on chain replication. In Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys '13, pages 85--98, Prague, Czech Republic, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. Fawn: A fast array of wimpy nodes. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP '09, pages 1--14, Big Sky, Montana, USA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. G. Armstrong, V. Ponnekanti, D. Borthakur, and M. Callaghan. Linkbench: A database benchmark based on the facebook social graph. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 1185--1196, New York, New York, USA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H. Attiya, F. Ellen, and A. Morrison. Limitations of highly-available eventually-consistent data stores. In Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC '15, pages 385--394, Donostia-San Sebastián, Spain, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Bailis, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica. The potential dangers of causal consistency and an explicit solution. In Proceedings of the 3rd ACM Symposium on Cloud Computing, SoCC '12, pages 22:1--22:7, San Jose, California, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Bailis, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Bolton causal consistency. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 761--772, New York, New York, USA, 2013.Google ScholarGoogle Scholar
  11. P. Bailis, A. Fekete, M. J. Franklin, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Feral concurrency control: An empirical investigation of modern application integrity. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 1327--1342, Melbourne, Victoria, Australia, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. V. Balegas, S. Duarte, C. Ferreira, R. Rodrigues, N. Preguiça, M. Najafzadeh, and M. Shapiro. Putting consistency back into eventual consistency. In Proceedings of the 10th European Conference on Computer Systems, EuroSys '15, pages 6:1--6:16, Bordeaux, France, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Basho. Basho Bench. http://github.com/basho/basho_bench,.Google ScholarGoogle Scholar
  14. Basho. Riak core. http://github.com/basho/riak_core,.Google ScholarGoogle Scholar
  15. F. Benevenuto, T. Rodrigues, M. Cha, and V. Almeida. Characterizing user behavior in online social networks. In Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, IMC '09, pages 49--62, Chicago, Illinois, USA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. K. Birman, A. Schiper, and P. Stephenson. Lightweight causal and atomic group multicast. ACM Trans. Comput. Syst., 9(3), Aug. 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Bravo, N. Diegues, J. Zeng, P. Romano, and L. Rodrigues. On the use of clocks to enforce consistency in the cloud. IEEE Data Engineering Bulleting, 38(1):18--31, 2015.Google ScholarGoogle Scholar
  18. E. A. Brewer. Towards robust distributed systems. In Keynote at the ACM Symposium on Principles of Distributed Computing, PODC, 2000.Google ScholarGoogle Scholar
  19. A. Brodersen, S. Scellato, and M. Wattenhofer. Youtube around the world: Geographic popularity of videos. In Proceedings of the 21st International Conference on World Wide Web, WWW '12, pages 241--250, Lyon, France, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Castro and B. Liskov. Practical byzantine fault tolerance. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation, OSDI '99, pages 173--186, New Orleans, Louisiana, USA, 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Charron-Bost. Concerning the size of logical clocks in distributed systems. Information Processing Letters, 39(1): 11--16, July 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. R. Cheriton and D. Skeen. Understanding the limitations of causally and totally ordered communication. In Proceedings of the 14th ACM Symposium on Operating Systems Principles, SOSP '93, pages 44--57, Asheville, North Carolina, USA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google's globally-distributed database. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI'12, pages 251--264, Hollywood, CA, USA, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Dahlin, L. Gao, A. Nayate, A. Venkataramana, P. Yalagandula, and J. Zheng. Practi replication. In Proceedings of the 3rd USENIX Symposium on Networked Systems Design and Implementation, NSDI '06, 2006.Google ScholarGoogle Scholar
  25. J. Du, S. Elnikety, A. Roy, andW. Zwaenepoel. Orbe: Scalable causal consistency using dependency matrices and physical clocks. In Proceedings of the 4th Annual Symposium on Cloud Computing, SOCC '13, pages 11:1--11:14, Santa Clara, California, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Du, C. Iorgulescu, A. Roy, and W. Zwaenepoel. Gentlerain: Cheap and scalable causal consistency with physical clocks. In Proceedings of the 5th ACM Symposium on Cloud Computing, SOCC '14, pages 4:1--4:13, Seattle, WA, USA, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Escriva, A. Dubey, B. Wong, and E. G. Sirer. Kronos: The design and implementation of an event ordering service. In Proceedings of the 9th European Conference on Computer Systems, EuroSys '14, pages 3:1--3:14, Amsterdam, The Netherlands, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Gilbert and N. Lynch. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News, 33(2):51--59, June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. Vl2: A scalable and flexible data center network. In Proceedings of the ACM SIGCOMM Conference on Data Communication, SIGCOMM '09, pages 51--62, Barcelona, Spain, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. Guerraoui and A. Schiper. Genuine atomic multicast in asynchronous distributed systems. Theoretical Computer Science, 254(1):297--316, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Guerraoui, M. Pavlovic, and D.-A. Seredinschi. Trade-offs in replicated systems. IEEE Data Engineering Bulletin, 39: 14--26, 2016.Google ScholarGoogle Scholar
  32. C. Gunawardhana, M. Bravo, and L. Rodrigues. Unobtrusive deferred update stabilization for efficient geo-replication. Arxiv preprint arXiv:1702.01786, Feb. 2017.Google ScholarGoogle Scholar
  33. M. P. Herlihy and J. M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst., 12(3):463--492, July 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu, J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat. B4: Experience with a globally-deployed software defined wan. In Proceedings of the ACM SIGCOMM Conference, SIGCOMM '13, pages 3--14, Hong Kong, China, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. M. Karp. Reducibility among Combinatorial Problems, pages 85--103. Springer US, Boston, MA, 1972. Google ScholarGoogle ScholarCross RefCross Ref
  36. R. Ladin, B. Liskov, L. Shrira, and S. Ghemawat. Providing high availability using lazy replication. ACM Trans. Comput. Syst., 10(4):360--391, Nov. 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. L. Lamport. Time, clocks, and the ordering of events in a distributed system. Commun. ACM, 21(7), July 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. Li, D. Porto, A. Clement, J. Gehrke, N. Preguiça, and R. Rodrigues. Making geo-replicated systems fast as possible, consistent when necessary. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI '12, pages 265--278, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Don't settle for eventual: Scalable causal consistency for wide-area storage with cops. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles, SOSP '11, pages 401--416, Cascais, Portugal, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Stronger semantics for low-latency geo-replicated storage. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI '13, pages 313--328, 2013.Google ScholarGoogle Scholar
  41. P. Mahajan, L. Alvisi, and M. Dahlin. Consistency, availability, and convergence. Technical Report TR-11-21, University of Texas at Austin, Austin, Texas, 2011.Google ScholarGoogle Scholar
  42. S. A. Mehdi, C. Littley, N. Crooks, L. Alvisi, N. Bronson, and W. Lloyd. I can't believe it's not causal! In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI '17, 2017.Google ScholarGoogle Scholar
  43. NTP. The network time protocol. http://www.ntp.org.Google ScholarGoogle Scholar
  44. OscaR Team. OscaR: Scala in OR. https://bitbucket.org/oscarlib/oscar.Google ScholarGoogle Scholar
  45. K. Petersen, M. J. Spreitzer, D. B. Terry, M. M. Theimer, and A. J. Demers. Flexible update propagation for weakly consistent replication. In Proceedings of the 16th ACM Symposium on Operating Systems Principles, SOSP '97, pages 288--301, Saint Malo, France, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. J. M. Pujol, V. Erramilli, G. Siganos, X. Yang, N. Laoutaris, P. Chhabra, and P. Rodriguez. The little engine(s) that could: Scaling online social networks. In Proceedings of the ACM SIGCOMM Conference, SIGCOMM '10, pages 375--386, New Delhi, India, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. F. B. Schneider. Byzantine generals in action: Implementing fail-stop processors. ACM Trans. Comput. Syst., 2(2):145--154, May 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. F. B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Comput. Surv., 22(4):299--319, Dec. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Y. Sovran, R. Power, M. K. Aguilera, and J. Li. Transactional storage for geo-replicated systems. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles, SOSP '11, pages 385--400, Cascais, Portugal, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. D. B. Terry, A. J. Demers, K. Petersen, M. J. Spreitzer, M. M. Theimer, and B. B. Welch. Session guarantees for weakly consistent replicated data. In Proceedings of 3rd International Conference on Parallel and Distributed Information Systems, PDIS '94, pages 140--149, Sep 1994. Google ScholarGoogle ScholarCross RefCross Ref
  51. R. Van Renesse and F. B. Schneider. Chain replication for supporting high throughput and availability. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation, OSDI '04, 2004.Google ScholarGoogle Scholar
  52. B. Viswanath, A. Mislove, M. Cha, and K. P. Gummadi. On the evolution of user interaction in facebook. In Proceedings of the 2nd ACM Workshop on Online Social Networks, WOSN '09, pages 37--42, Barcelona, Spain, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. M. Zawirski, N. Preguiça, S. Duarte, A. Bieniusa, V. Balegas, and M. Shapiro. Write fast, read in the past: Causal consistency for client-side applications. In Proceedings of the 16th Annual Middleware Conference, Middleware '15, pages 75--87, Vancouver, BC, Canada, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    EuroSys '17: Proceedings of the Twelfth European Conference on Computer Systems
    April 2017
    648 pages
    ISBN:9781450349383
    DOI:10.1145/3064176

    Copyright © 2017 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 23 April 2017

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate241of1,308submissions,18%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader