ABSTRACT
This paper presents the design, implementation, and evaluation of Saturn, a metadata service for geo-replicated systems. Saturn can be used in combination with several distributed and replicated data services to ensure that remote operations are made visible in an order that respects causality, a requirement central to many consistency criteria.
Saturn addresses two key unsolved problems inherent to previous approaches. First, it eliminates the tradeoff between throughput and data freshness, when deciding what metadata to use for tracking causality. Second, it enables genuine partial replication, a key property to ensure scalability when the number of geo-locations increases. Saturn addresses these challenges while keeping metadata size constant, independently of the number of clients, servers, data partitions, and locations. By decoupling metadata management from data dissemination, and by using clever metadata propagation techniques, it ensures that the throughput and visibility latency of updates on a given item are (mostly) shielded from operations on other items or locations.
We evaluate Saturn in Amazon EC2 using realistic benchmarks under both full and partial geo-replication. Results show that weakly consistent datastores can lean on Saturn to upgrade their consistency guarantees to causal consistency with a negligible penalty on performance.
- M. Ahamad, G. Neiger, J. Burns, P. Kohli, and P. Hutto. Causal memory: definitions, implementation, and programming. Distributed Computing, 9(1):37--49, 1995. Google ScholarDigital Library
- P. Ajoux, N. Bronson, S. Kumar, W. Lloyd, and K. Veeraraghavan. Challenges to adopting stronger consistency at scale. In Proceeding of the 15th Workshop on Hot Topics in Operating Systems, HotOS '15, 2015.Google Scholar
- D. D. Akkoorath, A. Z. Tomsic, M. Bravo, Z. Li, T. Crain, A. Bieniusa, N. Preguia, and M. Shapiro. Cure: Strong semantics meets high availability and low latency. In Proceeding of the IEEE 36th International Conference on Distributed Computing Systems, ICDCS '16, pages 405--414, 2016. Google ScholarCross Ref
- M. Al-Fares, A. Loukissas, and A. Vahdat. A scalable, commodity data center network architecture. In Proceedings of the ACM SIGCOMM Conference on Data Communication, SIGCOMM '08, pages 63--74, Seattle, WA, USA, 2008. Google ScholarDigital Library
- S. Almeida, J. a. Leitão, and L. Rodrigues. ChainReaction: A causal+ consistent datastore based on chain replication. In Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys '13, pages 85--98, Prague, Czech Republic, 2013. Google ScholarDigital Library
- D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. Fawn: A fast array of wimpy nodes. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP '09, pages 1--14, Big Sky, Montana, USA, 2009. Google ScholarDigital Library
- T. G. Armstrong, V. Ponnekanti, D. Borthakur, and M. Callaghan. Linkbench: A database benchmark based on the facebook social graph. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 1185--1196, New York, New York, USA, 2013. Google ScholarDigital Library
- H. Attiya, F. Ellen, and A. Morrison. Limitations of highly-available eventually-consistent data stores. In Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC '15, pages 385--394, Donostia-San Sebastián, Spain, 2015. Google ScholarDigital Library
- P. Bailis, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica. The potential dangers of causal consistency and an explicit solution. In Proceedings of the 3rd ACM Symposium on Cloud Computing, SoCC '12, pages 22:1--22:7, San Jose, California, 2012. Google ScholarDigital Library
- P. Bailis, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Bolton causal consistency. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 761--772, New York, New York, USA, 2013.Google Scholar
- P. Bailis, A. Fekete, M. J. Franklin, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Feral concurrency control: An empirical investigation of modern application integrity. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD '15, pages 1327--1342, Melbourne, Victoria, Australia, 2015. Google ScholarDigital Library
- V. Balegas, S. Duarte, C. Ferreira, R. Rodrigues, N. Preguiça, M. Najafzadeh, and M. Shapiro. Putting consistency back into eventual consistency. In Proceedings of the 10th European Conference on Computer Systems, EuroSys '15, pages 6:1--6:16, Bordeaux, France, 2015. Google ScholarDigital Library
- Basho. Basho Bench. http://github.com/basho/basho_bench,.Google Scholar
- Basho. Riak core. http://github.com/basho/riak_core,.Google Scholar
- F. Benevenuto, T. Rodrigues, M. Cha, and V. Almeida. Characterizing user behavior in online social networks. In Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, IMC '09, pages 49--62, Chicago, Illinois, USA, 2009. Google ScholarDigital Library
- K. Birman, A. Schiper, and P. Stephenson. Lightweight causal and atomic group multicast. ACM Trans. Comput. Syst., 9(3), Aug. 1991.Google ScholarDigital Library
- M. Bravo, N. Diegues, J. Zeng, P. Romano, and L. Rodrigues. On the use of clocks to enforce consistency in the cloud. IEEE Data Engineering Bulleting, 38(1):18--31, 2015.Google Scholar
- E. A. Brewer. Towards robust distributed systems. In Keynote at the ACM Symposium on Principles of Distributed Computing, PODC, 2000.Google Scholar
- A. Brodersen, S. Scellato, and M. Wattenhofer. Youtube around the world: Geographic popularity of videos. In Proceedings of the 21st International Conference on World Wide Web, WWW '12, pages 241--250, Lyon, France, 2012. Google ScholarDigital Library
- M. Castro and B. Liskov. Practical byzantine fault tolerance. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation, OSDI '99, pages 173--186, New Orleans, Louisiana, USA, 1999.Google ScholarDigital Library
- B. Charron-Bost. Concerning the size of logical clocks in distributed systems. Information Processing Letters, 39(1): 11--16, July 1991. Google ScholarDigital Library
- D. R. Cheriton and D. Skeen. Understanding the limitations of causally and totally ordered communication. In Proceedings of the 14th ACM Symposium on Operating Systems Principles, SOSP '93, pages 44--57, Asheville, North Carolina, USA, 1993. Google ScholarDigital Library
- J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google's globally-distributed database. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI'12, pages 251--264, Hollywood, CA, USA, 2012.Google ScholarDigital Library
- M. Dahlin, L. Gao, A. Nayate, A. Venkataramana, P. Yalagandula, and J. Zheng. Practi replication. In Proceedings of the 3rd USENIX Symposium on Networked Systems Design and Implementation, NSDI '06, 2006.Google Scholar
- J. Du, S. Elnikety, A. Roy, andW. Zwaenepoel. Orbe: Scalable causal consistency using dependency matrices and physical clocks. In Proceedings of the 4th Annual Symposium on Cloud Computing, SOCC '13, pages 11:1--11:14, Santa Clara, California, 2013. Google ScholarDigital Library
- J. Du, C. Iorgulescu, A. Roy, and W. Zwaenepoel. Gentlerain: Cheap and scalable causal consistency with physical clocks. In Proceedings of the 5th ACM Symposium on Cloud Computing, SOCC '14, pages 4:1--4:13, Seattle, WA, USA, 2014. Google ScholarDigital Library
- R. Escriva, A. Dubey, B. Wong, and E. G. Sirer. Kronos: The design and implementation of an event ordering service. In Proceedings of the 9th European Conference on Computer Systems, EuroSys '14, pages 3:1--3:14, Amsterdam, The Netherlands, 2014. Google ScholarDigital Library
- S. Gilbert and N. Lynch. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News, 33(2):51--59, June 2002. Google ScholarDigital Library
- A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. Vl2: A scalable and flexible data center network. In Proceedings of the ACM SIGCOMM Conference on Data Communication, SIGCOMM '09, pages 51--62, Barcelona, Spain, 2009. Google ScholarDigital Library
- R. Guerraoui and A. Schiper. Genuine atomic multicast in asynchronous distributed systems. Theoretical Computer Science, 254(1):297--316, 2001. Google ScholarDigital Library
- R. Guerraoui, M. Pavlovic, and D.-A. Seredinschi. Trade-offs in replicated systems. IEEE Data Engineering Bulletin, 39: 14--26, 2016.Google Scholar
- C. Gunawardhana, M. Bravo, and L. Rodrigues. Unobtrusive deferred update stabilization for efficient geo-replication. Arxiv preprint arXiv:1702.01786, Feb. 2017.Google Scholar
- M. P. Herlihy and J. M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst., 12(3):463--492, July 1990. Google ScholarDigital Library
- S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu, J. Zolla, U. Hölzle, S. Stuart, and A. Vahdat. B4: Experience with a globally-deployed software defined wan. In Proceedings of the ACM SIGCOMM Conference, SIGCOMM '13, pages 3--14, Hong Kong, China, 2013. Google ScholarDigital Library
- R. M. Karp. Reducibility among Combinatorial Problems, pages 85--103. Springer US, Boston, MA, 1972. Google ScholarCross Ref
- R. Ladin, B. Liskov, L. Shrira, and S. Ghemawat. Providing high availability using lazy replication. ACM Trans. Comput. Syst., 10(4):360--391, Nov. 1992. Google ScholarDigital Library
- L. Lamport. Time, clocks, and the ordering of events in a distributed system. Commun. ACM, 21(7), July 1978. Google ScholarDigital Library
- C. Li, D. Porto, A. Clement, J. Gehrke, N. Preguiça, and R. Rodrigues. Making geo-replicated systems fast as possible, consistent when necessary. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI '12, pages 265--278, 2012.Google ScholarDigital Library
- W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Don't settle for eventual: Scalable causal consistency for wide-area storage with cops. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles, SOSP '11, pages 401--416, Cascais, Portugal, 2011. Google ScholarDigital Library
- W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Stronger semantics for low-latency geo-replicated storage. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI '13, pages 313--328, 2013.Google Scholar
- P. Mahajan, L. Alvisi, and M. Dahlin. Consistency, availability, and convergence. Technical Report TR-11-21, University of Texas at Austin, Austin, Texas, 2011.Google Scholar
- S. A. Mehdi, C. Littley, N. Crooks, L. Alvisi, N. Bronson, and W. Lloyd. I can't believe it's not causal! In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI '17, 2017.Google Scholar
- NTP. The network time protocol. http://www.ntp.org.Google Scholar
- OscaR Team. OscaR: Scala in OR. https://bitbucket.org/oscarlib/oscar.Google Scholar
- K. Petersen, M. J. Spreitzer, D. B. Terry, M. M. Theimer, and A. J. Demers. Flexible update propagation for weakly consistent replication. In Proceedings of the 16th ACM Symposium on Operating Systems Principles, SOSP '97, pages 288--301, Saint Malo, France, 1997. Google ScholarDigital Library
- J. M. Pujol, V. Erramilli, G. Siganos, X. Yang, N. Laoutaris, P. Chhabra, and P. Rodriguez. The little engine(s) that could: Scaling online social networks. In Proceedings of the ACM SIGCOMM Conference, SIGCOMM '10, pages 375--386, New Delhi, India, 2010. Google ScholarDigital Library
- F. B. Schneider. Byzantine generals in action: Implementing fail-stop processors. ACM Trans. Comput. Syst., 2(2):145--154, May 1984. Google ScholarDigital Library
- F. B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Comput. Surv., 22(4):299--319, Dec. 1990. Google ScholarDigital Library
- Y. Sovran, R. Power, M. K. Aguilera, and J. Li. Transactional storage for geo-replicated systems. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles, SOSP '11, pages 385--400, Cascais, Portugal, 2011. Google ScholarDigital Library
- D. B. Terry, A. J. Demers, K. Petersen, M. J. Spreitzer, M. M. Theimer, and B. B. Welch. Session guarantees for weakly consistent replicated data. In Proceedings of 3rd International Conference on Parallel and Distributed Information Systems, PDIS '94, pages 140--149, Sep 1994. Google ScholarCross Ref
- R. Van Renesse and F. B. Schneider. Chain replication for supporting high throughput and availability. In Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation, OSDI '04, 2004.Google Scholar
- B. Viswanath, A. Mislove, M. Cha, and K. P. Gummadi. On the evolution of user interaction in facebook. In Proceedings of the 2nd ACM Workshop on Online Social Networks, WOSN '09, pages 37--42, Barcelona, Spain, 2009. Google ScholarDigital Library
- M. Zawirski, N. Preguiça, S. Duarte, A. Bieniusa, V. Balegas, and M. Shapiro. Write fast, read in the past: Causal consistency for client-side applications. In Proceedings of the 16th Annual Middleware Conference, Middleware '15, pages 75--87, Vancouver, BC, Canada, 2015. Google ScholarDigital Library
Recommendations
Interactive Saturn flight program simulator
Space vehicle control, guidance, and navigation require onboard computers. Mission safety and success demand high program reliability without preliminary in-flight testing.
Interactive Saturn flight simulation discussed in this paper tests all normal ...
Saturn V launch vehicle digital computer and data adapter
AFIPS '64 (Fall, part I): Proceedings of the October 27-29, 1964, fall joint computer conference, part IThis paper describes the IBM Space Guidance Center's part in the Saturn V Program and the digital computer and data adapter being developed for the Saturn V booster. This work is being performed under contract to NASA under direction of the Marshall ...
Saturn: Range Queries, Load Balancing and Fault Tolerance in DHT Data Systems
In this paper, we present Saturn, an overlay architecture for large-scale data networks maintained over Distributed Hash Tables (DHTs) that efficiently processes range queries and ensures access load balancing and fault-tolerance. Placing consecutive ...
Comments