skip to main content
10.1145/1807128.1807152acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Benchmarking cloud serving systems with YCSB

Published:10 June 2010Publication History

ABSTRACT

While the use of MapReduce systems (such as Hadoop) for large scale data analysis has been widely recognized and studied, we have recently seen an explosion in the number of systems developed for cloud data serving. These newer systems address "cloud OLTP" applications, though they typically do not support ACID transactions. Examples of systems proposed for cloud serving use include BigTable, PNUTS, Cassandra, HBase, Azure, CouchDB, SimpleDB, Voldemort, and many others. Further, they are being applied to a diverse range of applications that differ considerably from traditional (e.g., TPC-C like) serving workloads. The number of emerging cloud serving systems and the wide range of proposed applications, coupled with a lack of apples-to-apples performance comparisons, makes it difficult to understand the tradeoffs between systems and the workloads for which they are suited. We present the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems. We define a core set of benchmarks and report results for four widely used systems: Cassandra, HBase, Yahoo!'s PNUTS, and a simple sharded MySQL implementation. We also hope to foster the development of additional cloud benchmark suites that represent other classes of applications by making our benchmark tool available via open source. In this regard, a key feature of the YCSB framework/tool is that it is extensible--it supports easy definition of new workloads, in addition to making it easy to benchmark new systems.

References

  1. Amazon SimpleDB. http://aws.amazon.com/simpledb/.Google ScholarGoogle Scholar
  2. Apache Cassandra. http://incubator.apache.org/cassandra/.Google ScholarGoogle Scholar
  3. Apache CouchDB. http://couchdb.apache.org/.Google ScholarGoogle Scholar
  4. Apache HBase. http://hadoop.apache.org/hbase/.Google ScholarGoogle Scholar
  5. Dynomite Framework. http://wiki.github.com/cliffmoon/-dynomite/dynomite-framework.Google ScholarGoogle Scholar
  6. Google App Engine. http://appengine.google.com.Google ScholarGoogle Scholar
  7. Hypertable. http://www.hypertable.org/.Google ScholarGoogle Scholar
  8. mongodb. http://www.mongodb.org/.Google ScholarGoogle Scholar
  9. Project Voldemort. http://project-voldemort.com/.Google ScholarGoogle Scholar
  10. Solaris FileBench. http://www.solarisinternals.com/wiki/index.php/FileBench.Google ScholarGoogle Scholar
  11. SQL Data Services/Azure Services Platform. http://www.microsoft.com/azure/data.mspx.Google ScholarGoogle Scholar
  12. Storage Performance Council. http://www.storageperformance.org/home.Google ScholarGoogle Scholar
  13. Yahoo! Query Language. http://developer.yahoo.com/yql/.Google ScholarGoogle Scholar
  14. A. Arasu et al. Linear Road: a stream data management benchmark. In VLDB, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. C. Botelho, D. Belazzougui, and M. Dietzfelbinger. Compress, hash and displace. In Proc. of the 17th European Symposium on Algorithms, 2009.Google ScholarGoogle Scholar
  16. F. Chang et al. Bigtable: A distributed storage system for structured data. In OSDI, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. F. Cooper et al. PNUTS: Yahoo!'s hosted data serving platform. In VLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. DeCandia et al. Dynamo: Amazon's highly available key-value store. In SOSP, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. J. DeWitt. The Wisconsin Benchmark: Past, present and future. In J. Gray, editor, The Benchmark Handbook. Morgan Kaufmann, 1993.Google ScholarGoogle Scholar
  20. I. Eure. Looking to the future with Cassandra. http://blog.digg.com/?p=966.Google ScholarGoogle Scholar
  21. S. Gilbert and N. Lynch. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, 33(2):51--59, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Gray, editor. The Benchmark Handbook For Database and Transaction Processing Systems. Morgan Kaufmann, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Gray et al. Quickly generating billion-record syntheti databases. In SIGMOD, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Lakshman, P. Malik, and K. Ranganathan. Cassandra: A structured storage system on a P2P network. In SIGMOD, 2008.Google ScholarGoogle Scholar
  25. B. C. Ooi and S. Parthasarathy. Special issue on data management on cloud computing platforms. IEEE Data Engineering Bul letin, vol. 32, 2009.Google ScholarGoogle Scholar
  26. A. Pavlo et al. A comparison of approaches to large-scale data analysis. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Rawson. HBase intro. In NoSQL Oakland, 2009.Google ScholarGoogle Scholar
  28. A. Schmidt et al. Xmark: A benchmark for XML data management. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Sears, M. Callaghan, and E. Brewer. Rose: Compressed, log-structured replication. In VLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Seltzer, D. Krinsky, K. A. Smith, and X. Zhang. The case for application-specific benchmarking. In Proc. HotOS, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. P. Shivam et al. Cutting corners: Workbench automation for server benchmarking. In Proc. USENIX Annual Technical Conference, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Stonebraker et al. C-store: a column-oriented DBMS. In VLDB, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. White et al. An integrated experimental environment for distributed systems and networks. In OSDI, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. K. Yocum et al. Scalability and accuracy in a large-scale network emulator. In OSDI, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Benchmarking cloud serving systems with YCSB

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SoCC '10: Proceedings of the 1st ACM symposium on Cloud computing
      June 2010
      264 pages
      ISBN:9781450300360
      DOI:10.1145/1807128

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 June 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate169of722submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader