research-article

Can the elephants handle the NoSQL onslaught?

Authors:
Avrilia Floratou

University of Wisconsin-Madison

University of Wisconsin-Madison
View Profile

,
Nikhil Teletia

Microsoft Jim Gray Systems Lab

Microsoft Jim Gray Systems Lab
View Profile

,
David J. DeWitt

Microsoft Jim Gray Systems Lab

Microsoft Jim Gray Systems Lab
View Profile

,
Jignesh M. Patel

University of Wisconsin-Madison

University of Wisconsin-Madison
View Profile

,
Donghui Zhang

Paradigm4

Paradigm4
View Profile

Proceedings of the VLDB Endowment Volume 5 Issue 12pp 1712–1723https://doi.org/10.14778/2367502.2367511

Published:01 August 2012Publication History

Proceedings of the VLDB Endowment

Abstract

In this new era of "big data", traditional DBMSs are under attack from two sides. At one end of the spectrum, the use of document store NoSQL systems (e.g. MongoDB) threatens to move modern Web 2.0 applications away from traditional RDBMSs. At the other end of the spectrum, big data DSS analytics that used to be the domain of parallel RDBMSs is now under attack by another class of NoSQL data analytics systems, such as Hive on Hadoop. So, are the traditional RDBMSs, aka "big elephants", doomed as they are challenged from both ends of this "big data" spectrum? In this paper, we compare one representative NoSQL system from each end of this spectrum with SQL Server, and analyze the performance and scalability aspects of each of these approaches (NoSQL vs. SQL) on two workloads (decision support analysis and interactive data-serving) that represent the two ends of the application spectrum. We present insights from this evaluation and speculate on potential trends for the future.

References

CouchDB. http://couchdb.apache.org/Google Scholar
Hadoop. http://hadoop.apache.org/Google Scholar
Hive. http://hive.apache.org/Google Scholar
Hive Issue 2081. https://issues.apache.org/jira/browse/HIVE-2081Google Scholar
Hive Issue 2130. https://issues.apache.org/jira/browse/HIVE-2130Google Scholar
Microsoft SQL Server 2008 R2 Parallel Data Warehouse. http://www.microsoft.com/sqlserver/en/us/solutions-technologies/data-warehousing/pdw.aspxGoogle Scholar
MongoDB. http://www.mongodb.org/Google Scholar
MongoDB -- Replica Sets. http://www.mongodb.org/display/DOCS/Replica+SetsGoogle Scholar
MongoDB - Splitting Chunk Shards. http://www.mongodb.org/display/DOCS/Splitting+Shard+ChunksGoogle Scholar
MongoDB - Mongostat. http://www.mongodb.org/display/DOCS/mongostatGoogle Scholar
Riak. http://wiki.basho.com/Google Scholar
Running TPC-H queries on Hive. https://issues.apache.org/jira/browse/HIVE-600Google Scholar
The TPC-H Benchmark. http://www.tpc.org/tpch/Google Scholar
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking Cloud Serving Systems with YCSB. In SoCC, pages 143--154, 2010. Google Scholar
M. Y. Eltabakh, Y. Tian, F. Özcan, Rainer Gemulla, Aljoscha Krettek, John McPherson: CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop. PVLDB 4(9): 575--585, 2011. Google Scholar
A. Floratou, J. M. Patel, E. J. Shekita, and S. Tata. Column-Oriented Storage Techniques for MapReduce. PVLDB, 4(7): 419--429, 2011. Google Scholar
Y. He, R. Lee, Y. Huai, Z. Shao, N. Jain, X. Zhang, and Z. Xu. RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems. In ICDE, pages 1199--1208, 2011. Google Scholar
T.Kaldewey, E. J. Shekita, and S. Tata. Clydesdale: Structured Data Processing on MapReduce. In EDBT, pages 15--25, 2012. Google Scholar
A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden, and M. Stonebraker. A Comparison of Approaches to Large-Scale Data Analysis. In SIGMOD, pages 165--178, 2009. Google Scholar
A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu, and R. Murthy. Hive: A Petabyte Scale Data Warehouse Using Hadoop. In ICDE, pages 996--1005, 2010.Google Scholar

Index Terms

Can the elephants handle the NoSQL onslaught?
1. Information systems
  1. Data management systems
    1. Database management system engines

Index terms have been assigned to the content through auto-classification.

Recommendations

NoSQL For Dummies
Read More
Big Data NoSQL Architecting MongoDB
Read More
NoSQL databases: MongoDB vs cassandra
C3S2E '13: Proceedings of the International C* Conference on Computer Science and Software Engineering

In the past, relational databases were used in a large scope of applications due to their rich set of features, query capabilities and transaction management. However, they are not able to store and process big data effectively and are not very ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the VLDB Endowment Volume 5, Issue 12
August 2012
340 pages
ISSN:2150-8097
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 August 2012
Published in pvldb Volume 5, Issue 12
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 1,907
  Total Downloads
- Downloads (Last 12 months)74
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Can the elephants handle the NoSQL onslaught?

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

NoSQL For Dummies

Big Data NoSQL Architecting MongoDB

NoSQL databases: MongoDB vs cassandra

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Can the elephants handle the NoSQL onslaught?

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

NoSQL For Dummies

Big Data NoSQL Architecting MongoDB

NoSQL databases: MongoDB vs cassandra

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media