Amazon Redshift and the Case for Simpler Data Warehouses

Authors:
Anurag Gupta

Amazon Web Services, Seattle, USA

Amazon Web Services, Seattle, USA
View Profile

,
Deepak Agarwal

Amazon Web Services, Seattle, USA

Amazon Web Services, Seattle, USA
View Profile

,
Derek Tan

Amazon Web Services, Seattle, USA

Amazon Web Services, Seattle, USA
View Profile

,
Jakub Kulesza

Amazon Web Services, Seattle, USA

Amazon Web Services, Seattle, USA
View Profile

,
Rahul Pathak

Amazon Web Services, Seattle, USA

Amazon Web Services, Seattle, USA
View Profile

,
Stefano Stefani

Amazon Web Services, Seattle, USA

Amazon Web Services, Seattle, USA
View Profile

,
Vidhya Srinivasan

Amazon Web Services, Seattle, USA

Amazon Web Services, Seattle, USA
View Profile

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataMay 2015Pages 1917–1923https://doi.org/10.1145/2723372.2742795

Published:27 May 2015Publication History

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

Pages 1917–1923

ABSTRACT

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse solution that makes it simple and cost-effective to efficiently analyze large volumes of data using existing business intelligence tools. Since launching in February 2013, it has been Amazon Web Service's (AWS) fastest growing service, with many thousands of customers and many petabytes of data under management. Amazon Redshift's pace of adoption has been a surprise to many participants in the data warehousing community. While Amazon Redshift was priced disruptively at launch, available for as little as $1000/TB/year, there are many open-source data warehousing technologies and many commercial data warehousing engines that provide free editions for development or under some usage limit. While Amazon Redshift provides a modern MPP, columnar, scale-out architecture, so too do many other data warehousing engines. And, while Amazon Redshift is available in the AWS cloud, one can build data warehouses using EC2 instances and the database engine of one's choice with either local or network-attached storage.

In this paper, we discuss an oft-overlooked differentiating characteristic of Amazon Redshift -- simplicity. Our goal with Amazon Redshift was not to compete with other data warehousing engines, but to compete with non-consumption. We believe the vast majority of data is collected but not analyzed. We believe, while most database vendors target larger enterprises, there is little correlation in today's economy between data set size and company size. And, we believe the models used to procure and consume analytics technology need to support experimentation and evaluation. Amazon Redshift was designed to bring data warehousing to a mass market by making it easy to buy, easy to tune and easy to manage while also being fast and cost-effective.

References

Abadi D, Boncz P, Harizopoulos S, Idreos S, Madden S. The Design and Implementation of Modern Column-Oriented Database Systems. Foundations and Trends in Databases. 2013;5(3):197--280. Google ScholarDigital Library
Daniel J. Abadi, Samuel R. Madden, and Miguel Ferreira. Integrating compression and execution in column-oriented database systems. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 671--682, 2006. Google ScholarDigital Library
Peter Boncz, Marcin Zukowski, and Niels Nes. MonetDB/X100: Hyper- pipelining query execution. In Proceedings of the biennial Conference on Innovative Data Systems Research (CIDR), 2005.Google Scholar
Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Åke Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, Mike Zwilling. Hekaton: SQL server's memory-optimized OLTP engine. SIGMOD Conference 2013: 1243--1254. Google ScholarDigital Library
Guido Moerkotte. Small Materialized Aggregates: A Light Weight Index Structure for Data Warehousing. VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases. Pages 476--487. Google ScholarDigital Library
Thomas Neumann. Efficiently Compiling Efficient Query Plans for Modern Hardware. PVLDB 2011, Seattle, USA. Google ScholarDigital Library
J. A. Orenstein and T. H. Merrett. A class of data structures for associative searching. In Proc. PODS, pages 181--190, 1984. Google ScholarDigital Library
Michael Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Samuel R. Madden, Elizabeth J. O'Neil, Patrick E. O'Neil, Alexan- der Rasin, Nga Tran, and Stan B. Zdonik. C-Store: A Column-Oriented DBMS. In Proceedings of the International Conference on Very Large Data Bases (VLDB), pages 553--564, 2005. Google ScholarDigital Library
J. Sompolski, M. Zukowski, and P. A. Boncz. Vectorization vs. compilation in query execution. In DaMoN, pages 33--40, 2011. Google ScholarDigital Library
Gartner : User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 IDC: Worldwide Business Analytics Software 2012-2016 Forecast and 2011 Vendor SharesGoogle Scholar

Index Terms

Amazon Redshift and the Case for Simpler Data Warehouses
1. Information systems
  1. Data management systems

Recommendations

Amazon Redshift Re-invented
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

In 2013, AmazonWeb Services revolutionized the data warehousing industry by launching Amazon Redshift, the first fully-managed, petabyte-scale, enterprise-grade cloud data warehouse. Amazon Redshift made it simple and cost-effective to efficiently ...
Read More
The evolution of Amazon redshift

In 2013, Amazon Web Services revolutionized the data warehousing industry by launching Amazon Redshift [7], the first fully managed, petabyte-scale enterprise-grade cloud data warehouse. Amazon Redshift made it simple and cost-effective to efficiently ...
Read More
Amazon Cloud Computing With Java
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
May 2015
2110 pages
ISBN:9781450327589
DOI:10.1145/2723372
General Chair:
Timos Sellis
RMIT University, Australia
,
Program Chairs:
Susan B. Davidson
University of Pennsylvania, USA
,
Zack Ives
University of Pennsylvania, USA
Copyright © 2015 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 May 2015
Check for updates
Author Tags
amazon redshift
columnar database:redshift:data warehousing:mpp
Qualifiers
- research-article
Conference

Acceptance Rates
SIGMOD '15 Paper Acceptance Rate106of415submissions,26%Overall Acceptance Rate785of4,003submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 98
  Total Citations
  View Citations
- 8,314
  Total Downloads
- Downloads (Last 12 months)973
- Downloads (Last 6 weeks)177
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Amazon Redshift and the Case for Simpler Data Warehouses

SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Amazon Redshift Re-invented

The evolution of Amazon redshift

Amazon Cloud Computing With Java