Article

QPipe: a simultaneously pipelined relational query engine

Authors:
Stavros Harizopoulos

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Vladislav Shkapenyuk

Rutgers University, Piscataway, NJ

Rutgers University, Piscataway, NJ
View Profile

,
Anastassia Ailamaki

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of dataJune 2005Pages 383–394https://doi.org/10.1145/1066157.1066201

Published:14 June 2005Publication History

SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data

Pages 383–394

ABSTRACT

Relational DBMS typically execute concurrent queries independently by invoking a set of operator instances for each query. To exploit common data retrievals and computation in concurrent queries, researchers have proposed a wealth of techniques, ranging from buffering disk pages to constructing materialized views and optimizing multiple queries. The ideas proposed, however, are inherently limited by the query-centric philosophy of modern engine designs. Ideally, the query engine should proactively coordinate same-operator execution among concurrent queries, thereby exploiting common accesses to memory and disks as well as common intermediate result computation.This paper introduces on-demand simultaneous pipelining (OSP), a novel query evaluation paradigm for maximizing data and work sharing across concurrent queries at execution time. OSP enables proactive, dynamic operator sharing by pipelining the operator's output simultaneously to multiple parent nodes. This paper also introduces QPipe, a new operator-centric relational engine that effortlessly supports OSP. Each relational operator is encapsulated in a micro-engine serving query tasks from a queue, naturally exploiting all data and work sharing opportunities. Evaluation of QPipe built on top of BerkeleyDB shows that QPipe achieves a 2x speedup over a commercial DBMS when running a workload consisting of TPC-H queries.

References

S. Agrawal, S. Chaudhuri, and V. R. Narasayya. "Automated selection of materialized views and indexes in SQL databases." In Proc. VLDB, 2000. Google ScholarDigital Library
J. A. Blakeley, P. Larson, and F. W. Tompa. "Efficiently updating materialized views." In Proc SIGMOD, 1986. Google ScholarDigital Library
M. Carey et al. "Shoring Up Persistent Applications." In Proc. SIGMOD, 1994. Google ScholarDigital Library
S. Chandrasekaran and M. J. Franklin. "Streaming Queries over Streaming Data." In Proc. VLDB, 2002. Google ScholarDigital Library
J. Chen, D. DeWitt, F. Tian, and Y. Wang. "NiagaraCQ: A scalable continuous query system for internet databases." In Proc. SIGMOD, 2000. Google ScholarDigital Library
H. T. Chou and D. J. DeWitt. "An evaluation of buffer management strategies for relational database systems." In Proc. SIGMOD, 1985.Google Scholar
C. Cook. "Database Architecture: The Storage Engine." Miscrosoft SQL Server 2000 Technical Article, July 2001. Available at: http://msdn.microsoft. com/libraryGoogle Scholar
N. Dalvi, S. K. Sanghai, P. Roy, and S. Sudarshan. "Pipelining in Multi-Query Optimization." In PODS, 2001. Google ScholarDigital Library
S. Dar, M. J. Franklin, B. T. Jonsson, D. Srivastava, and M. Tan. "Semantic Data Caching and Replacement." In Proc. VLDB, 1996. Google ScholarDigital Library
D. J. DeWitt, S. Ghandeharizadeh, D. A. Schneider, A. Bricker, H. Hsiao, and R. Rasmussen. "The Gamma Database Machine Project." In IEEE TKDE, 2(1), pp. 44--63, Mar. 1990. Google ScholarDigital Library
D. J. DeWitt. "The Wisconsin Benchmark: Past, Present, and Future." The Benchmark Handbook, J. Gray, ed., Morgan Kaufmann Pub., San Mateo, CA (1991).Google Scholar
P. M. Fernandez. "Red Brick Warehouse: A Read-Mostly RDBMS for Open SMP Platforms." In SIGMOD, 1994. Google ScholarDigital Library
S. Finkelstein. "Common expression analysis in database applications." In Proc. SIGMOD, 1982. Google ScholarDigital Library
G. Graefe. "Iterators, Schedulers, and Distributed-memory Parallelism." In Software-practice and experience, Vol. 26 (4), pp. 427--452, Apr. 1996. Google ScholarDigital Library
G. Graefe. "Volcano - An Extensible and Parallel Query Evaluation System." In TKDE 6(1): 120--135, 1994. Google ScholarDigital Library
J. Gray. "The Next Database Revolution." Keynote, SIGMOD, 2004. Google ScholarDigital Library
S. Harizopoulos and A. Ailamaki. "A Case for Staged Database Systems." In Proc. CIDR, 2003.Google Scholar
T. Johnson and D. Shasha. "2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm." In Proc. VLDB, 1994. Google ScholarDigital Library
N. Kabra and D. J. DeWitt. "Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans." In Proc. SIGMOD, 1998. Google ScholarDigital Library
S. R. Madden, M. A. Shah, J. M. Hellerstein, and V. Raman. "Continuously Adaptive Continuous Queries over Streams." In Proc. SIGMOD, 2002. Google ScholarDigital Library
N. Megiddo and D. S. Modha. "ARC: A Self-Tuning, Low Overhead Replacement Cache." In Proc. FAST, 2003. Google ScholarDigital Library
E. J. O'Neil, P. E. O'Neil, and G. Weikum. "The LRU-K page replacement algorithm for database disk buffering." In Proc. SIGMOD, 1993. Google ScholarDigital Library
N. Roussopoulos. "View indexing in relational databases." In ACM Trans. on Database Systems 7(2):258--290,1982. Google ScholarDigital Library
P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. "Efficient and Extensible Algorithms for Multi Query Optimization." In Proc. SIGMOD, 2000. Google ScholarDigital Library
G. M. Sacco and M. Schkolnick. "Buffer management in relational database systems." In ACM TODS, 11(4):473--498, Dec. 1986. Google ScholarDigital Library
P. Sarda, J. R. Haritsa. "Green Query Optimization: Taming Query Optimization Overheads through Plan Recycling," In Proc. VLDB, 2004. Google ScholarDigital Library
T. K. Sellis. "Multiple Query Optimization." In ACM TODS, 13(1):23--52, Mar. 1988. Google ScholarDigital Library
P. Seshadri, M. Livny, and R. Ramakrishnan. "The Case for Enhanced Abstract Data Types." In Proc. VLDB, 1997. Google ScholarDigital Library
J. Shim, P. Scheuermann, and R. Vingralek. "Dynamic caching of query results for decision support systems." In Proc. SSDBM, 1999. Google ScholarDigital Library
V. Shkapenyuk, R. Williams, S. Harizopoulos, and A. Ailamaki. "Deadlock Resolution in Pipelined Query Graphs." Carnegie Mellon University Technical Report, CMU-CS-05-122, 2005.Google Scholar
J. Zhou and K. A. Ross. "Buffering Database Operations for Enhanced Instruction Cache Performance." In Proc. SIGMOD, 2004. Google ScholarDigital Library

QPipe: a simultaneously pipelined relational query engine
1. Information systems
  1. Data management systems
    1. Database management system engines
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory

Recommendations

Simultaneous Pipelining in QPipe: Exploiting Work Sharing Opportunities Across Queries
ICDE '06: Proceedings of the 22nd International Conference on Data Engineering

Data warehousing and scientific database applications operate on massive datasets and are characterized by complex queries accessing large portions of the database. Concurrent queries often exhibit high data and computation overlap, e.g., they access ...
Read More
Equivalence and minimization of conjunctive queries under combined semantics
ICDT '12: Proceedings of the 15th International Conference on Database Theory

The problems of query containment, equivalence, and minimization are fundamental problems in the context of query processing and optimization. In their classic work [2] published in 1977, Chandra and Merlin solved the three problems for the language of ...
Read More
Optimization of joins using random record generation method
A2CWiC '10: Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India

Joins are statements that retrieve data from more than one table. A Join is characterized by multiple tables in the FROM clause, and the relationship between the tables is defined through the existence of a Join condition in the WHERE clause. In the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data
June 2005
990 pages
ISBN:1595930604
DOI:10.1145/1066157
Conference Chair:
Fatma Ozcan
IBM Almaden Research Center
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 June 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 122
  Total Citations
  View Citations
- 1,182
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

QPipe: a simultaneously pipelined relational query engine

SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Recommendations

Simultaneous Pipelining in QPipe: Exploiting Work Sharing Opportunities Across Queries

Equivalence and minimization of conjunctive queries under combined semantics

Optimization of joins using random record generation method

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

QPipe: a simultaneously pipelined relational query engine

SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Recommendations

Simultaneous Pipelining in QPipe: Exploiting Work Sharing Opportunities Across Queries

Equivalence and minimization of conjunctive queries under combined semantics

Optimization of joins using random record generation method

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media