skip to main content
10.1145/1559845.1559878acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Results Reproduced / v1.1

Self-organizing tuple reconstruction in column-stores

Published:29 June 2009Publication History

ABSTRACT

Column-stores gained popularity as a promising physical design alternative. Each attribute of a relation is physically stored as a separate column allowing queries to load only the required attributes. The overhead incurred is on-the-fly tuple reconstruction for multi-attribute queries. Each tuple reconstruction is a join of two columns based on tuple IDs, making it a significant cost component. The ultimate physical design is to have multiple presorted copies of each base table such that tuples are already appropriately organized in multiple different orders across the various columns. This requires the ability to predict the workload, idle time to prepare, and infrequent updates.

In this paper, we propose a novel design, partial sideways cracking, that minimizes the tuple reconstruction cost in a self-organizing way. It achieves performance similar to using presorted data, but without requiring the heavy initial presorting step itself. Instead, it handles dynamic, unpredictable workloads with no idle time and frequent updates. Auxiliary dynamic data structures, called cracker maps, provide a direct mapping between pairs of attributes used together in queries for tuple reconstruction. A map is continuously physically reorganized as an integral part of query evaluation, providing faster and reduced data access for future queries. To enable flexible and self-organizing behavior in storage-limited environments, maps are materialized only partially as demanded by the workload. Each map is a collection of separate chunks that are individually reorganized, dropped or recreated as needed. We implemented partial sideways cracking in an open-source column-store. A detailed experimental analysis demonstrates that it brings significant performance benefits for multi-attribute queries.

References

  1. D. Abadi et al. Integrating compression and execution in column-oriented database systems. SIGMOD 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Abadi et al. Materialization Strategies in a Column-Oriented DBMS. ICDE 2007.Google ScholarGoogle ScholarCross RefCross Ref
  3. S. Agrawal et al. Database Tuning Advisor for Microsoft SQL Server. VLDB 2004.Google ScholarGoogle Scholar
  4. P. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-Pipelining Query Execution. CIDR 2005.Google ScholarGoogle Scholar
  5. N. Bruno and S. Chaudhuri. To Tune or not to Tune? A Lightweight Physical Design Alerter. VLDB 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Harizopoulos et al. Performance Tradeoffs in Read-Optimized Databases. VLDB 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Idreos, M. Kersten, and S. Manegold. Database Cracking. CIDR 2007.Google ScholarGoogle Scholar
  8. S. Idreos, M. Kersten, and S. Manegold. Updating a Cracked Database. SIGMOD 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Kersten and S. Manegold. Cracking the Database Store. CIDR 2005.Google ScholarGoogle Scholar
  10. S. Manegold et al. Cache-Conscious Radix-Decluster Projections. VLDB 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Schnaitter et al. COLT: Continuous On-Line Database Tuning. SIGMOD 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Stonebraker et al. C-Store: A Column Oriented DBMS. VLDB 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. C. Zilio et al. DB2 Design Advisor: Integrated Automatic Physical Database Design. VLDB 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. TPC Benchmark H. http://www.tpc.org/tpch/.Google ScholarGoogle Scholar
  15. MonetDB. http://monetdb.cwi.nl/.Google ScholarGoogle Scholar

Index Terms

  1. Self-organizing tuple reconstruction in column-stores

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
        June 2009
        1168 pages
        ISBN:9781605585512
        DOI:10.1145/1559845

        Copyright © 2009 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 29 June 2009

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader