ABSTRACT
A key feature of database systems is to provide transparent access to stored data. In distributed database systems, this includes data allocation and fragmentation. Transparent access introduces data dependencies and increases system complexity and inter-process communication. Therefore, many developers are exchanging transparency for better scalability using sharding and similar techniques. However, explicitly managing data distribution and data flow requires a deep understanding of the distributed system and the data access, and it reduces the possibilities for optimizations.
To address this problem, we present an approach for efficient data allocation that features good scalability while keeping the data distribution transparent. We propose a workload-aware, query-centric, heterogeneity-aware analytical model. We formalize our approach and present an efficient allocation algorithm. The algorithm optimizes the partitioning and data layout for local query execution and balances the workload on homogeneous and heterogeneous systems according to the query history. In the evaluation, we demonstrate that our approach scales well in performance for OLTP- and OLAP-style workloads and reduces storage requirements significantly over replicated systems while guaranteeing configurable availability.
- R. Agrawal, S. Chaudhuri, A. Das, and V. R. Narasayya. Automating Layout of Relational Databases. In ICDE, pages 607--618, 2003.Google Scholar
- I. Ahmad, K. Karlapalem, Y.-K. Kwok, and S.-K. So. Evolutionary Algorithms for Allocating Data in Distributed Database Systems. Distributed and Parallel Databases, 11(1):5--32, 2002. Google ScholarDigital Library
- A. Alba, V. Bhagwan, M. Ching, A. Cozzi, R. Desai, D. Gruhl, K. Haas, L. Kato, J. Kusnitz, B. Langston, F. Nagy, L. Nguyen, J. Pieper, S. Srinivasan, A. Stuart, and R. Tang. A Funny Thing Happened on the Way to a Billion... IEEE Data Engineering Bulletin, 31(4):27--36, 2006.Google Scholar
- P. A. Alsberg and J. D. Day. A Principle for Resilient Sharing of Distributed Resources. In ICSE, pages 562--570, 1976. Google ScholarDigital Library
- G. M. Amdahl. Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities. In AFIPS, pages 483--485, 1967. Google ScholarDigital Library
- R. R. Amossen. Vertical Partitioning of Relational OLTP Databases Using Integer Programming. In ICDEW, pages 93--98, 2010.Google ScholarCross Ref
- C. Amza, A. L. Cox, and W. Zwaenepoel. Distributed Versioning: Consistent Replication for Scaling Back-End Databases of Dynamic Content Web Sites. In Middleware 2003, pages 282--304. Springer Berlin Heidelberg, 2003. Google ScholarDigital Library
- P. M. G. Apers. Data Allocation in Distributed Database Systems. ACM Transactions on Database Systems, 13(3):263--304, 1988. Google ScholarDigital Library
- L. A. Barroso and U. Hölzle. The Case for Engergy-Proportional Computing. IEEE Computer, 40(12):33--37, 2007. Google ScholarDigital Library
- H.-G. Beyer. Theory of Evolution Strategies. Springer Berlin / Heidelberg, 2001.Google ScholarCross Ref
- B. Bhattacharjee, M. Canim, C. A. Lang, G. A. Mihaila, and K. A. Ross. Storage Class Memory Aware Data Management. IEEE Data Engineering Bulletin, 33(4):35--40, 2010.Google Scholar
- A. Borodin and R. El-Yaniv. Online Computation and Competitive Analysis. Cambridge University Press, 1998. Google ScholarDigital Library
- R. Burkard, M. Dell'Amico, and S. Martello. Assignment Problems. Society for Industrial and Applied Mathematics, 2009. Google ScholarDigital Library
- B. Calder, C. Krintz, S. John, and T. Austin. Cache-Conscious Data Placement. SIGOPS Operation System Review, 32(5):139--149, 1998. Google ScholarDigital Library
- M. Canim, B. Bhattacharjee, G. A. Mihaila, C. A. Lang, and K. A. Ross. An Object Placement Advisor for DB2 Using Solid State Storage. PVLDB, 2(2):1318--1329, 2009. Google ScholarDigital Library
- E. Cecchet. RAIDb: Redundant Array of Inexpensive Databases. In ISPA, pages 115--125, 2004. Google ScholarDigital Library
- E. Cecchet, J. Marguerite, and W. Zwaenepoel. C-JDBC: Flexible Database Clustering Middleware. In USENIX, 2004. Google ScholarDigital Library
- S. A. Ceri, M. Negri, and G. Pelagatti. Horizontal Data Partitioning in Database Design. In SIGMOD, pages 128--136, 1982. Google ScholarDigital Library
- E. G. Coffman, M. R. Garey, and D. S. Johnson. Approximation Algorithms for Bin Packing: A Survey. In D. S. Hochbaum, editor, Approximation Algorithms for NP-Hard Problems, chapter 2. PWS Publishing Company, 1996. Google ScholarDigital Library
- G. Copeland, W. Alexander, E. Boughter, and T. Keller. Data Placement in Bubba. SIGMOD Record, 17(3):99--108, 1988. Google ScholarDigital Library
- C. Curino, E. Jones, Y. Zhang, and S. Madden. Schism: A Workload-Driven Approach to Database Replication and Partitioning. PVLDB, 3(1--2):48--57, 2010. Google ScholarDigital Library
- J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. CACM, 51(1):107--113, 2008. Google ScholarDigital Library
- D. J. DeWitt, R. H. Gerber, G. Graefe, M. L. Heytens, K. B. Kumar, and M. Muralikrishna. GAMMA - A High Performance Dataflow Database Machine. In VLDB, pages 228--237, 1986. Google ScholarDigital Library
- S. Elnikety, S. Dropsho, and F. Pedone. Tashkent: Uniting Durability with Transaction Ordering for High-Performance Scalable Database Replication. In SIGOPS/EuroSys, pages 117--130, 2006. Google ScholarDigital Library
- S. Elnikety, S. Dropsho, and W. Zwaenepoel. Tashkent: Memory-Aware Load Balancing and Update Filtering in Replicated Databases. In EuroSys, pages 399--412, 2007. Google ScholarDigital Library
- S. Englert, J. Gray, T. Kocher, and P. Shah. A Benchmark of NonStop SQL Release 2 Demonstrating Near-Linear Speedup and Scaleup on Large Databases. Technical Report 89.4, Tandem Computers Inc., 1989.Google Scholar
- P. Ganesan, M. Bawa, and H. Garcia-Molina. Online Balancing of Range-Partitioned Data with Applications to Peer-to-Peer Systems. In VLDB, pages 444--455. VLDB, 2004. Google ScholarDigital Library
- J. Gray. Why Do Computers Stop and What Can be Done About It? Technical Report 85.7, Tandem Computers, 1985.Google Scholar
- J. O. Hauglid, N. H. Ryeng, and K. Nørvåg. DYFRAM: Dynamic Fragmentation and Replica Management in Distributed Database Systems. Distributed and Parallel Databases, 28(2):157--185, 2010. Google ScholarDigital Library
- A. A. Helal, A. A. Heddaya, and B. B. Bhargava. Replication Techniques in Distributed Systems. Springer, 1996. Google ScholarDigital Library
- J. M. Hellerstein, M. Stonebraker, and J. Hamilton. Architecture of a Database System. Foundations and Trends in Databases, 1(2):141--259, 2007. Google ScholarDigital Library
- B. Kemme and G. Alonso. Don't Be Lazy, Be Consistent: Postgres-R, A New Way to Implement Database Replication. In VLDB, pages 134--143, 2000. Google ScholarDigital Library
- H. W. Kuhn. The Hungarian Method for the Assignment Problem. Naval Research Logistic, 52(1):7--21, 2005.Google ScholarCross Ref
- R. Ladin, B. Liskov, L. Shrira, and S. Ghemawat. Providing High Availability Using Lazy Replication. ACM Transactions on Computer Systems, 10:360--391, 1992. Google ScholarDigital Library
- A. Lakshman and P. Malik. Cassandra: a decentralized structured storage system. SIGOPS Operating Systems Review, 44(2):35--40, 2010. Google ScholarDigital Library
- J. Leverich and C. Kozyrakis. On the Energy (In)efficiency of Hadoop Clusters. In HotPower, New York, NY, USA, 2009. ACM.Google ScholarDigital Library
- J. Li, J. Naughton, and R. V. Nehme. Resource Bricolage for Parallel Database Systems. PVLDB, 8:25--36, 2014. Google ScholarDigital Library
- J. M. Milan-Franco, R. Jiménez-Peris, M. P. no Martínez, and B. Kemme. Adaptive Middleware for Data Replication. In Middleware, pages 175--194, 2004. Google ScholarDigital Library
- P. Moscato. On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Towards Memetic Algorithms. Technical Report Caltech Concurrent Computation Program 158--79, California Institute of Technology, Pasadena, CA, USA, 1989.Google Scholar
- J. Munkres. Algorithms for the Assignment and Transportation Problems. Journal of the Society of Industrial and Applied Mathematics, 5(1):32--38, 1957.Google ScholarCross Ref
- O. Ozmen, K. Salem, J. Schindler, and S. Daniel. Workload-Aware Storage Layout for Database Systems. In SIGMOD, pages 939--950, 2010. Google ScholarDigital Library
- T. Özsu and P. Valduriez. Principles of Distributed Database Systems, Third Edition. Springer New York, Upper Saddle River, NJ, USA, 2011. Google ScholarDigital Library
- S. Papadomanolakis, D. Dash, and A. Ailamaki. Efficient Use of the Query Optimizer for Automated Physical Design. In VLDB, pages 1093--1104. VLDB Endowment, 2007. Google ScholarDigital Library
- M. Patino-Martínez, R. Jimínez-Peris, B. Kemme, and G. Alonso. MIDDLE-R: Consistent Database Replication at the Middleware Level. ACM Transactions on Computer Systems, 23(4):375--423, 2005. Google ScholarDigital Library
- A. Pavlo, C. Curino, and S. Zdonik. Skew-aware Automatic Database Partitioning in Shared-nothing, Parallel OLTP Systems. In SIGMOD, 2012. Google ScholarDigital Library
- C. Plattner, G. Alonso, and M. T. Özsu. Extending DBMSs with Satellite Databases. VLDBJ, 17(4):657--682, 2008. Google ScholarDigital Library
- A. Quamar, K. A. Kumar, and A. Deshpande. SWORD: Scalable Workload-aware Data Placement for Transactional Workloads. In EDBT, 2013. Google ScholarDigital Library
- T. Rabl. Efficiency in Cluster Database Systems. PhD thesis, University of Passau, 2011.Google Scholar
- P. Rubel, M. Gillen, J. Loyall, R. Schantz, A. Gokhale, J. Balasubramanian, A. Paulos, and P. Narasimhan. Fault Tolerant Approaches for Distributed Real-time and Embedded Systems. In MILCOM, pages 1--8. IEEE, 2007.Google ScholarCross Ref
- C. Rusu, A. Ferreira, C. Scordino, and A. Watson. Energy-Efficient Real-Time Heterogeneous Server Clusters. In RTAS, pages 418--428, 2006. Google ScholarDigital Library
- D. Sacca and G. Wiederhold. Database Partitioning in a Cluster of Processors. ACM Transactions on Database Systems, 10(1):29--56, 1985. Google ScholarDigital Library
- P. Scheuermann, G. Weikum, and P. Zabback. Data Partitioning and Load Balancing in Parallel Disk Systems. VLDBJ, 7(1):48--66, 1998. Google ScholarDigital Library
- H.-P. Schwefel and G. Rudolph. Contemporary Evolution Strategies. In Proceedings of the Third European Conference on Advances in Artificial Life, volume 929 of Lecture Notes in Computer Science, pages 893--907, Berlin, Germany, 1995. Springer. Google ScholarDigital Library
- S. Shankland. Google spotlights data center inner workings, May 2008. https://www.cnet.com/news/google-spotlights-data-center-inner -workings/.Google Scholar
- M. Stonebraker, D. Abadi, D. J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin. MapReduce and Parallel DBMSs: Friends or Foes? CACM, 53(1):64--71, 2010. Google ScholarDigital Library
- M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. J. O'Neil, P. E. O'Neil, A. Rasin, N. Tran, and S. B. Zdonik. C-Store: A Column-oriented DBMS. In VLDB, pages 553--564. VLDB Endowment, 2005. Google ScholarDigital Library
- M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell, C. Staelin, and A. Yu. Mariposa: A Wide-Area Distributed Database System. VLDBJ, 5(1):48--63, 1996. Google ScholarDigital Library
- A. Syropoulos. Mathematics of Multisets. In WMP, volume 2235 of Lecture Notes in Computer Science, pages 347--358, Berlin, Germany, 2001. Springer. Google ScholarDigital Library
- S. Voß. Meta-heuristics: The State of the Art. In ECAI, pages 1--23, 2001. Google ScholarDigital Library
- E. Zamanian, C. Binnig, and A. Salama. Locality-aware partitioning in parallel database systems. In SIGMOD, pages 17--30, 2015. Google ScholarDigital Library
Index Terms
- Query Centric Partitioning and Allocation for Partially Replicated Database Systems
Recommendations
Exploration of Dynamic Query-Based Load Balancing for Partially Replicated Database Systems with Node Failures
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge ManagementDatabase replication is a mechanism to achieve scalability, for example, by executing queries independently on replica nodes. Partial replication is an approach to minimize the overall memory consumption of a replication cluster while still enabling a ...
Replicated data management in distributed database systems
Replication is the key factor in improving the availability of data in distributed systems. Replicated data is stored at multiple sites so that it can be accessed by the user even when some of the copies are not available due to site failures. A major ...
Resource bricolage and resource selection for parallel database systems
Running parallel database systems in an environment with heterogeneous resources has become increasingly common, due to cluster evolution and increasing interest in moving applications into public clouds. Performance differences among machines in the ...
Comments