ABSTRACT
While shared-nothing parallel infrastructures provide fast processing of explosively growing digital content, managing data efficiently across multiple nodes is important. The value-range partitioning method with parallel B-tree structures in a shared-nothing environment is an efficient approach for handling large amounts of data. To handle large amounts of data, it is also important to provide an efficient concurrency control protocol for the parallel B-tree. Many studies have proposed concurrency control protocols for B-trees, which use latch-coupling. None of these studies has considered that latch-coupling contains a performance bottleneck of sending of messages between processing elements (PEs) in distributed environments because latch-coupling is efficient for a B-tree on a single machine. The only protocol without latch-coupling is the B-link algorithm, but it is difficult to use the B-link algorithm directly on an entire parallel B-tree structure because it is necessary to guarantee the consistency of the side pointers. We propose a new concurrency control protocol named LCFB that requires no latch-coupling in optimistic processes. LCFB reduces the amount of communication between PEs during a B-tree traversal. To detect access path errors in the LCFB protocol caused by removal of latch-coupling, we assign boundary values to each index page. Because a page split may cause page deletion in a Fat-Btree, we also propose an effective method for handling page deletions without latch-coupling. We then combine LCFB with the B-link algorithm within each PE to reduce the cost of Structure Modification Operations (SMOs) in a PE, as a solution to the difficulty of consistency management for the side pointers in a parallel B-tree structure. To compare the performance of the proposed protocol with conventional protocols MARK-OPT, INC-OPT, and ARIES/IM, we implemented them on an autonomous disk system with a Fat-Btree structure. Experimental results in various environments indicate that the system throughput of the proposed protocols is always superior to those of the other protocols, especially in large-scale configurations, and LCFB with the B-link algorithm is effective at higher update ratios.
- R. Bayer and M. Schkolnick. Concurrency of operations on B-trees. Acta Inf., 9(1):1--21, 1977.Google ScholarDigital Library
- P. A. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987. Google ScholarDigital Library
- F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. In Proc. of OSDI2006, pages 205--218, 2006. Google ScholarDigital Library
- Cluster File Systems, Inc. Lustre: A scalable, high-performance file system. http://www.lustre.org/docs/whitepaper.pdf.Google Scholar
- R. Devine. Design and implementation of DDH: A distributed dynamic hashing algorithm. In Proc of FODO '93, pages 101--114, 1993. Google ScholarDigital Library
- D. J. DeWitt and J. Gray. Parallel database systems: The future of high performance database systems. Commun. ACM, 35(6):85--98, 1992. Google ScholarDigital Library
- S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In Proc. of SOSP 2003, pages 29--43, 2003. Google ScholarDigital Library
- G. Graefe. Write-optimized B-trees. In Proc. of VLDB 2004, pages 672--683, 2004. Google ScholarDigital Library
- J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1992. Google ScholarDigital Library
- I. Jaluta, S. Sippu, and E. Soisalon-Soininen. Concurrency control and recovery for balanced B-link trees. 14(2):257--277, 2005. Google ScholarDigital Library
- T. Johnson and D. Shasha. Utilization of B-trees with inserts, deletes and modifies. In Proc. of PODS'89, pages 235--246, 1989. Google ScholarDigital Library
- H. T. Kung and P. L. Lehman. Concurrent manipulation of binary search trees. ACM Trans. Database Syst., 5(3):354--382, 1980. Google ScholarDigital Library
- V. Lanin and D. Shasha. A symmetric concurrent B-tree algorithm. In Proc. of FJCC'86, pages 380--389, 1986. Google ScholarDigital Library
- M. Lee, Mong-Liand Kitsuregawa, B. C. Ooi, K.-L. Tan, and A. Mondal. Towards self-tuning data placement in parallel database systems. In Proc. of the SIGMOD'00, pages 225--236, 2000. Google ScholarDigital Library
- P. L. Lehman and S. B. Yao. Efficient locking for concurrent operations on B-trees. ACM Trans. Database Syst., 6(4):650--670, 1981. Google ScholarDigital Library
- W. Litwin, M.-A. Neimat, and D. A. Schneider. LH* --- linear hashing for distributed files. In Proc. of the SIGMOD '93, pages 327--336, 1993. Google ScholarDigital Library
- D. Lomet and B. Salzberg. Concurrency and recovery for index trees. VLDB Journal, 6(3):224--240, 1997. Google ScholarDigital Library
- J. MacCormick, N. Murphy, M. Najork, C. A. Thekkath, and L. Zhou. Boxwood: Abstractions as the foundation for storage infrastructure. In Proc. of OSDI 2004, pages 105--120, 2004. Google ScholarDigital Library
- J. Menon, D. A. Pease, R. Rees, L. D. ich, and B. Hillsberg. IBM storage tank --- a heterogeneous scalable SAN file system. IBM Syst. J., 42(2):250--267, 2003. Google ScholarDigital Library
- J. Miyazaki, Y. Abe, and H. Yokota. Availabilities and costs of reliable Fat-Btrees. In Proc. of PRDC2004, pages 163--172, 2004. Google ScholarDigital Library
- J. Miyazaki and H. Yokota. Concurrency control and performance evaluation of parallel B-tree structures. IEICE Transactions on Information and Systems, E85-D(8):1269--1283, 2002.Google Scholar
- C. Mohan and F. Levine. ARIES/IM: An efficient and high concurrency index management method using write-ahead logging. In Proc. of the SIGMOD'92, pages 371--381, 1992. Google ScholarDigital Library
- X. Ouyang, T. Yoshihara, and H. Yokota. An efficient commit protocol exploiting primary-backup placement in a distributed storage system. In Proc. of PRDC2006, pages 238--247, 2006. Google ScholarDigital Library
- B. Phillips. Have storage area networks come of age? IEEE Computer, 31(7):10--12, 1998. Google ScholarDigital Library
- N. Ponnekanti and H. Kodavalla. Online index rebuild. In Proc. of the SIGMOD'00, pages 529--538, 2000. Google ScholarDigital Library
- SAS. SAS scalable performance data server. http://www.sas.com/technologies/dw/storage/spds/index.html.Google Scholar
- B. Seeger and P.-Å. Larson. Multi-disk B-trees. In Proc. of the SIGMOD '91, pages 436--445, 1991. Google ScholarDigital Library
- V. Srinivasan and M. J. Carey. Performance of B-tree concurrency control algorithms. In Proc. of the SIGMOD '91, pages 416--425, 1991. Google ScholarDigital Library
- H. Yokota. Autonomous disks for advanced database applications. In Proc. of DANTE'99, pages 435--442, 1999. Google ScholarDigital Library
- H. Yokota, Y. Kanemasa, and J. Miyazaki. Fat-Btree: An update-conscious parallel directory structure. In Proc. of ICDE'99, pages 448--457, 1999. Google ScholarDigital Library
- T. Yoshihara, D. Kobayashi, and H. Yokota. MARK-OPT: A concurrency control protocol for parallel B-tree structures to reduce the cost of SMOs. IEICE Transactions on Information and Systems, E90-D(8):1213--1224, 2007. Google ScholarDigital Library
Index Terms
- A concurrency control protocol for parallel B-tree structures without latch-coupling for explosively growing digital content
Recommendations
A Concurrency Control Method for Parallel Btree Structures
ICDEW '06: Proceedings of the 22nd International Conference on Data Engineering WorkshopsA new concurrency control protocol for parallel Btree structures, MARK-OPT, is proposed. MARK-OPT marks the lowest structure-modification-operation (SMO) occurrence point during optimistic latch coupling operations, to reduce the cost of SMO compared to ...
T-Tree or B-Tree: Main Memory Database Index Structure Revisited
ADC '00: Proceedings of the Australasian Database ConferenceWhile the B-tree (or the B+-tree) is the most popular index structure in disk-based relational database systems, the T-tree has been widely accepted as a promising index structure for main memory databases where the entire database (or most of them) ...
B-tree concurrency control and recovery in page-server database systems
We develop new algorithms for the management of transactions in a page-shipping client-server database system in which the physical database is organized as a sparse B-tree index. Our starvation-free fine-grained locking protocol combines adaptive ...
Comments