skip to main content
10.1145/3127479.3127483acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Search lookaside buffer: efficient caching for index data structures

Published:24 September 2017Publication History

ABSTRACT

With the ever increasing DRAM capacity in commodity computers, applications tend to store large amount of data in main memory for fast access. Accordingly, efficient traversal of index structures to locate requested data becomes crucial to their performance. The index data structures grow so large that only a fraction of them can be cached in the CPU cache. The CPU cache can leverage access locality to keep the most frequently used part of an index in it for fast access. However, the traversal on the index to a target data during a search for a data item can result in significant false temporal and spatial localities, which make CPU cache space substantially underutilized. In this paper we show that even for highly skewed accesses the index traversal incurs excessive cache misses leading to suboptimal data access performance. To address the issue, we introduce Search Lookaside Buffer (SLB) to selectively cache only the search results, instead of the index itself. SLB can be easily integrated with any index data structure to increase utilization of the limited CPU cache resource and improve throughput of search requests on a large data set. We integrate SLB with various index data structures and applications. Experiments show that SLB can improve throughput of the index data structures by up to an order of magnitude. Experiments with real-world key-value traces also show up to 73% throughput improvement on a hash table.

References

  1. Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau. 2015. Operating Systems: Three Easy Pieces (0.91 ed.). Arpaci-Dusseau Books.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload Analysis of a Large-scale Key-value Store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '12). ACM, New York, NY, USA, 53--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B+-tree 2017. B+-tree. https://en.wikipedia.org/wiki/B%2B_tree. (2017).Google ScholarGoogle Scholar
  4. Masanori Bando, Yi-Li Lin, and H. Jonathan Chao. 2012. FlashTrie: Beyond 100-Gb/s IP Route Lookup Using Hash-based Prefix-compressed Trie. IEEE/ACM Trans. Netw. 20, 4 (Aug. 2012), 1262--1275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Arkaprava Basu, Jayneel Gandhi, Jichuan Chang, Mark D. Hill, and Michael M. Swift. 2013. Efficient Virtual Memory for Big Memory Servers. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA '13). ACM, New York, NY, USA, 237--248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Bayer and E. McCreight. 1970. Organization and Maintenance of Large Ordered Indices. In Proceedings of the 1970 ACM SIGFIDET (Now SIGMOD) Workshop on Data Description, Access and Control (SIGFIDET '70). ACM, New York, NY, USA, 107--141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jeff Bonwick, Matt Ahrens, Val Henson, Mark Maybee, and Mark Shellenbaum. 2003. The zettabyte file system. In Proc. of the 2nd Usenix Conference on File and Storage Technologies.Google ScholarGoogle Scholar
  8. Chee-Yong Chan and Yannis E. Ioannidis. 1998. Bitmap Index Design and Evaluation. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD '98). ACM, New York, NY, USA, 355--366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Shimin Chen, Anastassia Ailamaki, Phillip B. Gibbons, and Todd C. Mowry. 2007. Improving Hash Join Performance Through Prefetching. ACM Trans. Database Syst. 32, 3, Article 17 (Aug. 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Eric S. Chung, John D. Davis, and Jaewon Lee. 2013. LINQits: Big Data on Little Clients. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA '13). ACM, New York, NY, USA, 261--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fernando J Corbato. 1968. A paging experiment with the multics system. Technical Report. DTIC Document.Google ScholarGoogle Scholar
  12. Justin DeBrabant, Andrew Pavlo, Stephen Tu, Michael Stonebraker, and Stan Zdonik. 2013. Anti-caching: A New Approach to Database Management System Architecture. Proc. VLDB Endow. 6, 14 (Sept. 2013), 1942--1953. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Mihai Dobrescu, Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin Fall, Gianluca Iannaccone, Allan Knies, Maziar Manesh, and Sylvia Ratnasamy. 2009. RouteBricks: Exploiting Parallelism to Scale Software Routers. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles (SOSP '09). ACM, New York, NY, USA, 15--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ahmed Eldawy, Justin Levandoski, and Per-Åke Larson. 2014. Trekking Through Siberia: Managing Cold Data in a Memory-optimized Database. Proc. VLDB Endow. 7, 11 (July 2014), 931--942. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Florian Funke, Alfons Kemper, and Thomas Neumann. 2012. Compacting Transactional Data in Hybrid OLTP&OLAP Databases. Proc. VLDB Endow. 5, 11 (July 2012), 1424--1435. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Brian Gold, Anastassia Ailamaki, Larry Huston, and Babak Falsafi. 2005. Accelerating Database Operators Using a Network Processor. In Proceedings of the 1st International Workshop on Data Management on New Hardware (DaMoN '05). ACM, New York, NY, USA, Article 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Google Sparse Hash 2017. Google Sparse Hash. http://github.com/sparsehash/sparsehash/. (2017).Google ScholarGoogle Scholar
  18. Peter Hawkins, Alex Aiken, Kathleen Fisher, Martin Rinard, and Mooly Sagiv. 2012. Concurrent Data Representation Synthesis. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '12). ACM, New York, NY, USA, 417--428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Timothy Hayes, Oscar Palomar, Osman Unsal, Adrian Cristal, and Mateo Valero. 2012. Vector Extensions for Decision Support DBMS Acceleration. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). IEEE Computer Society, Washington, DC, USA, 166--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Maurice Herlihy and Nir Shavit. 2011. The art of multiprocessor programming. Morgan Kaufmann.Google ScholarGoogle Scholar
  21. Maurice Herlihy, Nir Shavit, and Moran Tzafrir. 2008. Hopscotch hashing. In International Symposium on Distributed Computing. Springer, 350--364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. hugepages 2017. HugeTlbPage. https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt. (2017).Google ScholarGoogle Scholar
  23. Infiniband HDR 2017. 200Gb/s HDR InfiniBand Solutions. https://goo.gl/z6Z1xc. (2017).Google ScholarGoogle Scholar
  24. InnoDB B-tree 2017. Physical Structure of an InnoDB Index. https://goo.gl/mHnpFb. (2017).Google ScholarGoogle Scholar
  25. Intel Xeon E5 2017. Intel(R) Xeon(R) Processor E5-2683 v4. https://goo.gl/4Ls9xr. (2017).Google ScholarGoogle Scholar
  26. Song Jiang, Xiaoning Ding, Feng Chen, Enhua Tan, and Xiaodong Zhang. 2005. DULO: An Effective Buffer Cache Management Scheme to Exploit Both Temporal and Spatial Locality. In Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies - Volume 4 (FAST'05). USENIX Association, Berkeley, CA, USA, 8--8. http://dl.acm.org/citation.cfm?id=1251028.1251036Google ScholarGoogle Scholar
  27. Song Jiang and Xiaodong Zhang. 2004. ULC: A File Block Placement and Replacement Protocol to Effectively Exploit Hierarchical Locality in Multi-Level Buffer Caches. In Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS'04) (ICDCS '04). IEEE Computer Society, Washington, DC, USA, 168--177. http://dl.acm.org/citation.cfm?id=977400.977997 Google ScholarGoogle ScholarCross RefCross Ref
  28. Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi. 2008. H-store: A High-performance, Distributed Main Memory Transaction Processing System. Proc. VLDB Endow. 1, 2 (Aug. 2008), 1496--1499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Changkyu Kim, Tim Kaldewey, Victor W. Lee, Eric Sedlar, Anthony D. Nguyen, Nadathur Satish, Jatin Chhugani, Andrea Di Blas, and Pradeep Dubey. 2009. Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-core CPUs. Proc. VLDB Endow. 2, 2 (Aug. 2009), 1378--1389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Onur Kocberber, Boris Grot, Javier Picorel, Babak Falsafi, Kevin Lim, and Parthasarathy Ranganathan. 2013. Meet the Walkers: Accelerating Index Traversals for In-memory Databases. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, New York, NY, USA, 468--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Per-Åke Larson, Cipri Clinciu, Eric N. Hanson, Artem Oks, Susan L. Price, Srikumar Rangarajan, Aleksandras Surna, and Qingqing Zhou. 2011. SQL Server Column Store Indexes. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD '11). ACM, New York, NY, USA, 1177--1184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A Holistic Approach to Fast In-memory Key-value Storage. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI'14). USENIX Association, Berkeley, CA, USA, 429--444. http://dl.acm.org/citation.cfm?id=2616448.2616488Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. lmdb 2017. Symas Lightning Memory-mapped Database. http://www.lmdb.tech/doc/. (2017).Google ScholarGoogle Scholar
  34. Daniel Lustig, Abhishek Bhattacharjee, and Margaret Martonosi. 2013. TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs. ACM Trans. Archit. Code Optim. 10, 1, Article 2 (April 2013), 38 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Stefan Manegold, Peter Boncz, and Martin Kersten. 2002. Optimizing Main-Memory Join on Modern Hardware. IEEE Trans. on Knowl. and Data Eng. 14, 4 (July 2002), 709--730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yandong Mao, Eddie Kohler, and Robert Tappan Morris. 2012. Cache Craftiness for Fast Multicore Key-value Storage. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys '12). ACM, New York, NY, USA, 183--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Rich Martin. 1996. A Vectorized Hash-Join. In iRAM technical report. University of California at Berkeley.Google ScholarGoogle Scholar
  38. Memcached 2017. Memcached - a distributed memory object caching system. https://memcached.org/. (2017).Google ScholarGoogle Scholar
  39. MemSQL 2017. MemSQL. http://www.memsql.com/. (2017).Google ScholarGoogle Scholar
  40. MICA source code 2017. MICA. https://github.com/efficient/mica/. (2017).Google ScholarGoogle Scholar
  41. MongoDB 2017. MongoDB for GIANT Ideas. https://mongodb.com/. (2017).Google ScholarGoogle Scholar
  42. Rasmus Pagh and Flemming Friche Rodler. 2001. Cuckoo hashing. In European Symposium on Algorithms. Springer, 121--133. Google ScholarGoogle ScholarCross RefCross Ref
  43. Redis 2017. Redis. http://redis.io/. (2017).Google ScholarGoogle Scholar
  44. Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The Linux B-Tree Filesystem. Trans. Storage 9, 3, Article 9 (Aug. 2013), 32 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. SQLite 2017. In-Memory Databases - SQLite. https://sqlite.org/inmemorydb.html. (2017).Google ScholarGoogle Scholar
  46. SQLite B-tree 2017. Architecture of SQLite. https://goo.gl/5RaSol. (2017).Google ScholarGoogle Scholar
  47. Radu Stoica and Anastasia Ailamaki. 2013. Enabling Efficient OS Paging for Main-memory OLTP Databases. In Proceedings of the Ninth International Workshop on Data Management on New Hardware (DaMoN '13). ACM, New York, NY, USA, Article 7, 7 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Radu Stoica, Justin J. Levandoski, and Per-Åke Larson. 2013. Identifying Hot and Cold Data in Main-memory Databases. In Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE 2013) (ICDE '13). IEEE Computer Society, Washington, DC, USA, 26--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Y. Sun, Y. Hua, D. Feng, L. Yang, P. Zuo, and S. Cao. 2015. MinCounter: An efficient cuckoo hashing scheme for cloud storage systems. In 2015 31st Symposium on Mass Storage Systems and Technologies (MSST). 1--7. Google ScholarGoogle ScholarCross RefCross Ref
  50. Translation Lookaside buffer 2017. Translation lookaside buffer. https://goo.gl/yDd2i8. (2017).Google ScholarGoogle Scholar
  51. Yandong Wang, Li Zhang, Jian Tan, Min Li, Yuqing Gao, Xavier Guerin, Xiaoqiao Meng, and Shicong Meng. 2015. HydraDB: A Resilient RDMA-driven Key-value Middleware for In-memory Cluster Computing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '15). ACM, New York, NY, USA, Article 22, 11 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Xingbo Wu, Li Zhang, Yandong Wang, Yufei Ren, Michel Hack, and Song Jiang. 2016. zExpander: A Key-value Cache with Both High Performance and Fewer Misses. In Proceedings of the Eleventh European Conference on Computer Systems (EuroSys '16). ACM, New York, NY, USA, Article 14, 15 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. xxHash 2017. xxHash. http://github.com/Cyan4973/xxHash/. (2017).Google ScholarGoogle Scholar
  54. Yuan Yuan, Rubao Lee, and Xiaodong Zhang. 2013. The Yin and Yang of Processing Data Warehousing Queries on GPU Devices. Proc. VLDB Endow. 6, 10 (Aug. 2013), 817--828. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Huanchen Zhang, David G. Andersen, Andrew Pavlo, Michael Kaminsky, Lin Ma, and Rui Shen. 2016. Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). ACM, New York, NY, USA, 1567--1581. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Lee, and Xiaodong Zhang. 2015. Mega-KV: A Case for GPUs to Maximize the Throughput of In-memory Key-value Stores. Proc. VLDB Endow. 8, 11 (July 2015), 1226--1237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Dong Zhou, Bin Fan, Hyeontaek Lim, Michael Kaminsky, and David G. Andersen. 2013. Scalable, High Performance Ethernet Forwarding with CuckooSwitch. In Proceedings of the Ninth ACM Conference on Emerging Networking Experiments and Technologies (CoNEXT '13). ACM, New York, NY, USA, 97--108. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Search lookaside buffer: efficient caching for index data structures

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SoCC '17: Proceedings of the 2017 Symposium on Cloud Computing
          September 2017
          672 pages
          ISBN:9781450350280
          DOI:10.1145/3127479

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 September 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate169of722submissions,23%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader