skip to main content
research-article

Faster set intersection with SIMD instructions by reducing branch mispredictions

Published:01 November 2014Publication History
Skip Abstract Section

Abstract

Set intersection is one of the most important operations for many applications such as Web search engines or database management systems. This paper describes our new algorithm to efficiently find set intersections with sorted arrays on modern processors with SIMD instructions and high branch misprediction penalties. Our algorithm efficiently exploits SIMD instructions and can drastically reduce branch mispredictions. Our algorithm extends a merge-based algorithm by reading multiple elements, instead of just one element, from each of two input arrays and compares all of the pairs of elements from the two arrays to find the elements with the same values. The key insight for our improvement is that we can reduce the number of costly hard-to-predict conditional branches by advancing a pointer by more than one element at a time. Although this algorithm increases the total number of comparisons, we can execute these comparisons more efficiently using the SIMD instructions and gain the benefits of the reduced branch misprediction overhead. Our algorithm is suitable to replace existing standard library functions, such as std::set_intersection in C++, thus accelerating many applications, because the algorithm is simple and requires no preprocessing to generate additional data structures. We implemented our algorithm on Xeon and POWER7+. The experimental results show our algorithm outperforms the std::set_intersection implementation delivered with gcc by up to 5.2x using SIMD instructions and by up to 2.1x even without using SIMD instructions for 32-bit and 64-bit integer datasets. Our SIMD algorithm also outperformed an existing algorithm that can leverage SIMD instructions.

References

  1. L. A. Barroso, J. Dean, and U. Hölzle. Web Search for a Planet: The Google Cluster Architecture. IEEE Micro 23(2), pp. 22--28. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Ding and A. C. König. Fast set intersection in memory. In Proceedings of VLDB Endow., 4, pp. 255--266, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Baeza-Yates and A. Salinger. Experimental Analysis of a Fast Intersection Algorithm for Sorted Sequences. In Proceedings of the International Conference on String Processing and Information Retrieval, pp. 13--24, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Baeza-Yates. A Fast Set Intersection Algorithm for Sorted Sequences. In Proceedings of the Annual Symposium on Combinatorial Pattern Matching, pp. 400--408, 2004.Google ScholarGoogle Scholar
  5. E. D. Demaine, A. López-Ortiz, and J. I. Munro. Adaptive set intersections, unions, and differences. In Proceedings of the Annual Symposium on Discrete Algorithms, pp. 743--752, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Bille, A Pagh, and R. Pagh. Fast evaluation of union-intersection expressions. In Proceedings of the International Conference on Algorithms and Computation, pp. 739--750, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Sanders, F. Transier. Intersection in Integer Inverted Indices. In Proceedings of the Workshop on Algorithm Engineering and Experiments, pp. 71--83, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  8. B. Schlegel, T. Willhalm, and W. Lehner. Fast Sorted-Set Intersection using SIMD Instructions. In Proceedings of the Second International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures, 2011.Google ScholarGoogle Scholar
  9. D. Lemire, L. Boytsov and N. Kurz. SIMD Compression and the Intersection of Sorted Integers. arXiv: 1401.6399, 2014Google ScholarGoogle Scholar
  10. J. L. Bentley and A. C. Yao. An almost optimal algorithm for unbounded searching. Information processing letters, 5(3), pp. 82--87, 1976.Google ScholarGoogle Scholar
  11. J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren. Conversion of control dependence to data dependence. In Proceedings of the Symposium on Principles of Programming Languages, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Zhou and K. A. Ross. Implementing database operations using SIMD instructions. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 145--156, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Sanders and S. Winkel. Super Scalar Sample Sort. In Proceedings of the European Symposium on Algorithms, Volume 3221 of LNCS, pp. 784--796, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  14. H. Inoue, T. Moriyama, H. Komatsu, and T. Nakatani. AA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors, In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, pp. 189--198, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Chhugani, A. D. Nguyen, V. W. Lee, W. Macy, M. Hagog, Y. Chen, A. Baransi, S. Kumar, and P. Dubey. Efficient implementation of sorting on multi-core SIMD CPU architecture. In Proceedings. VLDB Endow., 1(2), pp. 1313--1324, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Tsirogiannis, S. Guha, and N. Koudas. Improving the performance of list intersection. In Proceedings of VLDB Endow., 2(1), pp. 838--849, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Tatikonda, F. Junqueira, B. B. Cambazoglu, and V. Plachouras. On efficient posting list intersection with multicore processors. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Ao, F. Zhang, D. Wu, D. S. Stones, G. Wang, X. Liu, J. Liu, and S. Lin. Efficient parallel lists intersection and index compression algorithms using graphics processing units. In Proceedings of VLDB Endow., 4(8), pp. 470--481, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. R. Musser. Introspective Sorting and Selection Algorithms. Software Practice and Experience, 27(8), pp. 983--993, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 8, Issue 3
    November 2014
    144 pages

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 November 2014
    Published in pvldb Volume 8, Issue 3

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader