skip to main content
research-article

ARC 2014: A Multidimensional FPGA-Based Parallel DBSCAN Architecture

Published:04 November 2015Publication History
Skip Abstract Section

Abstract

Clustering large numbers of data points is a very computationally demanding task that often needs to be accelerated in order to be useful in practical applications. This work focuses on the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, which is one of the state-of-the-art clustering algorithms, and targets its acceleration using an FPGA device. The article presents an optimized, scalable, and parameterizable architecture that takes advantage of the internal memory structure of modern FPGAs in order to deliver a high-performance clustering system. Post-synthesis simulation results show that the developed system can obtain mean speedups of 31× in real-world tests and 202× in synthetic tests when compared to state-of-the-art software counterparts running on a quad-core 3.4GHz Intel i7-2600k. Additionally, this implementation is also capable of clustering data with any number of dimensions without impacting the performance.

References

  1. Elke Achtert, Hans-Peter Kriegel, Erich Schubert, and Arthur Zimek. 2013. Interactive data mining with 3D-parallel-coordinate-trees. In 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD'13). ACM, New York, NY, 1009--1012. DOI:http://dx.doi.org/10.1145/2463676.2463696 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Guilherme Andrade, Gabriel Ramos, Daniel Madeira, Rafael Sachetto, Renato Ferreira, and Leonardo Rocha. 2013. G-DBSCAN: A {GPU} accelerated algorithm for density-based clustering. Procedia Computer Science 18, 0 (2013), 369--378. DOI:http://dx.doi.org/10.1016/j.procs.2013.05.200Google ScholarGoogle ScholarCross RefCross Ref
  3. Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, and Jörg Sander. 1999. OPTICS: Ordering points to identify the clustering structure. ACM Press, 49--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Annovi and M. Beretta. 2010. A fast general-purpose clustering algorithm based on FPGAs for high-throughput data processing. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 617, 13 (2010), 254--257. DOI:http://dx.doi.org/10.1016/j.nima.2009.10.046Google ScholarGoogle ScholarCross RefCross Ref
  5. Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, and Bernhard Seeger. 1990. The R*-tree: An efficient and robust access method for points and rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data (SIGMOD'90). ACM, New York, NY, USA, 322--331. DOI:http://dx.doi.org/10.1145/93597.98741 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Min Chen, Xuedong Gao, and HuiFei Li. 2010. Parallel DBSCAN with priority R-tree. In The 2010 2nd IEEE International Conference on Information Management and Engineering (ICIME). 508--511. DOI:http://dx.doi.org/10.1109/ICIME.2010.5477926Google ScholarGoogle ScholarCross RefCross Ref
  7. M. Daszykowski, B. Walczak, and D. L. Massart. 2001. Looking for natural patterns in data: Part 1. Density-based approach. Chemometrics and Intelligent Laboratory Systems 56, 2 (2001), 83--92. DOI:http://dx.doi.org/10.1016/S0169-7439(01)00111-3Google ScholarGoogle ScholarCross RefCross Ref
  8. Chris Harris and Mike Stephens. 1988. A combined corner and edge detector. In Proceedings of the 4th Alvey Vision Conference. 147--151.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. A. Hartigan and M. A. Wong. 1979. A K-means clustering algorithm. Applied Statistics 28 (1979), 100--108.Google ScholarGoogle ScholarCross RefCross Ref
  10. Yaobin He, Haoyu Tan, Wuman Luo, Huajian Mao, Di Ma, Shengzhong Feng, and Jianping Fan. 2011. MR-DBSCAN: An efficient parallel density-based clustering algorithm using mapreduce. In 2011 IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS). 473--480. DOI:http://dx.doi.org/10.1109/ICPADS.2011.83 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hanaa M. Hussain, Khaled Benkrid, Ahmet T. Erdogan, and Huseyin Seker. 2011. Highly parameterized K-means clustering on FPGAs: Comparative results with GPPs and GPUs. In ReConFig, Peter M. Athanas, Jrgen Becker, and Ren Cumplido (Eds.). IEEE Computer Society, 475--480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Lingjuan Li and Yang Xi. 2011. Research on clustering algorithm and its parallelization strategy. 2012 4th International Conference on Computational and Information Sciences 0 (2011), 325--328. DOI:http://dx.doi.org/10.1109/ICCIS.2011.223 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Llet, M. C. Ortiz, L. A. Sarabia, and M. S. Snchez. 2004. Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes. Analytica Chimica Acta 515, 1 (2004), 87--100. DOI:http://dx.doi.org/10.1016/j.aca.2003.12.020 Papers presented at the 5th Colloquium Chemiometricum Mediterraneum.Google ScholarGoogle ScholarCross RefCross Ref
  14. Hans-peter Kriegel Martin Ester, Jrg S, and Xiaowei Xu. 1996. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. AAAI Press, 226--231.Google ScholarGoogle Scholar
  15. Tsutomu Maruyama. 2006. Real-time K-means clustering for color images on reconfigurable hardware. In ICPR (2) (2006-09-25). IEEE Computer Society, 816--819. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Microsoft. 2014. Most Cited Data Mining Articles on Microsoft Academic Search. Retrieved from http://academic.research.microsoft.com/RankList?entitytype=1&topDomainID=2&subDomainID=7&last=0&start=1&end=100.Google ScholarGoogle Scholar
  17. Neil Scicluna and Christos-Savvas Bouganis. 2014. FPGA-based parallel DBSCAN architecture. In Reconfigurable Computing: Architectures, Tools, and Applications, Diana Goehringer, MarcoDomenico Santambrogio, Joo M. P. Cardoso, and Koen Bertels (Eds.). Lecture Notes in Computer Science, Vol. 8405. Springer International Publishing, 1--12. DOI:http://dx.doi.org/10.1007/978-3-319-05960-0_1Google ScholarGoogle Scholar
  18. Qi Yue Shaobo Shi and Qin Wang. 2014. FPGA based accelerator for parallel DBSCAN algorithm. Computer Modelling & New Technologies 18, 2 (2014), 135--142.Google ScholarGoogle Scholar
  19. A. Shimada, Hongbo Zhu, and T. Shibata. 2013. A VLSI DBSCAN processor composed as an array of micro agents having self-growing interconnects. In 2013 IEEE International Symposium on Circuits and Systems (ISCAS). 2062--2065. DOI:http://dx.doi.org/10.1109/ISCAS.2013.6572278Google ScholarGoogle ScholarCross RefCross Ref
  20. R. J. Thapa, C. Trefftz, and G. Wolffe. 2010. Memory-efficient implementation of a graphics processor-based cluster detection algorithm for large spatial databases. In 2010 IEEE International Conference on Electro/Information Technology (EIT). 1--5. DOI:http://dx.doi.org/10.1109/EIT.2010.5612134Google ScholarGoogle ScholarCross RefCross Ref
  21. Andrea Vattani. 2011. k-means requires exponentially many iterations even in the plane. Discrete & Computational Geometry 45, 4 (2011), 596--616. DOI:http://dx.doi.org/10.1007/s00454-011-9340-1 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Tom White. 2009. Hadoop: The Definitive Guide (1st ed.). O'Reilly Media, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Bayliss, F. Winterstein, and G. A. Constantinides. 2013. FPGA-based K-means clustering using tree-based data structures. In 2013 23rd International Conference on Field Programmable Logic and Applications (FPL). 1--6. DOI:http://dx.doi.org/10.1109/FPL.2013.6645501Google ScholarGoogle Scholar
  24. Xiang Xiao, Tuo Shi, Pranav Vaidya, and Jaehwan John Lee. 2008. R-tree: A hardware implementation. In CDES (2009-12-05), Hamid R. Arabnia (Ed.). CSREA Press, 3--9.Google ScholarGoogle Scholar

Index Terms

  1. ARC 2014: A Multidimensional FPGA-Based Parallel DBSCAN Architecture

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Reconfigurable Technology and Systems
      ACM Transactions on Reconfigurable Technology and Systems  Volume 9, Issue 1
      Special Section on the 2014 International Symposium on Applied Reconfigurable Computing
      November 2015
      121 pages
      ISSN:1936-7406
      EISSN:1936-7414
      DOI:10.1145/2839314
      • Editor:
      • Steve Wilton
      Issue’s Table of Contents

      Copyright © 2015 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 November 2015
      • Accepted: 1 January 2015
      • Revised: 1 November 2014
      • Received: 1 June 2014
      Published in trets Volume 9, Issue 1

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader