Abstract
Graphs that model social networks, numerical simulations, and the structure of the Internet are enormous and cannot be manually inspected. A popular metric used to analyze these networks is Betweenness Centrality (BC), which has applications in community detection, power grid contingency analysis, and the study of the human brain. However, these analyses come with a high computational cost that prevents the examination of large graphs of interest.
Recently, the use of Graphics Processing Units (GPUs) has been promising for efficient processing of unstructured data sets. Prior GPU implementations of BC suffer from large local data structures and inefficient graph traversals that limit scalability and performance. Here we present a hybrid GPU implementation that provides good performance on graphs of arbitrary structure rather than just scale-free graphs as was done previously. Our methods achieve up to 13× speedup on high-diameter graphs and an average of 2.71× speedup overall compared to the best existing GPU algorithm. We also observe near linear speedup when running BC on 192 GPUs.
- Bader, D.A., Meyerhenke, H., Sanders, P., Wagner, D. (eds) Graph Partitioning and Graph Clustering--10th DIMACS Implementation Challenge, volume 588 of Contemporary Mathematics (2013).Google Scholar
- Barabási, A.-L., Albert, R. Emergence of scaling in random networks. Science 286, 5439 (1999), 509--512.Google ScholarCross Ref
- Brandes, U. A faster algorithm for betweenness centrality. J. Math. Sociol. 25 (2001), 163--177.Google ScholarCross Ref
- Brandes, U. On variants of shortest-path betweenness centrality and their generic computation. Social Networks 30, 2 (2008), 136--145.Google ScholarCross Ref
- Bullmore, E., Sporns, O. Complex brain networks: Graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10, 3 (2009), 186--198.Google Scholar
- Davidson, A.A., Baxter, S., Garland, M., Owens, J.D. Work-efficient parallel GPU methods for single-source shortest paths. In International Parallel and Distributed Processing Symposium 28 (2014). Google ScholarDigital Library
- Davis, T.A., Hu, Y. The university of Florida sparse matrix collection. ACM Trans. Math. Softw. 38, 1 (Dec. 2011), 1:1--1:25. Google ScholarDigital Library
- Freeman, L.C. A set of measures of centrality based on betweenness. Sociometry (1977), 35--41.Google Scholar
- Hoberock, J., Bell, N. Thrust: A Parallel Template Library. Online at: http://thrust.googlecode.com, 42:43 (2010).Google Scholar
- Jia, Y., Lu, V., Hoberock, J., Garland, M., Hart, J.C. Edge v. node parallelism for graph centrality metrics. GPU Computing Gems 2 (2011), 15--30.Google Scholar
- Jin, S., Huang, Z., Chen, Y., Chavarria-Miranda, D., Feo, J., Wong, P.C. A novel application of parallel betweenness centrality to power grid contingency analysis. In IEEE International Symposium on Parallel Distributed Processing (IPDPS) (2010), 1--7.Google ScholarCross Ref
- Leskovec, J., Krevl, A. SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014.Google Scholar
- Liljeros, F., Edling, C.R., Amaral, L.A., Stanley, H.E., Aberg, Y. The web of human sexual contacts. Nature 411, 6840 (2001), 907--908.Google ScholarCross Ref
- Madduri, K., Ediger, D., Jiang, K., Bader, D., Chavarria-Miranda, D. A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets. In IEEE International Symposium on Parallel & Distributed Processing (IPDPS) (May 2009), 1--8. Google ScholarDigital Library
- McLaughlin, A., Bader, D. Revisiting edge and node parallelism for dynamic GPU graph analytics. In 2014 IEEE 28th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW) (2014). Google ScholarDigital Library
- McLaughlin, A., Bader, D.A. Scalable and high performance betweenness centrality on the GPU. In Proceedings of the 26th ACM/IEEE International Conference on High Performance Computing, Networking, Storage, and Analysis (SC) (2014). Google ScholarDigital Library
- Mendez-Lojo, M., Burtscher, M., Pingali, K. A GPU implementation of inclusion-based points-to analysis. In Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12 (New York, NY, USA, 2012). ACM, 107--116. Google ScholarDigital Library
- Merrill, D. CUDA Unbound. 2013. https://nvlabs.github.io/cub/ (accessed April 11, 2014).Google Scholar
- Merrill, D., Garland, M., Grimshaw, A. Scalable GPU graph traversal. In Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12 (New York, NY, USA, 2012). ACM, 117--128. Google ScholarDigital Library
- Porta, S., Latora, V., Wang, F., Strano, E., Cardillo, A., Scellato, S., Iacoviello, V., Messora, R. Street centrality and densities of retail and services in Bologna, Italy. Environment and Planning B: Planning and design 36, 3 (2009), 450--465.Google Scholar
- Sariyüce, A.E., Saule, E., Kaya, K., çatalyürek, Ü. Regularizing graph centrality computations. J, Parallel Distrib. Comput. (JPDC) (2014).Google Scholar
- Shi, Z., Zhang, B. Fast network centrality analysis using GPUs. BMC Bioinformatics 12, 1 (2011), 149.Google ScholarCross Ref
- Soman, J., Narang, A. Fast community detection algorithm with GPUs and multicore architectures. In International Parallel & Distributed Processing Symposium (IPDPS) (IEEE, 2011), 568--579. Google ScholarDigital Library
- Vetter, J.S., Glassbrook, R., Dongarra, J., Schwan, K., Loftis, B., McNally, S., Meredith, J., Rogers, J., Roth, P., Spafford, K., Yalamanchili, S. Keeneland: Bringing heterogeneous GPU computing to the computational science community. Comput. Sci. & Engineering 13, 5 (2011), 90--95. Google ScholarDigital Library
- Watts, D.J., Strogatz, S.H. Collective dynamics of 'Small-World' networks. Nature 393, 6684 (1998), 440--442.Google ScholarCross Ref
Index Terms
- Accelerating GPU betweenness centrality
Recommendations
Accelerating PQMRCGSTAB algorithm on GPU
UCHPC-MAW '09: Proceedings of the combined workshops on UnConventional high performance computing workshop plus memory access workshopThe general computations on GPU are becoming more and more popular because of GPU's powerful computing ability. In this paper, how to use GPU to accelerate sparse linear system solver, preconditioned QMRCGSTAB (PQMRCGSTAB for short), is our concern. We ...
Betweenness Centrality in an HSA-enabled System
HPGP '16: Proceedings of the ACM Workshop on High Performance Graph ProcessingThis paper studies different approaches to implementing betweenness centrality in a heterogeneous system. Betweenness centrality is an important algorithm in graph processing. It presents multiple levels of parallelism when processing a graph, and is an ...
Accelerating financial applications on the GPU
GPGPU-6: Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing UnitsThe QuantLib library is a popular library used for many areas of computational finance. In this work, the parallel processing power of the GPU is used to accelerate QuantLib financial applications. Black-Scholes, Monte-Carlo, Bonds, and Repo code paths ...
Comments