Graph Summarization Methods and Applications: A Survey

Authors:
Yike Liu

University of Michigan, Ann Arbor

University of Michigan, Ann Arbor

0000-0003-0117-1656
View Profile

,
Tara Safavi

University of Michigan, Ann Arbor, MI

University of Michigan, Ann Arbor, MI
View Profile

,
Abhilash Dighe

University of Michigan, Ann Arbor, MI

University of Michigan, Ann Arbor, MI
View Profile

,
Danai Koutra

University of Michigan, Ann Arbor, MI

University of Michigan, Ann Arbor, MI
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 51 Issue 3Article No.: 62pp 1–34https://doi.org/10.1145/3186727

Published:22 June 2018Publication History

ACM Computing Surveys

Abstract

While advances in computing resources have made processing enormous amounts of data possible, human ability to identify patterns in such data has not scaled accordingly. Efficient computational methods for condensing and simplifying data are thus becoming vital for extracting actionable insights. In particular, while data summarization techniques have been studied extensively, only recently has summarizing interconnected data, or graphs, become popular. This survey is a structured, comprehensive overview of the state-of-the-art methods for summarizing graph data. We first broach the motivation behind and the challenges of graph summarization. We then categorize summarization approaches by the type of graphs taken as input and further organize each category by core methodology. Finally, we discuss applications of summarization on real-world graphs and conclude by describing some open problems in the field.

References

Bijaya Adhikari, Yao Zhang, Aditya Bharadwaj, and B. Aditya Prakash. 2017. Condensing temporal networks using propagation. In Proceedings of the 2017 SIAM International Conference on Data Mining. 417--425Google Scholar
Charu Aggarwal and Karthik Subbian. 2014. Evolutionary network analysis: A survey. ACM Comput. Surv. 47, 1 (2014), 10:1--10:36. Google ScholarDigital Library
Charu C. Aggarwal. 2015. Data Mining: The Textbook. Springer. Google ScholarDigital Library
Charu C. Aggarwal and S. Yu Philip. 2005. Online analysis of community evolution in data streams. In Proceedings of the SIAM international Conference on Data Mining (SDM’05).Google Scholar
Charu C. Aggarwal and Haixun Wang. 2010. A survey of clustering algorithms for graph data. In Managing and Mining Graph Data. Springer, 275--301.Google ScholarDigital Library
Amr Ahmed, Nino Shervashidze, Shravan Narayanamurthy, Vanja Josifovski, and Alexander J. Smola. 2013. Distributed large-scale natural graph factorization. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). 37--48. Google ScholarDigital Library
Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. 2012. Graph sketches: Sparsification, spanners, and subgraphs. In Proceedings of the 31st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 5--14. Google ScholarDigital Library
Sebastian E. Ahnert. 2013. Power graph compression reveals dominant relationships in genetic transcription networks. Molec. BioSyst. 9, 11 (2013), 2681--2685.Google ScholarCross Ref
Alfred V. Aho, M. R. Garey, and Jeffrey D. Ullman. 1972. The transitive reduction of a directed graph. Siam J. Comput. 1, 2 (1972), 131--137.Google ScholarDigital Library
Leman Akoglu, Duen Horng Chau, U. Kang, Danai Koutra, and Christos Faloutsos. 2012. OPAvion: Mining and visualization in large graphs. In Proceedings of the 2012 SIGMOD Conference. ACM, 717--720. Google ScholarDigital Library
Leman Akoglu, Mary McGlohon, and Christos Faloutsos. 2010. OddBall: Spotting anomalies in weighted graphs. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’10). Google ScholarDigital Library
Leman Akoglu, Hanghang Tong, Jilles Vreeken, and Christos Faloutsos. 2012. Fast and reliable anomaly detection in categorical data. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM’12). ACM. Google ScholarDigital Library
Charles J. Alpert, Andrew B. Kahng, and So-Zen Yao. 1999. Spectral partitioning with multiple eigenvectors. Discr. Appl. Math. 90, 1 (1999), 3--26. Google ScholarDigital Library
Alexandr Andoni and Piotr Indyk. 2008. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51, (2008), 117--122. Google ScholarDigital Library
Alberto Apostolico and Guido Drovandi. 2009. Graph compression by BFS. Algorithms 2, 3 (2009), 1031--1044.Google ScholarCross Ref
Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos E. Papalexakis, and Danai Koutra. 2014. Com2: Fast automatic discovery of temporal (“comet”) communities. In Advances in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science, Vol. 8444. Springer, 271--283.Google Scholar
Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, and Xiangyang Lan. 2006. Group formation in large social networks: Membership, growth, and evolution. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’06). ACM, 44--54. Google ScholarDigital Library
Mathieu Bastian, Sebastien Heymann, and Mathieu Jacomy. 2009. Gephi: An open source software for exploring and manipulating networks. In Proceedings of the International AAAI Conference on Weblogs and Social Media.Google Scholar
Joshua D. Batson, Daniel A. Spielman, Nikhil Srivastava, and Shang-Hua Teng. 2013. Spectral sparsification of graphs: Theory and algorithms. Commun. ACM 56, 8 (2013), 87--94. Google ScholarDigital Library
Enrico Bertini and Giuseppe Santucci. 2004. By chance is not enough: Preserving relative density through non uniform sampling. In Proceedings of the Information Visualisation Conference. Google ScholarDigital Library
Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: Compression techniques. In Proceedings of the International World Wide Web Conference. 595--602. Google ScholarDigital Library
Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. 2011. D3 data-driven documents. IEEE Trans. Vis. Comput. Graph. 17, 12 (2011), 2301--2309. Google ScholarDigital Library
Ivan Brugere, Brian Gallagher, and Tanya Y. Berger-Wolf. 2016. Network structure inference, a survey: Motivations, methods, and applications. ACM Comput. Surv. 51, 2, Article 24. http://arxiv.org/abs/1610.00782. Google ScholarDigital Library
Gregory Buehrer and Kumar Chellapilla. 2008. A scalable pattern mining approach to web graph compression with communities. In Proceedings of the 2008 International Conference on Web Search and Data Mining. ACM, 95--106. Google ScholarDigital Library
Gemma Casas-Garriga. 2005. Summarizing sequential data with closed partial orders. In Proceedings of the SIAM International Conference on Data Mining (SDM’05). 380--391.Google ScholarCross Ref
Šelja Čebirić, François Goasdoué, and Ioana Manolescu. 2015. Query-oriented summarization of RDF graphs. Proc. VLDB Endow. 8, 12 (2015), 2012--2015. Google ScholarDigital Library
Deepayan Chakrabarti, Spiros Papadimitriou, Dharmendra S. Modha, and Christos Faloutsos. 2004. Fully automatic cross-associations. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’04). 79--88. Google ScholarDigital Library
Varun Chandola and Vipin Kumar. 2005. Summarization -- Compressing data into an informative representation. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’05). 98--105. Google ScholarDigital Library
Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos. 2011. Apolo: Making sense of large network data by combining rich user interaction and machine learning. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11).Google ScholarDigital Library
Chen Chen, Cindy X. Lin, Matt Fredrikson, Mihai Christodorescu, Xifeng Yan, and Jiawei Han. 2009. Mining graph patterns efficiently via randomized summaries. Proc. VLDB Endow. 2, 1 (2009), 742--753. Google ScholarDigital Library
Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Michael Mitzenmacher, Alessandro Panconesi, and Prabhakar Raghavan. 2009. On compressing social networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’09). 219--228. Google ScholarDigital Library
Yongwook Choi and Wojciech Szpankowski. 2012. Compression of graphical structures: Fundamental limits, algorithms, and experiments. IEEE Trans. Inf. Theory 58, 2 (2012), 620--638. Google ScholarDigital Library
Rudi Cilibrasi and Paul Vitányi. 2005. Clustering by compression. IEEE Trans. Inf. Theory 51, 4 (2005), 1523--1545. Google ScholarDigital Library
Diane J. Cook and Lawrence B. Holder. 1994. Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. 1 (1994), 231--255. Google ScholarCross Ref
Graham Cormode and S. Muthukrishnan. 2005a. An improved data stream summary: The count-min sketch and its applications. J. Algor. 55, 1 (2005), 58--75. Google ScholarDigital Library
Graham Cormode and S. Muthukrishnan. 2005b. Summarizing and mining skewed data streams. In Proceedings of the SIAM international Conference on Data Mining (SDM’05).Google Scholar
Corinna Cortes, Daryl Pregibon, and Chris Volinsky. 2001. Communities of interest. In Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis. 105--114. Google ScholarDigital Library
Uros Damnjanovic, Virginia Fernandez Arguedas, Ebroul Izquierdo, and José M. Martínez. 2008. Event detection and clustering for surveillance video summarization. In Proceedingsof the 9th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS’08). IEEE, 63--66. Google ScholarDigital Library
Pedro O. S. Vaz de Melo, Leman Akoglu, Christos Faloutsos, and Antonio Alfredo Ferreira Loureiro. 2010. Surprising patterns for the call duration distribution of mobile phone users. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’10). 354--369. Google ScholarDigital Library
Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified data processing on large clusters. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI’04). 10. Google ScholarDigital Library
Pravallika Devineni, Danai Koutra, Michalis Faloutsos, and Christos Faloutsos. 2015. If walls could talk: Patterns and anomalies in Facebook wallposts. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’15). 367--374. Google ScholarDigital Library
Inderjit Dhillon, Yuqiang Guan, and Brian Kulis. 2005. A fast kernel-based multilevel algorithm for graph clustering. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’05). ACM, 629--634. Google ScholarDigital Library
Laxman Dhulipala, Igor Kabiljo, Brian Karrer, Giuseppe Ottaviano, Sergey Pupyrev, and Alon Shalita. 2016. Compressing graphs and indexes with recursive graph bisection. arXiv:1602.08820. Google ScholarDigital Library
P. Drineas, R. Kannan, and M. W. Mahoney. 2006. Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. SIAM J. Comput. 36, 1 (2006), 184--206. Google ScholarDigital Library
Cody Dunne and Ben Shneiderman. 2013. Motif simplification: Improving network visualization readability with fan, connector, and clique glyphs. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’13). ACM, 3247--3256. Google ScholarDigital Library
P. Elias. 2006. Universal codeword sets and representations of the integers. IEEE Trans. Inf. Theor. 21, 2 (2006), 194--203. Google ScholarDigital Library
Wenfei Fan, Jianzhong Li, Xin Wang, and Yinghui Wu. 2012. Query preserving graph compression. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 157--168. Google ScholarDigital Library
Wenfei Fan, Xin Wang, and Yinghui Wu. 2013. Diversified top-k graph pattern matching. Proc. VLDB Endow. 6, 13 (2013), 1510--1521. Google ScholarDigital Library
Jing Feng, Xiao He, Nina Hubig, Christian Böhm, and Claudia Plant. 2013. Compression-based graph mining exploiting structure primitives. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’13). IEEE, 181--190.Google ScholarCross Ref
Jure Ferlez, Christos Faloutsos, Jure Leskovec, Dunja Mladenic, and Marko Grobelnik. 2008. Monitoring network evolution using MDL. In Proceedings of the 24th International Conference on Data Engineering (ICDE’08). 1328--1330. Google ScholarDigital Library
Wenjie Fu, Le Song, and Eric P. Xing. 2009. Dynamic mixed membership blockmodel for evolving networks. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). ACM, 329--336. Google ScholarDigital Library
Johannes Gehrke, Edward Lui, and Rafael Pass. 2003. Towards privacy for social networks: A zero-knowledge based definition of privacy. In Proceedings of the 8th Conference on Theory of Cryptography. 432--449. Google ScholarDigital Library
Mina Ghashami, Edo Liberty, and Jeff M. Phillips. 2016. Efficient frequent directions algorithm for sparse matrices. arXiv:1602.00412.Google Scholar
Sean Gilpin, Tina Eliassi-Rad, and Ian Davidson. 2013. Guided learning for role discovery (gLRD): Framework, algorithms, and applications. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’13). ACM, 113--121. Google ScholarDigital Library
Oshini Goonetilleke, Danai Koutra, Timos Sellis, and Kewen Liao. 2017. Edge labeling schemes for graph data. In Proceedings of the Scientific and Statistical Database Management Conference (SSDBM’17). ACM, Article 12, 12 pages. Google ScholarDigital Library
Robert Görke, Pascal Maillard, Christian Staudt, and Dorothea Wagner. 2010. Modularity-driven clustering of dynamic graphs. In Proceedings of the 9th International Symposium on Experimental Algorithms (SEA’10). 436--448. Google ScholarDigital Library
Szymon Grabowski and Wojciech Bieniecki. 2014. Tight and simple web graph compression for forward and reverse neighbor queries. Discr. Appl. Math. 163 (2014), 298--306. Google ScholarDigital Library
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’16). ACM. Google ScholarDigital Library
Carlos Guestrin, Daphne Koller, Chris Gearhart, and Neal Kanodia. 2003. Generalizing plans to new environments in relational MDPs. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI’03). Google ScholarDigital Library
Mohammad Al Hasan, Nesreen K. Ahmed, and Jennifer Neville. 2013. Network Sampling: Methods and Applications. Retrieved from https://www.cs.purdue.edu/homes/neville/courses/NetworkSampling-KDD13-final.pdf.Google ScholarDigital Library
Nasrin Hassanlou, Maryam Shoaran, and Alex Thomo. 2013. Probabilistic graph summarization. In Web-Age Information Management. Springer, 545--556. Google ScholarDigital Library
Mark Heimann, Haoming Shen, and Danai Koutra. 2018. Node representation learning for multiple networks: The case of graph alignment. arXiv:1802.06257.Google Scholar
Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li. 2012. RolX: Structural role extraction 8 mining in large graphs. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’12). ACM, 1231--1239. Google ScholarDigital Library
Keith Henderson, Brian Gallagher, Lei Li, Leman Akoglu, Tina Eliassi-Rad, Hanghang Tong, and Christos Faloutsos. 2011. It’s who you know: Graph mining using recursive structural features. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, 663--671. Google ScholarDigital Library
Wilko Henecka and Matthew Roughan. 2015. Lossy compression of dynamic, weighted graphs. In Proceedings of the 2015 3rd International Conference on Future Internet of Things and Cloud (FiCloud’15). 427--434. Google ScholarDigital Library
Shawndra Hill, Deepak Agarwal, Robert Bell, and Chris Volinsky. 2006. Building an effective representation for dynamic networks. J. Comput. Graph. Stat. 15 (2006), 1--25.Google ScholarCross Ref
Christian Hübler, Hans-Peter Kriegel, Karsten Borgwardt, and Zoubin Ghahramani. 2008. Metropolis algorithms for representative subgraph sampling. In Proceedings of the 2008 8th IEEE International Conference on Data Mining (ICDM’08). IEEE, 283--292. Google ScholarDigital Library
Di Jin and Danai Koutra. 2017a. Exploratory analysis of graph data by leveraging domain knowledge. In Proceedings of the 2017 IEEE International Conference on Data Mining. 187--196.Google ScholarCross Ref
Di Jin, Aristotelis Leventidis, Haoming Shen, Ruowang Zhang, Junyue Wu, and Danai Koutra. 2017. PERSEUS-HUB: Interactive and collective exploration of large-scale graphs. Informatics 4, 3 (2017), 22.Google ScholarCross Ref
Lisa Jin and Danai Koutra. 2017b. ECOviz: Comparative visualization of time-evolving network summaries. In Proceedings of the ACM SIGKDD 2017 Workshop on Interactive Data Exploration and Analytics.Google Scholar
U. Kang, Jay-Yoon Lee, Danai Koutra, and Christos Faloutsos. 2014. Net-ray: Visualizing and mining web-scale graphs. In Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’14).Google Scholar
U. Kang, Hanghang Tong, Jimeng Sun, Ching-Yung Lin, and Christos Faloutsos. 2011. Gbase: A scalable and general graph management system. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, 1091--1099. Google ScholarDigital Library
U. Kang, Charalampos E. Tsourakakis, and Christos Faloutsos. 2009. PEGASUS: A peta-scale graph mining system—implementation and observations. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’09). Google ScholarDigital Library
George Karypis and Vipin Kumar. 1999. Multilevel k-way hypergraph partitioning. In Proceedings of the 36th Annual ACM/IEEE Design Automation Conference. 343--348. Google ScholarDigital Library
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the Conference of the ACM Special Interest Group on Knowledge Discovery and Data Mining. ACM. Google ScholarDigital Library
Kifayat Ullah Khan, Waqas Nawaz, and Young-Koo Lee. 2014. Set-based unified approach for attributed graph summarization. In Proceedings of the IEEE 4th International Conference on Big Data and Cloud Computing (BdCloud’14). IEEE, 378--385. Google ScholarDigital Library
Mikko Kivel, Alex Arenas, Marc Barthelemy, James P. Gleeson, Yamir Moreno, and Mason A. Porter. 2014. Multilayer networks. J. Complex Netw. 2, 3 (2014), 203--271.Google ScholarCross Ref
Arne Koopman and Arno Siebes. 2008. Discovering relational items sets efficiently. In Proceedings of the SIAM International Conference on Data Mining (SDM’08). 108--119.Google ScholarCross Ref
Arne Koopman and Arno Siebes. 2009. Characteristic relational patterns. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’09). 437--446. Google ScholarDigital Library
Danai Koutra, Abhilash Dighe, Smriti Bhagat, Udi Weinsberg, Stratis Ioannidis, Christos Faloutsos, and Jean Bolot. 2017. PNP: Fast path ensemble method for movie design. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’17). Google ScholarDigital Library
Danai Koutra and Christos Faloutsos. 2017. Individual and Collective Graph Mining: Principles, Algorithms, and Applications. Synthesis Lectures on Data Mining and Knowledge Discovery. Morgan 8 Claypool. Google ScholarDigital Library
Danai Koutra, Di Jin, Yuanchi Ning, and Christos Faloutsos. 2015. Perseus: An interactive large-scale graph mining and visualization tool. Proc. VLDB Endow. 8, 12, 1924--1927. Google ScholarDigital Library
Danai Koutra, U. Kang, Jilles Vreeken, and Christos Faloutsos. 2014b. VoG: Summarizing and understanding large graphs. In Proceedings of the SIAM international Conference on Data Mining (SDM’14). 91--99.Google ScholarCross Ref
Danai Koutra, U. Kang, Jilles Vreeken, and Christos Faloutsos. 2014a. VoG: Summarizing and understanding large graphs. In Proceedings of the SIAM international Conference on Data Mining (SDM’14). SIAM.Google ScholarCross Ref
Danai Koutra, Vasileios Koutras, B. Aditya Prakash, and Christos Faloutsos. 2013. Patterns amongst competing task frequencies: Super-linearities, and the almond-DG model. In Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’13). 201--212.Google ScholarCross Ref
Danai Koutra, Evangelos E. Papalexakis, and Christos Faloutsos. 2012. TensorSplat: Spotting latent anomalies in time. In Proceedings of the 2012 16th Panhellenic Conference on Informatics (PCI’12). IEEE, 144--149. Google ScholarDigital Library
Danai Koutra, Neil Shah, Joshua Vogelstein, Brian Gallagher, and Christos Faloutsos. 2015. DeltaCon: A principled massive-graph similarity function with attribution. ACM Trans. Knowl. Discov. Data 10, 3, Article 28. Google ScholarDigital Library
Danai Koutra, Joshua Vogelstein, and Christos Faloutsos. 2013. DeltaCon: A principled massive-graph similarity function. In Proceedings of the SIAM International Conference on Data Mining (SDM’13). 162--170.Google ScholarCross Ref
Luigi Laura, Stefano Leonardi, Guido Caldarelli, and Paolo De Los Rios. 2002. A multi-layer model for the web graph. In On-Line Proceedings of the 2nd International Workshop on Web Dynamics.Google Scholar
Matthijs Leeuwen van Leeuwen, Jilles Vreeken, and Arno Siebes. 2006. Compression picks the item sets that matter. In Proceedings of the Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’06). 585--592.Google ScholarCross Ref
Kristen LeFevre and Evimaria Terzi. 2010. GraSS: Graph structure summarization. In Proceedings of the SIAM International Conference on Data Mining (SDM’10). 454--465.Google ScholarCross Ref
Jure Leskovec, Lars Backstrom, Ravi Kumar, and Andrew Tomkins. 2008. Microscopic evolution of social networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’08). 462--470. Google ScholarDigital Library
Jure Leskovec and Christos Faloutsos. 2007. Scalable modeling of real graphs using Kronecker multiplication. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). 497--504. Google ScholarDigital Library
Jure Leskovec, Jon M. Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 177--187. Google ScholarDigital Library
Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman. 2014. Mining of Massive Datasets. Cambridge University Press. Google ScholarDigital Library
Cheng-Te Li and Shou-De Lin. 2009. Egocentric information abstraction for heterogeneous social networks. In Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’09). IEEE, 255--260. Google ScholarDigital Library
Edo Liberty. 2013. Simple and deterministic matrix sketching. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’13). ACM, 581--588. Google ScholarDigital Library
Yongsub Lim, U. Kang, and Christos Faloutsos. 2014. SlashBurn: Graph compression and mining beyond caveman communities. IEEE Trans. Knowl. Data Eng. 26, 12 (2014), 3077--3089.Google ScholarCross Ref
Shou-De Lin, Mi-Yen Yeh, and Cheng-Te Li. 2013. Sampling and summarization for social networks. In Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’13).Google Scholar
Xuemin Lin, Qing Liu, Yidong Yuan, and Xiaofang Zhou. 2003. Multiscale histograms: Summarizing topological relations in large spatial datasets. In Proceedings of the International Conference on Very Large Databases (VLDB’03). 814--825. Google ScholarDigital Library
Yu-Ru Lin, Hari Sundaram, and Aisling Kelliher. 2008. Summarization of social activity over time: People, actions and concepts in dynamic networks. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’08). 1379--1380. Google ScholarDigital Library
Bing Liu, Wynne Hsu, and Yiming Ma. 1999. Pruning and summarizing the discovered associations. In Proceedings of the 5th ACM SIGKDD International Conference Knowledge Discovery and Data Mining (KDD’99). 145--154. Google ScholarDigital Library
Chunyang Liu and Ling Chen. 2016. Summarizing uncertain transaction databases by probabilistic tiles. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’16). IEEE, 4375--4382.Google ScholarCross Ref
Wei Liu, Andrey Kan, Jeffrey Chan, James Bailey, Christopher Leckie, Jian Pei, and Ramamohanarao Kotagiri. 2012. On compressing weighted time-evolving graphs. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). ACM, 2319--2322. Google ScholarDigital Library
Yike Liu, Neil Shah, and Danai Koutra. 2015. An empirical comparison of the summarization power of graph clustering methods. arXiv:1511.06820.Google Scholar
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed graphlab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 8 (2012), 716--727. Google ScholarDigital Library
Antonio Maccioni and Daniel J. Abadi. 2016. Scalable pattern matching over compressed graphs via dedensification. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’16). ACM, 1755--1764. Google ScholarDigital Library
Arun S. Maiya and Tanya Y. Berger-Wolf. 2010. Sampling community structure. In Proceedings of the 25th International Conference Conference on the World Wide Web (WWW’10). ACM, 701--710. Google ScholarDigital Library
M. Mampaey, J. Vreeken, and N. Tatti. 2011. Summarizing Data with Itemsets Using Maximum Entropy Models. Technical Report 2011/02. University of Antwerp.Google Scholar
Sebastian Maneth and Fabian Peternek. 2016. Compressing graphs by grammars. In Proceedings of the 2016 IEEE 32nd International Conference on Data Engineering (ICDE’16). IEEE, 109--120.Google ScholarCross Ref
Koji Maruhashi, Fan Guo, and Christos Faloutsos. 2011. Multiaspectforensics: Pattern mining on large-scale heterogeneous networks with tensor analysis. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining. 203--210. Google ScholarDigital Library
Hossein Maserrat and Jian Pei. 2010. Neighbor query friendly compression of social networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’10). Google ScholarDigital Library
Hossein Maserrat and Jian Pei. 2012. Community preserving lossy compression of social networks. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’12). IEEE, 509--518. Google ScholarDigital Library
Michael Mathioudakis, Francesco Bonchi, Carlos Castillo, Aristides Gionis, and Antti Ukkonen. 2011. Sparsification of influence networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). 529--537. Google ScholarDigital Library
Yasir Mehmood, Nicola Barbieri, Francesco Bonchi, and Antti Ukkonen. 2013. Csi: Community-level social influence analysis. In Machine Learning and Knowledge Discovery in Databases. Springer, 48--63.Google Scholar
Pauli Miettinen and Jilles Vreeken. 2011. Model order selection for boolean matrix factorization. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). 51--59. Google ScholarDigital Library
Pauli Miettinen and Jilles Vreeken. 2014. mdl4bmf: Minimum description length for Boolean matrix factorization. ACM Trans. Knowl. Discov. Data 8, 4 (2014), 1--30. Google ScholarDigital Library
Saket Navlakha, Rajeev Rastogi, and Nisheeth Shrivastava. 2008. Graph summarization with bounded error. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’08). 419--432. Google ScholarDigital Library
Mark E. J. Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69, 2 (2004), 026113+.Google Scholar
Carlos Ordonez, Norberto Ezquerra, and Cesar A. Santana. 2006. Constraining and summarizing association rules in medical data. Knowl. Inf. Syst. 9, 3 (2006), 259--283. Google ScholarDigital Library
Themis Palpanas, Michail Vlachos, Eamonn J. Keogh, and Dimitrios Gunopulos. 2008. Streaming time series summarization using user-defined amnesic functions. IEEE Trans. Knowl. Data Eng. 20, 7 (2008), 992--1006. Google ScholarDigital Library
Jia-Yu Pan, Hyung-Jeong Yang, and Christos Faloutsos. 2004. MMSS: Multi-modal story-oriented video summarization. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’04). 491--494. Google ScholarDigital Library
Jian Pei, Daxin Jiang, and Aidong Zhang. 2005. On mining cross-graph quasi-cliques. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’05). 228--238. Google ScholarDigital Library
David Peleg and Alejandro A. Schäffer. 1989. Graph spanners. J. Graph Theory 13, 1 (1989), 99--116.Google ScholarCross Ref
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, 701--710. Google ScholarDigital Library
Robert Pienta, Acar Tamersoy, Hanghang Tong, and Duen Horng Chau. 2014. MAGE: Matching approximate patterns in richly-attributed graphs. In Proceedings of the 2014 IEEE International Conference on Big Data. 585--590.Google ScholarCross Ref
B. Aditya Prakash, Jilles Vreeken, and Christos Faloutsos. 2012. Spotting culprits in epidemics: How many and which ones? In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’12). IEEE. Google ScholarDigital Library
M. Purohit, B. A. Prakash, C. Kang, Y. Zhang, and V. S. Subrahmanian. 2014. Fast influence-based coarsening for large networks. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, 1296--1305. Google ScholarDigital Library
Qiang Qu, Siyuan Liu, Christian S. Jensen, Feida Zhu, and Christos Faloutsos. 2014. Interestingness-driven diffusion process summarization in dynamic networks. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’14). 597--613. Google ScholarDigital Library
Davood Rafiei and Stephen Curial. 2005. Effectively visualizing large networks through sampling. In Proceedings of the 16th IEEE Visualization Conference (VIS’05). 48.Google Scholar
Sriram Raghavan and Hector Garcia-Molina. 2003. Representing web graphs. In Proceedings of the 2003 IEEE International Conference on Data Engineering (ICDE’03). IEEE, 405--416.Google Scholar
Xuguang Ren and Junhu Wang. 2015. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. Proc. VLDB Endow. 8, 5 (2015), 617--628. Google ScholarDigital Library
Leonardo F. R. Ribeiro, Pedro H. P. Saverese, and Daniel R. Figueiredo. 2017. struc2vec: Learning node representations from structural identity. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’17). ACM, 385--394. Google ScholarDigital Library
Matteo Riondato, David García-Soriano, and Francesco Bonchi. 2014. Graph summarization with quality guarantees. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’14). IEEE, 947--952. Google ScholarDigital Library
Ryan Rossi, Brian Gallagher, Jennifer Neville, and Keith Henderson. 2012. Role-dynamics: Fast mining of large dynamic networks. InProceedings of the 25th International Conference Companion on the World Wide Web (WWW’12 Companion). ACM, 997--1006. Google ScholarDigital Library
T. Safavi, C. Sripada, and D. Koutra. 2017. Scalable hashing-based network discovery. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’17). 405--414.Google Scholar
Barna Saha and Pabitra Mitra. 2007. Dynamic algorithm for graph clustering using minimum cut tree. In Proceedings of the SIAM International Conference on Data Mining (SDM’07). 581--586. Google ScholarDigital Library
Neil Shah, Danai Koutra, Lisa Jin, Tianmin Zou, Brian Gallagher, and Christos Faloutsos. 2017. On summarizing large-scale dynamic graphs. IEEE Data Eng. Bull. 40, 3 (2017), 75--88.Google Scholar
Neil Shah, Danai Koutra, Tianmin Zou, Brian Gallagher, and Christos Faloutsos. 2015. TimeCrunch: Interpretable dynamic graph summarization. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’15). Google ScholarDigital Library
P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang, D. Ramage, N. Amin, B. Schwikowski, and T. Ideker. 2003. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 11 (2003), 2498.Google ScholarCross Ref
Umang Sharan and Jennifer Neville. 2008. Temporal-relational classifiers for prediction in evolving domains. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’08). 540--549. Google ScholarDigital Library
Z. Shen, K.-L. Ma, and T. Eliassi-Rad. 2006. Visual analysis of large heterogeneous social networks by semantic and structural abstraction. IEEE Trans. Vis. Comput. Graph. 12, 6 (2006), 1427--1439. Google ScholarDigital Library
Lei Shi, Hanghang Tong, Jie Tang, and Chuang Lin. 2015. VEGAS: Visual influence graph summarization on citation networks. IEEE Trans. Knowl. Data Eng. 27, 12 (2015), 3417--3431. Google ScholarDigital Library
Ben Shneiderman. 2008. Extreme visualization: Squeezing a billion records into a million pixels. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’08). Google ScholarDigital Library
Mahsa Shoaran, Alex Thomo, and Jens H. Weber-Jahnke. 2013. Zero-knowledge private graph summarization. In Proceedings of the IEEE International Conference on Big Data. IEEE, 597--605.Google Scholar
Koen Smets and Jilles Vreeken. 2011. The odd one out: Identifying and characterising anomalies. In Proceedings of the SIAM International Conference on Data Mining (SDM’11). 804--815.Google ScholarCross Ref
Qi Song, Yinhui Wu, and Xin Luna Dong. 2016. Mining summaries for knowledge graph search. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’16). 1215--1220.Google ScholarCross Ref
Sucheta Soundarajan, Acar Tamersoy, Elias B. Khalil, Tina Eliassi-Rad, Duen Horng Chau, Brian Gallagher, and Kevin Roundy. 2016. Generating graph snapshots from streaming edge data. In Proceedings of the 25th International Conference Companion on the World Wide Web. 109--110. Google ScholarDigital Library
Daniel A. Spielman and Nikhil Srivastava. 2011. Graph sparsification by effective resistances. SIAM J. Comput. 40, 6 (2011), 1913--1926. Google ScholarDigital Library
Olaf Sporns. 2010. Networks of the Brain. MIT Press, Cambridge, MA. Google ScholarDigital Library
Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, and Philip S. Yu. 2007. GraphScope: Parameter-free mining of large time-evolving graphs. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’07). ACM, 687--696. Google ScholarDigital Library
Jimeng Sun, Charalampos E. Tsourakakis, Evan Hoke, Christos Faloutsos, and Tina Eliassi-Rad. 2008. Two heads better than one: Pattern discovery in time-evolving multi-aspect data. Data Min. Knowl. Discov. 17, 1 (2008), 111--128. Google ScholarDigital Library
Jimeng Sun, Yinglian Xie, Hui Zhang, and Christos Faloutsos. 2007. Less is more: Compact matrix decomposition for large sparse graphs. In Proceedings of the SIAM International Conference on Data Mining (SDM’07).Google ScholarCross Ref
Yizhou Sun and Jiawei Han. 2012. Mining Heterogeneous Information Networks: Principles and Methodologies. Morgan 8 Claypool. Google ScholarDigital Library
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). 1067--1077. Google ScholarDigital Library
Nan Tang, Qing Chen, and Prasenjit Mitra. 2016. Graph stream summarization: From big bang to big crunch. In Proceedings of the 2016 International Conference on Management of Data. ACM, 1481--1496. Google ScholarDigital Library
Chayant Tantipathananandh and Tanya Berger-Wolf. 2011. Finding communities in dynamic social networks. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’11). IEEE, 1236--1241. Google ScholarDigital Library
Lini T. Thomas, Satyanarayana R. Valluri, and Kamalakar Karlapalem. 2010. MARGIN: Maximal frequent subgraph mining. ACM Trans. Knowl. Discov. Data 4, 3 (2010), 10:1--10:42. Google ScholarDigital Library
Yuanyuan Tian, Richard A. Hankins, and Jignesh M. Patel. 2008. Efficient aggregation for graph summarization. In Proceedings of the ACM Special Interest Group on Management of Data (SIGMOD’08). ACM, 567--580. Google ScholarDigital Library
Yuanyuan Tian and Jignesh M. Patel. 2008. TALE: A tool for approximate large graph matching. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering. 963--972. Google ScholarDigital Library
Hannu Toivonen, Fang Zhou, Aleksi Hartikainen, and Atte Hinkka. 2011. Compression of weighted graphs. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’11). 965--973. Google ScholarDigital Library
Hanghang Tong, Christos Faloutsos, Brian Gallagher, and Tina Eliassi-Rad. 2007. Fast best-effort pattern matching in large attributed graphs. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 737--746. Google ScholarDigital Library
Jilles Vreeken, Matthijs van Leeuwen, and Arno Siebes. 2011. Krimp: Mining itemsets that compress. Data Min. Knowl. Disc. 23, 1 (2011), 169--214. Google ScholarDigital Library
Bianca Wackersreuther, Peter Wackersreuther, Annahita Oswald, Christian Böhm, and Karsten M. Borgwardt. 2010. Frequent subgraph discovery in dynamic networks. In Proceedings of the 8th Workshop on Mining and Learning with Graphs. ACM, 155--162. Google ScholarDigital Library
Daixin Wang, Peng Cui, and Wenwu Zhu. 2016. Structural deep network embedding. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’16). Google ScholarDigital Library
Jianyong Wang and George Karypis. 2004. SUMMARY: Efficiently summarizing transactions for clustering. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’04). 241--248. Google ScholarDigital Library
S. Wasserman and J. Galaskiewicz. 1994. Advances in Social Network Analysis: Research in the Social and Behavioral Sciences. SAGE Publications.Google Scholar
Ye Wu, Zhinong Zhong, Wei Xiong, and Ning Jing. 2014. Graph summarization for attributed graphs. In Proceedings of the International Conference on Information Science, Electronics, and Electrical Engineering (ISEEE’14). IEEE, 503--507.Google ScholarCross Ref
Yang Xiang, Ruoming Jin, David Fuhry, and Feodor Dragan. 2010. Summarizing transactional databases with overlapped hyperrectangles. Data Min. Knowl. Disc. 23, 2, 215--251. Google ScholarDigital Library
Kevin S. Xu, Mark Kliger, and Alfred O. Hero III. 2011. Tracking communities in dynamic social networks. In Proceedings of the 4th International Conference on Social Computing, Behavioral-Cultural Modeling, 8 Prediction (SBP’11). 219--226. Google ScholarDigital Library
Zhiqiang Xu, Yiping Ke, Yi Wang, Hong Cheng, and James Cheng. 2012. A model-based approach to attributed graph clustering. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD’12). ACM, 505--516. Google ScholarDigital Library
Xifeng Yan, Hong Cheng, Jiawei Han, and Dong Xin. 2005. Summarizing itemset patterns: A profile-based approach. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD’05). 314--323. Google ScholarDigital Library
Jaewon Yang and Jure Leskovec. 2013. Overlapping community detection at scale: A nonnegative matrix factorization approach. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM’13). ACM, 587--596. Google ScholarDigital Library
Jaewon Yang, Julian McAuley, and Jure Leskovec. 2013. Community detection in networks with node attributes. In Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM’13). IEEE, 1151--1156.Google ScholarCross Ref
Liu Yang, Susan T. Dumais, Paul N. Bennett, and Ahmed Hassan Awadallah. 2017. Characterizing and predicting enterprise email reply behavior. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’17). ACM, 235--244. Google ScholarDigital Library
Jinguo You, Qiuping Pan, Wei Shi, Zhipeng Zhang, and Jianhua Hu. 2013. Towards graph summary and aggregation: A survey. In Social Media Retrieval and Mining. Springer, 3--12.Google Scholar
Ning Zhang, Yuanyuan Tian, and Jignesh M. Patel. 2010. Discovery-driven graph summarization. In Proceedings of the 2003 IEEE International Conference on Data Engineering (ICDE’10). 880--891.Google Scholar
Peixiang Zhao, Charu C. Aggarwal, and Min Wang. 2011. gSketch: On query estimation in graph streams. Proc. VLDB Endow. 5, 3 (2011), 193--204. Google ScholarDigital Library
Yang Zhou, Hong Cheng, and Jeffrey Xu Yu. 2009. Graph clustering based on structural/attribute similarities. Proc. VLDB Endow. 2, 1 (2009), 718--729. Google ScholarDigital Library
Linhong Zhu, Majid Ghasemi-Gol, Pedro Szekely, Aram Galstyan, and Craig A. Knoblock. 2016. Unsupervised entity resolution on multi-type graphs. In Proceedings of the International Semantic Web Conference. 649--667.Google Scholar

Index Terms

Recommendations

Incremental Lossless Graph Summarization
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Given a fully dynamic graph, represented as a stream of edge insertions and deletions, how can we obtain and incrementally update a lossless summary of its current snapshot? As large-scale graphs are prevalent, concisely representing them is inevitable ...
Read More
Graph Interpretation, Summarization and Visualization Techniques: A Review and Open Research Issues
Abstract
Graphs has been a ubiquitous way of representing heterogeneous data. There are many studies focused on graph learning highlighting the approaches for graph data extraction, interpretation and graph summarization. Graph data summarization is ...
Read More
Opinion summarization methods

Research about some aspect-based opinion summarization methods.A new content selection strategy to produce extractive summaries is proposed.A novel NLG template-based system to generate abstractive summaries is proposed.Extractive and abstractive ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 51, Issue 3
May 2019
796 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3212709
Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering / University of Florida / Gainesville, FL 32611
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 June 2018
- Accepted: 1 February 2018
- Revised: 1 January 2018
- Received: 1 December 2016
Published in csur Volume 51, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Graph mining
graph summarization
Qualifiers
- survey
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 180
  Total Citations
  View Citations
- 7,680
  Total Downloads
- Downloads (Last 12 months)1,689
- Downloads (Last 6 weeks)232
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Graph Summarization Methods and Applications: A Survey

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Incremental Lossless Graph Summarization

Graph Interpretation, Summarization and Visualization Techniques: A Review and Open Research Issues

Opinion summarization methods