Managing and Mining Graph Data is a comprehensive survey book in graph data analytics. It contains extensive surveys on important graph topics such as graph languages, indexing, clustering, data generation, pattern mining, classification, keyword search, pattern matching, and privacy. It also studies a number of domain-specific scenarios such as stream mining, web graphs, social networks, chemical and biological data. The chapters are written by leading researchers, and provide a broad perspective of the area. This is the first comprehensive survey book in the emerging topic of graph data processing. Managing and Mining Graph Data is designed for a varied audience composed of professors, researchers and practitioners in industry. This volume is also suitable as a reference book for advanced-level database students in computer science. About the Editors:Charu C. Aggarwal obtained his B.Tech in Computer Science from IIT Kanpur in 1993 and Ph.D. from MIT in 1996. He has worked as a researcher at IBM since then, and has published over 130 papers in major data mining conferences and journals. He has applied for or been granted over 70 US and International patents, and has thrice been designated a Master Inventor at IBM. He has received an IBM Corporate award for his work on data stream analytics, and an IBM Outstanding Innovation Award for his work on privacy technology. He has served on the executive committees of most major data mining conferences. He has served as an associate editor of the IEEE TKDE, as an associate editor of the ACM SIGKDD Explorations, and as an action editor of the DMKD Journal. He is a fellow of the IEEE, and a life-member of the ACM. Haixun Wang is currently a researcher at Microsoft Research Asia. He received the B.S. and the M.S. degree, both in computer science, from Shanghai Jiao Tong University in 1994 and 1996. He received the Ph.D. degree in computer science from the University of California, Los Angeles in 2000. He subsequently worked as a researcher at IBMuntil 2009. His main research interest is database language and systems, data mining, and information retrieval. He has published more than 100 research papers in referred international journals and conference proceedings. He serves as an associate editor of the IEEE TKDE, and has served as a reviewer and program committee member of leading database conferences and journals.
Cited By
- He Y, Wang K, Zhang W, Lin X and Zhang Y (2023). Scaling Up k-Clique Densest Subgraph Detection, Proceedings of the ACM on Management of Data, 1:1, (1-26), Online publication date: 26-May-2023.
- Almasri M, Hajj I, Nagi R, Xiong J and Hwu W Parallel K-clique counting on GPUs Proceedings of the 36th ACM International Conference on Supercomputing, (1-14)
- Gao S, Xu J, Li X, Fu F, Zhang W, Ouyang W, Tao Y and Cui B K-core decomposition on super large graphs with limited resources Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, (413-422)
- Pursalim M and Keong K (2020). An Efficient Multiresolution Clustering for Motif Discovery in Complex Networks, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 19:1, (284-294), Online publication date: 1-Jan-2022.
- Besta M, Kanakagiri R, Kwasniewski G, Ausavarungnirun R, Beránek J, Kanellopoulos K, Janda K, Vonarburg-Shmaria Z, Gianinazzi L, Stefan I, Luna J, Golinowski J, Copik M, Kapp-Schwoerer L, Di Girolamo S, Blach N, Konieczny M, Mutlu O and Hoefler T SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, (282-297)
- Veldt N, Benson A and Kleinberg J The Generalized Mean Densest Subgraph Problem Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, (1604-1614)
- Shi J, Dhulipala L, Eisenstat D, Łăcki J and Mirrokni V (2021). Scalable community detection via parallel correlation clustering, Proceedings of the VLDB Endowment, 14:11, (2305-2313), Online publication date: 1-Jul-2021.
- Besta M, Vonarburg-Shmaria Z, Schaffner Y, Schwarz L, Kwasniewski G, Gianinazzi L, Beranek J, Janda K, Holenstein T, Leisinger S, Tatkowski P, Ozdemir E, Balla A, Copik M, Lindenberger P, Konieczny M, Mutlu O and Hoefler T (2021). GraphMineSuite, Proceedings of the VLDB Endowment, 14:11, (1922-1935), Online publication date: 1-Jul-2021.
- Chen X, Dathathri R, Gill G, Hoang L and Pingali K Sandslash Proceedings of the ACM International Conference on Supercomputing, (378-391)
- Jin W, Li Y, Xu H, Wang Y, Ji S, Aggarwal C and Tang J (2021). Adversarial Attacks and Defenses on Graphs, ACM SIGKDD Explorations Newsletter, 22:2, (19-34), Online publication date: 17-Jan-2021.
- Paudel R and Eberle W (2020). An Approach For Concept Drift Detection in a Graph Stream Using Discriminative Subgraphs, ACM Transactions on Knowledge Discovery from Data, 14:6, (1-25), Online publication date: 31-Dec-2021.
- Blanuša J, Stoica R, Ienne P and Atasu K (2020). Manycore clique enumeration with fast set intersections, Proceedings of the VLDB Endowment, 13:12, (2676-2690), Online publication date: 1-Aug-2020.
- Agarwal S, Dutta S and Bhattacharya A (2021). ChiSeL, Proceedings of the VLDB Endowment, 13:10, (1654-1668), Online publication date: 1-Jun-2020.
- Sun B, Danisch M, Chan T and Sozio M (2021). KClist++, Proceedings of the VLDB Endowment, 13:10, (1628-1640), Online publication date: 1-Jun-2020.
- Chen X, Dathathri R, Gill G and Pingali K (2020). Pangolin, Proceedings of the VLDB Endowment, 13:8, (1190-1205), Online publication date: 1-Apr-2020.
- Saisubramanian S, Galhotra S and Zilberstein S Balancing the Tradeoff Between Clustering Value and Interpretability Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, (351-357)
- Malliaros F, Giatsidis C, Papadopoulos A and Vazirgiannis M (2019). The core decomposition of networks: theory, algorithms and applications, The VLDB Journal — The International Journal on Very Large Data Bases, 29:1, (61-92), Online publication date: 1-Jan-2020.
- Lee J, Rossi R, Kim S, Ahmed N and Koh E (2019). Attention Models in Graphs, ACM Transactions on Knowledge Discovery from Data, 13:6, (1-25), Online publication date: 17-Dec-2019.
- Álvarez-García S, Freire B, Ladra S and Pedreira Ó (2019). Compact and efficient representation of general graph databases, Knowledge and Information Systems, 60:3, (1479-1510), Online publication date: 1-Sep-2019.
- Li P, Huang L, Wang C and Lai J EdMot Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (479-487)
- Liu Y, Safavi T, Dighe A and Koutra D (2018). Graph Summarization Methods and Applications, ACM Computing Surveys, 51:3, (1-34), Online publication date: 31-May-2019.
- Bourhim S, Benhiba L and Idrissi M Investigating algorithmic variations of an RS Graph-based collaborative filtering approach Proceedings of the ArabWIC 6th Annual International Conference Research Track, (1-6)
- Dallachiesa M, Aggarwal C and Palpanas T (2019). Improving Classification Quality in Uncertain Graphs, Journal of Data and Information Quality, 11:1, (1-20), Online publication date: 18-Jan-2019.
- Iyer A, Liu Z, Jin X, Venkataraman S, Braverman V and Stoica I ASAP Proceedings of the 13th USENIX conference on Operating Systems Design and Implementation, (745-761)
- Pileggi S (2018). Looking deeper into academic citations through network analysis, Universal Access in the Information Society, 17:3, (541-548), Online publication date: 1-Aug-2018.
- Atastina I, Sitohang B, Saptawati G and Moertini V An implementation of graph mining to find the group evolution in communication data record Proceedings of the 2018 International Conference on Data Science and Information Technology, (79-84)
- Akgün A and Ayvaz S An Approach for Information Discovery Using Ontology In Semantic Web Content Proceedings of the 1st International Conference on Information Science and Systems, (250-255)
- Fu S, Wang Y, Yang Y, Bi Q, Guo F and Qu H (2018). VisForum, ACM Transactions on Interactive Intelligent Systems, 8:1, (1-21), Online publication date: 13-Mar-2018.
- Rehman S, Asghar S and Fong S An Efficient Ranking Scheme for Frequent Subgraph Patterns Proceedings of the 2018 10th International Conference on Machine Learning and Computing, (257-262)
- Silva F, Werneck R, Goldenstein S, Tabbone S and Torres R (2018). Graph-based bag-of-words for classification, Pattern Recognition, 74:C, (266-285), Online publication date: 1-Feb-2018.
- Mohanty M and Ramanath M Klustree Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, (265-272)
- Onoue Y and Koyamada K Optimal tree reordering for group-in-a-box graph layouts SIGGRAPH Asia 2017 Symposium on Visualization, (1-9)
- Abdolazimi R, Naderi H and Sagharichian M (2017). Connected components of big graphs in fixed MapReduce rounds, Cluster Computing, 20:3, (2563-2574), Online publication date: 1-Sep-2017.
- Zhang Y and Zhou Y Bibliometrics Analysis of Complex Networks Research Proceedings of the International Conference on Business and Information Management, (24-28)
- Lemay M, Ul Hassan W, Moyer T, Schear N and Smith W Automated provenance analytics Proceedings of the 9th USENIX Conference on Theory and Practice of Provenance, (12-12)
- Boden B, Günnemann S, Hoffmann H and Seidl T (2017). MiMAG, Knowledge and Information Systems, 50:2, (417-446), Online publication date: 1-Feb-2017.
- Zhu Y, Yan E and Song I (2017). The use of a graph-based system to improve bibliographic information retrieval, Journal of the Association for Information Science and Technology, 68:2, (480-490), Online publication date: 1-Feb-2017.
- Le T and Ling T (2016). Survey on Keyword Search over XML Documents, ACM SIGMOD Record, 45:3, (17-28), Online publication date: 6-Dec-2016.
- Dolgorsuren B, Xu W, Khan K, Jeong B and Lee Y SP2 Proceedings of the Sixth International Conference on Emerging Databases: Technologies, Applications, and Theory, (43-50)
- Gu Y, Gao C, Wang L and Yu G (2016). Subgraph similarity maximal all-matching over a large uncertain graph, World Wide Web, 19:5, (755-782), Online publication date: 1-Sep-2016.
- Jiang M, Cui P, Beutel A, Faloutsos C and Yang S (2016). Inferring lockstep behavior from connectivity pattern in large graphs, Knowledge and Information Systems, 48:2, (399-428), Online publication date: 1-Aug-2016.
- WU Y, Zhu X, Li L, Fan W, Jin R and Zhang X (2016). Mining Dual Networks, ACM Transactions on Knowledge Discovery from Data, 10:4, (1-37), Online publication date: 27-Jul-2016.
- Nirmala P, Lekshmi R and Nadarajan R (2016). Vertex cover-based binary tree algorithm to detect all maximum common induced subgraphs in large communication networks, Knowledge and Information Systems, 48:1, (229-252), Online publication date: 1-Jul-2016.
- Salas J and Torra V (2016). Improving the characterization of P-stability for applications in network privacy, Discrete Applied Mathematics, 206:C, (109-114), Online publication date: 19-Jun-2016.
- Ma S, Li J, Hu C, Lin X and Huai J (2016). Big graph search, Frontiers of Computer Science: Selected Publications from Chinese Universities, 10:3, (387-398), Online publication date: 1-Jun-2016.
- Hassani M, Cuzzocrea A, Spaus P and Seidl T I-HASTREAM Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (656-665)
- Bhanuse S, Kamble S and Kakde S (2016). Text Mining Using Metadata for Generation of Side Information, Procedia Computer Science, 78:C, (807-814), Online publication date: 1-Mar-2016.
- Sagharichian M, Naderi H and Haghjoo M (2015). ExPregel, Concurrency and Computation: Practice & Experience, 27:17, (4954-4969), Online publication date: 10-Dec-2015.
- Wardani D and Küng J Property Hypergraphs as an Attributed Predicate RDF Proceedings of the Confederated International Conferences on On the Move to Meaningful Internet Systems: OTM 2015 Conferences - Volume 9415, (329-336)
- Zhao P gSparsify Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, (373-382)
- Teixeira C, Fonseca A, Serafini M, Siganos G, Zaki M and Aboulnaga A Arabesque Proceedings of the 25th Symposium on Operating Systems Principles, (425-440)
- Islam M, Chengfei Liu and Jianxin Li (2015). Efficient Answering of Why-Not Questions in Similar Graph Matching, IEEE Transactions on Knowledge and Data Engineering, 27:10, (2672-2686), Online publication date: 1-Oct-2015.
- Hassani M, Spaus P, Cuzzocrea A and Seidl T Adaptive stream clustering using incremental graph maintenance Proceedings of the 4th International Conference on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications - Volume 41, (49-64)
- Botezatu M, Bogojeska J, Giurgiu I, Voelzer H and Wiesmann D Multi-View Incident Ticket Clustering for Optimal Ticket Dispatching Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (1711-1720)
- Mottin D, Bonchi F and Gullo F Graph Query Reformulation with Diversity Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (825-834)
- Miao Y, Han W, Li K, Wu M, Yang F, Zhou L, Prabhakaran V, Chen E and Chen W (2015). ImmortalGraph, ACM Transactions on Storage, 11:3, (1-34), Online publication date: 29-Jul-2015.
- Chen P and Plale B Big data provenance analysis and visualization Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (797-800)
- Yuan Y, Wang G, Chen L and Wang H (2015). Graph similarity search on large uncertain graph databases, The VLDB Journal — The International Journal on Very Large Data Bases, 24:2, (271-296), Online publication date: 1-Apr-2015.
- Liu J, Aggarwal C and Han J On Integrating Network and Community Discovery Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, (117-126)
- Song C, Ge T, Chen C and Wang J (2014). Event pattern matching over graph streams, Proceedings of the VLDB Endowment, 8:4, (413-424), Online publication date: 1-Dec-2014.
- Shang Z and Yu J (2014). Auto-approximation of graph computing, Proceedings of the VLDB Endowment, 7:14, (1833-1844), Online publication date: 1-Oct-2014.
- Zhu Y, Yu J and Qin L (2014). Leveraging graph dimensions in online graph search, Proceedings of the VLDB Endowment, 8:1, (85-96), Online publication date: 1-Sep-2014.
- Aggarwal C and Subbian K (2014). Evolutionary Network Analysis, ACM Computing Surveys, 47:1, (1-36), Online publication date: 1-Jul-2014.
- Dallachiesa M, Aggarwal C and Palpanas T Node classification in uncertain graphs Proceedings of the 26th International Conference on Scientific and Statistical Database Management, (1-4)
- Han W, Miao Y, Li K, Wu M, Yang F, Zhou L, Prabhakaran V, Chen W and Chen E Chronos Proceedings of the Ninth European Conference on Computer Systems, (1-14)
- Heer J and Perer A (2014). Orion, Information Visualization, 13:2, (111-133), Online publication date: 1-Apr-2014.
- Tsai M, Aggarwal C and Huang T Ranking in heterogeneous social media Proceedings of the 7th ACM international conference on Web search and data mining, (613-622)
- Jin R, Lee V and Li L (2014). Scalable and axiomatic ranking of network role similarity, ACM Transactions on Knowledge Discovery from Data, 8:1, (1-37), Online publication date: 1-Feb-2014.
- Aggarwal C, Xie Y and Yu P (2014). A framework for dynamic link prediction in heterogeneous networks, Statistical Analysis and Data Mining, 7:1, (14-33), Online publication date: 1-Feb-2014.
- Ma S, Cao Y, Fan W, Huai J and Wo T (2014). Strong simulation, ACM Transactions on Database Systems, 39:1, (1-46), Online publication date: 1-Jan-2014.
- Gossen T, Kotzyba M and Nürnberger A (2014). Graph clusterings with overlaps, Neurocomputing, 123, (13-22), Online publication date: 1-Jan-2014.
- Guo T, Chi L and Zhu X Graph hashing and factorization for fast graph stream classification Proceedings of the 22nd ACM international conference on Information & Knowledge Management, (1607-1612)
- Pan C and Zymbler M Very Large Graph Partitioning by Means of Parallel DBMS Proceedings of the 17th East European Conference on Advances in Databases and Information Systems - Volume 8133, (388-399)
- Nguyen J, Hu B, Günnemann S and Ester M Finding contexts of social influence in online social networks Proceedings of the 7th Workshop on Social Network Mining and Analysis, (1-9)
- Boden B, Günnemann S, Hoffmann H and Seidl T RMiCS Proceedings of the 25th International Conference on Scientific and Statistical Database Management, (1-12)
- Aggarwal C and Zhao P (2013). Towards graphical models for text processing, Knowledge and Information Systems, 36:1, (1-21), Online publication date: 1-Jul-2013.
- Mueller-Wickop N and Schultz M ERP event log preprocessing Proceedings of the 8th international conference on Design Science at the Intersection of Physical and Virtual Design, (105-119)
- Livi L and Rizzi A (2013). Graph ambiguity, Fuzzy Sets and Systems, 221, (24-47), Online publication date: 1-Jun-2013.
- Qi G, Aggarwal C and Huang T Online community detection in social sensing Proceedings of the sixth ACM international conference on Web search and data mining, (617-626)
- Nguyen K, Cerf L, Plantevit M and Boulicaut J (2013). Discovering descriptive rules in relational dynamic graphs, Intelligent Data Analysis, 17:1, (49-69), Online publication date: 1-Jan-2013.
- Shelokar P, Quirin A and Cordón Ó (2013). MOSubdue, Knowledge and Information Systems, 34:1, (75-108), Online publication date: 1-Jan-2013.
- Beheshti S, Benatallah B, Motahari-Nezhad H and Allahbakhsh M A framework and a language for on-line analytical processing on graphs Proceedings of the 13th international conference on Web Information Systems Engineering, (213-227)
- Lin W, Xiao X, Cheng J and Bhowmick S Efficient algorithms for generalized subgraph query processing Proceedings of the 21st ACM international conference on Information and knowledge management, (325-334)
- Nguyen Q, Eades P and Hong S StreamEB Proceedings of the 20th international conference on Graph Drawing, (400-413)
- Ahmed N, Neville J and Kompella R Space-efficient sampling from social activity streams Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, (53-60)
- Boden B, Günnemann S, Hoffmann H and Seidl T Mining coherent subgraphs in multi-layer graphs with edge labels Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, (1258-1266)
- Günnemann S, Boden B and Seidl T Substructure clustering Proceedings of the 24th international conference on Scientific and Statistical Database Management, (280-297)
- Valari E, Kontaki M and Papadopoulos A Discovery of top-k dense subgraphs in dynamic graph collections Proceedings of the 24th international conference on Scientific and Statistical Database Management, (213-230)
- Shao B, Wang H and Xiao Y Managing and mining large graphs Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, (589-592)
- Yuan Y, Wang G, Chen L and Wang H (2012). Efficient subgraph similarity search on large probabilistic graph databases, Proceedings of the VLDB Endowment, 5:9, (800-811), Online publication date: 1-May-2012.
- Sun Z, Wang H, Wang H, Shao B and Li J (2012). Efficient subgraph matching on billion node graphs, Proceedings of the VLDB Endowment, 5:9, (788-799), Online publication date: 1-May-2012.
- Ma S, Cao Y, Huai J and Wo T Distributed graph pattern matching Proceedings of the 21st international conference on World Wide Web, (949-958)
- Cuzzocrea A and Serafino P Probabilistic pattern queries over complex probabilistic graphs Proceedings of the 2012 Joint EDBT/ICDT Workshops, (131-135)
- Fan W Graph pattern matching revised for social network analysis Proceedings of the 15th International Conference on Database Theory, (8-21)
- Bahmani B, Kumar R and Vassilvitskii S (2012). Densest subgraph in streaming and MapReduce, Proceedings of the VLDB Endowment, 5:5, (454-465), Online publication date: 1-Jan-2012.
- Ma S, Cao Y, Fan W, Huai J and Wo T (2011). Capturing topology in graph pattern matching, Proceedings of the VLDB Endowment, 5:4, (310-321), Online publication date: 1-Dec-2011.
- Lavrač N, Vavpetič A, Soldatova L, Trajkovski I and Novak P Using ontologies in semantic data mining with SEGS and g-SEGS Proceedings of the 14th international conference on Discovery science, (165-178)
- Cuzzocrea A and Serafino P A family of graph-theory-driven algorithms for managing complex probabilistic graph data efficiently Proceedings of the 15th Symposium on International Database Engineering & Applications, (240-242)
- Günnemann S, Boden B and Seidl T DB-CSC Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I, (565-580)
- Günnemann S, Boden B and Seidl T DB-CSC Proceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, (565-580)
- Beheshti S, Benatallah B, Motahari-Nezhad H and Sakr S A query language for analyzing business processes execution Proceedings of the 9th international conference on Business process management, (281-297)
- Lam D, Liu A and Martin C Graph-based data warehousing using the core-facets model Proceedings of the 11th international conference on Advances in data mining: applications and theoretical aspects, (240-254)
- Bifet A, Holmes G, Pfahringer B and Gavaldà R Mining frequent closed graphs on evolving data streams Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, (591-599)
- Mathew G and Obradovic Z Constraint graphs as security filters for privacy assurance in medical transactions Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine, (502-504)
- Getoor L and Mihalkova L Learning statistical models from relational data Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, (1195-1198)
- Zhao P, Li X, Xin D and Han J Graph cube Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, (853-864)
- Lyritsis A, Papadopoulos A and Manolopoulos Y TAGs Proceedings of the 14th International Conference on Extending Database Technology, (295-306)
- Aggarwal C, Li Y, Yu P and Jin R (2010). On dense pattern mining in graph streams, Proceedings of the VLDB Endowment, 3:1-2, (975-984), Online publication date: 1-Sep-2010.
- Zou Z, Gao H and Li J Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, (633-642)
- Álvarez S, Brisaboa N, Ladra S and Pedreira Ó A compact representation of graph databases Proceedings of the Eighth Workshop on Mining and Learning with Graphs, (18-25)
Index Terms
- Managing and Mining Graph Data