Abstract
The need for mining structured data has increased in the past few years. One of the best studied data structures in computer science and discrete mathematics are graphs. It can therefore be no surprise that graph based data mining has become quite popular in the last few years.This article introduces the theoretical basis of graph based data mining and surveys the state of the art of graph-based data mining. Brief descriptions of some representative approaches are provided as well.
- MRDM'01: Workshop multi-relational data mining. In conjunction with PKDD'01 and ECML'01, 2002. http://www.kiminkii.com/mrdm/.Google Scholar
- R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In VLDB'94: Twentyth Very Large Dada Base Conference, pages 487--499, 1994. Google ScholarDigital Library
- J. Cook and L. Holder. Substructure discovery using minimum description length and background knowledge. J. Artificial Intel. Research, 1 :231--255, 1994.Google Scholar
- L. De Raedt and S. Kramer. The levelwise version space algorithm and its application to molecular fragment finding. In IJCAI'01: Seventeenth International Joint Conference on Artificial Intelligence, volume 2, pages 853--859, 2001. Google ScholarDigital Library
- A. Debnath, R. De Compadre, G. Debnath, A. Schusterman, and C. Hansch. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. J. Medicinal Chemistry, 34, 1991.Google Scholar
- L. Dehaspe and H. Toivonen. Discovery of frequent datalog patterns. Data Mining and Knowledge Discovery, 3(1):7--36, 1999. Google ScholarDigital Library
- T. Gaertner. A survey of kernels for structured data. SIGKDD Explorations, 5(1), 2003. Google ScholarDigital Library
- W. Geamsakul, T. Matsuda, T. Yoshida, H. Motoda, and T. Washio. Classifier construction by graph-based induction for graph-structured data. In PAKDD'03: Proc. of 7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, LNAI2637, pages 52--62, 2003. Google ScholarDigital Library
- P. Geibel and F. Wysotzki. Learning relational concepts with decision trees. In ICML'96: 13th Int. Conf. Machine Learning, pages 166--174, 1996.Google Scholar
- T. Imielinski and H. Mannila. A database perspective on knowledge discovery. Communications of the ACM, 39(11):58--64, 1996. Google ScholarDigital Library
- A. Inokuchi, T. Washio, and H. Motoda. Complete mining of frequent patterns from graphs: Mining graph data. Machine Learning, 50:321--354, 2003. Google ScholarDigital Library
- I. Jonyer, L. Holder, and D. Cook. Concept formation using graph grammars. In Workshop Notes: MRDM 2002 Workshop on Multi-Relational Data Mining, pages 71--792, 2002.Google Scholar
- H. Kashima and A. Inokuchi. Kernels for graph classification. In AM2002: Proc. of Int. Workshop on Active Mining, pages 31--35, 2002.Google Scholar
- R. Kondor and J. Lafferty. Diffusion kernels on graphs and other discrete input space. In ICML'02: Nineteenth International Joint Conference on Machine Learning, pages 315--322, 2002. Google ScholarDigital Library
- M. Kuramochi and G. Karypis. Frequent subgraph discovery. In ICDM'01: 1st IEEE Conf. Data Mining, pages 313--320, 2001. Google ScholarDigital Library
- M. Liquiere and J. Sallantin. Structural machine learning with galois lattice and graphs. In ICML'98: 15th Int. Conf. Machine Learning, pages 305--313, 1998. Google ScholarDigital Library
- H. Mannila and H. Toivonen. Discovering generalized episodes using minimal occurrences. In 2nd Intl. Conf. Knowledge Discovery and Data Mining, pages 146--151, 1996.Google Scholar
- B. Mckay. Nauty users guide (version 1.5). Technical Report Technical Report, TR-CS-90-02, Department of computer Science, Australian National University, 1990.Google Scholar
- A. Mendelzon, A. Mihaila, and T. Milo. Querying the world wide web. Int. J. Digit. Libr., 1:54--67, 1997.Google ScholarCross Ref
- S. Muggleton and L. De Raedt. Inductive logic programming: Theory and methods. J. Logic Programming, 19(20):629--679, 1994.Google ScholarCross Ref
- S. Nijssen and J. Kok. Faster association rules for multiple relations. In IJCAI'01: Seventeenth International Joint Conference on Artificial Intelligence, volume 2, pages 891--896, 2001. Google ScholarDigital Library
- A. Srinivasan, R. King, and D. Bristol. An assessment of submissions made to the predictive toxicology evaluation challenge. In IJCAI'99: Proc. of 16th International Joint Conference on Artificial Intelligence, pages 270--275, 1999. Google ScholarDigital Library
- V. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, New York., 1995. Google ScholarDigital Library
- X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM'02: 2nd IEEE Conf. Data Mining, pages 721--724, 2002. Google ScholarDigital Library
- K. Yoshida, H. Motoda, and N. Indurkhya. Graphbased induction as a unified learning framework. J. of Applied Intel., 4:297--328, 1994.Google ScholarCross Ref
- M. Zaki. Efficiently mining frequent trees in a forest. In 8th Intl. Conf. Knowledge Discovery and Data Mining, pages 71--80, 2002. Google ScholarDigital Library
Index Terms
- State of the art of graph-based data mining
Recommendations
A review of enhancing online learning using graph-based data mining techniques
AbstractIn recent years, graph-based data mining (GDM) is the most accepted research due to numerous applications in a broad selection of software bug localization, computational biology, practical field, computer networking, and keyword searching. ...
Mining fuzzy specific rare itemsets for education data
Association rule mining is an important data analysis method for the discovery of associations within data. There have been many studies focused on finding fuzzy association rules from transaction databases. Unfortunately, in the real world, one may ...
New approach in data stream association rule mining based on graph structure
ICDM'10: Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspectsDiscovery of useful information and valuable knowledge from transactions has attracted many researchers due to increasing use of very large databases and data warehouses. Furthermore most of proposed methods are designed to work on traditional databases ...
Comments