skip to main content
10.1145/1774088.1774422acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

g-tries: an efficient data structure for discovering network motifs

Published:22 March 2010Publication History

ABSTRACT

In this paper we propose a novel specialized data structure that we call g-trie, designed to deal with collections of subgraphs. The main conceptual idea is akin to a prefix tree in the sense that we take advantage of common topology by constructing a multiway tree where the descendants of a node share a common substructure. We give algorithms to construct a g-trie, to list all stored subgraphs, and to find occurrences on another graph of the subgraphs stored in the g-trie. We evaluate the implementation of this structure and its associated algorithms on a set of representative benchmark biological networks in order to find network motifs. To assess the efficiency of our algorithms we compare their performance with other known network motif algorithms also implemented in the same common platform. Our results show that indeed, g-tries are a feasible, adequate and very efficient data structure for network motifs discovery, clearly outperforming previous algorithms and data structures.

References

  1. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pages 487--499, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. I. Albert and R. Albert. Conserved network motifs allow protein-protein interaction prediction. Bioinformatics, 20(18):3346--3352, December 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. V. Batagelj and A. Mrvar. Pajek datasets, 2006. http://vlado.fmf.uni-lj.si/pub/networks/data/.Google ScholarGoogle Scholar
  4. D. Bu, Y. Zhao, L. Cai, H. Xue, X. Zhu, H. Lu, J. Zhang, S. Sun, L. Ling, N. Zhang, G. Li, and R. Chen. Topological structure analysis of the protein-protein interaction network in budding yeast. Nucl. Acids Res., 31(9):2443--2450, May 2003.Google ScholarGoogle ScholarCross RefCross Ref
  5. D. Chakrabarti and C. Faloutsos. Graph mining: Laws, generators, and algorithms. ACM Computing Surveys, 38:1, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. da F. Costa, F. A. Rodrigues, G. Travieso, and P. R. V. Boas. Characterization of complex networks: A survey of measurements. Advances In Physics, 56:167, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  7. R. Dobrin, Q. K. Beg, A. Barabasi, and Z. Oltvai. Aggregation of topological motifs in the escherichia coli transcriptional regulatory network. BMC Bioinformatics, 5:10, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. E. Fredkin. Trie memory. Commun. ACM, 3(9):490--499, September 1960. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Grochow and M. Kellis. Network motif discovery using subgraph enumeration and symmetry-breaking. pages 92--106. 2007.Google ScholarGoogle Scholar
  10. J. Han and M. Kamber. Data Mining, Second Edition: Concepts and Techniques. Morgan Kaufmann, September 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Itzkovitz, R. Levitt, N. Kashtan, R. Milo, M. Itzkovitz, and U. Alon. Coarse-graining and self-dissimilarity of complex networks. Phys Rev E Stat Nonlin Soft Matter Phys, 71(1 Pt 2), January 2005.Google ScholarGoogle Scholar
  12. N. Kashtan, S. Itzkovitz, R. Milo, and U. Alon. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics, 20(11):1746--1758, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Kondoh. Building trophic modules into a persistent food web. Proceedings of the National Academy of Sciences, 105(43):16631--16635, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  14. M. Kuramochi and G. Karypis. Frequent subgraph discovery. Data Mining, IEEE International Conference on, 0:313, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Lancichinetti, S. Fortunato, and F. Radicchi. Benchmark graphs for testing community detection algorithms. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics), 78(4), 2008.Google ScholarGoogle Scholar
  16. D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson. The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. can geographic isolation explain this unique trait? Behavioral Ecology and Sociobiology, 54(4):396--405, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  17. B. McKay. Practical graph isomorphism. Cong. Numerantium, 30:45--87, 1981.Google ScholarGoogle Scholar
  18. R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network motifs: simple building blocks of complex networks. Science, 298(5594):824--827, October 2002.Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Newman. Network data, september, 2009. http://www-personal.umich.edu/~mejn/netdata/.Google ScholarGoogle Scholar
  20. M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45, 2003.Google ScholarGoogle Scholar
  21. O. Sporns and R. Kotter. Motifs in brain networks. PLoS Biology, 2, 2004.Google ScholarGoogle Scholar
  22. S. Wasserman, K. Faust, and D. Iacobucci. Social Network Analysis: Methods and Applications (Structural Analysis in the Social Sciences). Cambridge University Press, November 1994.Google ScholarGoogle Scholar
  23. D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393(6684):440--442, June 1998.Google ScholarGoogle ScholarCross RefCross Ref
  24. S. Wernicke. Efficient detection of network motifs. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 3(4):347--359, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Wernicke and F. Rasche. Fanmod: a tool for fast network motif detection. Bioinformatics, 22(9):1152--1153, May 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. g-tries: an efficient data structure for discovering network motifs

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing
                March 2010
                2712 pages
                ISBN:9781605586397
                DOI:10.1145/1774088

                Copyright © 2010 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 22 March 2010

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                SAC '10 Paper Acceptance Rate364of1,353submissions,27%Overall Acceptance Rate1,650of6,669submissions,25%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader