research-article

g-tries: an efficient data structure for discovering network motifs

Authors:
Pedro Ribeiro

Universidade do Porto, R. Campo Alegre, Porto, Portugal

Universidade do Porto, R. Campo Alegre, Porto, Portugal
View Profile

,
Fernando Silva

Universidade do Porto, R. Campo Alegre, Porto, Portugal

Universidade do Porto, R. Campo Alegre, Porto, Portugal
View Profile

SAC '10: Proceedings of the 2010 ACM Symposium on Applied ComputingMarch 2010Pages 1559–1566https://doi.org/10.1145/1774088.1774422

Published:22 March 2010Publication History

SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing

Pages 1559–1566

ABSTRACT

In this paper we propose a novel specialized data structure that we call g-trie, designed to deal with collections of subgraphs. The main conceptual idea is akin to a prefix tree in the sense that we take advantage of common topology by constructing a multiway tree where the descendants of a node share a common substructure. We give algorithms to construct a g-trie, to list all stored subgraphs, and to find occurrences on another graph of the subgraphs stored in the g-trie. We evaluate the implementation of this structure and its associated algorithms on a set of representative benchmark biological networks in order to find network motifs. To assess the efficiency of our algorithms we compare their performance with other known network motif algorithms also implemented in the same common platform. Our results show that indeed, g-tries are a feasible, adequate and very efficient data structure for network motifs discovery, clearly outperforming previous algorithms and data structures.

References

R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pages 487--499, 1994. Google ScholarDigital Library
I. Albert and R. Albert. Conserved network motifs allow protein-protein interaction prediction. Bioinformatics, 20(18):3346--3352, December 2004. Google ScholarDigital Library
V. Batagelj and A. Mrvar. Pajek datasets, 2006. http://vlado.fmf.uni-lj.si/pub/networks/data/.Google Scholar
D. Bu, Y. Zhao, L. Cai, H. Xue, X. Zhu, H. Lu, J. Zhang, S. Sun, L. Ling, N. Zhang, G. Li, and R. Chen. Topological structure analysis of the protein-protein interaction network in budding yeast. Nucl. Acids Res., 31(9):2443--2450, May 2003.Google ScholarCross Ref
D. Chakrabarti and C. Faloutsos. Graph mining: Laws, generators, and algorithms. ACM Computing Surveys, 38:1, 2006. Google ScholarDigital Library
L. da F. Costa, F. A. Rodrigues, G. Travieso, and P. R. V. Boas. Characterization of complex networks: A survey of measurements. Advances In Physics, 56:167, 2007.Google ScholarCross Ref
R. Dobrin, Q. K. Beg, A. Barabasi, and Z. Oltvai. Aggregation of topological motifs in the escherichia coli transcriptional regulatory network. BMC Bioinformatics, 5:10, 2004.Google ScholarCross Ref
E. Fredkin. Trie memory. Commun. ACM, 3(9):490--499, September 1960. Google ScholarDigital Library
J. Grochow and M. Kellis. Network motif discovery using subgraph enumeration and symmetry-breaking. pages 92--106. 2007.Google Scholar
J. Han and M. Kamber. Data Mining, Second Edition: Concepts and Techniques. Morgan Kaufmann, September 2006. Google ScholarDigital Library
S. Itzkovitz, R. Levitt, N. Kashtan, R. Milo, M. Itzkovitz, and U. Alon. Coarse-graining and self-dissimilarity of complex networks. Phys Rev E Stat Nonlin Soft Matter Phys, 71(1 Pt 2), January 2005.Google Scholar
N. Kashtan, S. Itzkovitz, R. Milo, and U. Alon. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics, 20(11):1746--1758, 2004. Google ScholarDigital Library
M. Kondoh. Building trophic modules into a persistent food web. Proceedings of the National Academy of Sciences, 105(43):16631--16635, 2008.Google ScholarCross Ref
M. Kuramochi and G. Karypis. Frequent subgraph discovery. Data Mining, IEEE International Conference on, 0:313, 2001. Google ScholarDigital Library
A. Lancichinetti, S. Fortunato, and F. Radicchi. Benchmark graphs for testing community detection algorithms. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics), 78(4), 2008.Google Scholar
D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson. The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. can geographic isolation explain this unique trait? Behavioral Ecology and Sociobiology, 54(4):396--405, 2003.Google ScholarCross Ref
B. McKay. Practical graph isomorphism. Cong. Numerantium, 30:45--87, 1981.Google Scholar
R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network motifs: simple building blocks of complex networks. Science, 298(5594):824--827, October 2002.Google ScholarCross Ref
M. Newman. Network data, september, 2009. http://www-personal.umich.edu/~mejn/netdata/.Google Scholar
M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45, 2003.Google Scholar
O. Sporns and R. Kotter. Motifs in brain networks. PLoS Biology, 2, 2004.Google Scholar
S. Wasserman, K. Faust, and D. Iacobucci. Social Network Analysis: Methods and Applications (Structural Analysis in the Social Sciences). Cambridge University Press, November 1994.Google Scholar
D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393(6684):440--442, June 1998.Google ScholarCross Ref
S. Wernicke. Efficient detection of network motifs. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 3(4):347--359, 2006. Google ScholarDigital Library
S. Wernicke and F. Rasche. Fanmod: a tool for fast network motif detection. Bioinformatics, 22(9):1152--1153, May 2006. Google ScholarDigital Library

Index Terms

Recommendations

Towards a faster network-centric subgraph census
ASONAM '13: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Determining the frequency of small subgraphs is an important computational task lying at the core of several graph mining methodologies, such as network motifs discovery or graphlet based measurements. In this paper we try to improve a class of ...
Read More
G-Tries: a data structure for storing and finding subgraphs

The ability to find and count subgraphs of a given network is an important non trivial task with multidisciplinary applicability. Discovering network motifs or computing graphlet signatures are two examples of methodologies that at their core rely ...
Read More
Querying subgraph sets with g-tries
DBSocial '12: Proceedings of the 2nd ACM SIGMOD Workshop on Databases and Social Networks

In this paper we present an universal methodology for finding all the occurrences of a given set of subgraphs in one single larger graph. Past approaches would either enumerate all possible subgraphs of a certain size or query a single subgraph. We use ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing
March 2010
2712 pages
ISBN:9781605586397
DOI:10.1145/1774088
Conference Chairs:
Sung Y. Shin
South Dakota State University
,
Sascha Ossowski
University Rey Juan Carlos, Spain
,
Michael Schumacher
University of Applied Sciences Western Switzerland, Switzerland
,
Program Chairs:
Mathew J. Palakal
Indiana University Purdue University
,
Chih-Cheng Hung
Southern Polytechnic State University
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 March 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
algorithms
biological networks
complex networks
data structures
graph mining
network motifs
tries
Qualifiers
- research-article
Conference

Acceptance Rates
SAC '10 Paper Acceptance Rate364of1,353submissions,27%Overall Acceptance Rate1,650of6,669submissions,25%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 72
  Total Citations
  View Citations
- 734
  Total Downloads
- Downloads (Last 12 months)59
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

g-tries: an efficient data structure for discovering network motifs

SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards a faster network-centric subgraph census

G-Tries: a data structure for storing and finding subgraphs

Querying subgraph sets with g-tries