Abstract
The optimal transformation of one tree into another by means of elementary edit operations is an important algorithmic problem that has several interesting applications to computational biology. Here we introduce a constrained form of this problem in which a partial mapping of a set of nodes (the "seeds") in one tree to a corresponding set of nodes in the other tree is given, and present efficient algorithms for both ordered and unordered trees. Whereas ordered tree matching based on seeded nodes has applications in pattern matching of RNA structures, unordered tree matching based on seeded nodes has applications in co-speciation and phylogeny reconciliation. The latter involves the solution of the planar tanglegram layout problem, for which a polynomial-time algorithm is given here.
- M.J. Chung, "O(n2.5) Time Algorithms for the Subgraph Homeomorphism Problem on Trees," J. Algorithms, vol. 8, pp. 106-112, 1987. Google ScholarDigital Library
- B. DasGupta, X. He, T. Jiang, M. Li, J. Tromp, and L. Zhang, "On Distances between Phylogenetic Trees," Proc. Eighth Ann. ACM-SIAM Symp. Discrete Algorithms (SODA '97), pp. 427-436, 1997. Google ScholarDigital Library
- J.-F. Dufayard, L. Duret, S. Penel, M. Gouy, F. Rechenmann, and G. Perrière, "Tree Pattern Matching in Phylogenetic Trees: Automatic Search for Orthologs or Paralogs in Homologous Gene Sequence Databases," Bioinformatics, vol. 21, no. 11, pp. 2596-2603, 2005. Google ScholarDigital Library
- J. Felsenstein, "Phylip--Phylogeny Inference Package (Version 3.2)," Cladistics, vol. 5, no. 1, pp. 164-166, 1989.Google Scholar
- U. Fößmeier and M. Kaufmann, "Nice Drawings for Planar Bipartite Graphs," Proc. Third Italian Conf. Algorithms and Complexity (CIAC '97), vol. 1203, pp. 122-134, 1997. Google ScholarDigital Library
- J.A. Gallian, "A Dynamic Survey of Graph Labeling," Electronic J. Combinatorics, no. DS7, http://www.combinatorics.org/Surveys/, 2007.Google Scholar
- K.J. Gardiner, T.L. Marsh, and N.R. Pace, "Ion Dependence of the Bacillus Subtilis RNase P Reaction," J. Biological Chemistry, vol. 260, no. 9, pp. 5415-5419, 1985.Google Scholar
- E.S. Haas and J.W. Brown, "Evolutionary Variation in Bacterial RNase P RNAs," Nucleic Acids Research, vol. 26, no. 18, pp. 4093-4099, 1998.Google ScholarCross Ref
- J.K. Harris, E.S. Haas, D. Williams, D.N. Frank, and J.W. Brown, "New Insight into RNase P RNA Structure from Comparative Analysis of the Archaeal RNA," RNA, vol. 7, no. 2, pp. 220-232, 2001.Google ScholarCross Ref
- P. Hugenholtz, "Exploring Prokaryotic Diversity in the Genomic Era," Genome Biology, vol. 3, no. 2, pp. reviews0003.1- reviews0003.8, 2002.Google Scholar
- B.D. James, G.J. Olsen, J. Liu, and N.R. Pace, "The Secondary Structure of Ribonuclease P RNA, the Catalytic Element of a Ribonucleoprotein Enzyme," Cell, vol. 52, no. 1, pp. 19-26, 1988.Google ScholarCross Ref
- J. Jansson, N.T. Hieu, and W.-K. Sung, "Local Gapped Subforest Alignment and Its Application in Finding RNA Structural Motifs," J. Computational Biology, vol. 13, no. 3, pp. 702-718, 2006.Google ScholarCross Ref
- S.-Y. Le, R. Nussinov, and J.V. Maizel, "Tree Graphs of RNA Secondary Structures and Their Comparisons," Computers in Biomedical Research, vol. 22, no. 5, pp. 461-473, 1989. Google ScholarDigital Library
- A. Lozano, R. Pinter, O. Rokhlenko, G. Valiente, and M. Ziv-Ukelson, "Seeded Tree Alignment and Planar Tanglegram Layout," Proc. Seventh Workshop Algorithms in Bioinformatics (WABI '07), vol. 4645, pp. 98-110, 2007. Google ScholarDigital Library
- B.L. Maidak, J.R. Cole, T.G. Lilburn, C.T. Parker, P.R. Saxman, J.M. Stredwick, G.M. Garrity, B. Li, G.J. Olsen, S. Pramanik, T.M. Schmidt, and J.M. Tiedje, "The RDP (Ribosomal Database Project) Continues," Nucleic Acids Research, vol. 28, pp. 173-174, 2000.Google ScholarCross Ref
- D.W. Matula, "An Algorithm for Subtree Identification," SIAM Rev., vol. 10, pp. 273-274, 1968.Google Scholar
- D. Matula, "Subtree Isomorphism in O(n5/2)," Annals Discrete Math., vol. 2, pp. 91-106, 1978.Google ScholarCross Ref
- T.M. Nye, P. Lio, and W.R. Gilks, "A Novel Algorithm and Web-Based Tool for Comparing Two Alternative Phylogenetic Trees," Bioinformatics, vol. 22, no. 1, pp. 117-119, 2006. Google ScholarDigital Library
- N.R. Pace and J.W. Brown, "Evolutionary Perspective on the Structure and Function of Ribonuclease P, A Ribozyme," J. Bacteriology, vol. 177, no. 8, pp. 1919-1928, 1995.Google ScholarCross Ref
- R.D.M. Page, ed., Tangled Trees: Phylogeny, Cospeciation, and Coevolution. The Univ. of Chicago Press, 2002.Google Scholar
- R.D.M. Page and G. Valiente, "An Edit Script for Taxonomic Classifications," BMC Bioinformatics, vol. 6, p. 208, 2005.Google ScholarCross Ref
- R.Y. Pinter, O. Rokhlenko, D. Tsur, and M. Ziv-Ukelson, "Approximate Labelled Subtree Homeomorphism," Proc. 15th Ann. Symp. Combinatorial Pattern Matching (CPM '04), vol. 3109, pp. 59-73, 2004.Google ScholarCross Ref
- S.W. Reyner, "An Analysis of a Good Algorithm for the Subtree Problem," SIAM J. Computing, vol. 6, no. 4, pp. 730-732, 1977.Google ScholarCross Ref
- R. Shamir and D. Tsur, "Faster Subtree Isomorphism," J. Algorithms, vol. 33, no. 2, pp. 267-280, 1999. Google ScholarDigital Library
- H. Shan, K.G. Herbert, W.H. Piel, D. Shasha, and J.T.L. Wang, "A Structure-Based Search Engine for Phylogenetic Databases," Proc. 14th Int'l Conf. Scientific and Statistical Database Management (SSDBM '02), pp. 7-10, 2002. Google ScholarDigital Library
- B.A. Shapiro and K. Zhang, "Comparing Multiple RNA Secondary Structures Using Tree Comparisons," Computer Applications in the Biosciences, vol. 6, no. 4, pp. 309-318, 1990.Google Scholar
- D. Shasha, J.T.-L. Wang, and R. Giugno, "Algorithmics and Applications of Tree and Graph Searching," Proc. 21st ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems (PODS '02), pp. 39-52, 2002. Google ScholarDigital Library
- G. Valiente, Algorithms on Trees and Graphs. Springer, 2002. Google ScholarDigital Library
- G. Valiente, "Constrained Tree Inclusion," J. Discrete Algorithms, vol. 3, nos. 2-4, pp. 431-447, 2005.Google ScholarCross Ref
- G. Valiente, "A Fast Algorithmic Technique for Comparing Large Phylogenetic Trees," Proc. 12th Int'l Symp. String Processing and Information Retrieval (SPIRE '05), vol. 3772, pp. 371-376, 2005. Google ScholarDigital Library
- C.R. Woese and N.R. Pace, "Probing RNA Structure, Function, and History by Comparative Analysis," The RNA World, R.F. Gesteland and J.F. Atkins, eds., pp. 91-117, Cold Spring Harbor Laboratory Press, 1993.Google Scholar
- W.N.W. Zainon and P. Calder, "Visualizing Phylogenetic Trees," Proc. Seventh Australasian User Interface Conf. (AUIC '06), pp. 145-152, 2006. Google ScholarDigital Library
- K. Zhang, L. Wang, and B. Ma, "Computing Similarity between RNA Structures," Proc. 10th Ann. Symp. Combinatorial Pattern Matching (CPM '99), vol. 1645, pp. 281-293, 1999. Google ScholarDigital Library
Index Terms
- Seeded Tree Alignment
Recommendations
Constructing a minimum height elimination tree of a tree in linear time
Given a graph, finding an optimal vertex ranking and constructing a minimum height elimination tree are two related problems. However, an optimal vertex ranking does not by itself provide enough information to construct an elimination tree of minimum ...
A Fast Algorithm for Computing Geodesic Distances in Tree Space
Comparing and computing distances between phylogenetic trees are important biological problems, especially for models where edge lengths play an important role. The geodesic distance measure between two phylogenetic trees with edge lengths is the length ...
Seeded tree alignment and planar tanglegram layout
WABI'07: Proceedings of the 7th international conference on Algorithms in BioinformaticsThe optimal transformation of one tree into another by means of elementary edit operations is an important algorithmic problem that has several interesting applications to computational biology. We introduce a constrained form of this problem in which a ...
Comments