Abstract
Tracing the first four decades in the life of suffix trees, their many incarnations, and their applications.
Supplemental Material
Available for Download
A list of resources and references to learn more about suffix trees
- Amir, A., Benson, G. and Farach, M. Let sleeping files lie: Pattern matching in Z-compressed files. In Proceedings of the 5th ACM-SIAM Annual Symposium on Discrete Algorithms (Arlington, VA, 1994), 705--714. Google ScholarDigital Library
- Apostolico, A. The myriad virtues of suffix trees. Combinatorial Algorithms on Words, vol. 12 of NATO Advanced Science Institutes, Series F. A. Apostolico and Z. Galil, Eds. Springer-Verlag, Berlin, 1985, 85--96. Google ScholarDigital Library
- Apostolico, A., Bock, M.E. and Lonardi, S. Monotony of surprise and large-scale quest for unusual words. J. Computational Biology 10, 3 / 4 (2003), 283--311.Google ScholarCross Ref
- Apostolico, A., Denas, O. and Dress, A. Efficient tools for comparative substring analysis. J. Biotechnology 149, 3 (2010), 120--126.Google ScholarCross Ref
- Apostolico, A. and Preparata, F.P. Optimal off-line detection of repetitions in a string. Theor. Comput. Sci. 22, 3 (1983), 297--315.Google ScholarCross Ref
- Apostolico, A. and Preparata, F.P. Data structures and algorithms for the strings statistics problem. Algorithmica 15, 5 (May 1996), 481--494. Google ScholarDigital Library
- Baker, B.S. Parameterized duplication in strings: Algorithms and an application to software maintenance. SIAM J. Comput. 26, 5 (1997), 1343--1362. Google ScholarDigital Library
- Béal, M.-P., Mignosi, F. and Restivo, A. Minimal forbidden words and symbolic dynamics. In Proceedings of the 13th Annual Symposium on Theoretical Aspects of Computer Science, vol. 1046 of Lecture Notes in Computer Science (Grenoble, France, Feb. 22--24, 1996). Springer, 555--566. Google ScholarDigital Library
- Blumer, A., Blumer, J., Ehrenfeucht, A., Haussler, D., Chen, M.T. and Seiferas, J. The smallest automaton recognizing the subwords of a text. Theor. Comput. Sci. 40, 1 (1985), 31--55.Google ScholarCross Ref
- Brodal, G.S., Lyngsø, R.B., Östlin, A. and Pedersen, C.N.S. Solving the string statistics problem in time O (n log n). In Proceedings of the 29th International Colloquium on Automata, Languages and Programming, vol. 2380 of Lecture Notes in Computer Science (Malaga, Spain, July 8--13, 2002). Springer, 728--739. Google ScholarDigital Library
- Burrows, M. and Wheeler, D.J. A block-sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corp., May 1994.Google Scholar
- Chairungsee, S. and Crochemore, M. Using minimal absent words to build phylogeny. Theoretical Computer Science 450, 1 (2012), 109--116. Google ScholarDigital Library
- Clark, D.R. and Munro, J.I. Efficient suffix trees on secondary storage. In Proceedings of the 7th ACM-SIAM Annual Symposium on Discrete Algorithms, (Atlanta, GA, 1996), 383--391. Google ScholarDigital Library
- Crochemore, M. Transducers and repetitions. Theor. Comput. Sci., 45, 1 (1986), 63--86. Google ScholarDigital Library
- Crochemore, M., Mignosi, F. and Restivo, A. Automata and forbidden words. Information Processing Letters 67, 3 (1998), 111--117. Google ScholarDigital Library
- Crochemore, M., Mignosi, F., Restivo, A and Salemi, S. Data compression using antidictonaries. In Proceedings of the IEEE: Special Issue Lossless Data Compression 88, 11 (2000). J. Storer, Ed., 1756--1768.Google Scholar
- Farach, M. Optimal suffix tree construction with large alphabets. In Proceedings of the 38th IEEE Annual Symposium on Foundations of Computer Science (Miami Beach, FL, 1997), 137--143. Google ScholarDigital Library
- Ferragina, P., Luccio, F., Manzini, G. and Muthukrishnan, S. Compressing and indexing labeled trees with applications. JACM 57, 1 (2009). Google ScholarDigital Library
- Ferragina, P. and Manzini, G. Opportunistic data structures with applications. In FOCS (2000), 390--398. Google ScholarDigital Library
- Grossi, R., Gupta, A. and Vitter, J.S. High-order entropy-compressed text indexes. In SODA (2003), 841--850. Google ScholarDigital Library
- Grossi, R. and Vitter, J.S. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. In Proceedings ACM Symposium on the Theory of Computing (Portland, OR, 2000). ACM Press, 397--406). Google ScholarDigital Library
- Gusfield, D. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge, U.K., 1997. Google ScholarCross Ref
- Harel, D. and Tarjan, R.E. Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13, 2 (1984), 338--355. Google ScholarDigital Library
- Hon, W.-K., Shah, R. and Vitter, J.S. Space-efficient framework for top-k string retrieval problems. In FOCS. IEEE Computer Society, 2009, 713--722. Google ScholarDigital Library
- Hui, L.C.K. Color set size problem with applications to string matching. In Proceedings of the 3rd Annual Symposium on Combinatorial Pattern Matching, no. 644 in Lecture Notes in Computer Science, (Tucson, AZ, 1992). A. Apostolico, M. Crochemore, Z. Galil, and U. Manber, Eds. Springer-Verlag, Berlin, 230--243. Google ScholarDigital Library
- Karp, R.M., Miller, R.E., and Rosenberg, A.L. Rapid identification of repeated patterns in strings, trees and arrays. In Proceedings of the 4th ACM Symposium on the Theory of Computing (Denver, CO, 1972). ACM Press, 125--13. Google ScholarDigital Library
- Kasai, T., Lee, G., Arimura, H., Arikawa, S. and Park, K. Linear-time longest-common-prefix computation in suffix arrays and its applications. CPM. Springer-Verlag, 2001, 181--192. Google ScholarDigital Library
- Kurtz, S. Reducing the space requirements of suffix trees. Softw. Pract. Exp. 29, 13 (1999), 1149--1171. Google ScholarCross Ref
- Landau, G.M. String matching in erroneus input. Ph.D. Thesis, Department of Computer Science, Tel-Aviv University, 1986.Google Scholar
- Lempel, A. and Ziv, J. On the complexity of finite sequences. IEEE Trans. Inf. Theory 22 (1976), 75--81. Google ScholarDigital Library
- Manber, U. and Myers, G. Suffix arrays: A new method for on-line string searches. In Proceedings of the 1st ACM-SIAM Annual Symposium on Discrete Algorithms (San Francisco, CA, 1990), 319--327. Google ScholarDigital Library
- McCreight, E.M. A space-economical suffix tree construction algorithm. J. Algorithms 23, 2 (1976), 262--272. Google ScholarDigital Library
- Muthukrishnan, S. Efficient algorithms for document listing problems. In Proceedings of the 13th ACM-SIAM Annual Symposium on Discrete Algorithms (2002), 657--666. Google ScholarDigital Library
- J. C. Na, P. Ferragina, R. Giancarlo, and K. Park. Two-dimensional pattern indexing. In Encyclopedia of Algorithms. 2008.Google ScholarCross Ref
- Nong, G., Zhang, S. and Chan, W.H. Two efficient algorithms for linear time suffix array construction. IEEE Trans. Comput. 60, 10 (2011), 1471--1484. Google ScholarDigital Library
- Poe, E.A. The Gold-Bug and Other Tales. Dover Thrift Editions Series. Dover, 1991.Google Scholar
- Pratt, V. Improvements and applications for the Weiner repetition finder. Manuscript, 1975.Google Scholar
- Rodeh, M., Pratt, V. and Even, S. Linear algorithm for data compression via string matching. J. Assoc. Comput. Mach. 28, 1 (1981), 16--24. Google ScholarDigital Library
- Ukkonen, E. On-line construction of suffix trees. Algorithmica 14, 3 (1995), 249--260. Google ScholarDigital Library
- Ulitsky, I., Burstein, D., Tuller, T. and Chor, B. The average common substring approach to phylogenomic reconstruction. J. Computational Biology 13, 2 (2006), 336--350.Google ScholarCross Ref
- Weiner, P. Linear pattern matching algorithms. In Proceedings of the 14th Annual IEEE Symposium on Switching and Automata Theory, (Washington, D.C., 1973), 1--11. Google ScholarDigital Library
Index Terms
- 40 years of suffix trees
Recommendations
On suffix extensions in suffix trees
Suffix trees are inherently asymmetric: prefix extensions only cause a few updates, while suffix extensions affect all suffixes causing a wave of updates. In his elegant linear-time on-line suffix tree algorithm Ukkonen relaxed the prevailing suffix ...
Computing suffix links for suffix trees and arrays
We present a new and simple algorithm to reconstruct suffix links in suffix trees and suffix arrays. The algorithm is based on observations regarding suffix tree construction algorithms. With our algorithm we bring suffix arrays even closer to the ease ...
Compressed suffix trees: Efficient computation and storage of LCP-values
The suffix tree is a very important data structure in string processing, but typical implementations suffer from huge space consumption. In large-scale applications, compressed suffix trees (CSTs) are therefore used instead. A CST consists of three (...
Comments