skip to main content
review-article
Free Access

40 years of suffix trees

Published:23 March 2016Publication History
Skip Abstract Section

Abstract

Tracing the first four decades in the life of suffix trees, their many incarnations, and their applications.

Skip Supplemental Material Section

Supplemental Material

References

  1. Amir, A., Benson, G. and Farach, M. Let sleeping files lie: Pattern matching in Z-compressed files. In Proceedings of the 5th ACM-SIAM Annual Symposium on Discrete Algorithms (Arlington, VA, 1994), 705--714. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Apostolico, A. The myriad virtues of suffix trees. Combinatorial Algorithms on Words, vol. 12 of NATO Advanced Science Institutes, Series F. A. Apostolico and Z. Galil, Eds. Springer-Verlag, Berlin, 1985, 85--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Apostolico, A., Bock, M.E. and Lonardi, S. Monotony of surprise and large-scale quest for unusual words. J. Computational Biology 10, 3 / 4 (2003), 283--311.Google ScholarGoogle ScholarCross RefCross Ref
  4. Apostolico, A., Denas, O. and Dress, A. Efficient tools for comparative substring analysis. J. Biotechnology 149, 3 (2010), 120--126.Google ScholarGoogle ScholarCross RefCross Ref
  5. Apostolico, A. and Preparata, F.P. Optimal off-line detection of repetitions in a string. Theor. Comput. Sci. 22, 3 (1983), 297--315.Google ScholarGoogle ScholarCross RefCross Ref
  6. Apostolico, A. and Preparata, F.P. Data structures and algorithms for the strings statistics problem. Algorithmica 15, 5 (May 1996), 481--494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Baker, B.S. Parameterized duplication in strings: Algorithms and an application to software maintenance. SIAM J. Comput. 26, 5 (1997), 1343--1362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Béal, M.-P., Mignosi, F. and Restivo, A. Minimal forbidden words and symbolic dynamics. In Proceedings of the 13th Annual Symposium on Theoretical Aspects of Computer Science, vol. 1046 of Lecture Notes in Computer Science (Grenoble, France, Feb. 22--24, 1996). Springer, 555--566. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Blumer, A., Blumer, J., Ehrenfeucht, A., Haussler, D., Chen, M.T. and Seiferas, J. The smallest automaton recognizing the subwords of a text. Theor. Comput. Sci. 40, 1 (1985), 31--55.Google ScholarGoogle ScholarCross RefCross Ref
  10. Brodal, G.S., Lyngsø, R.B., Östlin, A. and Pedersen, C.N.S. Solving the string statistics problem in time O (n log n). In Proceedings of the 29th International Colloquium on Automata, Languages and Programming, vol. 2380 of Lecture Notes in Computer Science (Malaga, Spain, July 8--13, 2002). Springer, 728--739. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Burrows, M. and Wheeler, D.J. A block-sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corp., May 1994.Google ScholarGoogle Scholar
  12. Chairungsee, S. and Crochemore, M. Using minimal absent words to build phylogeny. Theoretical Computer Science 450, 1 (2012), 109--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Clark, D.R. and Munro, J.I. Efficient suffix trees on secondary storage. In Proceedings of the 7th ACM-SIAM Annual Symposium on Discrete Algorithms, (Atlanta, GA, 1996), 383--391. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Crochemore, M. Transducers and repetitions. Theor. Comput. Sci., 45, 1 (1986), 63--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Crochemore, M., Mignosi, F. and Restivo, A. Automata and forbidden words. Information Processing Letters 67, 3 (1998), 111--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Crochemore, M., Mignosi, F., Restivo, A and Salemi, S. Data compression using antidictonaries. In Proceedings of the IEEE: Special Issue Lossless Data Compression 88, 11 (2000). J. Storer, Ed., 1756--1768.Google ScholarGoogle Scholar
  17. Farach, M. Optimal suffix tree construction with large alphabets. In Proceedings of the 38th IEEE Annual Symposium on Foundations of Computer Science (Miami Beach, FL, 1997), 137--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ferragina, P., Luccio, F., Manzini, G. and Muthukrishnan, S. Compressing and indexing labeled trees with applications. JACM 57, 1 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ferragina, P. and Manzini, G. Opportunistic data structures with applications. In FOCS (2000), 390--398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Grossi, R., Gupta, A. and Vitter, J.S. High-order entropy-compressed text indexes. In SODA (2003), 841--850. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Grossi, R. and Vitter, J.S. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. In Proceedings ACM Symposium on the Theory of Computing (Portland, OR, 2000). ACM Press, 397--406). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Gusfield, D. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge, U.K., 1997. Google ScholarGoogle ScholarCross RefCross Ref
  23. Harel, D. and Tarjan, R.E. Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13, 2 (1984), 338--355. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Hon, W.-K., Shah, R. and Vitter, J.S. Space-efficient framework for top-k string retrieval problems. In FOCS. IEEE Computer Society, 2009, 713--722. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hui, L.C.K. Color set size problem with applications to string matching. In Proceedings of the 3rd Annual Symposium on Combinatorial Pattern Matching, no. 644 in Lecture Notes in Computer Science, (Tucson, AZ, 1992). A. Apostolico, M. Crochemore, Z. Galil, and U. Manber, Eds. Springer-Verlag, Berlin, 230--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Karp, R.M., Miller, R.E., and Rosenberg, A.L. Rapid identification of repeated patterns in strings, trees and arrays. In Proceedings of the 4th ACM Symposium on the Theory of Computing (Denver, CO, 1972). ACM Press, 125--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kasai, T., Lee, G., Arimura, H., Arikawa, S. and Park, K. Linear-time longest-common-prefix computation in suffix arrays and its applications. CPM. Springer-Verlag, 2001, 181--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kurtz, S. Reducing the space requirements of suffix trees. Softw. Pract. Exp. 29, 13 (1999), 1149--1171. Google ScholarGoogle ScholarCross RefCross Ref
  29. Landau, G.M. String matching in erroneus input. Ph.D. Thesis, Department of Computer Science, Tel-Aviv University, 1986.Google ScholarGoogle Scholar
  30. Lempel, A. and Ziv, J. On the complexity of finite sequences. IEEE Trans. Inf. Theory 22 (1976), 75--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Manber, U. and Myers, G. Suffix arrays: A new method for on-line string searches. In Proceedings of the 1st ACM-SIAM Annual Symposium on Discrete Algorithms (San Francisco, CA, 1990), 319--327. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. McCreight, E.M. A space-economical suffix tree construction algorithm. J. Algorithms 23, 2 (1976), 262--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Muthukrishnan, S. Efficient algorithms for document listing problems. In Proceedings of the 13th ACM-SIAM Annual Symposium on Discrete Algorithms (2002), 657--666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. J. C. Na, P. Ferragina, R. Giancarlo, and K. Park. Two-dimensional pattern indexing. In Encyclopedia of Algorithms. 2008.Google ScholarGoogle ScholarCross RefCross Ref
  35. Nong, G., Zhang, S. and Chan, W.H. Two efficient algorithms for linear time suffix array construction. IEEE Trans. Comput. 60, 10 (2011), 1471--1484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Poe, E.A. The Gold-Bug and Other Tales. Dover Thrift Editions Series. Dover, 1991.Google ScholarGoogle Scholar
  37. Pratt, V. Improvements and applications for the Weiner repetition finder. Manuscript, 1975.Google ScholarGoogle Scholar
  38. Rodeh, M., Pratt, V. and Even, S. Linear algorithm for data compression via string matching. J. Assoc. Comput. Mach. 28, 1 (1981), 16--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ukkonen, E. On-line construction of suffix trees. Algorithmica 14, 3 (1995), 249--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ulitsky, I., Burstein, D., Tuller, T. and Chor, B. The average common substring approach to phylogenomic reconstruction. J. Computational Biology 13, 2 (2006), 336--350.Google ScholarGoogle ScholarCross RefCross Ref
  41. Weiner, P. Linear pattern matching algorithms. In Proceedings of the 14th Annual IEEE Symposium on Switching and Automata Theory, (Washington, D.C., 1973), 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. 40 years of suffix trees

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Communications of the ACM
            Communications of the ACM  Volume 59, Issue 4
            April 2016
            87 pages
            ISSN:0001-0782
            EISSN:1557-7317
            DOI:10.1145/2907055
            • Editor:
            • Moshe Y. Vardi
            Issue’s Table of Contents

            Copyright © 2016 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 23 March 2016

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • review-article
            • Popular
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDFChinese translation

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format