skip to main content
article
Free Access

Algorithms for string searching

Published:01 April 1989Publication History
Skip Abstract Section

Abstract

We survey several algorithms for searching a string in a piece of text. We include theoretical and empirical results, as well as the actual code of each algorithm. An extensive bibliography is also included.

References

  1. {AG86} A. Apostolico and R. Giancarlo. The Boyer-Moore-Galil string searching strategies revisited. SIAM J on Computing, 15:98--105, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {Aho80} A. V. Aho. Pattern matching in strings. In R. Book, editor, Formal Language Theory: Perspectives and Open Problems, pages 325--347. Academic Press, London, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  3. {AYS85} J. Aoe, Y. Yamamoto, and R. Shimada. An efficient implementation of static string pattern matching machines. In IEEE Int. Conf. on Supercomputing Systems, volume 1, pages 491--498, St. Petersburg, Fla, 1985.Google ScholarGoogle Scholar
  4. {Bar81} G. Barth. An alternative for the implementation of Knuth-Morris-Pratt algorithm. Inf. Proc. Letters, 13:134--137, 1981.Google ScholarGoogle ScholarCross RefCross Ref
  5. {Bar84} G. Barth. An analytical comparison of two string searching algorithms. Inf. Proc. Letters, 18:249--256, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. {BBE+87} A. Blumer, J. Blumer, A. Ehrenfeucht, D. Haussler, and R. McConnell. Completed inverted files for efficient text retrieval and analysis. J.ACM, 34:578--595, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. {BBG+89} O. Berkman, D. Breslauer, Z. Galil, B. Schieber, and U. Vishkin. Highly parellelizable problems. In Proc. 20th ACM Symp. on Theory of Computing, pages 309--319, Seattle, Washington, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. {BBH+85} A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. T. Chen, and J. Seiferas. The smallest automaton recognizing the subwords of a text. Theoretical Computer Science, 40:31--55, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  9. {BD80} T. A. Bailey and R. G. Dromey. Fast string searching by finding subkeys in subtext. Inf. Proc. Letters, 11:130--133, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  10. {BM77} R. Boyer and S. Moore. A fast string searching algorithm. C.ACM, 20:762--772, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. {BY89a} R. Baeza-Yates. Improved string searching. Software-Practice and Experience, 19(3):257--271, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {BY89b} R. A. Baeza-Yates. Efficient Text Searching. PhD thesis, Dept. of Computer Science, University of Waterloo, May 1989. Also as Research Report CS-89-17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. {BY89c} R. A. Baeza-Yates. String searching algorithms revisited. In Workshop in Algorithms and Data Structures, Ottawa, Canada, August 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. {BYG89} R. Baeza-Yates and G. H. Gonnet. A new approach to text searching. In Proc. of 12th SIGIR, Cambridge, Mass., June 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. {CF87} H. Cheng and K. Fu. Vlsi architectures for string matching and pattern matching. Pattern Recognition, 20:125--141, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. {CP88} M. Crochemore and D. Perrin. Pattern matching in strings. In Di Gesu, editor, 4th Conference on Image Analysis and Processing, pages 67--79. Springer Verlag, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  17. {CP89} M. Crochemore and D. Perrin. Two way pattern matching. Technical Report 98-8, L.I.T.P., Univ. Paris 7, Feb 1989. (submitted for publication).Google ScholarGoogle Scholar
  18. {Cro88} M. Crochemore. String matching with constraints. In Mathematical Foundations of Computer Science, Carlsbad, Czechoslovakia, Aug/Sept 1988. Lecture Notes in Computer Science 324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. {CW79} B. Commentz-Walter. A string matching algorithm fast on the average. In ICALP, volume 6 of Lecture Notes in Computer Science, pages 118--132. Springer-Verlag, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. {DB86} G. Davies and S. Bowsher. Algorithms for pattern matching. Software - Practice and Experience, 16:575--601, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. {Fal85} C. Faloutsos. Access methods for text. ACM C. Surveys, 17:49--74, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. {Gal79} Z. Galil. On improving the worst case running time of the Boyer-Moore string matching algorithm. C. ACM, 22:505--508, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. {Gal81} Z. Galil. String matching in real time. J.ACM, 28:134--149, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. {Gal85} Z. Galil. Optimal parallel algorithms for string matching. Information and Control, 67:144--157, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. {GBY89} G. H. Gonnet and R. A. Baeza-Yates. An analysis of the Karp-Rabin string matching algorithm. (submitted for publication), April 1989.Google ScholarGoogle Scholar
  26. {GO80} L. Guibas and A. Odlyzko. A new proof of the linearity of the Boyer-Moore string searching algorithm. SIAM J on Computing, 9:672--682, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  27. {Gon83} G. H. Gonnet. Unstructured data bases or very efficient text searching. In ACM PODS, volume 2, pages 117--124, Atlanta, GA, Mar 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. {Gon88} G. H. Gonnet. Private communication, 1988.Google ScholarGoogle Scholar
  29. {GS80} Z. Galil and J. Seiferas. Saving space in fast string-matching. SIAM J on Computing, 9:417--438, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  30. {GS83} Z. Galil and J. Seiferas. Time-space-optimal string matching. JCSS, 26:280--294, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  31. {Har71} M. C. Harrison. Implementation of the substring test by hashing. C.ACM, 14:777--779, 1971. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. {Has80} R. Haskin. Hardware for searching very large text databases. In Workshop Computer Architecture for Non-Numeric Processing, volume 5, pages 49--56, California, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. {Hol79} L. A. Hollaar. Text retrieval computers. IEEE Computer, 12:40--50, 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. {Hor80} N. Horspool. Practical fast searching in strings. Software - Practice and Experience, 10:501--506, 1980.Google ScholarGoogle Scholar
  35. {IA80} S. Iyengar and V. Alia. A string search algorithm. Appl. Math. Comput., 6:123--131, 1980.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. {KBG87} M. Kemp, R. Bayer, and U. Guntzer. Time optimal left to right construction of position trees. Acta Informatica, 24:461--474, 1987.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. {KLP89} Z. Kedem, G. Landau, and K. Palem. Optimal parallel suffix-prefix matching algorithm and applications. In SPAA '89, Santa Fe, New Mexico, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. {KMP77} D. E. Knuth, J. Morris, and V. Pratt. Fast pattern matching in strings. SIAM J on Computing, 6:323--350, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  39. {KR78} B. Kernighan and D. Ritchie. The C Programming Language. Prentice-Hall, Englewood Cliffs, NJ, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. {KR87} R. Karp and M. Rabin. Efficient randomized pattern-matching algorithms. IBM J Res. Development, 31:249--260, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. {Li84} M. Li. Lower bounds on string-matching. Technical Report TR-84-636, Dept. of Computer Science, Cornell University, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. {LY86} M. Li and Y. Yesha. String matching cannot be done by a two-head one way deterministic finite automaton. Inf. Proc. Letters, 22:231--235, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. {McC76} E. McCreight. A space-economical suffix tree construction algorithm. J.ACM, 23:262--272, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. {Mey85} B. Meyer. Incremental string matching. Inf. Proc. Letters, 21:219--227, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  45. {MNS84} P. Moller-Nielsen and J. Staunstrup. Experiments with a fast string searching algorithm. Inf. Proc. Letters, 18:129--135, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. {Mor68} D. Morrison. PATRICIA-Practical algorithm to retrieve information coded in alphanumeric. JACM, 15:514--534, 1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. {MP70} J. H. Morris and V. R. Pratt. A linear pattern matching algorithm. Technical Report 40, Computing Center, University of California, Berkeley, 1970.Google ScholarGoogle Scholar
  48. {MR80} M. Majster and A. Reiser. Efficient on-line construction and correction of position trees. SIAM J on Computing, 9:785--807, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  49. {Reg88} M. Regnier. Knuth-Morris-Pratt algorithm: An analysis. INRIA, Rocquencourt, France (unpublished), 1988.Google ScholarGoogle Scholar
  50. {Riv77} R. Rivest. On the worst-case behavior of string-searching algorithms. SIAM J on Computing, 6:669--674, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  51. {Ryt80} W. Rytter. A correct preprocessing algorithm for Boyer-Moore string-searching. SIAM J on Computing, 9:509--512, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  52. {Sch88} R. Schaback. On the expected sublinearity of the Boyer-Moore algorithm. SIAM J on Computing, 17:548--658, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. {Sed83} R. Sedgewick. Algorithms. Addison-Wesley, Reading, Mass., 1983.Google ScholarGoogle Scholar
  54. {Sli80} A. Slisenko. Determination in real time of all the periodicities in a word. Sov. Math. Dokl., 21:392--395, 1980.Google ScholarGoogle Scholar
  55. {Sli83} A. Slisenko. Detection of periodicities and string-matching in real time. Journal of Soviet Mathematics, 22:1316--1386, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  56. {Smi82} G. V. Smit. A comparison of three string matching algorithms. Software - Practice and Experience, 12:57--66, 1982.Google ScholarGoogle Scholar
  57. {Tak86} T. Takaoka. An on-line pattern matching algorithm. Inf. Proc. Letters, 22:329--330, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. {Vis85} U. Vishkin. Optimal parallel pattern matching in strings. Information and Control, 67:91--113, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. {Wei73} P. Weiner. Linear pattern matching algorithm. In FOCS, volume 14, pages 1--11, 1973.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. {WKY85} S. Wakabayashi, T. Kikuno, and N. Yoshida. Design of hardware algorithms by recurrence relations. Systems and Computers in Japan, 8:10--17, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  61. {Yao79} A. C. Yao. The complexity of pattern matching for a random string. SIAM J on Computing, 8:368--387, 1979.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Algorithms for string searching

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGIR Forum
          ACM SIGIR Forum  Volume 23, Issue 3-4
          Spring 1989
          131 pages
          ISSN:0163-5840
          DOI:10.1145/74697
          Issue’s Table of Contents

          Copyright © 1989 Author

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 April 1989

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader