Abstract
We survey several algorithms for searching a string in a piece of text. We include theoretical and empirical results, as well as the actual code of each algorithm. An extensive bibliography is also included.
- {AG86} A. Apostolico and R. Giancarlo. The Boyer-Moore-Galil string searching strategies revisited. SIAM J on Computing, 15:98--105, 1986. Google ScholarDigital Library
- {Aho80} A. V. Aho. Pattern matching in strings. In R. Book, editor, Formal Language Theory: Perspectives and Open Problems, pages 325--347. Academic Press, London, 1980.Google ScholarCross Ref
- {AYS85} J. Aoe, Y. Yamamoto, and R. Shimada. An efficient implementation of static string pattern matching machines. In IEEE Int. Conf. on Supercomputing Systems, volume 1, pages 491--498, St. Petersburg, Fla, 1985.Google Scholar
- {Bar81} G. Barth. An alternative for the implementation of Knuth-Morris-Pratt algorithm. Inf. Proc. Letters, 13:134--137, 1981.Google ScholarCross Ref
- {Bar84} G. Barth. An analytical comparison of two string searching algorithms. Inf. Proc. Letters, 18:249--256, 1984. Google ScholarDigital Library
- {BBE+87} A. Blumer, J. Blumer, A. Ehrenfeucht, D. Haussler, and R. McConnell. Completed inverted files for efficient text retrieval and analysis. J.ACM, 34:578--595, 1987. Google ScholarDigital Library
- {BBG+89} O. Berkman, D. Breslauer, Z. Galil, B. Schieber, and U. Vishkin. Highly parellelizable problems. In Proc. 20th ACM Symp. on Theory of Computing, pages 309--319, Seattle, Washington, 1989. Google ScholarDigital Library
- {BBH+85} A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. T. Chen, and J. Seiferas. The smallest automaton recognizing the subwords of a text. Theoretical Computer Science, 40:31--55, 1985.Google ScholarCross Ref
- {BD80} T. A. Bailey and R. G. Dromey. Fast string searching by finding subkeys in subtext. Inf. Proc. Letters, 11:130--133, 1980.Google ScholarCross Ref
- {BM77} R. Boyer and S. Moore. A fast string searching algorithm. C.ACM, 20:762--772, 1977. Google ScholarDigital Library
- {BY89a} R. Baeza-Yates. Improved string searching. Software-Practice and Experience, 19(3):257--271, 1989. Google ScholarDigital Library
- {BY89b} R. A. Baeza-Yates. Efficient Text Searching. PhD thesis, Dept. of Computer Science, University of Waterloo, May 1989. Also as Research Report CS-89-17. Google ScholarDigital Library
- {BY89c} R. A. Baeza-Yates. String searching algorithms revisited. In Workshop in Algorithms and Data Structures, Ottawa, Canada, August 1989. Google ScholarDigital Library
- {BYG89} R. Baeza-Yates and G. H. Gonnet. A new approach to text searching. In Proc. of 12th SIGIR, Cambridge, Mass., June 1989. Google ScholarDigital Library
- {CF87} H. Cheng and K. Fu. Vlsi architectures for string matching and pattern matching. Pattern Recognition, 20:125--141, 1987. Google ScholarDigital Library
- {CP88} M. Crochemore and D. Perrin. Pattern matching in strings. In Di Gesu, editor, 4th Conference on Image Analysis and Processing, pages 67--79. Springer Verlag, 1988.Google ScholarCross Ref
- {CP89} M. Crochemore and D. Perrin. Two way pattern matching. Technical Report 98-8, L.I.T.P., Univ. Paris 7, Feb 1989. (submitted for publication).Google Scholar
- {Cro88} M. Crochemore. String matching with constraints. In Mathematical Foundations of Computer Science, Carlsbad, Czechoslovakia, Aug/Sept 1988. Lecture Notes in Computer Science 324. Google ScholarDigital Library
- {CW79} B. Commentz-Walter. A string matching algorithm fast on the average. In ICALP, volume 6 of Lecture Notes in Computer Science, pages 118--132. Springer-Verlag, 1979. Google ScholarDigital Library
- {DB86} G. Davies and S. Bowsher. Algorithms for pattern matching. Software - Practice and Experience, 16:575--601, 1986. Google ScholarDigital Library
- {Fal85} C. Faloutsos. Access methods for text. ACM C. Surveys, 17:49--74, 1985. Google ScholarDigital Library
- {Gal79} Z. Galil. On improving the worst case running time of the Boyer-Moore string matching algorithm. C. ACM, 22:505--508, 1979. Google ScholarDigital Library
- {Gal81} Z. Galil. String matching in real time. J.ACM, 28:134--149, 1981. Google ScholarDigital Library
- {Gal85} Z. Galil. Optimal parallel algorithms for string matching. Information and Control, 67:144--157, 1985. Google ScholarDigital Library
- {GBY89} G. H. Gonnet and R. A. Baeza-Yates. An analysis of the Karp-Rabin string matching algorithm. (submitted for publication), April 1989.Google Scholar
- {GO80} L. Guibas and A. Odlyzko. A new proof of the linearity of the Boyer-Moore string searching algorithm. SIAM J on Computing, 9:672--682, 1980.Google ScholarCross Ref
- {Gon83} G. H. Gonnet. Unstructured data bases or very efficient text searching. In ACM PODS, volume 2, pages 117--124, Atlanta, GA, Mar 1983. Google ScholarDigital Library
- {Gon88} G. H. Gonnet. Private communication, 1988.Google Scholar
- {GS80} Z. Galil and J. Seiferas. Saving space in fast string-matching. SIAM J on Computing, 9:417--438, 1980.Google ScholarCross Ref
- {GS83} Z. Galil and J. Seiferas. Time-space-optimal string matching. JCSS, 26:280--294, 1983.Google ScholarCross Ref
- {Har71} M. C. Harrison. Implementation of the substring test by hashing. C.ACM, 14:777--779, 1971. Google ScholarDigital Library
- {Has80} R. Haskin. Hardware for searching very large text databases. In Workshop Computer Architecture for Non-Numeric Processing, volume 5, pages 49--56, California, 1980. Google ScholarDigital Library
- {Hol79} L. A. Hollaar. Text retrieval computers. IEEE Computer, 12:40--50, 1979.Google ScholarDigital Library
- {Hor80} N. Horspool. Practical fast searching in strings. Software - Practice and Experience, 10:501--506, 1980.Google Scholar
- {IA80} S. Iyengar and V. Alia. A string search algorithm. Appl. Math. Comput., 6:123--131, 1980.Google ScholarDigital Library
- {KBG87} M. Kemp, R. Bayer, and U. Guntzer. Time optimal left to right construction of position trees. Acta Informatica, 24:461--474, 1987.Google ScholarDigital Library
- {KLP89} Z. Kedem, G. Landau, and K. Palem. Optimal parallel suffix-prefix matching algorithm and applications. In SPAA '89, Santa Fe, New Mexico, 1989. Google ScholarDigital Library
- {KMP77} D. E. Knuth, J. Morris, and V. Pratt. Fast pattern matching in strings. SIAM J on Computing, 6:323--350, 1977.Google ScholarCross Ref
- {KR78} B. Kernighan and D. Ritchie. The C Programming Language. Prentice-Hall, Englewood Cliffs, NJ, 1978. Google ScholarDigital Library
- {KR87} R. Karp and M. Rabin. Efficient randomized pattern-matching algorithms. IBM J Res. Development, 31:249--260, 1987. Google ScholarDigital Library
- {Li84} M. Li. Lower bounds on string-matching. Technical Report TR-84-636, Dept. of Computer Science, Cornell University, 1984. Google ScholarDigital Library
- {LY86} M. Li and Y. Yesha. String matching cannot be done by a two-head one way deterministic finite automaton. Inf. Proc. Letters, 22:231--235, 1986. Google ScholarDigital Library
- {McC76} E. McCreight. A space-economical suffix tree construction algorithm. J.ACM, 23:262--272, 1976. Google ScholarDigital Library
- {Mey85} B. Meyer. Incremental string matching. Inf. Proc. Letters, 21:219--227, 1985.Google ScholarCross Ref
- {MNS84} P. Moller-Nielsen and J. Staunstrup. Experiments with a fast string searching algorithm. Inf. Proc. Letters, 18:129--135, 1984. Google ScholarDigital Library
- {Mor68} D. Morrison. PATRICIA-Practical algorithm to retrieve information coded in alphanumeric. JACM, 15:514--534, 1968. Google ScholarDigital Library
- {MP70} J. H. Morris and V. R. Pratt. A linear pattern matching algorithm. Technical Report 40, Computing Center, University of California, Berkeley, 1970.Google Scholar
- {MR80} M. Majster and A. Reiser. Efficient on-line construction and correction of position trees. SIAM J on Computing, 9:785--807, 1980.Google ScholarCross Ref
- {Reg88} M. Regnier. Knuth-Morris-Pratt algorithm: An analysis. INRIA, Rocquencourt, France (unpublished), 1988.Google Scholar
- {Riv77} R. Rivest. On the worst-case behavior of string-searching algorithms. SIAM J on Computing, 6:669--674, 1977.Google ScholarCross Ref
- {Ryt80} W. Rytter. A correct preprocessing algorithm for Boyer-Moore string-searching. SIAM J on Computing, 9:509--512, 1980.Google ScholarCross Ref
- {Sch88} R. Schaback. On the expected sublinearity of the Boyer-Moore algorithm. SIAM J on Computing, 17:548--658, 1988. Google ScholarDigital Library
- {Sed83} R. Sedgewick. Algorithms. Addison-Wesley, Reading, Mass., 1983.Google Scholar
- {Sli80} A. Slisenko. Determination in real time of all the periodicities in a word. Sov. Math. Dokl., 21:392--395, 1980.Google Scholar
- {Sli83} A. Slisenko. Detection of periodicities and string-matching in real time. Journal of Soviet Mathematics, 22:1316--1386, 1983.Google ScholarCross Ref
- {Smi82} G. V. Smit. A comparison of three string matching algorithms. Software - Practice and Experience, 12:57--66, 1982.Google Scholar
- {Tak86} T. Takaoka. An on-line pattern matching algorithm. Inf. Proc. Letters, 22:329--330, 1986. Google ScholarDigital Library
- {Vis85} U. Vishkin. Optimal parallel pattern matching in strings. Information and Control, 67:91--113, 1985. Google ScholarDigital Library
- {Wei73} P. Weiner. Linear pattern matching algorithm. In FOCS, volume 14, pages 1--11, 1973.Google ScholarDigital Library
- {WKY85} S. Wakabayashi, T. Kikuno, and N. Yoshida. Design of hardware algorithms by recurrence relations. Systems and Computers in Japan, 8:10--17, 1985.Google ScholarCross Ref
- {Yao79} A. C. Yao. The complexity of pattern matching for a random string. SIAM J on Computing, 8:368--387, 1979.Google ScholarCross Ref
Index Terms
- Algorithms for string searching
Recommendations
A fast string searching algorithm
An algorithm is presented that searches for the location, “il” of the first occurrence of a character string, “pat,” in another string, “string.” During the search operation, the characters of pat are matched starting with the last character of pat. The ...
Efficient String Matching Algorithm for Searching Large DNA and Binary Texts
The exact string matching is essential in application areas such as Bioinformatics and Intrusion Detection Systems. Speeding-up the string matching algorithm will therefore result in accelerating the searching process in DNA and binary data. Previously, ...
Comments