skip to main content
10.3115/1072228.1072282dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

Efficient support vector classifiers for named entity recognition

Authors Info & Claims
Published:24 August 2002Publication History

ABSTRACT

Named Entity (NE) recognition is a task in which proper nouns and numerical information are extracted from documents and are classified into categories such as person, organization, and date. It is a key technology of Information Extraction and Open-Domain Question Answering. First, we show that an NE recognizer based on Support Vector Machines (SVMs) gives better scores than conventional systems. However, off-the-shelf SVM classifiers are too inefficient for this task. Therefore, we present a method that makes the system substantially faster. This approach can also be applied to other similar tasks such as chunking and part-of-speech tagging. We also present an SVM-based feature selection method and an efficient training method.

References

  1. James Allen. 1995. Natural Language Understanding 2nd. Ed. Benjamin Cummings.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andrew Borthwick. 1999. A Maximum Entropy Approach to Named Entity Recognition. Ph.D. thesis, New York University.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chris J. C. Burges and Bernhard Schölkopf. 1997. Improving speed and accuracy of support vector learning machines. In Advances in Neural Information Processing Systems 9, pages 375--381.]]Google ScholarGoogle Scholar
  4. Tom Downs, Kevin E. Gates, and Annette Masters. 2001. Exact simplification of support vector solutions. Journal of Machine Learning Research, 2:293--297.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Hideki Isozaki. 2001. Japanese named entity recognition based on a simple rule generator and decision tree learning. In Proceedings of Association for Computational Linguistics, pages 306--313.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Tommi S. Jaakkola and David Haussler. 1998. Exploiting generative models in discriminative classifiers. In M. S. Kearns, S. A. Solla, and D. A. Cohn, editors, Advances in Neural Information Processing Systems 11. MIT Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Thorsten Joachims. 1999. Making large-scale support vector machine learning practical. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods, chapter 16, pages 170--184. MIT Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Taku Kudo and Yuji Matsumoto. 2001. Chunking with support vector machines. In Proceedings of NAACL, pages 192--199.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Tetsuji Nakagawa, Taku Kudoh, and Yuji Matsumoto. 2001. Unknown word guessing and part-of-speech tagging using support vector machines. In Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium, pages 325--331.]]Google ScholarGoogle Scholar
  11. Edgar E. Osuna and Federico Girosi. 1999. Reducing the run-time complexity in support vector machines. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods, chapter 16, pages 271--283. MIT Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. John C. Platt, Nello Cristiani, and John Shawe-Taylor. 2000. Large margin DAGs for multiclass classification. In Advances in Neural Information Processing Systems 12, pages 547--553. MIT Press.]]Google ScholarGoogle Scholar
  13. John C. Platt. 1999. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods, chapter 12, pages 185--208. MIT Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. John C. Platt. 2000. Probabilities for SV machines. In A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, chapter 5, pages 61--71. MIT Press.]]Google ScholarGoogle Scholar
  15. Friedhelm Schwenker. 2001. Solving multi-class pattern recognition problems with tree-structured support vector machines. In B. Radig and S. Florczyk, editors, Pattern Recognition, Proceedings of the 23rd Symposium, number 2191 in LNCS, pages 283--290. Springer.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Satoshi Sekine and Yoshio Eriguchi. 2000. Japanese named entity extraction evaluation --- analysis of results ---. In Proceedings of 18th International Conference on Computational Linguistics, pages 1106--1110.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Satoshi Sekine, Ralph Grishman, and Hiroyuki Shinnou. 1998. A decision tree method for finding and classifying names in Japanese texts. In Proceedings of the Sixth Workshop on Very Large Corpora.]]Google ScholarGoogle Scholar
  18. Hiroshi Shimodaira, Ken-ichi Noma, Mitsuru Naka, and Shigeki Sagayama. 2001. Support vector machine with dynamic time-alignment kernel for speech recognition. In Proceedings of Eurospeech, pages 1841--1844.]]Google ScholarGoogle Scholar
  19. Koji Tsuda, M. Kawanabe, G. Rätsch, S. Sonnenburg, and K. Müller. 2001. A new discriminative kernel from probabilistic models. In Advances in Newral Information Processing Systems 14.]]Google ScholarGoogle Scholar
  20. Kiyotaka Uchimoto, Qing Ma, Masaki Murata, Hiromi Ozaku, Masao Utiyama, and Hitoshi Isahara. 2000. Named entity extraction based on a maximum entropy model and transformation rules (in Japanese). Journal of Natural Language Processing, 7(2):63--90.]]Google ScholarGoogle ScholarCross RefCross Ref
  21. Takehito Utsuro, Manabu Sassano, and Kiyotaka Uchimoto. 2001. Learning to combine outputs of multiple Japanese named entity extractors (in Japanese). In IPSJ SIG notes NL-144-5.]]Google ScholarGoogle Scholar
  22. Vladimir N. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. E. M. Voorhees and D. K. Harman, editors. 2000. Proceedings of the 9th Text Retrieval Conference.]]Google ScholarGoogle Scholar
  24. Hiroyasu Yamada and Yuji Matsumoto. 2001. Applying support vector machine to multi-class classification problems (in Japanese). In IPSJ SIG Notes NL-146-6.]]Google ScholarGoogle Scholar
  25. Hiroyasu Yamada, Taku Kudoh, and Yuji Matsumoto. 2001. Japanese named entity extraction using support vector machines (in Japanese). In IPSJ SIG Notes NL-142-17.]]Google ScholarGoogle Scholar
  1. Efficient support vector classifiers for named entity recognition

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          COLING '02: Proceedings of the 19th international conference on Computational linguistics - Volume 1
          August 2002
          1184 pages

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 24 August 2002

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate1,537of1,537submissions,100%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader