ABSTRACT
Named Entity (NE) recognition is a task in which proper nouns and numerical information are extracted from documents and are classified into categories such as person, organization, and date. It is a key technology of Information Extraction and Open-Domain Question Answering. First, we show that an NE recognizer based on Support Vector Machines (SVMs) gives better scores than conventional systems. However, off-the-shelf SVM classifiers are too inefficient for this task. Therefore, we present a method that makes the system substantially faster. This approach can also be applied to other similar tasks such as chunking and part-of-speech tagging. We also present an SVM-based feature selection method and an efficient training method.
- James Allen. 1995. Natural Language Understanding 2nd. Ed. Benjamin Cummings.]] Google ScholarDigital Library
- Andrew Borthwick. 1999. A Maximum Entropy Approach to Named Entity Recognition. Ph.D. thesis, New York University.]] Google ScholarDigital Library
- Chris J. C. Burges and Bernhard Schölkopf. 1997. Improving speed and accuracy of support vector learning machines. In Advances in Neural Information Processing Systems 9, pages 375--381.]]Google Scholar
- Tom Downs, Kevin E. Gates, and Annette Masters. 2001. Exact simplification of support vector solutions. Journal of Machine Learning Research, 2:293--297.]] Google ScholarDigital Library
- Hideki Isozaki. 2001. Japanese named entity recognition based on a simple rule generator and decision tree learning. In Proceedings of Association for Computational Linguistics, pages 306--313.]] Google ScholarDigital Library
- Tommi S. Jaakkola and David Haussler. 1998. Exploiting generative models in discriminative classifiers. In M. S. Kearns, S. A. Solla, and D. A. Cohn, editors, Advances in Neural Information Processing Systems 11. MIT Press.]] Google ScholarDigital Library
- Thorsten Joachims. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning.]] Google ScholarDigital Library
- Thorsten Joachims. 1999. Making large-scale support vector machine learning practical. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods, chapter 16, pages 170--184. MIT Press.]] Google ScholarDigital Library
- Taku Kudo and Yuji Matsumoto. 2001. Chunking with support vector machines. In Proceedings of NAACL, pages 192--199.]] Google ScholarDigital Library
- Tetsuji Nakagawa, Taku Kudoh, and Yuji Matsumoto. 2001. Unknown word guessing and part-of-speech tagging using support vector machines. In Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium, pages 325--331.]]Google Scholar
- Edgar E. Osuna and Federico Girosi. 1999. Reducing the run-time complexity in support vector machines. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods, chapter 16, pages 271--283. MIT Press.]] Google ScholarDigital Library
- John C. Platt, Nello Cristiani, and John Shawe-Taylor. 2000. Large margin DAGs for multiclass classification. In Advances in Neural Information Processing Systems 12, pages 547--553. MIT Press.]]Google Scholar
- John C. Platt. 1999. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods, chapter 12, pages 185--208. MIT Press.]] Google ScholarDigital Library
- John C. Platt. 2000. Probabilities for SV machines. In A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, chapter 5, pages 61--71. MIT Press.]]Google Scholar
- Friedhelm Schwenker. 2001. Solving multi-class pattern recognition problems with tree-structured support vector machines. In B. Radig and S. Florczyk, editors, Pattern Recognition, Proceedings of the 23rd Symposium, number 2191 in LNCS, pages 283--290. Springer.]] Google ScholarDigital Library
- Satoshi Sekine and Yoshio Eriguchi. 2000. Japanese named entity extraction evaluation --- analysis of results ---. In Proceedings of 18th International Conference on Computational Linguistics, pages 1106--1110.]] Google ScholarDigital Library
- Satoshi Sekine, Ralph Grishman, and Hiroyuki Shinnou. 1998. A decision tree method for finding and classifying names in Japanese texts. In Proceedings of the Sixth Workshop on Very Large Corpora.]]Google Scholar
- Hiroshi Shimodaira, Ken-ichi Noma, Mitsuru Naka, and Shigeki Sagayama. 2001. Support vector machine with dynamic time-alignment kernel for speech recognition. In Proceedings of Eurospeech, pages 1841--1844.]]Google Scholar
- Koji Tsuda, M. Kawanabe, G. Rätsch, S. Sonnenburg, and K. Müller. 2001. A new discriminative kernel from probabilistic models. In Advances in Newral Information Processing Systems 14.]]Google Scholar
- Kiyotaka Uchimoto, Qing Ma, Masaki Murata, Hiromi Ozaku, Masao Utiyama, and Hitoshi Isahara. 2000. Named entity extraction based on a maximum entropy model and transformation rules (in Japanese). Journal of Natural Language Processing, 7(2):63--90.]]Google ScholarCross Ref
- Takehito Utsuro, Manabu Sassano, and Kiyotaka Uchimoto. 2001. Learning to combine outputs of multiple Japanese named entity extractors (in Japanese). In IPSJ SIG notes NL-144-5.]]Google Scholar
- Vladimir N. Vapnik. 1995. The Nature of Statistical Learning Theory. Springer.]] Google ScholarDigital Library
- E. M. Voorhees and D. K. Harman, editors. 2000. Proceedings of the 9th Text Retrieval Conference.]]Google Scholar
- Hiroyasu Yamada and Yuji Matsumoto. 2001. Applying support vector machine to multi-class classification problems (in Japanese). In IPSJ SIG Notes NL-146-6.]]Google Scholar
- Hiroyasu Yamada, Taku Kudoh, and Yuji Matsumoto. 2001. Japanese named entity extraction using support vector machines (in Japanese). In IPSJ SIG Notes NL-142-17.]]Google Scholar
- Efficient support vector classifiers for named entity recognition
Recommendations
A New Fuzzy Support Vector Machine Method for Named Entity Recognition
ICCSIT '08: Proceedings of the 2008 International Conference on Computer Science and Information TechnologyRecognizing and extracting exact name entities, like Persons, Locations, Organizations, Dates and Times are very useful to mining information from electronics resources and text. Learning to extract these types of data is called Named Entity Recognition ...
Learning multilingual named entity recognition from Wikipedia
We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...
NERA: Named Entity Recognition for Arabic
Name identification has been worked on quite intensively for the past few years, and has been incorporated into several products revolving around natural language processing tasks. Many researchers have attacked the name identification problem in a ...
Comments