skip to main content
10.3115/1220175.1220316dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free Access

An effective two-stage model for exploiting non-local dependencies in named entity recognition

Published:17 July 2006Publication History

ABSTRACT

This paper shows that a simple two-stage approach to handle non-local dependencies in Named Entity Recognition (NER) can outperform existing approaches that handle non-local dependencies, while being much more computationally efficient. NER systems typically use sequence models for tractable inference, but this makes them unable to capture the long distance structure present in text. We use a Conditional Random Field (CRF) based NER system using local features to make predictions and then train another CRF which uses both local information and features extracted from the output of the first CRF. Using features capturing non-local dependencies from the same document, our approach yields a 12.6% relative error reduction on the F1 score, over state-of-the-art NER systems using local-information alone, when compared to the 9.3% relative error reduction offered by the best systems that exploit non-local information. Our approach also makes it easy to incorporate non-local information from other documents in the test corpus, and this gives us a 13.3% error reduction over NER systems using local-information alone. Additionally, our running time for inference is just the inference time of two sequential CRFs, which is much less than that directly model the dependencies and do approximate inference.

References

  1. A. Borthwick. 1999. A Maximum Entropy Approach to Named Entity Recognition. Ph.D. thesis, New York University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Bunescu and R. J. Mooney, 2004. Collective information extraction with relational Markov networks. In Proceedings of the 42nd ACL, pages 439--446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. H. L. Chieu and H. T. Ng. 2002. Named entity recognition: a maximum entropy approach using global information. In Proceedings of the 19th Coling, pages 190--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. R. Curran and S. Clark. 2003. Language independent NER using a maximum entropy tagger. In Proceedings of the 7th CoNLL, pages 164--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Finkel, T. Grenager, and C. D. Manning. 2005. Incorporating non-local information into information extraction systems by gibbs sampling. In Proceedings of the 42nd ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Freitag and A. McCallum. 1999. Information extraction with HMMs and shrinkage. In Proceedings of the AAAI-99 Workshop on Machine Learning for Information Extraction.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Geman and D. Geman. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transitions on Pattern Analysis and Machine Intelligence, 6:721--741.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional Random Fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th ICML, pages 282--289. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. R. Leek. 1997. Information extraction using hidden Markov models. Master's thesis, U.C. San Diego.Google ScholarGoogle Scholar
  10. R. Malouf. 2002. Markov models for language-independent named entity recognition. In Proceedings of the 6th CoNLL, pages 187--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. McCallum, D. Freitag, and F. Pereira. 2000. Maximum entropy Markov models for information extraction and segmentation. In Proceedings of the 17th ICML, pages 591--598. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Mikheev, M. Moens, and C. Grover. 1999. Named entity recognition without gazetteers. In Proceedings of the 9th EACL, pages 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Pearl. 1988. Probabilistic reasoning in intelligent systems: Networks of plausible inference. In Morgan Kauffmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. Sha and F. Pereira. 2003. Shallow parsing with conditional random fields. In Proceedings of NAACL-2003, pages 134--141. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Sutton and A. McCallum. 2004. Collective segmentation and labeling of distant entities in information extraction. In ICML Workshop on Statistical Relational Learning and Its connections to Other Fields.Google ScholarGoogle Scholar
  16. B. Taskar, P. Abbeel, and D. Koller. 2002. Discriminative probabilistic models for relational data. In Proceedings of UAI-02. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. S. Yedidia, W. T. Freeman, and Y. Weiss. 2000. Generalized belief propagation. In Proceedings of NIPS-2000, pages 689--695.Google ScholarGoogle Scholar
  18. Alexander Yeh. 2000. More accurate tests for the statistical significance of result differences. In Proceedings of COLING 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. An effective two-stage model for exploiting non-local dependencies in named entity recognition

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image DL Hosted proceedings
      ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
      July 2006
      1214 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      • Published: 17 July 2006

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate85of443submissions,19%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader