skip to main content
10.5555/1870658.1870706dlproceedingsArticle/Chapter ViewAbstractPublication PagesemnlpConference Proceedingsconference-collections
research-article
Free Access

A multi-pass sieve for coreference resolution

Published:09 October 2010Publication History

ABSTRACT

Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier's entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sieve-based approaches could be applied to other NLP tasks.

References

  1. B. Amit and B. Baldwin. 1998. Algorithms for scoring coreference chains. In MUC-7.Google ScholarGoogle Scholar
  2. E. Bengston and D. Roth. 2008. Understanding the value of features for coreference resolution. In EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Bergsma and D. Lin. 2006. Bootstrapping Path-Based Pronoun Resolution. In ACL-COLING. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. F. Brown, V. J. Della Pietra, S. A. Della Pietra, and R. L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Collins and Y. Singer. 1999. Unsupervised models for named entity classification. In EMNLP-VLC.Google ScholarGoogle Scholar
  6. A. Culotta, M. Wick, R. Hall, and A. McCallum. 2007. First-order probabilistic models for coreference resolution. In NAACL-HLT.Google ScholarGoogle Scholar
  7. M. Elsner and E. Charniak. 2010. The same-head heuristic for coreference. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Finkel, T. Grenager, and C. Manning. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Finkel and C. Manning. 2008. Enforcing transitivity in coreference resolution. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. B. A. Fox 1993. Discourse structure and anaphora: written and conversational English. Cambridge University Press.Google ScholarGoogle Scholar
  11. J. Ghosh. 2003. Scalable clustering methods for data mining. Handbook of Data Mining, chapter 10, pages 247--277.Google ScholarGoogle Scholar
  12. A. Haghighi and D. Klein. 2009. Simple coreference resolution with rich syntactic and semantic features. In EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Haghighi and D. Klein. 2010. Coreference resolution in a modular, entity-centered model. In HLT-NAACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. R. Hobbs. 1977. Resolving pronoun references. Lingua.Google ScholarGoogle Scholar
  15. H. Ji and D. Lin. 2009. Gender and animacy knowledge discovery from web-scale n-grams for unsupervised person mention detection. In PACLIC.Google ScholarGoogle Scholar
  16. L. Kertz, A. Kehler, and J. Elman. 2006. Grammatical and Coherence-Based Factors in Pronoun Interpretation. In Proceedings of the 28th Annual Conference of the Cognitive Science Society.Google ScholarGoogle Scholar
  17. D. Klein and C. Manning. 2003. Accurate unlexicalized parsing. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Luo. 2005. On coreference resolution performance metrics. In HTL-EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Poon and P. Domingos. 2008. Joint unsupervised coreference resolution with Markov Logic. In EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. S. Schwartz and M. A. Hearst. 2003. A simple algorithm for identifying abbrevation definitions in biomedical text. In Pacific Symposium on Biocomputing.Google ScholarGoogle Scholar
  21. B. F. Skinner. 1938. The behavior of organisms: An experimental analysis. Appleton-Century-Crofts.Google ScholarGoogle Scholar
  22. V. I. Spitkovsky, H. Alshawi, and D. Jurafsky. 2010. From baby steps to leapfrog: How "less is more" in unsupervised dependency parsing. In NAACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. V. Stoyanov, N. Gilbert, C. Cardie, and E. Riloff. 2010. Conundrums in noun phrase coreference resolution: making sense of the state-of-the-art. In ACL-IJCNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Vilain, J. Burger, J. Aberdeen, D. Connolly, and L. Hirschman. 1995. A model-theoretic coreference scoring scheme. In MUC-6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. A multi-pass sieve for coreference resolution

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        EMNLP '10: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
        October 2010
        1332 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 9 October 2010

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate73of234submissions,31%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader