research-article

Free Access

A multi-pass sieve for coreference resolution

Authors:
Karthik Raghunathan

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Heeyoung Lee

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Sudarshan Rangarajan

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Nathanael Chambers

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Mihai Surdeanu

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Dan Jurafsky

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Christopher Manning

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

EMNLP '10: Proceedings of the 2010 Conference on Empirical Methods in Natural Language ProcessingOctober 2010Pages 492–501

Published:09 October 2010Publication History

EMNLP '10: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

Pages 492–501

ABSTRACT

Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier's entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sieve-based approaches could be applied to other NLP tasks.

References

B. Amit and B. Baldwin. 1998. Algorithms for scoring coreference chains. In MUC-7.Google Scholar
E. Bengston and D. Roth. 2008. Understanding the value of features for coreference resolution. In EMNLP. Google ScholarDigital Library
S. Bergsma and D. Lin. 2006. Bootstrapping Path-Based Pronoun Resolution. In ACL-COLING. Google ScholarDigital Library
P. F. Brown, V. J. Della Pietra, S. A. Della Pietra, and R. L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19. Google ScholarDigital Library
M. Collins and Y. Singer. 1999. Unsupervised models for named entity classification. In EMNLP-VLC.Google Scholar
A. Culotta, M. Wick, R. Hall, and A. McCallum. 2007. First-order probabilistic models for coreference resolution. In NAACL-HLT.Google Scholar
M. Elsner and E. Charniak. 2010. The same-head heuristic for coreference. In ACL. Google ScholarDigital Library
J. Finkel, T. Grenager, and C. Manning. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In ACL. Google ScholarDigital Library
J. Finkel and C. Manning. 2008. Enforcing transitivity in coreference resolution. In ACL. Google ScholarDigital Library
B. A. Fox 1993. Discourse structure and anaphora: written and conversational English. Cambridge University Press.Google Scholar
J. Ghosh. 2003. Scalable clustering methods for data mining. Handbook of Data Mining, chapter 10, pages 247--277.Google Scholar
A. Haghighi and D. Klein. 2009. Simple coreference resolution with rich syntactic and semantic features. In EMNLP. Google ScholarDigital Library
A. Haghighi and D. Klein. 2010. Coreference resolution in a modular, entity-centered model. In HLT-NAACL. Google ScholarDigital Library
J. R. Hobbs. 1977. Resolving pronoun references. Lingua.Google Scholar
H. Ji and D. Lin. 2009. Gender and animacy knowledge discovery from web-scale n-grams for unsupervised person mention detection. In PACLIC.Google Scholar
L. Kertz, A. Kehler, and J. Elman. 2006. Grammatical and Coherence-Based Factors in Pronoun Interpretation. In Proceedings of the 28th Annual Conference of the Cognitive Science Society.Google Scholar
D. Klein and C. Manning. 2003. Accurate unlexicalized parsing. In ACL. Google ScholarDigital Library
X. Luo. 2005. On coreference resolution performance metrics. In HTL-EMNLP. Google ScholarDigital Library
H. Poon and P. Domingos. 2008. Joint unsupervised coreference resolution with Markov Logic. In EMNLP. Google ScholarDigital Library
A. S. Schwartz and M. A. Hearst. 2003. A simple algorithm for identifying abbrevation definitions in biomedical text. In Pacific Symposium on Biocomputing.Google Scholar
B. F. Skinner. 1938. The behavior of organisms: An experimental analysis. Appleton-Century-Crofts.Google Scholar
V. I. Spitkovsky, H. Alshawi, and D. Jurafsky. 2010. From baby steps to leapfrog: How "less is more" in unsupervised dependency parsing. In NAACL. Google ScholarDigital Library
V. Stoyanov, N. Gilbert, C. Cardie, and E. Riloff. 2010. Conundrums in noun phrase coreference resolution: making sense of the state-of-the-art. In ACL-IJCNLP. Google ScholarDigital Library
M. Vilain, J. Burger, J. Aberdeen, D. Connolly, and L. Hirschman. 1995. A model-theoretic coreference scoring scheme. In MUC-6. Google ScholarDigital Library

A multi-pass sieve for coreference resolution
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Stanford's multi-pass sieve coreference resolution system at the CoNLL-2011 shared task
CONLL Shared Task '11: Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

This paper details the coreference resolution system submitted by Stanford at the CoNLL-2011 shared task. Our system is a collection of deterministic coreference resolution models that incorporate lexical, syntactic, semantic, and discourse information. ...
Read More
Cross-Document Coreference Resolution Based on Automatic Text Summary
WKDD '10: Proceedings of the 2010 Third International Conference on Knowledge Discovery and Data Mining

Cross-document coreference resolution plays an import part in the filed of natural language processing (NLP). It captures the ability of gathering documents for information about a certain entity. Most previous algorithms identify the underlying entity ...
Read More
Unrestricted Coreference: Identifying Entities and Events in OntoNotes
ICSC '07: Proceedings of the International Conference on Semantic Computing

Most research in the field of anaphora or coreference detection has been limited to noun phrase coreference, usually on a restricted set of entities, such as ACE entities. In part, this has been due to the lack of corpus resources tagged with general ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EMNLP '10: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
October 2010
1332 pages
Program Chairs:
Hang Li
Microsoft Research Asia
,
Lluís Màrquez
Technical University of Catalonia
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 9 October 2010
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate73of234submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 47
  Total Citations
  View Citations
- 865
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A multi-pass sieve for coreference resolution

EMNLP '10: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

ABSTRACT

References

Cited By

Recommendations

Stanford's multi-pass sieve coreference resolution system at the CoNLL-2011 shared task

Cross-Document Coreference Resolution Based on Automatic Text Summary

Unrestricted Coreference: Identifying Entities and Events in OntoNotes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A multi-pass sieve for coreference resolution

EMNLP '10: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

ABSTRACT

References

Cited By

Recommendations

Stanford's multi-pass sieve coreference resolution system at the CoNLL-2011 shared task

Cross-Document Coreference Resolution Based on Automatic Text Summary

Unrestricted Coreference: Identifying Entities and Events in OntoNotes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media