Article

Free Access

Adding predicate argument structure to the Penn TreeBank

Authors:
Paul Kingsbury

University of Pennsylvania, Philadelphia, PA

University of Pennsylvania, Philadelphia, PA
View Profile

,
Martha Palmer

University of Pennsylvania, Philadelphia, PA

University of Pennsylvania, Philadelphia, PA
View Profile

,
Mitch Marcus

University of Pennsylvania, Philadelphia, PA

University of Pennsylvania, Philadelphia, PA
View Profile

HLT '02: Proceedings of the second international conference on Human Language Technology ResearchMarch 2002Pages 252–256

Published:24 March 2002Publication History

HLT '02: Proceedings of the second international conference on Human Language Technology Research

Pages 252–256

ABSTRACT

This paper presents our basic approach to creating Proposition Bank, which involves adding a layer of semantic annotation to the Penn English TreeBank. Without attempting to confirm or disconfirm any particular semantic theory, our goal is to provide consistent argument labeling that will facilitate the automatic extraction of relational data. An argument such as the window in John broke the window and in The window broke would receive the same label in both sentences. In order to ensure reliable human annotation, we provide our annotators with explicit guidelines for labeling all of the syntactic and semantic frames of each particular verb. We give several examples of these guidelines and discuss the inter-annotator agreement figures. We also discuss our current experiments on the automatic expansion of our verb guidelines based on verb class membership. Our current rate of progress and our consistency of annotation demonstrate the feasibility of the task.

References

Eugene Charniak. Parsing with Context-Free Grammars and Word Statistics. In Technical Report: CS-95-28, Brown University, 1995. Google ScholarDigital Library
M. Collins. Three generative, lexicalised models for statistical parsing. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, Spain, July 1997. Google ScholarDigital Library
Michael Collins. Discriminative reranking for natural language parsing. In International Conference on Machine Learning, 2000. Google ScholarDigital Library
Eva Hajicova, Jarmila Panevova, Petr Sgall. Tectogrammatics in Corpus Tagging. In Perspectives on Semantics, Pragmatics, and Discourse: A Festschrift for Ferenc Keifer, I. Kenesei and R. M. Harnish eds.Google Scholar
Karin Kipper, Hoa Trang Dang, Martha Palmer. Class-Based Construction of a Verb Lexicon. AAAI-2000, Seventeenth National Conference on Artificial Intelligence, Austin TX, July 30 -- August 3, 2000. Google ScholarDigital Library
Beth Levin. English Verb Classes and Alternations A Preliminary Investigation. 1993.Google Scholar
J. B. Lowe, C. F. Baker, and C. J. Fillmore. A frame-semantic approach to semantic annotation. In Proceedings 1997 Siglex Workshop/ANLP97, Washington, D.C., 1997.Google Scholar
Mitch Marcus. The Penn TreeBank: A revised corpus design for extracting predicate-argument structure. In Proceedings of the ARPA Human Language Technology Workshop, Princeton, NJ, March 1994. Google ScholarDigital Library
M. Marcus, B. Santorini, M. A. Marcinkiewicz. Building a large annotated corpus of English: the Penn TreeBank. Computational linguistics. Vol 19, 1993. Google ScholarDigital Library
G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. Five papers on wordnet. Technical Report 43, Cognitive Science Laboratory, Princeton University, July 1990.Google Scholar
Scott Miller, Heidi Fox, Lance Ramshaw, and Ralph Weischedel. Sift --- statistically-derived information from text. In Seventh Message Understanding Conference (MUC-7), Washington, D.C., 1998.Google Scholar

Recommendations

Adding semantic roles to the chinese treebank

We report work on adding semantic role labels to the Chinese Treebank, a corpus already annotated with phrase structures. The work involves locating all verbs and their nominalizations in the corpus, and semi-automatically adding semantic role labels to ...
Read More
Parsing noun phrases in the penn treebank

Noun phrases (nps) are a crucial part of natural language, and can have a very complex structure. However, this np structure is largely ignored by the statistical parsing field, as the most widely used corpus is not annotated with it. This lack of gold-...
Read More
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus

With growing interest in Chinese Language Processing, numerous NLP tools (e.g., word segmenters, part-of-speech taggers, and parsers) for Chinese have been developed all over the world. However, since no large-scale bracketed corpora are available to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HLT '02: Proceedings of the second international conference on Human Language Technology Research
March 2002
436 pages
Conference Chair:
Mitchell Marcus
University of Pennsylvania
Sponsors
In-Cooperation
Publisher
Morgan Kaufmann Publishers Inc.
San Francisco, CA, United States
Publication History
- Published: 24 March 2002
Author Tags
predicate argument structure
semantic annotation
verb classes
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate240of768submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 22
  Total Citations
  View Citations
- 189
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Adding predicate argument structure to the Penn TreeBank

HLT '02: Proceedings of the second international conference on Human Language Technology Research

ABSTRACT

References

Cited By

Recommendations

Adding semantic roles to the chinese treebank

Parsing noun phrases in the penn treebank

The Penn Chinese TreeBank: Phrase structure annotation of a large corpus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Adding predicate argument structure to the Penn TreeBank

HLT '02: Proceedings of the second international conference on Human Language Technology Research

ABSTRACT

References

Cited By

Recommendations

Adding semantic roles to the chinese treebank

Parsing noun phrases in the penn treebank

The Penn Chinese TreeBank: Phrase structure annotation of a large corpus

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media