skip to main content
10.5555/1289189.1289207dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
Article
Free Access

Adding predicate argument structure to the Penn TreeBank

Published:24 March 2002Publication History

ABSTRACT

This paper presents our basic approach to creating Proposition Bank, which involves adding a layer of semantic annotation to the Penn English TreeBank. Without attempting to confirm or disconfirm any particular semantic theory, our goal is to provide consistent argument labeling that will facilitate the automatic extraction of relational data. An argument such as the window in John broke the window and in The window broke would receive the same label in both sentences. In order to ensure reliable human annotation, we provide our annotators with explicit guidelines for labeling all of the syntactic and semantic frames of each particular verb. We give several examples of these guidelines and discuss the inter-annotator agreement figures. We also discuss our current experiments on the automatic expansion of our verb guidelines based on verb class membership. Our current rate of progress and our consistency of annotation demonstrate the feasibility of the task.

References

  1. Eugene Charniak. Parsing with Context-Free Grammars and Word Statistics. In Technical Report: CS-95-28, Brown University, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Collins. Three generative, lexicalised models for statistical parsing. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, Spain, July 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Michael Collins. Discriminative reranking for natural language parsing. In International Conference on Machine Learning, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Eva Hajicova, Jarmila Panevova, Petr Sgall. Tectogrammatics in Corpus Tagging. In Perspectives on Semantics, Pragmatics, and Discourse: A Festschrift for Ferenc Keifer, I. Kenesei and R. M. Harnish eds.Google ScholarGoogle Scholar
  5. Karin Kipper, Hoa Trang Dang, Martha Palmer. Class-Based Construction of a Verb Lexicon. AAAI-2000, Seventeenth National Conference on Artificial Intelligence, Austin TX, July 30 -- August 3, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Beth Levin. English Verb Classes and Alternations A Preliminary Investigation. 1993.Google ScholarGoogle Scholar
  7. J. B. Lowe, C. F. Baker, and C. J. Fillmore. A frame-semantic approach to semantic annotation. In Proceedings 1997 Siglex Workshop/ANLP97, Washington, D.C., 1997.Google ScholarGoogle Scholar
  8. Mitch Marcus. The Penn TreeBank: A revised corpus design for extracting predicate-argument structure. In Proceedings of the ARPA Human Language Technology Workshop, Princeton, NJ, March 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Marcus, B. Santorini, M. A. Marcinkiewicz. Building a large annotated corpus of English: the Penn TreeBank. Computational linguistics. Vol 19, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. Five papers on wordnet. Technical Report 43, Cognitive Science Laboratory, Princeton University, July 1990.Google ScholarGoogle Scholar
  11. Scott Miller, Heidi Fox, Lance Ramshaw, and Ralph Weischedel. Sift --- statistically-derived information from text. In Seventh Message Understanding Conference (MUC-7), Washington, D.C., 1998.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image DL Hosted proceedings
    HLT '02: Proceedings of the second international conference on Human Language Technology Research
    March 2002
    436 pages

    Publisher

    Morgan Kaufmann Publishers Inc.

    San Francisco, CA, United States

    Publication History

    • Published: 24 March 2002

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate240of768submissions,31%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader