Article

Free Access

Applying co-training methods to statistical parsing

Author:
Anoop Sarkar

University of Pennsylvania, Philadelphia, PA

University of Pennsylvania, Philadelphia, PA
View Profile

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologiesJune 2001Pages 1–8https://doi.org/10.3115/1073336.1073359

Published:02 June 2001Publication History

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

Pages 1–8

ABSTRACT

We propose a novel Co-Training method for statistical parsing. The algorithm takes as input a small corpus (9695 sentences) annotated with parse trees, a dictionary of possible lexicalized structures for each word in the training set and a large pool of unlabeled text. The algorithm iteratively labels the entire data set with parse trees. Using empirical results based on parsing the Wall Street Journal corpus we show that training a statistical parser on the combined labeled and unlabeled data strongly out-performs training only on the labeled data.

References

E. Black, S. Abney, D. Flickinger, C. Gdaniec, R. Grishman, P. Harrison, D. Hindle, R. Ingria, F. Jelinek, J. Klavans, M. Liberman, M. Marcus, S. Roukos, B. Santorini, and T. Strzalkowski. 1991. A procedure for quantitatively comparing the syntactic coverage of english grammars. In Proc. DARPA Speech and Natural Language Workshop, pages 306--311. Morgan Kaufmann. Google ScholarDigital Library
A. Blum and T. Mitchell. 1998. Combining Labeled and Unlabeled Data with Co-Training. In Proc. of 11th Annual Conf. on Comp. Learning Theory (COLT), pages 92--100. Google ScholarDigital Library
E. Brill. 1997. Unsupervised learning of disambiguation rules for part of speech tagging. In Natural Language Processing Using Very Large Corpora. Kluwer Academic Press.Google Scholar
G. Carroll and M. Rooth. 1998. Valence Induction with a Head-Lexicalized PCFG. http://xxx.lanl.gov/abs/cmp-lg/9805001, May.Google Scholar
C. Chelba and F. Jelinek. 1998. Exploiting syntactic structure for language modeling. In Proc. of COLING-ACL '98, pages 225--231, Montreal. Google ScholarDigital Library
M. Collins and Y. Singer. 1999. Unsupervised Models for Named Entity Classification. In Proc. of WVLC/EMNLP-99, pages 100--110.Google Scholar
D. Cutting, J. Kupiec, J. Pedersen, and P. Sibun. 1992. A practical part-of-speech tagger. In Proc. of 3rd ANLP Conf., Trento, Italy. ACL. Google ScholarDigital Library
D. Elworthy. 1994. Does baum-welch re-estimation help taggers? In Proc. of 4th ANLP Conf., pages 53--58, Stuttgart, October 13-15. Google ScholarDigital Library
E. W. Fong and D. Wu. 1996. Learning restricted probabilistic link grammars. In S. Wermter, E. Riloff, and G. Scheler, editors, Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing, pages 173--187. Springer-Verlag. Google ScholarDigital Library
S. Goldman and Y. Zhou. 2000. Enhancing supervised learning with unlabeled data. In Proc. of ICML'2000, Stanford University, June 29--July 2. Google ScholarDigital Library
Rebecca Hwa. 2000. Sample selection for statistical grammar induction. In Proceedings of EMNLP/VLC-2000, pages 45--52. Google ScholarDigital Library
A. K. Joshi and Y. Schabes. 1992. Tree-adjoining grammar and lexicalized grammars. In M. Nivat and A. Podelski, editors, Tree automata and languages, pages 409--431. Elsevier Science.Google Scholar
A. K. Joshi, L. Levy, and M. Takahashi. 1975. Tree Adjunct Grammars. Journal of Computer and System Sciences.Google ScholarDigital Library
A. K. Joshi. 1985. Tree Adjoining Grammars: How much context Sensitivity is required to provide a reasonable structural description. In D. Dowty, I. Karttunen, and A. Zwicky, editors, Natural Language Parsing, pages 206--250. Cambridge University Press, Cambridge, U.K.Google ScholarCross Ref
J. Lafferty, D. Sleator, and D. Temperley. 1992. Grammatical trigrams: A probabilistic model of link grammar. In Proc. of the AAAI Conf. on Probabilistic Approaches to Natural Language.Google Scholar
K. Lari and S. J. Young. 1990. The estimation of stochastic context-free grammars using the Inside-Outside algorithm. Computer Speech and Language, 4:35--56.Google ScholarCross Ref
C. de Marcken. 1995. Lexical heads, phrase structure and the induction of grammar. In D. Yarowsky and K. Church, editors, Proc. of 3rd WVLC, pages 14--26, MIT, Cambridge, MA.Google Scholar
M. Marcus, B. Santorini, and M. Marcinkiewiecz. 1993. Building a large annotated corpus of english. Computational Linguistics, 19(2):313--330. Google ScholarDigital Library
B. Merialdo. 1994. Tagging english text with a probabilistic model. Computational Linguistics, 20(2):155--172. Google ScholarDigital Library
Kamal Nigam and Rayid Ghani. 2000. Analyzing the effectiveness and applicability of co-training. In Proc. of Ninth International Conference on Information and Knowledge (CIKM-2000). Google ScholarDigital Library
Kamal Nigam, Andrew McCallum, Sebastian Thrun, and Tom Mitchell. 1999. Text Classification from Labeled and Unlabeled Documents using EM. Machine Learning, 1(34). Google ScholarDigital Library
S. Della Pietra, V. Della Pietra, J. Gillett, J. Lafferty, H. Printz, and L. Ureš. 1994. Inference and estimation of a long-range trigram model. In R. Carrasco and J. Oncina, editors, Proc. of ICGI-94. Springer-Verlag. Google ScholarDigital Library
A. Ratnaparkhi. 1996. A Maximum Entropy Part-Of-Speech Tagger. In Proc. of EMNLP-96, University of Pennsylvania.Google Scholar
P. Resnik. 1992. Probabilistic tree-adjoining grammars as a framework for statistical natural language processing. In Proc. of COLING '92, volume 2, pages 418--424, Nantes, France. Google ScholarDigital Library
Y. Schabes. 1992. Stochastic lexicalized tree-adjoining grammars. In Proc. of COLING '92, volume 2, pages 426--432, Nantes, France. Google ScholarDigital Library
B. Srinivas. 1997. Complexity of Lexical Descriptions and its Relevance to Partial Parsing. Ph.D. thesis, Department of Computer and Information Sciences, University of Pennsylvania.Google Scholar
F. Xia, M. Palmer, and A. Joshi. 2000. A Uniform Method of Grammar Extraction and its Applications. In Proc. of EMNLP/VLC-2000. Google ScholarDigital Library
D. Yarowsky. 1995. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In Proc. 33rd Meeting of the ACL, pages 189--196, Cambridge, MA. Google ScholarDigital Library

Applying co-training methods to statistical parsing
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Statistical ltag parsing
Read More
Learning structured classifiers for statistical dependency parsing
Read More
Wide-coverage efficient statistical parsing with ccg and log-linear models

This article describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are “full” parsing models in the sense that probabilities are defined for complete parses, rather than for independent events ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
June 2001
293 pages
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 2 June 2001
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate21of29submissions,72%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 38
  Total Citations
  View Citations
- 410
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Applying co-training methods to statistical parsing

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

ABSTRACT

References

Cited By

Recommendations

Statistical ltag parsing

Learning structured classifiers for statistical dependency parsing

Wide-coverage efficient statistical parsing with ccg and log-linear models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Applying co-training methods to statistical parsing

NAACL '01: Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

ABSTRACT

References

Cited By

Recommendations

Statistical ltag parsing

Learning structured classifiers for statistical dependency parsing

Wide-coverage efficient statistical parsing with ccg and log-linear models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media