Article

Free Access

Inferring parts of speech for lexical mappings via the Cyc KB

Authors:
Tom O'Hara

New Mexico State University, Las Cruces, NM

New Mexico State University, Las Cruces, NM
View Profile

,
Stefano Bertolo

Cycorp, Inc., Austin, TX

Cycorp, Inc., Austin, TX
View Profile

,
Michael Witbrock

Cycorp, Inc., Austin, TX

Cycorp, Inc., Austin, TX
View Profile

,
Bjørn Aldag

Cycorp, Inc., Austin, TX

Cycorp, Inc., Austin, TX
View Profile

,
Jon Curtis

Cycorp, Inc., Austin, TX

Cycorp, Inc., Austin, TX
View Profile

,
Kathy Panton

Cycorp, Inc., Austin, TX

Cycorp, Inc., Austin, TX
View Profile

,
Dave Schneider

Cycorp, Inc., Austin, TX

Cycorp, Inc., Austin, TX
View Profile

,
Nancy Salay

Cycorp, Inc., Austin, TX

Cycorp, Inc., Austin, TX
View Profile

COLING '04: Proceedings of the 20th international conference on Computational LinguisticsAugust 2004Pages 1305–eshttps://doi.org/10.3115/1220355.1220546

Published:23 August 2004Publication History

COLING '04: Proceedings of the 20th international conference on Computational Linguistics

Pages 1305–es

ABSTRACT

We present an automatic approach to learning criteria for classifying the parts-of-speech used in lexical mappings. This will further automate our knowledge acquisition system for non-technical users. The criteria for the speech parts are based on the types of the denoted terms along with morphological and corpus-based clues. Associations among these and the parts-of-speech are learned using the lexical mappings contained in the Cyc knowledge base as training data. With over 30 speech parts to choose from, the classifier achieves good results (77.8% correct). Accurate results (93.0%) are achieved in the special case of the mass-count distinction for nouns. Comparable results are also obtained using OpenCyc (73.1% general and 88.4% mass-count).

References

Timothy Baldwin and Francis Bond. 2003. Learning the countability of English nouns from corpus data. In Proc. ACL-03. Google ScholarDigital Library
Francis Bond and Caitlin Vatikiotis-Bateson. 2002. Using an ontology to determine English countability. In Proc. COLING-2002, pages 99--105. Taipei. Google ScholarDigital Library
Eric Brill. 1995. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, 21(4):543--565. Google ScholarDigital Library
Kathy J. Burns and Anthony B. Davis. 1999. Building and maintaining a semantically adequate lexicon using Cyc. In Evelyn Viegas, editor, Breadth and Depth of Semantic Lexicons, pages 121--143. Kluwer, Dordrecht.Google Scholar
Alexander Clark. 2003. Combining distributional and morphological information for part of speech induction. In Proceedings of EACL 2003. Google ScholarDigital Library
Daniel Jurafsky and James H. Martin. 2000. Speech and Language Processing. Prentice Hall, Upper Saddle River, New Jersey. Google ScholarDigital Library
D. B. Lenat. 1995. Cyc: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38(11). Google ScholarDigital Library
Tom O'Hara, Nancy Salay, Michael Witbrock, Dave Schneider, Bjoern Aldag, Stefano Bertolo, Kathy Panton, Fritz Lehmann, Matt Smith, David Baxter, Jon Curtis, and Peter Wagner. 2003. Inducing criteria for mass noun lexical mappings using the Cyc KB, and its extension to WordNet. In Proc. Fifth International Workshop on Computational Semantics (IWCS-5).Google Scholar
B. Onyshkevych and S. Nirenburg. 1995. A lexicon for knowledge-based MT. Machine Translation, 10(2):5--57.Google ScholarCross Ref
Ted Pedersen and Weidong Chen. 1995. Lexical acquisition via constraint solving. In Proc. AAAI 1995 Spring Symposium Series.Google Scholar
Paul Procter, editor. 1995. Cambridge International Dictionary of English. Cambridge University Press, Cambridge.Google Scholar
J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, California. Google ScholarDigital Library
Lane O. B. Schwartz. 2002. Corpus-based acquisition of head noun countability features. Master's thesis, Cambridge University, Cambridge, UK.Google Scholar
Janine Toole. 2000. Categorizing unknown words: Using decision trees to identify names and misspellings. In Proc. ANLP-2000. Google ScholarDigital Library
Ian H. Witten and Eibe Frank. 1999. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library
W. Woods. 2000. Aggressive morphology for robust lexical coverage. In Proc. ANLP-00. Google ScholarDigital Library

Inferring parts of speech for lexical mappings via the Cyc KB
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Inferring selectional preferences from part-of-speech N-grams
EACL '12: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

We present the PONG method to compute selectional preferences using part-of-speech (POS) N-grams. From a corpus labeled with grammatical dependencies, PONG learns the distribution of word relations for each POS N-gram. From the much larger but unlabeled ...
Read More
Tagging Urdu text with parts of speech: a tagger comparison
EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics

In this paper, four state-of-art probabilistic taggers i.e. TnT tagger, TreeTagger, RF tagger and SVM tool, are applied to the Urdu language. For the purpose of the experiment, a syntactic tagset is proposed. A training corpus of 100,000 tokens is used ...
Read More
Decision Tree Ensemble for Parts-of-Speech Tagging of Resource-poor Languages
FIRE '18: Proceedings of the 10th Annual Meeting of the Forum for Information Retrieval Evaluation

Ensemble POS taggers are a good choice to integrate and leverage benefits of various types of POS taggers. This can help the large number (6500+) of resource-poor languages which do not have much annotated training data by providing ways to integrate ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

COLING '04: Proceedings of the 20th international conference on Computational Linguistics
August 2004
1411 pages
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 23 August 2004
Qualifiers
- Article
Conference

Acceptance Rates
COLING '04 Paper Acceptance Rate1,411of1,411submissions,100%Overall Acceptance Rate1,537of1,537submissions,100%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 211
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Inferring parts of speech for lexical mappings via the Cyc KB

COLING '04: Proceedings of the 20th international conference on Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

Inferring selectional preferences from part-of-speech N-grams

Tagging Urdu text with parts of speech: a tagger comparison

Decision Tree Ensemble for Parts-of-Speech Tagging of Resource-poor Languages

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Inferring parts of speech for lexical mappings via the Cyc KB

COLING '04: Proceedings of the 20th international conference on Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

Inferring selectional preferences from part-of-speech N-grams

Tagging Urdu text with parts of speech: a tagger comparison

Decision Tree Ensemble for Parts-of-Speech Tagging of Resource-poor Languages

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media