skip to main content
10.3115/1220355.1220546dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

Inferring parts of speech for lexical mappings via the Cyc KB

Published:23 August 2004Publication History

ABSTRACT

We present an automatic approach to learning criteria for classifying the parts-of-speech used in lexical mappings. This will further automate our knowledge acquisition system for non-technical users. The criteria for the speech parts are based on the types of the denoted terms along with morphological and corpus-based clues. Associations among these and the parts-of-speech are learned using the lexical mappings contained in the Cyc knowledge base as training data. With over 30 speech parts to choose from, the classifier achieves good results (77.8% correct). Accurate results (93.0%) are achieved in the special case of the mass-count distinction for nouns. Comparable results are also obtained using OpenCyc (73.1% general and 88.4% mass-count).

References

  1. Timothy Baldwin and Francis Bond. 2003. Learning the countability of English nouns from corpus data. In Proc. ACL-03. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Francis Bond and Caitlin Vatikiotis-Bateson. 2002. Using an ontology to determine English countability. In Proc. COLING-2002, pages 99--105. Taipei. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Eric Brill. 1995. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, 21(4):543--565. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Kathy J. Burns and Anthony B. Davis. 1999. Building and maintaining a semantically adequate lexicon using Cyc. In Evelyn Viegas, editor, Breadth and Depth of Semantic Lexicons, pages 121--143. Kluwer, Dordrecht.Google ScholarGoogle Scholar
  5. Alexander Clark. 2003. Combining distributional and morphological information for part of speech induction. In Proceedings of EACL 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Daniel Jurafsky and James H. Martin. 2000. Speech and Language Processing. Prentice Hall, Upper Saddle River, New Jersey. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. B. Lenat. 1995. Cyc: A large-scale investment in knowledge infrastructure. Communications of the ACM, 38(11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Tom O'Hara, Nancy Salay, Michael Witbrock, Dave Schneider, Bjoern Aldag, Stefano Bertolo, Kathy Panton, Fritz Lehmann, Matt Smith, David Baxter, Jon Curtis, and Peter Wagner. 2003. Inducing criteria for mass noun lexical mappings using the Cyc KB, and its extension to WordNet. In Proc. Fifth International Workshop on Computational Semantics (IWCS-5).Google ScholarGoogle Scholar
  9. B. Onyshkevych and S. Nirenburg. 1995. A lexicon for knowledge-based MT. Machine Translation, 10(2):5--57.Google ScholarGoogle ScholarCross RefCross Ref
  10. Ted Pedersen and Weidong Chen. 1995. Lexical acquisition via constraint solving. In Proc. AAAI 1995 Spring Symposium Series.Google ScholarGoogle Scholar
  11. Paul Procter, editor. 1995. Cambridge International Dictionary of English. Cambridge University Press, Cambridge.Google ScholarGoogle Scholar
  12. J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, California. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lane O. B. Schwartz. 2002. Corpus-based acquisition of head noun countability features. Master's thesis, Cambridge University, Cambridge, UK.Google ScholarGoogle Scholar
  14. Janine Toole. 2000. Categorizing unknown words: Using decision trees to identify names and misspellings. In Proc. ANLP-2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ian H. Witten and Eibe Frank. 1999. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. W. Woods. 2000. Aggressive morphology for robust lexical coverage. In Proc. ANLP-00. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Inferring parts of speech for lexical mappings via the Cyc KB

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        COLING '04: Proceedings of the 20th international conference on Computational Linguistics
        August 2004
        1411 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 23 August 2004

        Qualifiers

        • Article

        Acceptance Rates

        COLING '04 Paper Acceptance Rate1,411of1,411submissions,100%Overall Acceptance Rate1,537of1,537submissions,100%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader