Article

Free Access

Representing text chunks

Authors:
Erik F. Tjong Kim Sang

University of Antwerp, Wilrijk, Belgium

University of Antwerp, Wilrijk, Belgium
View Profile

,
Jorn Veenstra

Tilburg University, Le Tilburg, The Netherlands

Tilburg University, Le Tilburg, The Netherlands
View Profile

EACL '99: Proceedings of the ninth conference on European chapter of the Association for Computational LinguisticsJune 1999Pages 173–179https://doi.org/10.3115/977035.977059

Published:08 June 1999Publication History

EACL '99: Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics

Pages 173–179

ABSTRACT

Dividing sentences in chunks of words is a useful preprocessing step for parsing, information extraction and information retrieval. (Ramshaw and Marcus, 1995) have introduced a "convenient" data representation for chunking by converting it to a tagging task. In this paper we will examine seven different data representations for the problem of recognizing noun phrase chunks. We will show that the the data representation choice has a minor influence on chunking performance. However, equipped with the most suitable data representation, our memory-based learning chunker was able to improve the best published chunking results for a standard data set.

References

Steven Abney. 1991. Parsing by chunks. In Principle-Based Parsing. Kluwer Academic Publishers.Google Scholar
Shlomo Argamon, Ido Dagan, and Yuval Krymolowski. 1998. A memory-based approach to learning shallow natural language patterns. In Proceedings of the 17th International Conference on Computational Linguistics (COLING-ACL '98). Google ScholarDigital Library
Claire Cardie and David Pierce. 1998. Error-driven pruning of treebank grammars for base noun phrase identification. In Proceedings of the 17th International Conference on Computational Linguistics (COLING-ACL '98). Google ScholarDigital Library
Walter Daelemans, Jakub Zavrel, Ko van der Sloot, and Antal van den Bosch. 1998. TiMBL: Tilburg Memory Based Learner - version 1.0 - Reference Guide. ILK, Tilburg University, The Netherlands. http://ilk.kub.nl/~ilk/papers/ilk9803.ps.gz.Google Scholar
Walter Daelemans, Antal van den Bosch, and Jakub Zavrel. 1999. Forgetting exceptions is harmful in language learning. Machine Learning, 11. Google ScholarDigital Library
Lance A. Ramshaw and Mitchell P. Marcus. 1995. Text chunking using transformation-based learning. In Proceedings of the Third ACL Workshop on Very Large Corpora.Google Scholar
Adwait Ratnaparkhi. 1998. Maximum Entropy Models for Natural Language Ambiguity Resolution. PhD thesis Computer and Information Science, University of Pennsylvania. Google ScholarDigital Library
Jorn Veenstra. 1998. Fast np chunking using memory-based learning techniques. In BENELEARN-98: Proceedings of the Eigth Belgian-Dutch Conference on Machine Learning. ATO-DLO, Wageningen, report 352.Google Scholar

Representing text chunks
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Word alignment of English-Chinese bilingual corpus based on chunks
EMNLP '00: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13

In this paper, a method for the word alignment of English-Chinese corpus based on chunks is proposed. The chunks of English sentences are identified firstly. Then the chunk boundaries of Chinese sentences are predicted by the translations of English ...
Read More
English-to-Korean transliteration using multiple unbounded overlapping phoneme chunks
COLING '00: Proceedings of the 18th conference on Computational linguistics - Volume 1

We present in this paper the method of English-to-Korean (E-K) transliteration and back-transliteration. In Korean technical documents, many English words are transliterated into Korean words in various forms in diverse ways. As English words and Korean ...
Read More
Caching multidimensional queries using chunks
SIGMOD '98: Proceedings of the 1998 ACM SIGMOD international conference on Management of data

Caching has been proposed (and implemented) by OLAP systems in order to reduce response times for multidimensional queries. Previous work on such caching has considered table level caching and query level caching. Table level caching is more suitable ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EACL '99: Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
June 1999
310 pages
Program Chairs:
Henry S. Thompson
University of Edinburgh
,
Alex Lascarides
University of Edinburgh
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 8 June 1999
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate100of360submissions,28%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 64
  Total Citations
  View Citations
- 835
  Total Downloads
- Downloads (Last 12 months)27
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Representing text chunks

EACL '99: Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

Word alignment of English-Chinese bilingual corpus based on chunks

English-to-Korean transliteration using multiple unbounded overlapping phoneme chunks

Caching multidimensional queries using chunks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Representing text chunks

EACL '99: Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

Word alignment of English-Chinese bilingual corpus based on chunks

English-to-Korean transliteration using multiple unbounded overlapping phoneme chunks

Caching multidimensional queries using chunks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media