skip to main content
10.3115/1220575.1220699dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
Article
Free Access

Flexible text segmentation with structured multilabel classification

Published:06 October 2005Publication History

ABSTRACT

Many language processing tasks can be reduced to breaking the text into segments with prescribed properties. Such tasks include sentence splitting, tokenization, named-entity extraction, and chunking. We present a new model of text segmentation based on ideas from multilabel classification. Using this model, we can naturally represent segmentation problems involving overlapping and non-contiguous segments. We evaluate the model on entity extraction and noun-phrase chunking and show that it is more accurate for overlapping and non-contiguous segments, but it still performs well on simpler data sets for which sequential tagging has been the best method.

References

  1. D. M. Bikel, R. Schwartz, and R. M. Weischedel. 1999. An algorithm that learns what's in a name. Machine Learning Journal Special Issue on Natural Language Learning, 34(1/3):221--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Bockhorst and M. Craven. 2004. Markov networks for detecting overlapping elements in sequence data. In Proc. NIPS.Google ScholarGoogle Scholar
  3. Y. Censor and S. A. Zenios. 1997. Parallel optimization: theory, algorithms, and applications. Oxford University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Collins. 2002. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proc. EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Crammer and Y. Singer. 2002. A new family of online algorithms for category ranking. In Proc SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Crammer. 2005. Online Learning for Complex Categorial Problems. Ph.D. thesis, Hebrew University of Jerusalem, to appear.Google ScholarGoogle Scholar
  7. N. Cristianini and J. Shawe-Taylor. 2000. An Introduction to Support Vector Machines. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Dickinson and W. D. Meurers. 2005. Detecting errors in discontinuous structural annotation. In Proc. ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Elisseeff and J. Weston. 2001. A kernel method for multi-labeled classification. In Proc. NIPS.Google ScholarGoogle Scholar
  10. T. Kudo and Y. Matsumoto. 2001. Chunking with support vector machines. In Proc. NAACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. McCallum, D. Freitag, and F. Pereira. 2000. Maximum entropy Markov models for information extraction and segmentation. In Proceedings of ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. McDonald, K. Crammer, and F. Pereira. 2004. Large margin online learning algorithms for scalable structured classication. In NIPS Workshop on Structured Outputs.Google ScholarGoogle Scholar
  14. PennBioIE. 2005. Mining The Bibliome Project. http://bioie.ldc.upenn.edu/.Google ScholarGoogle Scholar
  15. L. R. Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--285, February.Google ScholarGoogle ScholarCross RefCross Ref
  16. A. Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. In Proc. EMNLP.Google ScholarGoogle Scholar
  17. R. E. Schapire and Y. Singer. 1999. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3):1--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. B. Schölkopf and A. J. Smola. 2002. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press.Google ScholarGoogle Scholar
  19. F. Sha and F. Pereira. 2003. Shallow parsing with conditional random fields. In Proc. HLT-NAACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. B. Taskar, C. Guestrin, and D. Koller. 2003. Max-margin Markov networks. In Proc. NIPS.Google ScholarGoogle Scholar
  21. E. F. Tjong Kim Sang and F. De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of CoNLL-2003. http://www.cnts.ua.ac.be/conll2003/ner. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Flexible text segmentation with structured multilabel classification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
        October 2005
        1054 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 6 October 2005

        Qualifiers

        • Article

        Acceptance Rates

        HLT '05 Paper Acceptance Rate127of402submissions,32%Overall Acceptance Rate240of768submissions,31%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader