skip to main content
10.5555/1698381.1698406dlproceedingsArticle/Chapter ViewAbstractPublication Pagesacl-ijcnlpConference Proceedingsconference-collections
research-article
Free Access

Towards a methodology for named entities annotation

Published:06 August 2009Publication History

ABSTRACT

Today, the named entity recognition task is considered as fundamental, but it involves some specific difficulties in terms of annotation. Those issues led us to ask the fundamental question of what the annotators should annotate and, even more important, for which purpose. We thus identify the applications using named entity recognition and, according to the real needs of those applications, we propose to semantically define the elements to annotate. Finally, we put forward a number of methodological recommendations to ensure a coherent and reliable annotation scheme.

References

  1. Maud Ehrmann. 2008. Les entités nommées, de la linguistique au TAL: statut théorique et méthodes de désambiguïsation. Ph.D. thesis, Univ. Paris 7.Google ScholarGoogle Scholar
  2. Ulrike Gut and Petra Saskia Bayerl. 2004. Measuring the reliability of manual annotations of speech corpora. In Proc. of Speech Prosody, pages 565--568, Nara, Japan.Google ScholarGoogle Scholar
  3. Lynette Hirschman, Alexander Yeh, Christian Blaschke, and Alfonso Valencia. 2005. Overview of biocreative: critical assessment of information extraction for biology. BMC Bioinformatics, 6(1).Google ScholarGoogle Scholar
  4. J.-D. Kim, T. Ohta, Y. Tateisi, and J. Tsujii. 2003. Genia corpus-a semantically annotated corpus for biotextmining. Bioinformatics, 19:180--182.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jin-Dong Kim, Tomoko Ohta, Yoshimasa Tsuruoka, Yuka Tateisi, and Nigel Collier. 2004. Introduction to the bio-entity recognition task at JNLPBA. In Proc. of JNLPBA COLING 2004 Workshop, pages 70--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Seth Kulick, Ann Bies, Mark Liberman, Mark Mandel, Ryan McDonald, Martha Palmer, Andrew Schein, and Lyle Ungar. 2004. Integrated annotation for biomedical information extraction. In HLT-NAACL 2004 Workshop: Biolink. ACL.Google ScholarGoogle Scholar
  7. LDC. 2004. ACE (Automatic Content Extraction) english annotation guidelines for entities. Livrable version 5.6.1 2005.05.23, Linguistic Data Consortium.Google ScholarGoogle Scholar
  8. David Nadeau and Satoshi Sekine. 2007. A survey of named entity recognition and classification. Linguisticae Investigaciones, 30(1):3--26.Google ScholarGoogle ScholarCross RefCross Ref
  9. B. Sundheim. 1995. Overview of results of the MUC-6 evaluation. In Proc. of the 6th Message Understanding Conference. Morgan Kaufmann Publishers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Lorraine Tanabe, Natalie Xie, Lynne Thom, Wayne Matten, and John Wilbur1. 2005. Genetag: a tagged corpus for gene/protein named entity recognition. Bioinformatics, 6.Google ScholarGoogle Scholar

Index Terms

  1. Towards a methodology for named entities annotation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          ACL-IJCNLP '09: Proceedings of the Third Linguistic Annotation Workshop
          August 2009
          203 pages
          ISBN:9781932432527

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 6 August 2009

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader