ABSTRACT
We address the problem of distinguishing between two sources of disagreement in annotations: genuine subjectivity and slip of attention. The latter is especially likely when the classification task has a default class, as in tasks where annotators need to find instances of the phenomenon of interest, such as in a metaphor detection task discussed here. We apply and extend a data analysis technique proposed by Beigman Klebanov and Shamir (2006) to first distill reliably deliberate (non-chance) annotations and then to estimate the amount of attention slips vs genuine disagreement in the reliably deliberate annotations.
- Beigman Klebanov, Beata and Eli Shamir. 2006. Reader-based exploration of lexical cohesion. Language Resources and Evaluation, 40(2): 109--126.Google ScholarCross Ref
- Carletta, Jean. 1996. Assessing agreement on classification tasks: the kappa statistic. Computational Linguistics, 22(2):249--254. Google ScholarDigital Library
- Krippendorff, Klaus. 1980. Content Analysis. Sage Publications.Google Scholar
- Musolff, Andreas. 2000. Mirror images of Europe: Metaphors in the public debate about Europe in Britain and Germany. München: Iudicium.Google Scholar
- Santorini, Beatrice. 1990. Part-of-speech tagging guidelines for the Penn Treebank project (3rd revision, 2nd printing). ftp://ftp.cis.upenn.edu/pub/treebank/doc/tagguide.ps.gz.Google Scholar
- Siegel, Sidney and John Castellan. 1988. Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill Book Company.Google Scholar
- Spiro, Neta. 2007. What contributes to the perception of musical phrases in Western classical music? Ph.D. thesis, University of Amsterdam, The Netherlands.Google Scholar
Recommendations
Analysing Wikipedia and gold-standard corpora for NER training
EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational LinguisticsNamed entity recognition (ner) for English typically involves one of three gold standards: muc, conll, or bbn, all created by costly manual annotation. Recent work has used Wikipedia to automatically create a massive corpus of named entity annotated ...
Analysing anaphoric ambiguity in natural language requirements
Special Issue on Best Papers of RE'10: Requirements Engineering in a Multi-faceted WorldMany requirements documents are written in natural language (NL). However, with the flexibility of NL comes the risk of introducing unwanted ambiguities in the requirements and misunderstandings between stakeholders. In this paper, we describe an ...
A structural approach to the automatic adjudication of word sense disagreements
The semantic annotation of texts with senses from a computational lexicon is a complex and often subjective task. As a matter of fact, the fine granularity of the WordNet sense inventory [Fellbaum, Christiane (ed.). 1998. WordNet: An Electronic Lexical ...
Comments