No abstract available.
Proceeding Downloads
IR between science and engineering, and the role of experimentation
Evaluation has always played a major role in IR research, as a means for judging about the quality of competing models. Lately, however, we have seen an over-emphasis of experimental results, thus favoring engineering approaches aiming at tuning ...
Retrieval evaluation in practice
Nowadays, most research on retrieval evaluation is about comparing different systems to determine which is the best one, using a standard document collection and a set of queries with relevance judgements, such as TREC. Retrieval quality baselines are ...
A dictionary- and corpus-independent statistical lemmatizer for information retrieval in low resource languages
We present a dictionary- and corpus-independent statistical lemmatizer StaLe that deals with the out-of-vocabulary (OOV) problem of dictionary-based lemmatization by generating candidate lemmas for any inflected word forms. StaLe can be applied with ...
A new approach for cross-language plagiarism analysis
This paper presents a new method for Cross-Language Plagiarism Analysis. Our task is to detect the plagiarized passages in the suspicious documents and their corresponding fragments in the source documents. We propose a plagiarism detection method ...
Creating a Persian-English comparable corpus
Multilingual corpora are valuable resources for cross-language information retrieval and are available in many language pairs. However the Persian language does not have rich multilingual resources due to some of its special features and difficulties in ...
Validating query simulators: an experiment using commercial searches and purchases
We design and validate simulators for generating queries and relevance judgments for retrieval system evaluation. We develop a simulation framework that incorporates existing and new simulation strategies. To validate a simulator, we assess whether ...
Using parallel corpora for multilingual (multi-document) summarisation evaluation
We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual ...
MapReduce for information retrieval evaluation: "let's quickly test this on 12 TB of data"
We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use a cluster of 15 low cost machines to search a web crawl of 0.5 billion ...
Which log for which information? gathering multilingual data from different log file types
In this paper, a comparative analysis of different log file types and their potential for gathering information about user behavior in a multilingual information system is presented. It starts with a discussion of potential questions to be answered in ...
Examining the robustness of evaluation metrics for patent retrieval with incomplete relevance judgements
Recent years have seen a growing interest in research into patent retrieval. One of the key issues in conducting information retrieval (IR) research is meaningful evaluation of the effectiveness of the retrieval techniques applied to task under ...
On the evaluation of entity profiles
Entity profiling is the task of identifying and ranking descriptions of a given entity. The task may be viewed as one where the descriptions being sought are terms that need to be selected from a knowledge source (such as an ontology or thesaurus). In ...
Evaluating information extraction
The issue of how to experimentally evaluate information extraction (IE) systems has received hardly any satisfactory solution in the literature. In this paper we propose a novel evaluation model for IE and argue that, among others, it allows (i) a ...
Tie-breaking bias: effect of an uncontrolled parameter on information retrieval evaluation
We consider Information Retrieval evaluation, especially at TREC with the trec_eval program. It appears that systems obtain scores regarding not only the relevance of retrieved documents, but also according to document names in case of ties (i.e., when ...
Automated component-level evaluation: present and future
Automated component-level evaluation of information retrieval (IR) is the main focus of this paper. We present a review of the current state of web-based and component-level evaluation. Based on these systems, propositions are made for a comprehensive ...
A PROMISE for experimental evaluation
- Martin Braschler,
- Khalid Choukri,
- Nicola Ferro,
- Allan Hanbury,
- Jussi Karlgren,
- Henning Müller,
- Vivien Petras,
- Emanuele Pianta,
- Maarten De Rijke,
- Giuseppe Santucci
Participative Research laboratory for Multimedia and Multilingual Information Systems Evaluation (PROMISE) is a Network of Excellence, starting in conjunction with this first independent CLEF 2010 conference, and designed to support and develop the ...
Cited By
- Ferro N (2014). CLEF 15th Birthday, ACM SIGIR Forum, 48:2, (31-55), Online publication date: 23-Dec-2014.
- Clough P, Ferro N, Forner P, Gonzalo J, Huurnink B, Kekäälåinen J, Lalmas M, Petras V and de Rijke M (2012). CLEF 2011, ACM SIGIR Forum, 45:2, (32-37), Online publication date: 9-Jan-2012.
- Agosti M, Berendsen R, Bogers T, Braschler M, Buitelaar P, Choukri K, Maria Di Nunzio G, Ferro N, Forner P, Hanbury A, Heppin K, Hansen P, Järvelin A, Larsen B, Lupu M, Masiero I, Müller H, Peruzzo S, Petras V, Piroi F, de Rijke M, Santucci G, Silvello G, Toms E, Berendsen R, Hanbury A, Lupu M, Petras V and Silvello G (2012). PROMISE retreat report prospects and opportunities for information access evaluation, ACM SIGIR Forum, 46:2, (60-84), Online publication date: 21-Dec-2012.
- Agosti M, Braschler M, Choukri K, Ferro N, Harman D, Peters C, Pianta E, de Rijke M and Smeaton A (2011). CLEF 2010 conference on multilingual and multimodal information access evaluation, ACM SIGIR Forum, 44:2, (8-12), Online publication date: 3-Jan-2011.