ABSTRACT
Even though information retrieval systems have been successfully deployed for over 45 years, the field continues to evolve in performance, functionality, and accuracy. There are hundreds of different products available with different indexing and retrieval characteristics. How does one choose the appropriate system for a given application? The first step in that choice is the creation of a framework for comparison of IR products and an infrastructure that supports automated execution and analysis of testing results. The next step is providing an environment for subjective measurement using human evaluators. In this paper we briefly introduce the concepts used in IR system evaluation and report on our initial implementation of a framework for evaluating indexing performance. We also report a test case, which provides a comparative analysis of the indexing characteristics for three IR system implementations using a common collection of documents.
- Baeza-Yates, R. and Ribeiro-Neto, B. Modern Information Retrieval. Addison-Wesley Longman, Reading, MA, 1999. Google ScholarDigital Library
- Brown, A. B. and Seltzer, M. I. Operating system benchmarking in the wake of lmbench: A case study of the performance of NetBSD on the Intel x86 architecture. In Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems (Seattle, Washington, United States, June 15-18, 1997) ACM Press, New York NY, 1997, 214--224. Google ScholarDigital Library
- Croft, W. B. and Turtle, H. R. Text Retrieval and Inference. In P. S. Jacobs (ed.) Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval. Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1992, 127--155. Google ScholarDigital Library
- Fox, C. Lexical analysis and stoplists. In W. B. Frakes and R. Baeza-Yates (eds.) Information Retrieval: Data Structures and Algorithms. Prentice-Hall, Englewood Cliffs, NJ, 1992, 102--130. Google ScholarDigital Library
- Gao, X., Murugesan, S., and Lo, B. Multi-dimensional evaluation of information retrieval results. In Proceedings of the IEEE/WIC/ACM International Conference (WI '04) (September 20-24, 2004). IEEE Computer Society, Washington, DC, 2004, 192--198. Google ScholarDigital Library
- Hart, M. S. History and Philosophy of Project Gutenberg, 1992. Retrieved August 15, 2006, from http://www.gutenberg.org/about/historyGoogle Scholar
- Salton, G., Wong, A., and Yang, C. S. A vector space model for automatic indexing, Communications of the ACM, 18, 11 (Nov. 1975). ACM Press, New York, NY, 1975, 613--620. Google ScholarDigital Library
- Van Rijsbergen, C. J., Information Retrieval, 2nd Ed. Butterworth-Heinemann, Newton, MA, 1979. Google ScholarDigital Library
- The Gutenberg Project. Retrieved March 5, 2006, from http://www.gutenberg.orgGoogle Scholar
Index Terms
- An infrastructure for the evaluation and comparison of information retrieval systems
Recommendations
Current Status of the Evaluation of Information Retrieval
This is the second in the series of the articles on an application of the systems analytic approach to evaluation of information retrieval (IR). In the previous article a historical overview of IR was presented and existing terminological problems ...
Incorporating rich features to boost information retrieval performance
Research highlights We propose a regression-based re-ranking framework that can take into account rich features for boosting information retrieval (IR) performance. A set of salient features that may affect IR performance are investigated. Extensive ...
Exploration of query context for information retrieval
WWW '07: Proceedings of the 16th international conference on World Wide WebA number of existing information retrieval systems propose the notion of query context to combine the knowledge of query and user into retrieval to reveal the most exact description of user's information needs. In this paper we interpret query context ...
Comments