ABSTRACT
We describe an initial study into the identification of important and useful information units within documents retrieved by an information retrieval system in response to a user query created in response to an underlying information need. This study is part of a large investigation of the exploitation of useful and important units from retrieved documents to generate rich document surrogates to improve user search experience. We report three user studies using a crowdsourcing platform, where participants were first asked to read an information need and contents of a relevant document and then to perform actions depending on the type of study: i) write important information units (WIIU), ii) highlight important information units (HIIU) and iii) assess importance of already highlighted information units (AIHIU). Further, we discuss a novel mechanism of measuring similarities between content annotations. We find majority agreement of about 0.489 and pairwise agreement of 0.340 among users annotation in the AIHIU study, and average cosine similarity of 0.50 and 0.57 between participant annotations and documents in the WIIU and HIIU studies respectively.
- A. Al-Maskari and M. Sanderson. A review of factors influencing user satisfaction in information retrieval. JASIST, 2010, 61(5):859--868. Google ScholarDigital Library
- P. Arora and G. J. F. Jones. Position paper: Promoting user engagement and learning in search tasks by effective document representation. In Proceedings of SAL workshop, SIGIR 2016 .Google Scholar
- M. Cole, J. Liu, N. Belkin, R. Bierig, J. Gwizdka, C. Liu, J. Zhang, and X. Zhang. Usefulness as the criterion for evaluation of interactive information retrieval. Proc. HCIR , pages 1--4, 2009.Google Scholar
- I. Habernal, M. Sukhareva, F. Raiber, A. Shtok, O. Kurland, H. Ronen, J. Bar-Ilan, and I. Gurevych. New collection announcement: Focused retrieval over the web. In Proceedings of SIGIR 2016, pages 701--704. Google ScholarDigital Library
- K. Järvelin and J. Kekäläinen. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of SIGIR 2000, pages 41--48. Google ScholarDigital Library
- J. Kamps, J. Pehcevski, G. Kazai, M. Lalmas, and S. Robertson. Inex 2007 evaluation measures. In International Workshop of the Initiative for the Evaluation of XML Retrieval, pages 24--33. Springer, 2007.Google Scholar
- E. Kanoulas, B. Carterette, M. Hall, P. Clough, and M. Sanderson. Overview of the TREC 2012 Session Track. 2012.Google Scholar
- M. P. Kato, M. Ekstrand-Abueg, V. Pavlu, T. Sakai, T. Yamamoto, and M. Iwata. Overview of the ntcir-11 mobileclick task. In NTCIR, 2014.Google Scholar
- D. Kelly and C. Cool. The effects of topic familiarity on information search behavior. JCDL '02, pages 74--75. ACM, 2002. Google ScholarDigital Library
- J. Mao, Y. Liu, K. Zhou, J.-Y. Nie, J. Song, M. Zhang, S. Ma, J. Sun, and H. Luo. When does relevance mean usefulness and user satisfaction in web search? In Proceedings of SIGIR 2016, pages 463--472. Google ScholarDigital Library
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.Google Scholar
- T. Saracevic. Relevance: A review of the literature and a framework for thinking on the notion in information science. part iii: Behavior and effects of relevance. JASIST, 2007, 58(13):2126--2144. Google ScholarDigital Library
- M. Shokouhi and Q. Guo. From queries to cards: Re-ranking proactive card recommendations based on reactive search history. In Proceedings of SIGIR 2015, pages 695--704. Google ScholarDigital Library
- A. Trotman and S. Geva. Passage retrieval and other xml-retrieval tasks. In Proceedings of the SIGIR 2006 Workshop on XML Element Retrieval Methodology, pages 43--50, 2006.Google Scholar
- R. W. White, J. M. Jose, and I. Ruthven. Using top-ranking sentences to facilitate effective information access. JASIST, 2005, 56(10):1113--1125. Google ScholarDigital Library
- E. Yilmaz, M. Verma, N. Craswell, F. Radlinski, and P. Bailey. Relevance and effort: An analysis of document utility. In Proceedings of CIKM 2014, pages 91--100. Google ScholarDigital Library
Index Terms
- Identifying Useful and Important Information within Retrieved Documents
Recommendations
Identifying important concepts from medical documents
Automated medical concept recognition is important for medical informatics such as medical document retrieval and text mining research. In this paper, we present a software tool called keyphrase identification program (KIP) for identifying topical ...
Browsing patterns in retrieved documents
IIiX '14: Proceedings of the 5th Information Interaction in Context SymposiumThe paper reports a test exploring how retrieved documents are browsed. The access point to the documents was varied -- starting either from the beginning of the document or from the point where relevant information is located -- to find out how much ...
Information Retrieval System for XML Documents
DEXA '02: Proceedings of the 13th International Conference on Database and Expert Systems ApplicationsIn the research field of document information retrieval, the unit of retrieval results returned by IR systems is a whole document or a document fragment, like a paragraph in passage retrieval. IR systems based on the vector space model compute feature ...
Comments