ABSTRACT
People's beliefs, and unconscious biases that arise from those beliefs, influence their judgment, decision making, and actions, as is commonly accepted among psychologists. Biases can be observed in information retrieval in situations where searchers seek or are presented with information that significantly deviates from the truth. There is little understanding of the impact of such biases in search. In this paper we study search-related biases via multiple probes: an exploratory retrospective survey, human labeling of the captions and results returned by a Web search engine, and a large-scale log analysis of search behavior on that engine. Targeting yes-no questions in the critical domain of health search, we show that Web searchers exhibit their own biases and are also subject to bias from the search engine. We clearly observe searchers favoring positive information over negative and more than expected given base rates based on consensus answers from physicians. We also show that search engines strongly favor a particular, usually positive, perspective, irrespective of the truth. Importantly, we show that these biases can be counterproductive and affect search outcomes; in our study, around half of the answers that searchers settled on were actually incorrect. Our findings have implications for search engine design, including the development of ranking algorithms that con-sider the desire to satisfy searchers (by validating their beliefs) and providing accurate answers and properly considering base rates. Incorporating likelihood information into search is particularly important for consequential tasks, such as those with a medical focus.
- Agichtein, E., Brill, E., and Dumais, S. (2006). Improving web search ranking by incorporating user behavior information. Proc. SIGIR, 19--26. Google ScholarDigital Library
- Ariely, D. (2008). Predictably Irrational: The Hidden Forces that Shape Our Decisions. Harper Collins.Google Scholar
- Baron, J. (2007). Thinking and Deciding. Cambridge Press.Google Scholar
- Belkin, N.J., Oddy, R.N., and Brooks, H.M. (1982). ASK for information retrieval: Part I - background and theory. J. Documentation, 38(2): 61--71.Google ScholarCross Ref
- Bennett, P.N. et al. (2012). Modeling the impact of short- and long-term behavior on search personalization. Proc. SIGIR, 185--194. Google ScholarDigital Library
- Bilenko, M. and White, R.W. (2008). Mining the search trails of the surfing crowds: identifying relevant websites from user activity. Proc. WWW, 51--60. Google ScholarDigital Library
- Brennan, R.L. and Prediger, D.J. (1981). Coefficient Kappa: Some uses, misuses, and alternatives. Educational and Psychological Measurement, (41): 687--699.Google Scholar
- Cho, J. and Roy, S. (2004). Impact of search engines on page popularity. Proc. WWW, 20--29. Google ScholarDigital Library
- Clarke, C., Agichtein, E., Dumais, S., and White R.W. (2007). The influence of caption features on clickthrough patterns in Web search. Proc. SIGIR, 135--142. Google ScholarDigital Library
- Craswell, N., Zoeter, O., Taylor, M., and Ramsey, B. (2008). An experimental comparison of click position-bias models. Proc. WSDM, 87--94. Google ScholarDigital Library
- Dou, Z., Song, R., and Wen, J.R. (2007). A large-scale evaluation and analysis of personalization search strategy. Proc. WWW, 581--590. Google ScholarDigital Library
- Dumais, S. et al. (2002). Web question answering: is more always better? Proc. SIGIR, 291--298. Google ScholarDigital Library
- Fleiss, J.L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5): 378--382.Google ScholarCross Ref
- Fortunato, S., Flammini, A., Menczer, F., and Vespignani, A. (2006). Topical interests and the mitigation of search engine bias. PNAS, 103(34): 12684--12689.Google ScholarCross Ref
- Gigerenzer, G. and Todd, P.M. (2000). Simple Heuristics That Make Us Smart. Oxford University Press.Google Scholar
- Ieong, S., Mishra, N., Sadikov, E., and Zhang, I. (2012). Do-main bias in Web search. Proc. WSDM, 413--422. Google ScholarDigital Library
- Ingwersen, P. (1994). Polyrepresentation of information needs and semantic entities: Elements of a cognitive theory for information retrieval interaction. Proc. SIGIR, 101--110. Google ScholarDigital Library
- Inlander, C.B. (1993). Good operations, Bad operations: The People's Medical Society's Guide to Surgery. Viking Adult.Google Scholar
- Joachims, T. (2002). Optimizing search engines using click-through data. Proc. SIGKDD, 132--142. Google ScholarDigital Library
- Joachims, T. et al. (2007). Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM TOIS, 25(2). Google ScholarDigital Library
- Kahneman, D. and Tversky, A. (1974). Judgment under un-certainty: heuristics and biases. Science, 185(4157): 1214--1231.Google Scholar
- Klayman, J. and Ha, Y. (1987). Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94: 211--228.Google ScholarCross Ref
- Kuhlthau, C. (1991). Inside the search process: Information seeking from the user's perspective. JASIST, 42(5): 361--371.Google ScholarCross Ref
- Marchionini, G. (1995). Information Seeking in Electronic Environments. Cambridge University Press. Google ScholarDigital Library
- Mowshowitz, A. and Kawaguchi, A. (2002). Bias on the Web. CACM, 45(9): 56--60. Google ScholarDigital Library
- Nickerson, R.S. (1998). Confirmation bias: a ubiquitous phenomenon in many guises. Rev. Gen. Psych., 2(2): 175--220.Google ScholarCross Ref
- Pariser, E. (2011). The Filter Bubble: What is the Internet Hiding from You? Penguin Press. Google ScholarDigital Library
- Popper, K. (1959). The Logic of Scientific Discovery. Basic Books.Google Scholar
- Radlinski, F. and Joachims, T. (2006). Minimally invasive randomization for collecting unbiased preferences from click-through logs. Proc. AAAI. Google ScholarDigital Library
- Salton, G., Wong, A., and Yang, C.S. (1975). A vector space model for automatic indexing. CACM, 18(11): 613--620. Google ScholarDigital Library
- Saracevic, T. (1997). The stratified model of information retrieval interaction: Extensions and applications. Proc. ASIS, 34: 313--327.Google Scholar
- Schwarz, J. and Morris, M.R. (2011). Augmenting Web pages and search results to help people find trustworthy information online. Proc. SIGCHI, 1245--1254. Google ScholarDigital Library
- Simon, H. (1991). Bounded rationality and organizational learning. Organization Science, 2(1): 125--134.Google ScholarDigital Library
- Snow, R., O'Connor, B., Jurafsky, D., and Ng, A.Y. (2008). Cheap and fast -- but is it good? Evaluating non-expert annotations for natural language tasks. Proc. EMNLP, 254--263. Google ScholarDigital Library
- Sontag, D. et al. (2012). Probabilistic models for personalizing web search. Proc. WSDM, 433--442. Google ScholarDigital Library
- Tang, T.T., Hawking, D., Craswell, N., and Griffiths, K. (2005). Focused crawling for both topical relevance and quality of medical information. Proc. CIKM, 147--154. Google ScholarDigital Library
- Taylor, R.S. (1968). Question-negotiation and information seeking in libraries. College and Res. Libraries, 29: 178--194.Google ScholarCross Ref
- Teevan, J., Dumais, S.T., and Horvitz, E. (2005). Personalizing search via automated analysis of interests and activities. Proc. SIGIR, 449--456. Google ScholarDigital Library
- Tversky, A. and Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5(1): 207--233.Google ScholarCross Ref
- Vaughn, L. and Thelwall, M. (2004). Search engine coverage bias: evidence and possible causes. IP&M, 40(4): 693--707. Google ScholarDigital Library
- Wason, P.C. (1960). On the failure to eliminate hypotheses in a conceptual task, Q. J. of Exp. Psychology, 12: 129--140.Google ScholarCross Ref
- White, R.W., Bennett, P.N., and Dumais, S.T. (2010). Predict-ing short-term interests using activity-based search context. Proc. CIKM, 1009--1018. Google ScholarDigital Library
- White, R.W. and Drucker, S.M. (2007). Investigating behavioral variability in Web search. Proc. WWW, 21--30. Google ScholarDigital Library
- White, R.W. and Horvitz, E. (2009). Cyberchondria: Studies of the escalation of medical concerns in web search. ACM TOIS, 27(4): 23. Google ScholarDigital Library
- White, R.W. and Horvitz, E. (2012). Studies on the onset and persistence of medical concerns in search logs. Proc. SIGIR, 265--274. Google ScholarDigital Library
- Xiang, B. et al. (2010). Context-aware ranking in web search. Proc. SIGIR, 451--458. Google ScholarDigital Library
- Yue, Y., Patel, R., and Roehrig, H. (2010). Beyond position bias: Examining result attractiveness as a source of presentation bias in clickthrough data. Proc. WWW, 1011--1018. Google ScholarDigital Library
- Zhai, C., Cohen, W.W., and Lafferty, J. (2003). Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. Proc. SIGIR, 10--17. Google ScholarDigital Library
Index Terms
- Beliefs and biases in web search
Recommendations
Belief Dynamics and Biases in Web Search
We investigate how beliefs about the efficacy of medical interventions are influenced by searchers' exposure to information on retrieved Web pages. We present a methodology for measuring participants' beliefs and confidence about the efficacy of ...
Content Bias in Online Health Search
Search engines help people answer consequential questions. Biases in retrieved and indexed content (e.g., skew toward erroneous outcomes that represent deviations from reality), coupled with searchers' biases in how they examine and interpret search ...
Captions and biases in diagnostic search
People frequently turn to the Web with the goal of diagnosing medical symptoms. Studies have shown that diagnostic search can often lead to anxiety about the possibility that symptoms are explained by the presence of rare, serious medical disorders, ...
Comments