ABSTRACT
A well-known challenge in learning from click data is its inherent bias and most notably position bias. Traditional click models aim to extract the ‹query, document› relevance and the estimated bias is usually discarded after relevance is extracted. In contrast, the most recent work on unbiased learning-to-rank can effectively leverage the bias and thus focuses on estimating bias rather than relevance [20, 31]. Existing approaches use search result randomization over a small percentage of production traffic to estimate the position bias. This is not desired because result randomization can negatively impact users' search experience. In this paper, we compare different schemes for result randomization (i.e., RandTopN and RandPair) and show their negative effect in personal search. Then we study how to infer such bias from regular click data without relying on randomization. We propose a regression-based Expectation-Maximization (EM) algorithm that is based on a position bias click model and that can handle highly sparse clicks in personal search. We evaluate our EM algorithm and the extracted bias in the learning-to-rank setting. Our results show that it is promising to extract position bias from regular clicks without result randomization. The extracted bias can improve the learning-to-rank algorithms significantly. In addition, we compare the pointwise and pairwise learning-to-rank models. Our results show that pairwise models are more effective in leveraging the estimated bias.
- Aman Agarwal, Soumya Basu, Tobias Schnabel, and Thorsten Joachims. 2017. Effective Evaluation Using Logged Bandit Feedback from Multiple Loggers Proc. of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 687--696. Google ScholarDigital Library
- Qingyao Ai, Susan T. Dumais, Nick Craswell, and Dan Liebling. 2017. Characterizing Email Search Using Large-scale Behavioral Logs and Surveys Proc. of the 26th International Conference on World Wide Web (WWW). 1511--1520. Google ScholarDigital Library
- Michael Bendersky, Xuanhui Wang, Donald Metzler, and Marc Najork. 2017. Learning from User Interactions in Personal Search via Attribute Parameterization Proc. of the 10th ACM International Conference on Web Search and Data Mining (WSDM). 791--799. Google ScholarDigital Library
- Christopher J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview. Technical Report MSR-TR-2010-82. Microsoft Research. http://research.microsoft.com/apps/pubs/default.aspx?id=132652Google Scholar
- David Carmel, Guy Halawi, Liane Lewin-Eytan, Yoelle Maarek, and Ariel Raviv. 2015. Rank by Time or by Relevance?: Revisiting Email Search Proc. of the 24th ACM International Conference on Information and Knowledge Management (CIKM). 283--292. Google ScholarDigital Library
- David Carmel, Liane Lewin-Eytan, Alex Libov, Yoelle Maarek, and Ariel Raviv. 2017. Promoting Relevant Results in Time-Ranked Mail Search Proc. of the 26th International Conference on World Wide Web (WWW). 1551--1559. Google ScholarDigital Library
- Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking Proc. of the 18th International Conference on World Wide Web (WWW). 1--10. Google ScholarDigital Library
- Weizhu Chen, Zhanglong Ji, Si Shen, and Qiang Yang. 2011. A Whole Page Click Model to Better Interpret Search Engine Click Data. Proc. of the 25th AAAI Conference on Artificial Intelligence (AAAI). Google ScholarDigital Library
- Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click Models for Web Search. Morgan & Claypool.Google Scholar
- Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An Experimental Comparison of Click Position-bias Models Proc. of the 1st International Conference on Web Search and Data Mining (WSDM). 87--94. Google ScholarDigital Library
- Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (Methodological) Vol. 39 (1977), 1--38. Issue 1.Google ScholarCross Ref
- Miroslav Dudík, John Langford, and Lihong Li. 2011. Doubly Robust Policy Evaluation and Learning. In Proc. of the 28th International Conference on Machine Learning (ICML). 1097--1104. Google ScholarDigital Library
- Susan Dumais, Edward Cutrell, JJ Cadiz, Gavin Jancke, Raman Sarin, and Daniel C. Robbins. 2003. Stuff I've Seen: A System for Personal Information Retrieval and Re-Use Proc. of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 72--79. Google ScholarDigital Library
- Georges E. Dupret and Benjamin Piwowarski. 2008. A User Browsing Model to Predict Search Engine Click Data from Past Observations Proc. of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 331--338. Google ScholarDigital Library
- David Elsweiler, Morgan Harvey, and Martin Hacker. 2011. Understanding re-finding behavior in naturalistic email interaction logs Proc. of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 35--44. Google ScholarDigital Library
- Jerome H. Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics Vol. 29 (2000), 1189--1232.Google ScholarCross Ref
- Fan Guo, Chao Liu, Anitha Kannan, Tom Minka, Michael Taylor, Yi-Min Wang, and Christos Faloutsos. 2009. Click chain model in web search. In Proc. of the 18th International Conference on World Wide Web (WWW). 11--20. Google ScholarDigital Library
- Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2001. The Elements of Statistical Learning. Springer.Google Scholar
- Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately Interpreting Clickthrough Data As Implicit Feedback Proc. of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 154--161. Google ScholarDigital Library
- Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proc. of the 10th ACM International Conference on Web Search and Data Mining (WSDM). 781--789. Google ScholarDigital Library
- Maryam Kamvar, Melanie Kellar, Rajan Patel, and Ya Xu. 2009. Computers and iPhones and Mobile Phones, Oh My!: A Logs-based Comparison of Search Users on Different Devices. In Proc. of the 18th International Conference on World Wide Web (WWW). 801--810. Google ScholarDigital Library
- Jin Young Kim, Nick Craswell, Susan Dumais, Filip Radlinski, and Fang Liu. 2017. Understanding and Modeling Success in Email Search Proc. of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 265--274. Google ScholarDigital Library
- Lihong Li, Shunbao Chen, Jim Kleban, and Ankur Gupta. 2015. Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study Proc. of the 24th International Conference on World Wide Web (WWW) Companion. 929--934. Google ScholarDigital Library
- Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms. In Proc. of the 4th International Conference on Web Search and Web Data Mining (WSDM). 297--306. Google ScholarDigital Library
- Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, Vol. 3, 3 (2009), 225--331. Google ScholarDigital Library
- Maeve O'Brien and Mark T. Keane. 2006. Modeling result--list searching in the World Wide Web: The role of relevance topologies and trust bias. In Proc. of the 28th Annual Conference of the Cognitive Science Society (CogSci). 1--881.Google Scholar
- Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting Clicks: Estimating the Click-Through Rate for New Ads Proc. of the 16th International Conference on World Wide Web (WWW). 521--530. Google ScholarDigital Library
- Paul R. Rosenbaum and Donald B. Rubin. 1983. The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, Vol. 70, 1 (1983), 41--55.Google ScholarCross Ref
- Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations As Treatments: Debiasing Learning and Evaluation Proc. of the 33rd International Conference on International Conference on Machine Learning (ICML). 1670--1679. Google ScholarDigital Library
- Adith Swaminathan and Thorsten Joachims. 2015. Batch Learning from Logged Bandit Feedback Through Counterfactual Risk Minimization. Journal of Machine Learning Research Vol. 16 (2015), 1731--1755. Issue 1. Google ScholarDigital Library
- Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search Proc. of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 115--124. Google ScholarDigital Library
- Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond Position Bias: Examining Result Attractiveness As a Source of Presentation Bias in Clickthrough Data. In Proc. of the 19th International Conference on World Wide Web (WWW). 1011--1018. Google ScholarDigital Library
- Hamed Zamani, Michael Bendersky, Xuanhui Wang, and Mingyang Zhang. 2017. Situational Context for Ranking in Personal Search Proc. of the 26th International Conference on World Wide Web (WWW). 1531--1540. Google ScholarDigital Library
- Zeyuan Allen Zhu, Weizhu Chen, Tom Minka, Chenguang Zhu, and Zheng Chen. 2010. A novel click model and its applications to online advertising Proc. of the 3rd ACM International Conference on Web Search and Data Mining (WSDM). 321--330. Google ScholarDigital Library
Index Terms
- Position Bias Estimation for Unbiased Learning to Rank in Personal Search
Recommendations
Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm
WWW '19: The World Wide Web ConferenceRecently a number of algorithms under the theme of 'unbiased learning-to-rank' have been proposed, which can reduce position bias, the major type of bias in click data, and train a high-performance ranker with click data. Most of the existing algorithms,...
Unbiased Learning to Rank with Unbiased Propensity Estimation
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalLearning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework ...
Learning to Rank with Selection Bias in Personal Search
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalClick-through data has proven to be a critical resource for improving search ranking quality. Though a large amount of click data can be easily collected by search engines, various biases make it difficult to fully leverage this type of data. In the ...
Comments