ABSTRACT
Implicit feedback (e.g., clicks, dwell times, etc.) is an abundant source of data in human-interactive systems. While implicit feedback has many advantages (e.g., it is inexpensive to collect, user centric, and timely), its inherent biases are a key obstacle to its effective use. For example, position bias in search rankings strongly influences how many clicks a result receives, so that directly using click data as a training signal in Learning-to-Rank (LTR) methods yields sub-optimal results. To overcome this bias problem, we present a counterfactual inference framework that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data. Using this framework, we derive a Propensity-Weighted Ranking SVM for discriminative learning from implicit feedback, where click models take the role of the propensity estimator. In contrast to most conventional approaches to de-biasing the data using click models, this allows training of ranking functions even in settings where queries do not repeat. Beyond the theoretical support, we show empirically that the proposed learning method is highly effective in dealing with biases, that it is robust to noise and propensity model misspecification, and that it scales efficiently. We also demonstrate the real-world applicability of our approach on an operational search engine, where it substantially improves retrieval performance.
- A. Borisov, I. Markov, M. de Rijke, and P. Serdyukov. A neural click model for web search. In Proceedings of the 25th International Conference on World Wide Web, pages 531--541, 2016. Google ScholarDigital Library
- O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large-scale validation and analysis of interleaved search evaluation. ACM Transactions on Information Systems (TOIS), 30(1):6:1--6:41, 2012.Google Scholar
- O. Chapelle and Y. Zhang. A dynamic bayesian network click model for web search ranking. In International Conference on World Wide Web (WWW), pages 1--10. ACM, 2009. Google ScholarDigital Library
- A. Chuklin, I. Markov, and M. de Rijke. Click Models for Web Search. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers, 2015.Google Scholar
- N. Craswell, O. Zoeter, M. Taylor, and B. Ramsey. An experimental comparison of click position-bias models. In International Conference on Web Search and Data Mining (WSDM), pages 87--94. ACM, 2008. Google ScholarDigital Library
- K. Hofmann, A. Schuth, S. Whiteson, and M. de Rijke. Reusing historical interaction data for faster online learning to rank for ir. In International Conference on Web Search and Data Mining (WSDM), pages 183--192, 2013. Google ScholarDigital Library
- D. G. Horvitz and D. J. Thompson. A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47(260):663--685, 1952. Google Scholar
- G. Imbens and D. Rubin. Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, 2015. Google ScholarCross Ref
- T. Joachims. Optimizing search engines using clickthrough data. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages 133--142, 2002. Google ScholarDigital Library
- T. Joachims. Training linear SVMs in linear time. In ACM SIGKDD International Conference On Knowledge Discovery and Data Mining (KDD), pages 217--226, 2006. Google ScholarDigital Library
- T. Joachims, L. Granka, B. Pan, H. Hembrooke, F. Radlinski, and G. Gay. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Transactions on Information Systems (TOIS), 25(2), April 2007. Google ScholarDigital Library
- J. Langford, A. Strehl, and J. Wortman. Exploration scavenging. In Proceedings of the 25th International Conference on Machine Learning, pages 528--535, 2008. Google ScholarDigital Library
- L. Li, W. Chu, J. Langford, and X. Wang. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In International Conference on Web Search and Data Mining (WSDM), pages 297--306, 2011. Google ScholarDigital Library
- R. J. A. Little and D. B. Rubin. Statistical Analysis with Missing Data. John Wiley, 2002. Google ScholarCross Ref
- T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, Mar. 2009. Google ScholarDigital Library
- K. Raman and T. Joachims. Learning socially optimal information systems from egoistic users. In European Conference on Machine Learning (ECML), pages 128--144, 2013. Google ScholarDigital Library
- K. Raman, T. Joachims, P. Shivaswamy, and T. Schnabel. Stable coactive learning via perturbation. In International Conference on Machine Learning (ICML), pages 837--845, 2013.Google ScholarDigital Library
- M. Richardson, E. Dominowska, and R. Ragno. Predicting clicks: Estimating the click-through rate for new ads. In International Conference on World Wide Web (WWW), pages 521--530. ACM, 2007. Google ScholarDigital Library
- P. R. Rosenbaum and D. B. Rubin. The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41--55, 1983. Google ScholarCross Ref
- T. Schnabel, A. Swaminathan, P. Frazier, and T. Joachims. Unbiased comparative evaluation of ranking functions. In ACM International Conference on the Theory of Information Retrieval (ICTIR), 2016. Google ScholarDigital Library
- T. Schnabel, A. Swaminathan, A. Singh, N. Chandak, and T. Joachims. Recommendations as treatments: Debiasing learning and evaluation. In International Conference on Machine Learning (ICML), 2016.Google Scholar
- A. Schuth, H. Oosterhuis, S. Whiteson, and M. de Rijke. Multileave gradient descent for fast online learning to rank. In International Conference on Web Search and Data Mining (WSDM), pages 457--466, 2016. Google ScholarDigital Library
- K. Sparck-Jones and C. J. V. Rijsbergen. Report on the need for and provision of an 'ideal' information retrieval test collection. Technical report, University of Cambridge, 1975.Google Scholar
- A. L. Strehl, J. Langford, L. Li, and S. Kakade. Learning from logged implicit exploration data. In Conference on Neural Information Processing Systems (NIPS), pages 2217--2225, 2010.Google Scholar
- A. Swaminathan and T. Joachims. Batch learning from logged bandit feedback through counterfactual risk minimization. Journal of Machine Learning Research (JMLR), 16:1731--1755, Sep 2015.Google Scholar
- V. Vapnik. Statistical Learning Theory. Wiley, Chichester, GB, 1998.Google ScholarDigital Library
- L. Wang, J. J. Lin, and D. Metzler. A cascade ranking model for efficient ranked retrieval. In ACM Conference on Research and Development in Information Retrieval (SIGIR), pages 105--114, 2011. Google ScholarDigital Library
- X. Wang, M. Bendersky, D. Metzler, and M. Najork. Learning to rank with selection bias in personal search. In ACM Conference on Research and Development in Information Retrieval (SIGIR). ACM, 2016. Google ScholarDigital Library
- Y. Wang, D. Yin, L. Jie, P. Wang, M. Yamada, Y. Chang, and Q. Mei. Beyond ranking: Optimizing whole-page presentation. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, WSDM '16, pages 103--112, 2016. Google ScholarDigital Library
- Y. Yue and T. Joachims. Interactively optimizing information retrieval systems as a dueling bandits problem. In International Conference on Machine Learning (ICML), pages 151--159, 2009. Google ScholarDigital Library
- Y. Yue, R. Patel, and H. Roehrig. Beyond position bias: examining result attractiveness as a source of presentation bias in clickthrough data. In International Conference on World Wide Web (WWW), pages 1011--1018. ACM, 2010. Google ScholarDigital Library
Index Terms
- Unbiased Learning-to-Rank with Biased Feedback
Recommendations
Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm
WWW '19: The World Wide Web ConferenceRecently a number of algorithms under the theme of 'unbiased learning-to-rank' have been proposed, which can reduce position bias, the major type of bias in click data, and train a high-performance ranker with click data. Most of the existing algorithms,...
Unbiased Learning to Rank with Unbiased Propensity Estimation
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalLearning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework ...
Learning to Rank with Ensemble Ranking SVM
In this paper, we propose a novel learning to rank method using Ensemble Ranking SVM. Ensemble Ranking SVM is based on Ranking SVM which has been commonly used for learning to rank. The basic idea of Ranking SVM is to formulate the problem of learning ...
Comments