Position Bias Estimation for Unbiased Learning to Rank in Personal Search

Authors:
Xuanhui Wang

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Nadav Golbandi

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Michael Bendersky

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Donald Metzler

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Marc Najork

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data MiningFebruary 2018Pages 610–618https://doi.org/10.1145/3159652.3159732

Published:02 February 2018Publication History

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

Pages 610–618

ABSTRACT

A well-known challenge in learning from click data is its inherent bias and most notably position bias. Traditional click models aim to extract the ‹query, document› relevance and the estimated bias is usually discarded after relevance is extracted. In contrast, the most recent work on unbiased learning-to-rank can effectively leverage the bias and thus focuses on estimating bias rather than relevance [20, 31]. Existing approaches use search result randomization over a small percentage of production traffic to estimate the position bias. This is not desired because result randomization can negatively impact users' search experience. In this paper, we compare different schemes for result randomization (i.e., RandTopN and RandPair) and show their negative effect in personal search. Then we study how to infer such bias from regular click data without relying on randomization. We propose a regression-based Expectation-Maximization (EM) algorithm that is based on a position bias click model and that can handle highly sparse clicks in personal search. We evaluate our EM algorithm and the extracted bias in the learning-to-rank setting. Our results show that it is promising to extract position bias from regular clicks without result randomization. The extracted bias can improve the learning-to-rank algorithms significantly. In addition, we compare the pointwise and pairwise learning-to-rank models. Our results show that pairwise models are more effective in leveraging the estimated bias.

References

Aman Agarwal, Soumya Basu, Tobias Schnabel, and Thorsten Joachims. 2017. Effective Evaluation Using Logged Bandit Feedback from Multiple Loggers Proc. of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 687--696. Google ScholarDigital Library
Qingyao Ai, Susan T. Dumais, Nick Craswell, and Dan Liebling. 2017. Characterizing Email Search Using Large-scale Behavioral Logs and Surveys Proc. of the 26th International Conference on World Wide Web (WWW). 1511--1520. Google ScholarDigital Library
Michael Bendersky, Xuanhui Wang, Donald Metzler, and Marc Najork. 2017. Learning from User Interactions in Personal Search via Attribute Parameterization Proc. of the 10th ACM International Conference on Web Search and Data Mining (WSDM). 791--799. Google ScholarDigital Library
Christopher J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview. Technical Report MSR-TR-2010-82. Microsoft Research. http://research.microsoft.com/apps/pubs/default.aspx?id=132652Google Scholar
David Carmel, Guy Halawi, Liane Lewin-Eytan, Yoelle Maarek, and Ariel Raviv. 2015. Rank by Time or by Relevance?: Revisiting Email Search Proc. of the 24th ACM International Conference on Information and Knowledge Management (CIKM). 283--292. Google ScholarDigital Library
David Carmel, Liane Lewin-Eytan, Alex Libov, Yoelle Maarek, and Ariel Raviv. 2017. Promoting Relevant Results in Time-Ranked Mail Search Proc. of the 26th International Conference on World Wide Web (WWW). 1551--1559. Google ScholarDigital Library
Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking Proc. of the 18th International Conference on World Wide Web (WWW). 1--10. Google ScholarDigital Library
Weizhu Chen, Zhanglong Ji, Si Shen, and Qiang Yang. 2011. A Whole Page Click Model to Better Interpret Search Engine Click Data. Proc. of the 25th AAAI Conference on Artificial Intelligence (AAAI). Google ScholarDigital Library
Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click Models for Web Search. Morgan & Claypool.Google Scholar
Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An Experimental Comparison of Click Position-bias Models Proc. of the 1st International Conference on Web Search and Data Mining (WSDM). 87--94. Google ScholarDigital Library
Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (Methodological) Vol. 39 (1977), 1--38. Issue 1.Google ScholarCross Ref
Miroslav Dudík, John Langford, and Lihong Li. 2011. Doubly Robust Policy Evaluation and Learning. In Proc. of the 28th International Conference on Machine Learning (ICML). 1097--1104. Google ScholarDigital Library
Susan Dumais, Edward Cutrell, JJ Cadiz, Gavin Jancke, Raman Sarin, and Daniel C. Robbins. 2003. Stuff I've Seen: A System for Personal Information Retrieval and Re-Use Proc. of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 72--79. Google ScholarDigital Library
Georges E. Dupret and Benjamin Piwowarski. 2008. A User Browsing Model to Predict Search Engine Click Data from Past Observations Proc. of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 331--338. Google ScholarDigital Library
David Elsweiler, Morgan Harvey, and Martin Hacker. 2011. Understanding re-finding behavior in naturalistic email interaction logs Proc. of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 35--44. Google ScholarDigital Library
Jerome H. Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics Vol. 29 (2000), 1189--1232.Google ScholarCross Ref
Fan Guo, Chao Liu, Anitha Kannan, Tom Minka, Michael Taylor, Yi-Min Wang, and Christos Faloutsos. 2009. Click chain model in web search. In Proc. of the 18th International Conference on World Wide Web (WWW). 11--20. Google ScholarDigital Library
Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2001. The Elements of Statistical Learning. Springer.Google Scholar
Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately Interpreting Clickthrough Data As Implicit Feedback Proc. of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 154--161. Google ScholarDigital Library
Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proc. of the 10th ACM International Conference on Web Search and Data Mining (WSDM). 781--789. Google ScholarDigital Library
Maryam Kamvar, Melanie Kellar, Rajan Patel, and Ya Xu. 2009. Computers and iPhones and Mobile Phones, Oh My!: A Logs-based Comparison of Search Users on Different Devices. In Proc. of the 18th International Conference on World Wide Web (WWW). 801--810. Google ScholarDigital Library
Jin Young Kim, Nick Craswell, Susan Dumais, Filip Radlinski, and Fang Liu. 2017. Understanding and Modeling Success in Email Search Proc. of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 265--274. Google ScholarDigital Library
Lihong Li, Shunbao Chen, Jim Kleban, and Ankur Gupta. 2015. Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study Proc. of the 24th International Conference on World Wide Web (WWW) Companion. 929--934. Google ScholarDigital Library
Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms. In Proc. of the 4th International Conference on Web Search and Web Data Mining (WSDM). 297--306. Google ScholarDigital Library
Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, Vol. 3, 3 (2009), 225--331. Google ScholarDigital Library
Maeve O'Brien and Mark T. Keane. 2006. Modeling result--list searching in the World Wide Web: The role of relevance topologies and trust bias. In Proc. of the 28th Annual Conference of the Cognitive Science Society (CogSci). 1--881.Google Scholar
Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting Clicks: Estimating the Click-Through Rate for New Ads Proc. of the 16th International Conference on World Wide Web (WWW). 521--530. Google ScholarDigital Library
Paul R. Rosenbaum and Donald B. Rubin. 1983. The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, Vol. 70, 1 (1983), 41--55.Google ScholarCross Ref
Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations As Treatments: Debiasing Learning and Evaluation Proc. of the 33rd International Conference on International Conference on Machine Learning (ICML). 1670--1679. Google ScholarDigital Library
Adith Swaminathan and Thorsten Joachims. 2015. Batch Learning from Logged Bandit Feedback Through Counterfactual Risk Minimization. Journal of Machine Learning Research Vol. 16 (2015), 1731--1755. Issue 1. Google ScholarDigital Library
Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search Proc. of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 115--124. Google ScholarDigital Library
Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond Position Bias: Examining Result Attractiveness As a Source of Presentation Bias in Clickthrough Data. In Proc. of the 19th International Conference on World Wide Web (WWW). 1011--1018. Google ScholarDigital Library
Hamed Zamani, Michael Bendersky, Xuanhui Wang, and Mingyang Zhang. 2017. Situational Context for Ranking in Personal Search Proc. of the 26th International Conference on World Wide Web (WWW). 1531--1540. Google ScholarDigital Library
Zeyuan Allen Zhu, Weizhu Chen, Tom Minka, Chenguang Zhu, and Zheng Chen. 2010. A novel click model and its applications to online advertising Proc. of the 3rd ACM International Conference on Web Search and Data Mining (WSDM). 321--330. Google ScholarDigital Library

Index Terms

Position Bias Estimation for Unbiased Learning to Rank in Personal Search
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank

Recommendations

Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm
WWW '19: The World Wide Web Conference

Recently a number of algorithms under the theme of 'unbiased learning-to-rank' have been proposed, which can reduce position bias, the major type of bias in click data, and train a high-performance ranker with click data. Most of the existing algorithms,...
Read More
Unbiased Learning to Rank with Unbiased Propensity Estimation
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Learning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework ...
Read More
Learning to Rank with Selection Bias in Personal Search
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Click-through data has proven to be a critical resource for improving search ranking quality. Though a large amount of click data can be easily collected by search engines, various biases make it difficult to fully leverage this type of data. In the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
February 2018
821 pages
ISBN:9781450355810
DOI:10.1145/3159652
General Chairs:
Yi Chang
Jilin University, Huawei Inc.
,
Chengxiang Zhai
University of Illinois Urbana-Champaign
,
Program Chairs:
Yan Liu
University of Southern California
,
Yoelle Maarek
Amazon
Copyright © 2018 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 February 2018
Check for updates
Author Tags
expectation-maximization
inverse propensity weighting
position bias estimation
Qualifiers
- research-article
Conference

Acceptance Rates
WSDM '18 Paper Acceptance Rate81of514submissions,16%Overall Acceptance Rate498of2,863submissions,17%
More
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 143
  Total Citations
  View Citations
- 5,590
  Total Downloads
- Downloads (Last 12 months)1,131
- Downloads (Last 6 weeks)159
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Position Bias Estimation for Unbiased Learning to Rank in Personal Search

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm

Unbiased Learning to Rank with Unbiased Propensity Estimation

Learning to Rank with Selection Bias in Personal Search