skip to main content
10.1145/3159652.3159732acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article
Open Access

Position Bias Estimation for Unbiased Learning to Rank in Personal Search

Published:02 February 2018Publication History

ABSTRACT

A well-known challenge in learning from click data is its inherent bias and most notably position bias. Traditional click models aim to extract the ‹query, document› relevance and the estimated bias is usually discarded after relevance is extracted. In contrast, the most recent work on unbiased learning-to-rank can effectively leverage the bias and thus focuses on estimating bias rather than relevance [20, 31]. Existing approaches use search result randomization over a small percentage of production traffic to estimate the position bias. This is not desired because result randomization can negatively impact users' search experience. In this paper, we compare different schemes for result randomization (i.e., RandTopN and RandPair) and show their negative effect in personal search. Then we study how to infer such bias from regular click data without relying on randomization. We propose a regression-based Expectation-Maximization (EM) algorithm that is based on a position bias click model and that can handle highly sparse clicks in personal search. We evaluate our EM algorithm and the extracted bias in the learning-to-rank setting. Our results show that it is promising to extract position bias from regular clicks without result randomization. The extracted bias can improve the learning-to-rank algorithms significantly. In addition, we compare the pointwise and pairwise learning-to-rank models. Our results show that pairwise models are more effective in leveraging the estimated bias.

References

  1. Aman Agarwal, Soumya Basu, Tobias Schnabel, and Thorsten Joachims. 2017. Effective Evaluation Using Logged Bandit Feedback from Multiple Loggers Proc. of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 687--696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Qingyao Ai, Susan T. Dumais, Nick Craswell, and Dan Liebling. 2017. Characterizing Email Search Using Large-scale Behavioral Logs and Surveys Proc. of the 26th International Conference on World Wide Web (WWW). 1511--1520. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Michael Bendersky, Xuanhui Wang, Donald Metzler, and Marc Najork. 2017. Learning from User Interactions in Personal Search via Attribute Parameterization Proc. of the 10th ACM International Conference on Web Search and Data Mining (WSDM). 791--799. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Christopher J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview. Technical Report MSR-TR-2010-82. Microsoft Research. http://research.microsoft.com/apps/pubs/default.aspx?id=132652Google ScholarGoogle Scholar
  5. David Carmel, Guy Halawi, Liane Lewin-Eytan, Yoelle Maarek, and Ariel Raviv. 2015. Rank by Time or by Relevance?: Revisiting Email Search Proc. of the 24th ACM International Conference on Information and Knowledge Management (CIKM). 283--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. David Carmel, Liane Lewin-Eytan, Alex Libov, Yoelle Maarek, and Ariel Raviv. 2017. Promoting Relevant Results in Time-Ranked Mail Search Proc. of the 26th International Conference on World Wide Web (WWW). 1551--1559. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking Proc. of the 18th International Conference on World Wide Web (WWW). 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Weizhu Chen, Zhanglong Ji, Si Shen, and Qiang Yang. 2011. A Whole Page Click Model to Better Interpret Search Engine Click Data. Proc. of the 25th AAAI Conference on Artificial Intelligence (AAAI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. 2015. Click Models for Web Search. Morgan & Claypool.Google ScholarGoogle Scholar
  10. Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An Experimental Comparison of Click Position-bias Models Proc. of the 1st International Conference on Web Search and Data Mining (WSDM). 87--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (Methodological) Vol. 39 (1977), 1--38. Issue 1.Google ScholarGoogle ScholarCross RefCross Ref
  12. Miroslav Dudík, John Langford, and Lihong Li. 2011. Doubly Robust Policy Evaluation and Learning. In Proc. of the 28th International Conference on Machine Learning (ICML). 1097--1104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Susan Dumais, Edward Cutrell, JJ Cadiz, Gavin Jancke, Raman Sarin, and Daniel C. Robbins. 2003. Stuff I've Seen: A System for Personal Information Retrieval and Re-Use Proc. of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 72--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Georges E. Dupret and Benjamin Piwowarski. 2008. A User Browsing Model to Predict Search Engine Click Data from Past Observations Proc. of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 331--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. David Elsweiler, Morgan Harvey, and Martin Hacker. 2011. Understanding re-finding behavior in naturalistic email interaction logs Proc. of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 35--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jerome H. Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics Vol. 29 (2000), 1189--1232.Google ScholarGoogle ScholarCross RefCross Ref
  17. Fan Guo, Chao Liu, Anitha Kannan, Tom Minka, Michael Taylor, Yi-Min Wang, and Christos Faloutsos. 2009. Click chain model in web search. In Proc. of the 18th International Conference on World Wide Web (WWW). 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2001. The Elements of Statistical Learning. Springer.Google ScholarGoogle Scholar
  19. Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately Interpreting Clickthrough Data As Implicit Feedback Proc. of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 154--161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proc. of the 10th ACM International Conference on Web Search and Data Mining (WSDM). 781--789. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Maryam Kamvar, Melanie Kellar, Rajan Patel, and Ya Xu. 2009. Computers and iPhones and Mobile Phones, Oh My!: A Logs-based Comparison of Search Users on Different Devices. In Proc. of the 18th International Conference on World Wide Web (WWW). 801--810. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jin Young Kim, Nick Craswell, Susan Dumais, Filip Radlinski, and Fang Liu. 2017. Understanding and Modeling Success in Email Search Proc. of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 265--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Lihong Li, Shunbao Chen, Jim Kleban, and Ankur Gupta. 2015. Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study Proc. of the 24th International Conference on World Wide Web (WWW) Companion. 929--934. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms. In Proc. of the 4th International Conference on Web Search and Web Data Mining (WSDM). 297--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, Vol. 3, 3 (2009), 225--331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Maeve O'Brien and Mark T. Keane. 2006. Modeling result--list searching in the World Wide Web: The role of relevance topologies and trust bias. In Proc. of the 28th Annual Conference of the Cognitive Science Society (CogSci). 1--881.Google ScholarGoogle Scholar
  27. Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting Clicks: Estimating the Click-Through Rate for New Ads Proc. of the 16th International Conference on World Wide Web (WWW). 521--530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Paul R. Rosenbaum and Donald B. Rubin. 1983. The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, Vol. 70, 1 (1983), 41--55.Google ScholarGoogle ScholarCross RefCross Ref
  29. Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations As Treatments: Debiasing Learning and Evaluation Proc. of the 33rd International Conference on International Conference on Machine Learning (ICML). 1670--1679. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Adith Swaminathan and Thorsten Joachims. 2015. Batch Learning from Logged Bandit Feedback Through Counterfactual Risk Minimization. Journal of Machine Learning Research Vol. 16 (2015), 1731--1755. Issue 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search Proc. of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 115--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond Position Bias: Examining Result Attractiveness As a Source of Presentation Bias in Clickthrough Data. In Proc. of the 19th International Conference on World Wide Web (WWW). 1011--1018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Hamed Zamani, Michael Bendersky, Xuanhui Wang, and Mingyang Zhang. 2017. Situational Context for Ranking in Personal Search Proc. of the 26th International Conference on World Wide Web (WWW). 1531--1540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Zeyuan Allen Zhu, Weizhu Chen, Tom Minka, Chenguang Zhu, and Zheng Chen. 2010. A novel click model and its applications to online advertising Proc. of the 3rd ACM International Conference on Web Search and Data Mining (WSDM). 321--330. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Position Bias Estimation for Unbiased Learning to Rank in Personal Search

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
      February 2018
      821 pages
      ISBN:9781450355810
      DOI:10.1145/3159652

      Copyright © 2018 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 February 2018

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WSDM '18 Paper Acceptance Rate81of514submissions,16%Overall Acceptance Rate498of2,863submissions,17%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader