skip to main content
10.1145/1935826.1935922acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
poster

On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals

Published:09 February 2011Publication History

ABSTRACT

Modern web search engines are federated --- a user query is sent to the numerous specialized search engines called verticals like web (text documents), News, Image, Video, etc. and the results returned by these engines are then aggregated and composed into a search result page (SERP) and presented to the user. For a specific query, multiple verticals could be relevant, which makes the placement of these vertical results within blocks of textual web results challenging: how do we represent, assess, and compare the relevance of these heterogeneous entities?

In this paper we present a machine-learning framework for SERP composition in the presence of multiple relevant verticals. First, instead of using the traditional label generation method of human judgment guidelines and trained judges, we use a randomized online auditioning system that allows us to evaluate triples of the form query, web block, vertical>. We use a pairwise click preference to evaluate whether the web block or the vertical block had a better users' engagement. Next, we use a hinged feature vector that contains features from the web block to create a common reference frame and augment it with features representing the specific vertical judged by the user. A gradient boosted decision tree is then learned from the training data. For the final composition of the SERP, we place a vertical result at a slot if the score is higher than a computed threshold. The thresholds are algorithmically determined to guarantee specific coverage for verticals at each slot.

We use correlation of clicks as our offline metric and show that click-preference target has a better correlation than human judgments based models. Furthermore, on online tests for News and Image verticals we show higher user engagement for both head and tail queries.

References

  1. J. Arguello, J. Callan, F. Diaz, and J. F. Crespo. Source of evidence for vertical selection. In Proc. of Ann. Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Arguello, F. Diaz, and J. F. Paiement. Vertical selection in presence of unlabeled verticals. In Proc. of Ann. Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. pages 89--96, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. O. Chapelle and Ya Zhang. A dynamic bayesian network model for web search ranking. In Proc. of Intl. Conf. on World Wide Web, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. Diaz. Integration of news content into web results. In Proc. of Intl. Conf. on Web Search and Data Mining, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Diaz and J. Arguello. Adaptation of offline selection predictions in presense of user feedback. In Proc. of Ann. Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Donmez, K. M. Svore, and C. J. C. Burges. On the local optimality of LambdaRank. In Proc. of Ann. Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. H. Friedman. Greedy function approximation: A graidient boosting machine. Annals of Statistics, 29:1189--1232, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. H. Friedman. Stochastic gradient boosting. Computational Statistics and Data Analysis, 38:367--378, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Sringer-Verlag, New York, NY, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  11. S. Ji, T. Moon, G. Dupret, C. Liao, and Z. Zheng. User behavior driven ranking without editorial judgments. In Proc. of Intl. Conf. on Information and Knowledge Management, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Joachims. Optimizing search engines using clickthrough data. In Proc. 8th Ann. Intl. ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, pages 133--142, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Joachims, L. Granka, B. Pan, H. Hembrooke, F. Radlinkski, and G. Gay. Evaluating the accuracy of implicit feedback from clicks and query reformations in web search. ACM Trans. on Information Retrieval, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Li, S. Huffman, and A. Tokuda. Good abandonment in mobile and PC internet search. In PRoc. of Ann. Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Li, C. J. C. Burges, and Q. Wu. Mcrank: Learning to rank using multiple classification and gradient boosting. In Proc. 21st Proc. of Advances in Neural Information Processing Systems, 2007.Google ScholarGoogle Scholar
  16. V. Murdock and M. Lalmas. Workshop on aggregated search, 2008. http://www.sigir.org/forum/2008D/sigirwksp/2008d_sigirforum_murdock.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. Radlinski and T. Joachims. Minimally invasive randomization for collecting unbiased preferences from clickthrough logs. In Proc. of AAAI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Robertson and S. Walker. Some simple approximations to the 2-poisson model for probabilistic weighted retrieval. In Proc. of Ann. Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Shokouhi and L. Si. Federated information retrieval. In D. W. Oard and Editors F. Sebastiani, editors, Foundations and Trends in Information Retrieval. 2010.Google ScholarGoogle Scholar
  20. M. Shokouhi, J. Zobel, S. Tahaghoghi, and F. Scholer. Using query logs to establish vocabularies in distributed information retrieval. Information Processing and Management, 43(1):169--180, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Si and J. Callan. Modeling search engine effectiveness for federated search. In Proc. of Ann. Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Z. Zheng, H. Zha,, K. Chen, and G. Sun. A regression framework for learning ranking functions using relative judgments. In Ann. Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 287--294, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. In Proc. 21st Proc. of Advances in Neural Information Processing Systems, 2007.Google ScholarGoogle Scholar

Index Terms

  1. On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '11: Proceedings of the fourth ACM international conference on Web search and data mining
      February 2011
      870 pages
      ISBN:9781450304931
      DOI:10.1145/1935826

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 February 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      WSDM '11 Paper Acceptance Rate83of372submissions,22%Overall Acceptance Rate498of2,863submissions,17%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader