skip to main content
research-article

Active Learning for Web Search Ranking via Noise Injection

Authors Info & Claims
Published:23 January 2015Publication History
Skip Abstract Section

Abstract

Learning to rank has become increasingly important for many information retrieval applications. To reduce the labeling cost at training data preparation, many active sampling algorithms have been proposed. In this article, we propose a novel active learning-for-ranking strategy called ranking-based sensitivity sampling (RSS), which is tailored for Gradient Boosting Decision Tree (GBDT), a machine-learned ranking method widely used in practice by major commercial search engines for ranking. We leverage the property of GBDT that samples close to the decision boundary tend to be sensitive to perturbations and design the active learning strategy accordingly. We further theoretically analyze the proposed strategy by exploring the connection between the sensitivity used for sample selection and model regularization to provide a potentially theoretical guarantee w.r.t. the generalization capability. Considering that the performance metrics of ranking overweight the top-ranked items, item rank is incorporated into the selection function. In addition, we generalize the proposed technique to several other base learners to show its potential applicability in a wide variety of applications. Substantial experimental results on both the benchmark dataset and a real-world dataset have demonstrated that our proposed active learning strategy is highly effective in selecting the most informative examples.

References

  1. N. Abe and H. Mamitsuka. 1998. Query learning strategies using boosting and bagging. In Proceedings of the 15th International Conference on Machine Learning (ICML’98). 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Ailon. 2011. Active learning ranking from pairwise preferences with almost optimal query complexity. In Advances in Neural Information Processing Systems (NIPS’11). 810--818.Google ScholarGoogle Scholar
  3. J. A. Aslam, E. Kanoulas, V. Pavlu, S. Savev, and E. Yilmaz. 2009. Document selection methodologies for efficient and effective learning-to-rank. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’09). 468--475. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Bilgic and P. N. Bennett. 2012. Active query selection for learning rankers. In Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Bishop. 1995. Training with noise is equivalent to tikhonov regularization. Neural Computation (1995), 108--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Cai, W. Gao, A. Zhou, and K. F. Wong. 2011. Relevant knowledge helps in choosing right teacher: Active query selection for ranking adaptation. In Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’11). 115--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Cai and Y. Zhang. 2012. Variance maximization via noise injection for active sampling in learning to rank. In Proceedings of the 21st Conference on Information and Knowledge Management (CIKM’12). 1809--1813. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. B. Cao, J. Xu, T. Y. Liu, and H. Li. 2006. Adapting ranking SVM to document retrieval. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’06). 186--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. O. Chapelle, P. Shivaswamy, S. Vadrevu, K. Weinberger, Y. Zhang, and B. Tseng. 2010. Multi-task learning for boosting with application to web search ranking. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). 1189--1198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. O. Chapelle, P. Shivaswamy, S. Vadrevu, K. Weinberger, Y. Zhang, and B. Tseng. 2011. Boosted multi-task learning. Machine Learning 85, 1--2 (2011), 149--173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. A. Chon, Z. Ghahramani, and M. I. Jordan. 1996. Active learning with statistical models. Journal of Machine Learning Research (1996), 129--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Cossock and T. Zhang. 2006. Subset ranking using regression. In Proceedings of the 16th International Conference on Learning Theory (COLT’06). 605--619. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Donmez and J. G. Carbonell. 2008. Optimizing estimated loss reduction for active sampling in rank learning. In Proceedings of the 25th International Conference on Machine Learning (ICML’08). 248--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. 2003. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research 4 (2003), 933--969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. 1997. Selective sampling using the query by committee algorithm. Machine Learning 28, 2--3 (1997), 133--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Friedman. 2001. Greedy function approximation: A gradient boosting machine. Annals of Statistics (2001), 1189--1232.Google ScholarGoogle Scholar
  17. T. Hastie, R. Tibshirani, and J. Friedman. 2001. The Elements of Statistical Learning. Springer.Google ScholarGoogle Scholar
  18. R. Herbrich, T. Graepel, and K. Obermayer. 2000. Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classifiers. MIT Press.Google ScholarGoogle Scholar
  19. K. Jarvelin and J. Kekalainen. 2000. IR evaluation methods for retrieving highly relevant documents. In Proceedings of the 23th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’00). 41--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. D. Lewis and W. A. Gale. 1994. A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’94). 3--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Y. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. 2007. LETOR: Benchmark dataset for research on learning to rank for information retrieval. In Proceedings of SIGIR 2007 Workshop on Learning to Rank for Information Retrieval.Google ScholarGoogle Scholar
  22. B. Long, O. Chappelle, Y. Zhang, Y. Chang, Z. Zheng, and B. Tseng. 2010. Active learning for ranking through expected loss optimization. In Proceedings of the 33th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’10). 267--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Matsuoka. 1992. Noise injection into inputs in back-propagation learning. IEEE Transactions on Systems, Man, and Cybernetics 22, 3 (1992), 436--440.Google ScholarGoogle ScholarCross RefCross Ref
  24. H. Nguyen and A. Smeulders. 2004. Active learning using pre-clustering. In Proceedings of the 21st International Conference on Machine Learning (ICML’04). 623--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Qian, X. Wang, J. Wang, H. Li, N. Cao, W. Zhi, and I. Davisdon. 2013. Fast pairwise query selection for large-scale active learning to rank. In Proceedings of the 13th International Conference on Data Mining (ICDM’13). 607--616.Google ScholarGoogle Scholar
  26. N. Roy and A. McCallum. 2001. Toward optimal active learning through sampling estimation of error reduction. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 441--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Settles. 2012. Active Learning. Morgan & Claypool.Google ScholarGoogle Scholar
  28. K. Shen, J. Wu, Y. Zhang, Y. Han, X. Yang, L. Song, and X. Gu. 2013. Reorder users tweets. ACM Transactions on Intelligent Systems and Technology 4, 1 (2013), Article No. 6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Silva, M. A. Gonçalves, and A. Veloso. 2011. Rule-based active sampling for learning to rank. In Proceedings of European Conference on Machine Learning and Principles and Practise of Knowlege Discovery in Databases (ECML-PKDD’11). 240--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Taylor, J. Guiver, S. Robertson, and T. Minka. 2008. SoftRank: Optimizing non-smooth rank metrics. Proceedings of the 1st ACM International Conference on Web Search and Data Mining (WSDM’08). 77--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Tong and D. Koller. 2001. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research 2 (2001), 45--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. G. Valentini and T. G. Dietterich. 2004. Bias-variance analysis of support vector machines for the development of SVM-based ensemble methods. Journal of Machine Learning Research 5 (2004), 725--775. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. Xia, T. Y. Liu, J. Wang, W. Zhang, and H. Li. 2008. Listwise approach to learning to rank: Theory and algorithm. In Proceedings of the 25th International Conference on Machine Learning (ICML’08). 1192--1199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. L. Yang, L. Wang, B. Geng, and X. Hua. 2009. Query sampling for ranking learning in web search. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’09). 754--755. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. E. Yilmaz and S. Robertson. 2009. Deep versus shallow judgments in learning to rank. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’09). 662--663. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. H. Yu. 2005. SVM selective sampling for ranking with application to data retrieval. In Proceedings of the 11st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’05). 354--363. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Y. Yue, T. Finley, F. Radlinski, and T. Joachims. 2007. A support vector method for optimizing average precision. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07). 271--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. Zhu, H. Wang, E. Hovy, and M. Ma. 2010a. Confidence-based stopping criteria for active learning for data annotation. ACM Transactions on Speech and Language Processing 6, 3 (2010), Article No. 3, 1--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Zhu, H. Wang, B. Tsou, and M. Ma. 2010b. Active learning with sampling by uncertainty and density for data annotations. IEEE Transactions on Audio, Speech and Language Processing 18, 6 (2010), 1323--1331. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Active Learning for Web Search Ranking via Noise Injection

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 9, Issue 1
      January 2015
      178 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/2726021
      Issue’s Table of Contents

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 January 2015
      • Revised: 1 October 2014
      • Accepted: 1 October 2014
      • Received: 1 January 2014
      Published in tweb Volume 9, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader