skip to main content
10.1145/3298689.3346997acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
research-article

Recommending what video to watch next: a multitask ranking system

Published:10 September 2019Publication History

ABSTRACT

In this paper, we introduce a large scale multi-objective ranking system for recommending what video to watch next on an industrial video sharing platform. The system faces many real-world challenges, including the presence of multiple competing ranking objectives, as well as implicit selection biases in user feedback. To tackle these challenges, we explored a variety of soft-parameter sharing techniques such as Multi-gate Mixture-of-Experts so as to efficiently optimize for multiple ranking objectives. Additionally, we mitigated the selection biases by adopting a Wide & Deep framework. We demonstrated that our proposed techniques can lead to substantial improvements on recommendation quality on one of the world's largest video sharing platforms.

References

  1. Abien Fred Agarap. 2018. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375 (2018).Google ScholarGoogle Scholar
  2. Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2019. Estimating Position Bias without Intrusive Interventions. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 474--482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Deepak Agarwal, Bee-Chung Chen, and Bo Long. 2011. Localized factor models for multi-context recommendation. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 609--617. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Denis Baylor, Eric Breck, Heng-Tze Cheng, Noah Fiedel, Chuan Yu Foo, Zakaria Haque, Salem Haykal, Mustafa Ispir, Vihan Jain, Levent Koc, et al. 2017. Tfx: A tensorflow-based production-scale machine learning platform. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1387--1395. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alex Beutel, Jilin Chen, Zhe Zhao, and Ed H Chi. 2017. Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv 1707.00075 (2017).Google ScholarGoogle Scholar
  6. Christopher Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Gregory N Hullender. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine learning (ICML-05). 89--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Rich Caruana. 1997. Multitask learning. Machine learning 28, 1 (1997), 41--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th international conference on World wide web. ACM, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. ACM, 7--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for YouTube Recommendations. In Proceedings of the 10th ACM conference on recommender systems. ACM, 191--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, et al. 2010. The YouTube video recommendation system. In Proceedings of the fourth ACM conference on Recommender systems. ACM, 293--296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  13. Humaira Ehsan, Mohamed A Sharaf, and Panos K Chrysanthis. 2016. Muve: Efficient multi-objective view recommendation for visual data exploration. In 2016 IEEE 32nd International Conference on Data Engineering (ICDE). IEEE, 731--742.Google ScholarGoogle ScholarCross RefCross Ref
  14. Chantat Eksombatchai, Pranav Jindal, Jerry Zitao Liu, Yuchen Liu, Rahul Sharma, Charles Sugnet, Mark Ulrich, and Jure Leskovec. 2018. Pixie: A system for recommending 3+ billion items to 200+ million users in real-time. In Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1775--1784. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ali Mamdouh Elkahky, Yang Song, and Xiaodong He. 2015. A multi-view deep learning approach for cross domain user modeling in recommendation systems. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 278--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Antonino Freno. 2017. Practical Lessons from Developing a Large-Scale Recommender System at Zalando. In Proceedings of the Eleventh ACM Conference on Recommender Systems. ACM, 251--259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Florent Garcin, Boi Faltings, Olivier Donatsch, Ayar Alazzawi, Christophe Bruttin, and Amr Huber. 2014. Offline and online evaluation of news recommender systems at swissinfo. ch. In Proceedings of the 8th ACM Conference on Recommender systems. ACM, 169--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Qi Gu, Ting Bai, Wayne Xin Zhao, and Ji-Rong Wen. 2018. A Neural Labeled Network Embedding Approach to Product Adopter Prediction. In Asia Information Retrieval Symposium. Springer, 77--89.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. 2013. Wtf: The who to follow service at twitter. In Proceedings of the 22nd international conference on World Wide Web. ACM, 505--514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, et al. 2014. Practical lessons from predicting clicks on ads at facebook. In Proceedings of the Eighth International Workshop on Data Mining for Online Advertising. ACM, 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Robert A Jacobs, Michael I Jordan, Steven J Nowlan, Geoffrey E Hinton, et al. 1991. Adaptive mixtures of local experts. Neural computation 3, 1 (1991), 79--87.Google ScholarGoogle Scholar
  22. Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, Filip Radlinski, and Geri Gay. 2007. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Transactions on Information Systems (TOIS) 25, 2 (2007), 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, 781--789. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems. 3146--3154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Walid Krichene, Nicolas Mayoraz, Steffen Rendle, Li Zhang, Xinyang Yi, Lichan Hong, Ed Chi, and John Anderson. 2018. Efficient training on very large corpora via gramian estimation. arXiv preprint arXiv:1807.07187 (2018).Google ScholarGoogle Scholar
  26. David C Liu, Stephanie Rogers, Raymond Shiau, Dmitry Kislyuk, Kevin C Ma, Zhigang Zhong, Jenny Liu, and Yushi Jing. 2017. Related pins at pinterest: The evolution of a real-world recommender system. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 583--592. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mingsheng Long and Jianmin Wang. 2015. Learning multiple tasks with deep relationship networks. arXiv preprint arXiv:1506.02117 2 (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Yichao Lu, Ruihai Dong, and Barry Smyth. 2018. Why I like it: multi-task learning for recommendation and explanation. In Proceedings of the 12th ACM Conference on Recommender Systems. ACM, 4--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jiaqi Ma, Zhe Zhao, Jilin Chen, Ang Li, Lichan Hong, and Ed Chi. 2019. SNR: Sub-Network Routing for Flexible Parameter Sharing in Multi-task Learning. AAAI (2019).Google ScholarGoogle Scholar
  30. Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H Chi. 2018. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1930--1939. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Xia Ning and George Karypis. 2010. Multi-task learning for recommender system. In Proceedings of 2nd Asian Conference on Machine Learning. 269--284.Google ScholarGoogle Scholar
  32. Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv preprint arXiv:1701.06538 (2017).Google ScholarGoogle Scholar
  33. Ayan Sinha, David F Gleich, and Karthik Ramani. 2016. Deconvolving feedback loops in recommender systems. In Advances in Neural Information Processing Systems. 3243--3251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Adith Swaminathan and Thorsten Joachims. 2015. Batch learning from logged bandit feedback through counterfactual risk minimization. Journal of Machine Learning Research 16, 1 (2015), 1731--1755. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jiaxi Tang, Francois Belletti, Sagar Jain, Minmin Chen, Alex Beutel, Can Xu, and Ed H Chi. 2019. Towards Neural Mixture Recommender for Long Range Dependent User Sequences. arXiv preprint arXiv:1902.08588 (2019). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jiaxi Tang and Ke Wang. 2018. Ranking distillation: Learning compact ranking models with high performance for recommender system. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2289--2298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial discriminative domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7167--7176.Google ScholarGoogle ScholarCross RefCross Ref
  38. Nan Wang, Hongning Wang, Yiling Jia, and Yue Yin. 2018. Explainable recommendation via multi-task learning in opinionated text data. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 165--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & cross network for ad click predictions. In Proceedings of the ADKDD'17. ACM, 12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Shanfeng Wang, Maoguo Gong, Haoliang Li, and Junwei Yang. 2016. Multi-objective optimization for long tail recommendation. Knowledge-Based Systems 104 (2016), 145--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval. ACM, 115--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Andrew Zhai, Dmitry Kislyuk, Yushi Jing, Michael Feng, Eric Tzeng, Jeff Donahue, Yue Li Du, and Trevor Darrell. 2017. Visual discovery at pinterest. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 515--524. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR) 52, 1 (2019), 5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xiaojian Zhao, Guangda Li, Meng Wang, Jin Yuan, Zheng-Jun Zha, Zhoujun Li, and Tat-Seng Chua. 2011. Integrating rich information for video recommendation with multi-task rank aggregation. In Proceedings of the 19th ACM international conference on Multimedia. ACM, 1521--1524. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Zhe Zhao, Zhiyuan Cheng, Lichan Hong, and Ed H Chi. 2015. Improving user topic interest profiles by behavior factorization. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1406--1416. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Recommending what video to watch next: a multitask ranking system

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              RecSys '19: Proceedings of the 13th ACM Conference on Recommender Systems
              September 2019
              635 pages
              ISBN:9781450362436
              DOI:10.1145/3298689

              Copyright © 2019 Owner/Author

              This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 10 September 2019

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              RecSys '19 Paper Acceptance Rate36of189submissions,19%Overall Acceptance Rate254of1,295submissions,20%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader