skip to main content
10.1145/3097983.3098184acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

An Efficient Bandit Algorithm for Realtime Multivariate Optimization

Published:13 August 2017Publication History

ABSTRACT

Optimization is commonly employed to determine the content of web pages, such as to maximize conversions on landing pages or click-through rates on search engine result pages. Often the layout of these pages can be decoupled into several separate decisions. For example, the composition of a landing page may involve deciding which image to show, which wording to use, what color background to display, etc. Such optimization is a combinatorial problem over an exponentially large decision space. Randomized experiments do not scale well to this setting, and therefore, in practice, one is typically limited to optimizing a single aspect of a web page at a time. This represents a missed opportunity in both the speed of experimentation and the exploitation of possible interactions between layout decisions

Here we focus on multivariate optimization of interactive web pages. We formulate an approach where the possible interactions between different components of the page are modeled explicitly. We apply bandit methodology to explore the layout space efficiently and use hill-climbing to select optimal content in realtime. Our algorithm also extends to contextualization and personalization of layout selection. Simulation results show the suitability of our approach to large decision spaces with strong interactions between content. We further apply our algorithm to optimize a message that promotes adoption of an Amazon service. After only a single week of online optimization, we saw a 21% conversion increase compared to the median layout. Our technique is currently being deployed to optimize content across several locations at Amazon.com.

References

  1. Shipra Agrawal and Navin Goyal 2012. Analysis of Thompson Sampling for the Multi-armed Bandit Problem. COLT. 39--1.Google ScholarGoogle Scholar
  2. Shipra Agrawal and Navin Goyal 2013. Thompson Sampling for Contextual Bandits with Linear Payoffs Proceedings of the 30th International Conference on Machine Learning (ICML). JMLR, Atlanta, Georgia, 127--135.Google ScholarGoogle Scholar
  3. Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, Vol. 35, 8 (2013), 1798--1828. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. George EP Box, J Stuart Hunter, and William Gordon Hunter. 2005. Statistics for experimenters: design, innovation, and discovery. Vol. Vol. 2. Wiley-Interscience New York.Google ScholarGoogle Scholar
  5. Sebastien Bubeck and Nicolo Cesa-Bianchi 2012. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. Foundations and Trends in Machine learning Vol. 5, 1 (2012), 1--122. Google ScholarGoogle ScholarCross RefCross Ref
  6. George Casella and Roger L Berger 2002. Statistical inference. Vol. Vol. 2. Duxbury Pacific Grove, CA.Google ScholarGoogle Scholar
  7. Nicolò Cesa-Bianchi and Gábor Lugosi 2012. Combinatorial bandits. J. Comput. System Sci. Vol. 78, 5 (2012), 1404--1422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Olivier Chapelle and Lihong Li 2011. An empirical evaluation of thompson sampling. In Advances in neural information processing systems. 2249--2257.Google ScholarGoogle Scholar
  9. Flavio Chierichetti, Ravi Kumar, and Prabhakar Raghavan. 2011. Optimizing two-dimensional search results presentation Proceedings of the fourth ACM international conference on Web search and data mining. ACM, 257--266.Google ScholarGoogle Scholar
  10. Shein-Chung Chow, Hansheng Wang, and Jun Shao 2007. Sample size calculations in clinical research. CRC press.Google ScholarGoogle Scholar
  11. Varsha Dani, Thomas P. Hayes, and Sham M. Kakade. 2008. Stochastic Linear Optimization under Bandit Feedback Proceedings of the 21st Annual Conference on Learning Theory (COLT). Helsinki, Finland, 355--366.Google ScholarGoogle Scholar
  12. Thore Graepel, Joaquin Q Candela, Thomas Borchert, and Ralf Herbrich 2010. Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft's bing search engine. In Proceedings of International Conference on Machine Learning (ICML). Haifa, Israel, 13--20.Google ScholarGoogle Scholar
  13. V Roshan Joseph. 2006. Experiments: Planning, Analysis, and Parameter Design Optimization. IIE Transactions, Vol. 38, 6 (2006), 521--523. Google ScholarGoogle ScholarCross RefCross Ref
  14. Lihong Li, Wei Chu, John Langford, and Robert E Schapire. 2010. A contextual-bandit approach to personalized news article recommendation Proceedings of the 19th international conference on World wide web. ACM, 661--670.Google ScholarGoogle Scholar
  15. Elder Magalh aes Macambira and Cid Carvalho de Souza 2000. The edge-weighted clique problem: Valid inequalities, facets and polyhedral computations. European Journal of Operational Research Vol. 123, 2 (2000), 346--371.Google ScholarGoogle ScholarCross RefCross Ref
  16. Karthik Mohan and Ofer Dekel 2011. Online bipartite matching with partially-bandit feedback Proceedings of NIPS workshop on Discrete optimization in Machine Learning. Granada, Spain, 1--7.Google ScholarGoogle Scholar
  17. Houssam Nassif, Kemal Oral Cansizlar, Mitchell Goodman, and S. V. N. Vishwanathan 2016. Diversifying Music Recommendations. In Proceedings of Machine Learning for Music Discovery Workshop at 33rd International Conference on Machine Learning (ICML).Google ScholarGoogle Scholar
  18. Steffen Rendle. 2010. Factorization machines. In Data Mining (ICDM), 2010 IEEE 10th International Conference on. IEEE, 995--1000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Liang Tang, Romer Rosales, Ajit Singh, and Deepak Agarwal 2013. Automatic ad format selection via contextual bandits Proceedings of the 22nd ACM international conference on Conference on information & knowledge management. ACM, 1587--1594.Google ScholarGoogle Scholar
  20. Choon Hui Teo, Houssam Nassif, Daniel Hill, Sriram Srinivasan, Mitchell Goodman, Vijai Mohan, and S.V.N. Vishwanathan 2016. Adaptive, Personalized Diversity for Visual Discovery Proceedings of the 10th ACM Conference on Recommender Systems (RecSys). ACM, Boston, 35--38.Google ScholarGoogle Scholar
  21. Yue Wang, Dawei Yin, Luo Jie, Pengyuan Wang, Makoto Yamada, Yi Chang, and Qiaozhu Mei 2016. Beyond ranking: Optimizing whole-page presentation Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. ACM, 103--112.Google ScholarGoogle Scholar
  22. Yisong Yue and Carlos Guestrin 2011. Linear submodular bandits and their application to diversified retrieval Advances in Neural Information Processing Systems. 2483--2491.Google ScholarGoogle Scholar

Index Terms

  1. An Efficient Bandit Algorithm for Realtime Multivariate Optimization

                          Recommendations

                          Comments

                          Login options

                          Check if you have access through your login credentials or your institution to get full access on this article.

                          Sign in

                          PDF Format

                          View or Download as a PDF file.

                          PDF

                          eReader

                          View online with eReader.

                          eReader