research-article

Open Access

Ad click prediction: a view from the trenches

Authors:
H. Brendan McMahan

Google, Seattle, WA, USA

Google, Seattle, WA, USA
View Profile

,
Gary Holt

Google, Pittsburgh, PA, USA

Google, Pittsburgh, PA, USA
View Profile

,
D. Sculley

Google, Pittsburgh, PA, USA

Google, Pittsburgh, PA, USA
View Profile

,
Michael Young

Google, Pittsburgh, PA, USA

Google, Pittsburgh, PA, USA
View Profile

,
Dietmar Ebner

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Julian Grady

Google, Pittsburgh, PA, USA

Google, Pittsburgh, PA, USA
View Profile

,
Lan Nie

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Todd Phillips

Google, Pittsburgh, PA, USA

Google, Pittsburgh, PA, USA
View Profile

,
Eugene Davydov

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Daniel Golovin

Google, Pittsburgh, PA, USA

Google, Pittsburgh, PA, USA
View Profile

,
Sharat Chikkerur

Google, Pittsburgh, PA, USA

Google, Pittsburgh, PA, USA
View Profile

,
Dan Liu

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Martin Wattenberg

Google, Cambridge, MA, USA

Google, Cambridge, MA, USA
View Profile

,
Arnar Mar Hrafnkelsson

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Tom Boulos

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Jeremy Kubica

Google, Pittsburgh, PA, USA

Google, Pittsburgh, PA, USA
View Profile

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2013Pages 1222–1230https://doi.org/10.1145/2487575.2488200

Published:11 August 2013Publication History

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 1222–1230

ABSTRACT

Predicting ad click-through rates (CTR) is a massive-scale learning problem that is central to the multi-billion dollar online advertising industry. We present a selection of case studies and topics drawn from recent experiments in the setting of a deployed CTR prediction system. These include improvements in the context of traditional supervised learning based on an FTRL-Proximal online learning algorithm (which has excellent sparsity and convergence properties) and the use of per-coordinate learning rates.

We also explore some of the challenges that arise in a real-world system that may appear at first to be outside the domain of traditional machine learning research. These include useful tricks for memory savings, methods for assessing and visualizing performance, practical methods for providing confidence estimates for predicted probabilities, calibration methods, and methods for automated management of features. Finally, we also detail several directions that did not turn out to be beneficial for us, despite promising results elsewhere in the literature. The goal of this paper is to highlight the close relationship between theoretical advances and practical engineering in this industrial setting, and to show the depth of challenges that appear when applying traditional machine learning methods in a complex dynamic system.

References

D. Agarwal, B.-C. Chen, and P. Elango. Spatio-temporal models for estimating click-through rate. In Proceedings of the 18th international conference on World wide web, pages 21--30. ACM, 2009. Google ScholarDigital Library
R. Ananthanarayanan, V. Basker, S. Das, A. Gupta, H. Jiang, T. Qiu, A. Reznichenko, D. Ryabkov, M. Singh, and S. Venkataraman. Photon: Fault-tolerant and scalable joining of continuous data streams. In SIGMOD Conference, 2013. To appear. Google ScholarDigital Library
R. Bekkerman, M. Bilenko, and J. Langford. Scaling up machine learning: Parallel and distributed approaches. 2011. Google ScholarDigital Library
B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. Commun. ACM, 13(7), July 1970. Google ScholarDigital Library
A. Blum, A. Kalai, and J. Langford. Beating the hold-out: Bounds for k-fold and progressive cross-validation. In COLT, 1999. Google ScholarDigital Library
O. Chapelle. Click modeling for display advertising. In AdML: 2012 ICML Workshop on Online Advertising, 2012.Google Scholar
C. Cortes, M. Mohri, M. Riley, and A. Rostamizadeh. Sample selection bias correction theory. In ALT, 2008. Google ScholarDigital Library
J. Dean, G. S. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, and A. Y. Ng. Large scale distributed deep networks. In NIPS, 2012.Google ScholarDigital Library
T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine learning, 40(2):139--157, 2000. Google ScholarDigital Library
J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. In COLT, 2010.Google Scholar
J. Duchi and Y. Singer. Efficient learning using forward-backward splitting. In Advances in Neural Information Processing Systems 22, pages 495--503. 2009.Google Scholar
L. Fan, P. Cao, J. Almeida, and A. Broder. Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Transactions on Networking, 8(3), jun 2000. Google ScholarDigital Library
T. Fawcett. An introduction to roc analysis. Pattern recognition letters, 27(8):861--874, 2006. Google ScholarDigital Library
D. Golovin, D. Sculley, H. B. McMahan, and M. Young. Large-scale learning with a small-scale footprint. In ICML, 2013. To appear.Google Scholar
T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-scale Bayesian click-through rate prediction for sponsored search advertising in microsofts bing search engine. In Proc. 27th Internat. Conf. on Machine Learning, 2010.Google ScholarDigital Library
D. Hillard, S. Schroedl, E. Manavoglu, H. Raghavan, and C. Leggetter. Improving ad relevance in sponsored search. In Proceedings of the third ACM international conference on Web search and data mining, WSDM '10, pages 361--370, 2010. Google ScholarDigital Library
G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580, 2012.Google Scholar
D. W. Hosmer and S. Lemeshow. Applied logistic regression. Wiley-Interscience Publication, 2000.Google ScholarCross Ref
H. A. Koepke and M. Bilenko. Fast prediction of new feature utility. In ICML, 2012.Google Scholar
J. Langford, L. Li, and T. Zhang. Sparse online learning via truncated gradient. JMLR, 10, 2009. Google ScholarDigital Library
S.-M. Li, M. Mahdian, and R. P. McAfee. Value of learning in sponsored search auctions. In WINE, 2010. Google ScholarDigital Library
W. Li, X. Wang, R. Zhang, Y. Cui, J. Mao, and R. Jin. Exploitation and exploration in a performance based contextual advertising system. In KDD, 2010. Google ScholarDigital Library
R. Luss, S. Rosset, and M. Shahar. Efficient regularized isotonic regression with application to gene--gene interaction search. Ann. Appl. Stat., 6(1), 2012.Google Scholar
H. B. McMahan. Follow-the-regularized-leader and mirror descent: Equivalence theorems and L1 regularization. In AISTATS, 2011.Google Scholar
H. B. McMahan and O. Muralidharan. On calibrated predictions for auction selection mechanisms. CoRR, abs/1211.3955, 2012.Google Scholar
H. B. McMahan and M. Streeter. Adaptive bound optimization for online convex optimization. In COLT, 2010.Google Scholar
A. Niculescu-Mizil and R. Caruana. Predicting good probabilities with supervised learning. In ICML, ICML '05, 2005. Google ScholarDigital Library
M. Richardson, E. Dominowska, and R. Ragno. Predicting clicks: estimating the click-through rate for new ads. In Proceedings of the 16th international conference on World Wide Web, pages 521--530. ACM, 2007. Google ScholarDigital Library
M. J. Streeter and H. B. McMahan. Less regret via online conditioning. CoRR, abs/1002.4862, 2010.Google Scholar
D. Tang, A. Agarwal, D. O'Brien, and M. Meyer. Overlapping experiment infrastructure: more, better, faster experimentation. In KDD, pages 17--26, 2010. Google ScholarDigital Library
K. Weinberger, A. Dasgupta, J. Langford, A. Smola, and J. Attenberg. Feature hashing for large scale multitask learning. In ICML, pages 1113--1120. ACM, 2009. Google ScholarDigital Library
L. Xiao. Dual averaging method for regularized stochastic learning and online optimization. In NIPS, 2009.Google Scholar
Z. A. Zhu, W. Chen, T. Minka, C. Zhu, and Z. Chen. A novel click model and its applications to online advertising. In Proceedings of the third ACM international conference on Web search and data mining, pages 321--330. ACM, 2010. Google ScholarDigital Library
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, 2003.Google ScholarDigital Library

Index Terms

Ad click prediction: a view from the trenches
1. Computing methodologies
  1. Machine learning

Recommendations

Cost-per-Click Pricing for Display Advertising

Display advertising is a $25 billion business with a promising upward revenue trend. In this paper, we consider an online display advertising setting in which a web publisher posts display ads on its website and charges based on the cost-per-click ...
Read More
Improving click-through rate prediction accuracy in online advertising by transfer learning
WI '17: Proceedings of the International Conference on Web Intelligence

As the main revenue source of Internet companies, online advertising is always a significant topic, where click-through rate (CTR) prediction plays a central role. In online advertising systems, there are often many advertisement products. Due to the ...
Read More
Click-through rate prediction in online advertising: A literature review
Highlights
- We make a comprehensive literature review on state-of-the-art and latest CTR prediction research, with a special focus on modeling frameworks.
Abstract
Predicting the probability that a user will click on a specific advertisement has been a prevalent issue in online advertising, attracting much research attention in the past decades. As a hot research frontier driven by industrial ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2013
1534 pages
ISBN:9781450321747
DOI:10.1145/2487575
Editors:
Rayid Ghani
University of Chicago
,
Ted E. Senator
SAIC
,
Paul Bradley
MethodCare, Inc.
,
Rajesh Parekh
Groupon
,
Jingrui He
Stevens Institute of Technology
,
General Chairs:
Robert L. Grossman
University of Chicago and Open Data Group
,
Ramasamy Uthurusamy
General Motors Corporation (retired)
,
Program Chairs:
Inderjit S. Dhillon
University of Texas
,
Yehuda Koren
Google
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 August 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data mining
large-scale learning
online advertising
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '13 Paper Acceptance Rate125of726submissions,17%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 555
  Total Citations
  View Citations
- 13,602
  Total Downloads
- Downloads (Last 12 months)2,954
- Downloads (Last 6 weeks)226
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Ad click prediction: a view from the trenches

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Cost-per-Click Pricing for Display Advertising

Improving click-through rate prediction accuracy in online advertising by transfer learning

Click-through rate prediction in online advertising: A literature review

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Ad click prediction: a view from the trenches

KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Cost-per-Click Pricing for Display Advertising

Improving click-through rate prediction accuracy in online advertising by transfer learning

Click-through rate prediction in online advertising: A literature review

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media