research-article

Recommending ephemeral items at web scale

Authors:
Ye Chen

Microsoft Corporation, Mountain View, CA, USA

Microsoft Corporation, Mountain View, CA, USA
View Profile

,
John F. Canny

University of California, Berkeley, Berkeley, CA, USA

University of California, Berkeley, Berkeley, CA, USA
View Profile

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalJuly 2011Pages 1013–1022https://doi.org/10.1145/2009916.2010051

Published:24 July 2011Publication History

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Pages 1013–1022

ABSTRACT

We describe an innovative and scalable recommendation system successfully deployed at eBay. To build recommenders for long-tail marketplaces requires projection of volatile items into a persistent space of latent products. We first present a generative clustering model for collections of unstructured, heterogeneous, and ephemeral item data, under the assumption that items are generated from latent products. An item is represented as a vector of independently and distinctly distributed variables, while a latent product is characterized as a vector of probability distributions, respectively. The probability distributions are chosen as natural stochastic models for different types of data. The learning objective is to maximize the total intra-cluster coherence measured by the sum of log likelihoods of items under such a generative process. In the space of latent products, robust recommendations can then be derived using naive Bayes for ranking, from historical transactional data. Item-based recommendations are achieved by inferring latent products from unseen items. In particular, we develop a probabilistic scoring function of recommended items, which takes into account item-product membership, product purchase probability, and the important auction-end-time factor. With the holistic probabilistic measure of a prospective item purchase, one can further maximize the expected revenue and the more subjective user satisfaction as well. We evaluated the latent product clustering and recommendation ranking models using real-world e-commerce data from eBay, in both forms of offline simulation and online A/B testing. In the recent production launch, our system yielded 3-5 folds improvement over the existing production system in click-through, purchase-through and gross merchandising value; thus now driving 100% related recommendation traffic with billions of items at eBay. We believe that this work provides a practical yet principled framework for recommendation in the domains with affluent user self-input data.

References

D. Agarwal and B.-C. Chen. Regression-based latent factor models. ACM Conference on Knowledge Discovery and Data Mining (KDD 2009), pages 19--28, 2009. Google ScholarDigital Library
D. Agarwal and B.-C. Chen. fLDA: Matrix factorization through latent Dirichlet allocation. ACM International Conference on Web Search and Data Mining (WSDM 2010), 2010. Google ScholarDigital Library
J. Bennett and S. Lanning. The Netflix Prize. In KDDCup'07: Proceedings of KDD Cup and Workshop 2007, 2007.Google Scholar
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003. Google ScholarDigital Library
O. Chapelle and Y. Zhang. A dynamic Bayesian network click model for web search ranking. International World Wide Web Conference (WWW 2009), pages 1--10, 2009. Google ScholarDigital Library
Y. Chen and J. F. Canny. Probabilistic clustering of an item. U.S. Patent Application 12/694,885, filed: Jan, 2010.Google Scholar
Y. Chen and J. F. Canny. Probabilistic recommendation of an item. U.S. Patent Application 12/694,903, filed: Jan, 2010.Google Scholar
Y. Chen, M. Kapralov, D. Pavlov, and J. F. Canny. Factor modeling for advertisement targeting. Advances in Neural Information Processing Systems (NIPS 2009), 2009.Google Scholar
Y. Chen, D. Pavlov, P. Berkhin, A. Seetharaman, and A. Meltzer. Practical lessons of data mining at yahoo! ACM Conference on Information and Knowledge Management (CIKM 2009), 2009. Google ScholarDigital Library
Y. Chen, D. Pavlov, and J. F. Canny. Large-scale behavioral targeting. ACM Conference on Knowledge Discovery and Data Mining (KDD 2009), pages 209--218, 2009. Google ScholarDigital Library
N. Craswell, O. Zoeter, M. Taylor, and B. Ramsey. An experimental comparison of click position-bias models. Web Search and Web Data Mining (WSDM 2008), pages 87--94, 2008. Google ScholarDigital Library
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley-Interscience, 2nd Edition (October 2000), 2000. Google ScholarDigital Library
F. Guo, C. Liu, A. Kannan, T. Minka, M. Taylor, Y.-M. Wang, and C. Faloutsos. Click chain model in web search. International World Wide Web Conference (WWW 2009), pages 11--20, 2009. Google ScholarDigital Library
T. Hofmann. Probabilistic latent semantic indexing. ACM Conference on Information Retrieval (SIGIR 1999), pages 50--57, 1999. Google ScholarDigital Library
T. Hofmann and J. Puzicha. Latent class models for collaborative filtering. International Joint Conference on Artificial Intelligence (IJCAI 1999), pages 688--693, 1999. Google ScholarDigital Library
F. Jelinek and R. L. Mercer. Interpolated estimation of Markov source parameters from sparse data. Pattern Recognition in Practice, pages 381--402, 1980.Google Scholar
Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. ACM Conference on Knowledge Discovery and Data Mining (KDD 2008), pages 426--434, 2008. Google ScholarDigital Library
Y. Koren. Collaborative filtering with temporal dynamics. ACM Conference on Knowledge Discovery and Data Mining (KDD 2009), pages 447--456, 2009. Google ScholarDigital Library
A. Mccallum and K. Nigam. A comparison of event models for naïve Bayes text classification. AAAI-98 Workshop on Learning for Text Categorization, 1998.Google Scholar
K. Nigam, A. K. McCallum, S. Thrun, and T. M. Mitchell. Text classification from labeled and unlabeled documents using EM. Machine Learning, 39(2):103--134, 2000. Google ScholarDigital Library
D. Pavlov, R. Balasubramanyan, B. Dom, S. Kapur, and J. Parikh. Document preprocessing for naïve Bayes classification and clustering with mixture of multinomials. ACM Conference on Knowledge Discovery and Data Mining (KDD 2004), pages 829--834, 2004. Google ScholarDigital Library
L. Si and R. Jin. Flexible mixture model for collaborative filtering. International Conference on Machine Learning (ICML 2003), pages 704--711, 2003.Google Scholar
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. ACM Conference on Information Retrieval (SIGIR 2001), pages 343--348, 2001. Google ScholarDigital Library
S. Zhong and J. Ghosh. A comparative study of generative models for document clustering. SIAM International Conference Data Mining Workshop on Clustering High Dimensional Data and Its Applications, 2003.Google Scholar

Index Terms

Recommending ephemeral items at web scale
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
  2. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

CFSF: On Cloud-Based Recommendation for Large-Scale E-commerce

Recommender systems assist the e-commerce providers for services computing in aggregating user profiles and making suggestions tailored to user interests from large-scale data. This is mainly achieved by two primary schemes, i.e., memory-based ...
Read More
Learning from Sets of Items in Recommender Systems

Most of the existing recommender systems use the ratings provided by users on individual items. An additional source of preference information is to use the ratings that users provide on sets of items. The advantages of using preferences on sets are ...
Read More
A hybrid knowledge-based approach to collaborative filtering for improved recommendations

Collaborative filtering (CF) is one of the most successful and effective recommendation techniques for personalized information access. This method makes recommendations based on past transactions and feedback from users sharing similar interests. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
July 2011
1374 pages
ISBN:9781450307574
DOI:10.1145/2009916
General Chairs:
Wei-Ying Ma
Microsoft Research Asia, China
,
Jian-Yun Nie
University of Montreal, Canada
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Research, Spain
,
Tat-Seng Chua
National University of Singapore
,
W. Bruce Croft
University of Massachusetts, Amherst, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 July 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bayesian methods
clustering
collaborative filtering
evaluation
generative models
recommender systems
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 1,048
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Recommending ephemeral items at web scale

SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

CFSF: On Cloud-Based Recommendation for Large-Scale E-commerce

Learning from Sets of Items in Recommender Systems

A hybrid knowledge-based approach to collaborative filtering for improved recommendations