research-article

Anomaly Detection for an E-commerce Pricing System

Authors:
Jagdish Ramakrishnan

Walmart Labs, San Bruno, CA, USA

Walmart Labs, San Bruno, CA, USA
View Profile

,
Elham Shaabani

Walmart Labs, San Bruno, CA, USA

Walmart Labs, San Bruno, CA, USA
View Profile

,
Chao Li

Walmart Labs, San Bruno, CA, USA

Walmart Labs, San Bruno, CA, USA
View Profile

,
Matyas A. Sustik

Walmart Labs, San Bruno, CA, USA

Walmart Labs, San Bruno, CA, USA
View Profile

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningJuly 2019Pages 1917–1926https://doi.org/10.1145/3292500.3330748

Published:25 July 2019Publication History

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1917–1926

ABSTRACT

Online retailers execute a very large number of price updates when compared to brick-and-mortar stores. Even a few mis-priced items can have a significant business impact and result in a loss of customer trust. Early detection of anomalies in an automated real-time fashion is an important part of such a pricing system. In this paper, we describe unsupervised and supervised anomaly detection approaches we developed and deployed for a large-scale online pricing system at Walmart. Our system detects anomalies both in batch and real-time streaming settings, and the items flagged are reviewed and actioned based on priority and business impact. We found that having the right architecture design was critical to facilitate model performance at scale, and business impact and speed were important factors influencing model selection, parameter choice, and prioritization in a production environment for a large-scale system. We conducted analyses on the performance of various approaches on a test set using real-world retail data and fully deployed our approach into production. We found that our approach was able to detect the most important anomalies with high precision.

Supplemental Material

p1917-ramakrishnan.mp4

mp4

949.9 MB

Download

References

Charu C. Aggarwal. 2016. Outlier Analysis 2nd ed.). Springer Publishing Company, Incorporated. Google ScholarDigital Library
Subutai Ahmad and Scott Purdy. 2016. Real-Time Anomaly Detection for Streaming Analytics. CoRR , Vol. abs/1607.02480 (2016).Google Scholar
Fabrizio Angiulli and Clara Pizzuti. 2002. Fast Outlier Detection in High Dimensional Spaces. In Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD '02). Springer-Verlag, London, UK, UK, 15--26. http://dl.acm.org/citation.cfm?id=645806.670167 Google ScholarCross Ref
Anodot. {n.d.}. Nipping it in the Bud: How real-time anomaly detection can prevent e-commerce glitches from becoming disasters. https://www.anodot.com/blog/real-time-anomaly-detection-can-prevent-ecommerce-retail-glitches/.Google Scholar
Leo Breiman. 2001. Random Forests. Mach. Learn. , Vol. 45, 1 (Oct. 2001), 5--32. Google ScholarDigital Library
Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying Density-based Local Outliers. SIGMOD Rec. , Vol. 29, 2 (May 2000), 93--104. Google ScholarDigital Library
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 785--794. Google ScholarDigital Library
Miroslav Dudik, David M. Blei, and Robert E. Schapire. 2007. Hierarchical Maximum Entropy Density Estimation. In Proceedings of the 24th International Conference on Machine Learning (ICML '07). ACM, New York, NY, USA, 249--256. Google ScholarDigital Library
Jerome H. Friedman. 2001. Greedy function approximation: A gradient boosting machine. Ann. Statist. , Vol. 29, 5 (10 2001), 1189--1232.Google Scholar
Huiyuan Fu, Huadong Ma, and Anlong Ming. 2011. EGMM: An enhanced Gaussian mixture model for detecting moving objects with intermittent stops. Proceedings - IEEE International Conference on Multimedia and Expo, 1--6. Google ScholarDigital Library
Ben D. Fulcher and Nick S. Jones. 2014. Highly Comparative Feature-Based Time-Series Classification. IEEE Transactions on Knowledge and Data Engineering , Vol. 26 (2014), 3026--3037.Google ScholarCross Ref
Nico Görnitz, Marius Kloft, Konrad Rieck, and Ulf Brefeld. 2013. Toward Supervised Anomaly Detection. J. Artif. Int. Res. , Vol. 46, 1 (Jan. 2013), 235--262. http://dl.acm.org/citation.cfm?id=2512538.2512545 Google ScholarDigital Library
Malay Haldar, Mustafa Abdool, Prashant Ramanathan, Tao Xu, Shulin Yang, Huizhong Duan, Qing Zhang, Nick Barrow-Williams, Bradley C. Turnbull, Brendan M. Collins, and Thomas Legrand. 2018. Applying Deep Learning To Airbnb Search. CoRR , Vol. abs/1810.09591 (2018). arxiv: 1810.09591 http://arxiv.org/abs/1810.09591Google Scholar
R. J. Hyndman, E. Wang, and N. Laptev. 2015. Large-Scale Unusual Time Series Detection. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW). 1616--1619. Google ScholarDigital Library
Sevvandi Kandanaarachchi, Mario A Munoz, Rob J Hyndman, and Kate Smith-Miles. 2018. On normalization and algorithm selection for unsupervised outlier detection. Monash Econometrics and Business Statistics Working Papers 16/18. Monash University, Department of Econometrics and Business Statistics. https://ideas.repec.org/p/msh/ebswps/2018--16.htmlGoogle Scholar
JooSeuk Kim and Clayton D. Scott. 2011. Robust Kernel Density Estimation. Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on , Vol. 13 (07 2011).Google Scholar
Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (12 2014).Google Scholar
Hans-Peter Kriegel, Matthias Schubert, and Arthur Zimek. 2008. Angle-based Outlier Detection in High-dimensional Data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08). ACM, New York, NY, USA, 444--452. Google ScholarDigital Library
Nikolay Laptev. 2018. AnoGen: Deep Anomaly Generator. Technical Report. Facebook. https://research.fb.com/wp-content/uploads/2018/11/AnoGen-Deep-Anomaly-Generator.pdf?Google Scholar
Nikolay Laptev, Saeed Amizadeh, and Ian Flint. 2015. Generic and Scalable Framework for Automated Time-series Anomaly Detection. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15). ACM, New York, NY, USA, 1939--1947. Google ScholarDigital Library
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining (ICDM '08). IEEE Computer Society, Washington, DC, USA, 413--422.Google ScholarDigital Library
Travis Oliphant. 2006--. NumPy: A guide to NumPy . USA: Trelgol Publishing. http://www.numpy.org/ {Online; accessed today}.Google Scholar
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research , Vol. 12 (2011), 2825--2830. Google ScholarDigital Library
Tomávs Pevn? 2016. Loda: Lightweight On-line Detector of Anomalies. Mach. Learn. , Vol. 102, 2 (Feb. 2016), 275--304. Google ScholarDigital Library
Maheshkumar R Sabhnani, Daniel B Neill, and Andrew W Moore. 2005. Detecting anomalous patterns in pharmacy retail data. (01 2005).Google Scholar
Bernhard Schölkopf, John C. Platt, John Shawe-Taylor, Alexander J. Smola, and Robert C. Williamson. 2001. Estimating the Support of a High-Dimensional Distribution. Neural Computation , Vol. 13 (2001), 1443--1471. Google ScholarDigital Library
Bernhard Schölkopf, Robert Williamson, Alex Smola, John Shawe-Taylor, and John Platt. 1999. Support Vector Method for Novelty Detection. In Proceedings of the 12th International Conference on Neural Information Processing Systems (NIPS'99). MIT Press, Cambridge, MA, USA, 582--588. http://dl.acm.org/citation.cfm?id=3009657.3009740 Google ScholarDigital Library
Dominique Shipmon, Jason Gurevitch, Paolo M Piselli, and Steve Edwards. 2017. Time Series Anomaly Detection: Detection of Anomalous Drops with Limited Features and Sparse Examples in Noisy Periodic Data . Technical Report. Google Inc. https://arxiv.org/abs/1708.03665Google Scholar
Md Amran Siddiqui, Alan Fern, Thomas G. Dietterich, and Weng-Keen Wong. 2019. Sequential Feature Explanations for Anomaly Detection. ACM Trans. Knowl. Discov. Data , Vol. 13, 1, Article 1 (Jan. 2019), bibinfonumpages22 pages. Google ScholarDigital Library
Karanjit Singh and Shuchita Upadhyaya. 2012. Outlier Detection: Applications And Techniques. International Journal of Computer Science Issues , Vol. 9 (01 2012).Google Scholar
David M.J. Tax and Robert P.W. Duin. 2004. Support Vector Data Description. Machine Learning , Vol. 54, 1 (01 Jan 2004), 45--66. Google ScholarDigital Library
Owen Vallis, Jordan Hochenbaum, and Arun Kejariwal. 2014. A Novel Technique for Long-Term Anomaly Detection in the Cloud. In 6th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 14). USENIX Association, Philadelphia, PA. https://www.usenix.org/conference/hotcloud14/workshop-program/presentation/vallis Google ScholarDigital Library
Houssam Zenati, Chuan Sheng Foo, Bruno Lecouat, Gaurav Manek, and Vijay Ramaseshan Chandrasekhar. 2018. Efficient GAN-Based Anomaly Detection. CoRR , Vol. abs/1802.06222 (2018). arxiv: 1802.06222 http://arxiv.org/abs/1802.06222Google Scholar
Shuangfei Zhai, Yu Cheng, Weining Lu, and Zhongfei Zhang. 2016. Deep Structured Energy Based Models for Anomaly Detection. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML'16). JMLR.org, 1100--1109. http://dl.acm.org/citation.cfm?id=3045390.3045507 Google ScholarDigital Library
Yue Zhao, Zain Nasrullah, and Zheng Li. 2019. PyOD: A Python Toolbox for Scalable Outlier Detection. arXiv preprint arXiv:1901.01588 (2019). https://arxiv.org/abs/1901.01588Google Scholar
Lingxue Zhu and Nikolay Laptev. 2017. Deep and Confident Prediction for Time Series at Uber. 103--110.Google Scholar

Index Terms

Anomaly Detection for an E-commerce Pricing System
1. Applied computing
  1. Electronic commerce
    1. E-commerce infrastructure
2. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Anomaly detection

Recommendations

Pricing games of mixed conventional and e-commerce distribution channels

In this paper, a distribution system is studied, in which a supplier sells a common product through conventional (physical retailer) and e-commerce (e-tailers) channels. We examine two types of Stackelberg pricing games and one type of Nash pricing game ...
Read More
A Model of Internet Pricing Under Price-Comparison Shopping

An empirical regularity in the price-promotion behavior of retailers of homogenous goods is explained theoretically. Based on this, a model is proposed for price competition in a market for a homogenous good with many asymmetrically positioned ...
Read More
Pricing Under Dynamic Competition When Loyal Consumers Stockpile

Managers, let stockpiling be but at a higher price—don’t hope to cut stockpiling by lopping off promotions.

One goal of promotions for frequently purchased products is increasing short-term sales. Increases could be at competitors’ expense, coming from consumers with relatively weak brand preferences. However, increased sales from brand-loyal consumers could ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2019
3305 pages
ISBN:9781450362016
DOI:10.1145/3292500
General Chairs:
Ankur Teredesai
KenSci
,
Vipin Kumar
University of Minnesota
,
Program Chairs:
Ying Li
EV Analysis Corporation
,
Rómer Rosales
LinkedIn
,
Evimaria Terzi
Boston University
,
George Karypis
University of Minnesota
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anomaly detection
e-commerce
pricing
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '19 Paper Acceptance Rate110of1,200submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 1,660
  Total Downloads
- Downloads (Last 12 months)82
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Anomaly Detection for an E-commerce Pricing System

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Pricing games of mixed conventional and e-commerce distribution channels

A Model of Internet Pricing Under Price-Comparison Shopping

Pricing Under Dynamic Competition When Loyal Consumers Stockpile

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Anomaly Detection for an E-commerce Pricing System

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Pricing games of mixed conventional and e-commerce distribution channels

A Model of Internet Pricing Under Price-Comparison Shopping

Pricing Under Dynamic Competition When Loyal Consumers Stockpile

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media