skip to main content
research-article
Open Access

End-to-End Continual Rare-Class Recognition with Emerging Novel Subclasses

Published:05 August 2020Publication History
Skip Abstract Section

Abstract

Given a labeled dataset that contains a rare (or minority) class containing of-interest instances, as well as a large class of instances that are not of interest, how can we learn to recognize future of-interest instances over a continuous stream? The setting is different from traditional classification in that instances from novel minority subclasses might continually emerge over time—and hence is often referred as continual, life-long, or open-world classification. We introduce RaRecognize, which (i) estimates a general decision boundary between the rare class and the majority class, (ii) learns to recognize the individual rare subclasses that exist within the training data, as well as (iii) flags instances from previously unseen rare subclasses as newly emerging (i.e., novel). The learner in (i) is general in the sense that by construction it is dissimilar to the specialized learners in (ii), thus distinguishes minority from the majority without overly tuning to what is only seen in the training data. Thanks to this generality, RaRecognize ignores all future instances that it labels as majority and recognizes the recurring as well as emerging rare subclasses only. This saves effort at test time as well as ensures that the model size grows moderately over time as it only maintains specialized minority learners. Overall, we build an end-to-end system which consists of (1) a representation learning component that transforms data instances into suitable vector inputs; (2) a continual classifier that labels incoming instances as majority (not of interest), rare recurrent, or rare emerging; and (3) a clustering component that groups the rare emerging instances into novel subclasses for expert vetting and model re-training. Through extensive experiments, we show that RaRecognize outperforms state-of-the art baselines on three real-world datasets that contain documents related to corporate-risk and (natural and man-made) disasters as rare classes.

References

  1. Charu C. Aggarwal. 2013. A survey of stream clustering algorithms. In Data Clustering: Algorithms and Applications, C. Aggarwal and C. Reddy (Eds.). CRC Press.Google ScholarGoogle Scholar
  2. Fabrizio Angiulli and Fabio Fassetti. 2010. Distance-based outlier queries in data streams: The novel task and algorithms. Data Mining and Knowledge Discovery 20, 2 (2010), 290--324.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. August A. Balkema and Laurens De Haan. 1974. Residual life time at great age. The Annals of Probability 2, 5 (1974), 792--804. DOI:10.1214/aop/1176996548Google ScholarGoogle ScholarCross RefCross Ref
  4. Zhiyuan Chen and Bing Liu. 2016. Lifelong machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 10, 3 (2016), 1--145.Google ScholarGoogle ScholarCross RefCross Ref
  5. Elaine R. Faria, João Gama, and André C. P. L. F. Carvalho. 2013. Novelty detection algorithm for data streams multi-class problems. In Proceedings of the 28th Annual ACM Symposium on Applied Computing. ACM, 795--800.Google ScholarGoogle Scholar
  6. Robert French. 1999. Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences 3, 4 (1999), 128--135. DOI:https://doi.org/10.1016/S1364-6613(99)01294-2Google ScholarGoogle ScholarCross RefCross Ref
  7. Ronald Kemker and Christopher Kanan. 2018. FearNet: Brain-inspired model for incremental learning. In Proceedings of the International Conference on Learning Representations. Retrieved from https://openreview.net/forum?id=SJ1Xmf-Rb.Google ScholarGoogle Scholar
  8. Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  9. James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, and D. Hassabis. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences of the United States of America 114, 13 (2017), 3521--3526.Google ScholarGoogle ScholarCross RefCross Ref
  10. Maria Kontaki, Anastasios Gounaris, Apostolos N. Papadopoulos, Kostas Tsichlas, and Yannis Manolopoulos. 2011. Continuous monitoring of distance-based outliers over data streams. In Proceedings of the 2011 IEEE 27th International Conference on Data Engineering. IEEE, 135--146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Quoc V. Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. Vol. 14. 1188--1196.Google ScholarGoogle Scholar
  12. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (May 2015), 436--444. DOI:https://doi.org/10.1038/nature14539Google ScholarGoogle ScholarCross RefCross Ref
  13. Sang-Woo Lee, Jin-Hwa Kim, Jaehyun Jun, Jung-Woo Ha, and Byoung-Tak Zhang. 2017. Overcoming catastrophic forgetting by incremental moment matching. In Proceedings of the Conference on Neural Information Processing Systems. 4652--4662.Google ScholarGoogle Scholar
  14. Emaad Manzoor, Hemank Lamba, and Leman Akoglu. 2018. Extremely fast decision tree. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining. ACM, 1963--1972.Google ScholarGoogle Scholar
  15. Xin Mu, Kai Ming Ting, and Zhi-Hua Zhou. 2017. Classification under streaming emerging new classes: A solution using completely-random trees. IEEE Transactions on Knowledge and Data Engineering 29, 8 (2017) 1605--1618.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Xin Mu, Feida Zhu, Juan Du, Ee-Peng Lim, and Zhi-Hua Zhou. 2017. Streaming classification with emerging new class by class matrix sketching. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  17. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Empirical Methods in Natural Language Processing. 1532--1543. Retrieved from http://www.aclweb.org/anthology/D14-1162.Google ScholarGoogle Scholar
  18. Tomáš Pevnỳ. 2016. Loda: Lightweight on-line detector of anomalies. Machine Learning 102, 2 (2016), 275--304.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H. Lampert. 2017. iCaRL: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2001--2010.Google ScholarGoogle Scholar
  20. Hanul Shin, Jung Kwon Lee, Jaehong Kim, and Jiwon Kim. 2017. Continual learning with deep generative replay. In Proceedings of the Conference on Neural Information Processing Systems. 2990--2999.Google ScholarGoogle Scholar
  21. Lei Shu, Hu Xu, and Bing Liu. 2017. DOC: Deep open classification of text documents. In Proceedings of the Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  22. Lei Shu, Hu Xu, and Bing Liu. 2018. Unseen class discovery in open-world classification. arXiv preprint arXiv:1801.05609 (2018).Google ScholarGoogle Scholar
  23. Alban Siffer, Pierre-Alain Fouque, Alexandre Termier, and Christine Largouet. 2017. Anomaly detection in streams with extreme value theory. In Proceedings of the SIGKDD Conference on Knowledge Discovery and Data Mining. ACM, 1067--1075.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Eduardo J. Spinosa, André Ponce de Leon F. de Carvalho, and João Gama. 2007. Olindda: A cluster-based approach for detecting novelty and concept drift in data streams. In Proceedings of the 2007 ACM Symposium on Applied Computing. ACM, 448--452.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Swee Chuan Tan, Kai Ming Ting, and Tony Fei Liu. 2011. Fast anomaly detection for streaming data. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  26. Ke Wu, Kun Zhang, Wei Fan, Andrea Edwards, and S. Yu Philip. 2014. RS-forest: A rapid density estimator for streaming anomaly detection. In Proceedings of the 2014 IEEE International Conference on Data Mining. IEEE, 600--609.Google ScholarGoogle Scholar
  27. Hu Xu, Bing Liu, Lei Shu, and P. Yu. 2019. Open-world learning and application to product classification. In Proceedings of the World Wide Web Conference.Google ScholarGoogle Scholar
  28. Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Proceedings of the Conference on Neural Information Processing Systems. 649--657.Google ScholarGoogle Scholar

Index Terms

  1. End-to-End Continual Rare-Class Recognition with Emerging Novel Subclasses

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 14, Issue 5
      Special Issue on KDD 2018, Regular Papers and Survey Paper
      October 2020
      376 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/3407672
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 August 2020
      • Accepted: 1 May 2020
      • Revised: 1 January 2020
      • Received: 1 July 2019
      Published in tkdd Volume 14, Issue 5

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)54
      • Downloads (Last 6 weeks)6

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format