research-article

Multi-View Low-Rank Analysis with Applications to Outlier Detection

Authors:
Sheng Li

Adobe Research, San Jose, CA, USA

Adobe Research, San Jose, CA, USA

0000-0003-1205-8632
View Profile

,
Ming Shao

University of Massachusetts Dartmouth, Dartmouth, MA, USA

University of Massachusetts Dartmouth, Dartmouth, MA, USA
View Profile

,
Yun Fu

Northeastern University, MA USA

Northeastern University, MA USA
View Profile

ACM Transactions on Knowledge Discovery from Data Volume 12 Issue 3Article No.: 32pp 1–22https://doi.org/10.1145/3168363

Published:23 March 2018Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Detecting outliers or anomalies is a fundamental problem in various machine learning and data mining applications. Conventional outlier detection algorithms are mainly designed for single-view data. Nowadays, data can be easily collected from multiple views, and many learning tasks such as clustering and classification have benefited from multi-view data. However, outlier detection from multi-view data is still a very challenging problem, as the data in multiple views usually have more complicated distributions and exhibit inconsistent behaviors. To address this problem, we propose a multi-view low-rank analysis (MLRA) framework for outlier detection in this article. MLRA pursuits outliers from a new perspective, robust data representation. It contains two major components. First, the cross-view low-rank coding is performed to reveal the intrinsic structures of data. In particular, we formulate a regularized rank-minimization problem, which is solved by an efficient optimization algorithm. Second, the outliers are identified through an outlier score estimation procedure. Different from the existing multi-view outlier detection methods, MLRA is able to detect two different types of outliers from multiple views simultaneously. To this end, we design a criterion to estimate the outlier scores by analyzing the obtained representation coefficients. Moreover, we extend MLRA to tackle the multi-view group outlier detection problem. Extensive evaluations on seven UCI datasets, the MovieLens, the USPS-MNIST, and the WebKB datasets demon strate that our approach outperforms several state-of-the-art outlier detection methods.

References

Alejandro Marcos Alvarez, Makoto Yamada, Akisato Kimura, and Tomoharu Iwata. 2013. Clustering-based anomaly detection in multi-view data. In CIKM. 1545--1548. Google ScholarDigital Library
Fabrizio Angiulli and Fabio Fassetti. 2009. Outlier detection using inductive logic programming. In ICDM. 693--698. Google ScholarDigital Library
Ira Assent, Xuan Hong Dang, Barbora Micenková, and Raymond T. Ng. 2013. Outlier detection with space transformation and spectral analysis. In SDM. 225--233.Google Scholar
F. R. Bach. 2008. Consistency of trace norm minimization. Journal of Machine Learning Research 9 (2008), 1019--1048. Google ScholarDigital Library
K. Bache and M. Lichman. 2013. UCI Machine Learning Repository. (2013). Retrieved from http://archive.ics.uci.edu/ml.Google Scholar
Avrim Blum and Tom M. Mitchell. 1998. Combining labeled and unlabeled data with co-training. In COLT. ACM, 92--100. Google ScholarDigital Library
J. F. Cai, E. J. Candes, and Z. W. Shen. 2010. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization 20, 4 (2010), 1956--1982.Google ScholarCross Ref
E. J. Candès, X. D. Li, Y. Ma, and J. Wright. 2011. Robust principal component analysis?Journal of ACM 58, 3 (2011), 11.Google Scholar
Jianhui Chen, Jiayu Zhou, and Jieping Ye. 2011. Integrating low-rank and group-sparse structures for robust multi-task learning. In KDD. 42--50. Google ScholarDigital Library
Bin Cheng, Guangcan Liu, Jingdong Wang, ZhongYang Huang, and Shuicheng Yan. 2011. Multi-task low-rank affinity pursuit for image segmentation. In ICCV. 2439--2446. Google ScholarDigital Library
Santanu Das, Bryan L. Matthews, Ashok N. Srivastava, and Nikunj C. Oza. 2010. Multiple kernel learning for heterogeneous anomaly detection: Algorithm and aviation safety case study. In KDD. 47--56. Google ScholarDigital Library
Bo Du and Liangpei Zhang. 2014. A discriminative metric learning based anomaly detection method. IEEE Transactions on Geoscience and Remote Sensing 52, 11 (2014), 6844--6857.Google ScholarCross Ref
Andrew F. Emmott, Shubhomoy Das, Thomas Dietterich, Alan Fern, and Weng-Keen Wong. 2013. Systematic construction of anomaly detection benchmarks from real data. In KDD Workshop on Outlier Detection and Description. 16--21. Google ScholarDigital Library
Jing Gao, Wei Fan, Deepak S. Turaga, Srinivasan Parthasarathy, and Jiawei Han. 2011. A spectral framework for detecting inconsistency across multi-source object relationships. In ICDM. 1050--1055. Google ScholarDigital Library
Yuhong Guo. 2013. Convex subspace representation learning from multi-view data. In AAAI. Vol. 1, 2. Google ScholarDigital Library
Ko-Jen Hsiao, Kevin S. Xu, Jeff Calder, and Alfred O. Hero III. 2012. Multi-criteria anomaly detection using pareto depth analysis. In NIPS. 854--862. Google ScholarDigital Library
Han Hu, Zhouchen Lin, Jianjiang Feng, and Jie Zhou. 2014. Smooth representation clustering. In CVPR. 3834--3841. Google ScholarDigital Library
Jonathan Hull. 1994. A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine 16, 5 (1994), 550--554. Google ScholarDigital Library
Vandana Pursnani Janeja and Revathi Palanisamy. 2013. Multi-domain anomaly detection in spatial datasets. Knowledge and Information Systems 36, 3 (2013), 749--788.Google ScholarDigital Library
R. H. Keshavan, A. Montanari, and S. Oh. 2009. Matrix completion from noisy entries. In NIPS. 952--960. Google ScholarDigital Library
Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haaffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.Google ScholarCross Ref
Yuh-Jye Lee, Yi-Ren Yeh, and Yu-Chiang Frank Wang. 2013. Anomaly detection via online oversampling principal component analysis. IEEE Transactions on Knowledge and Data Engineering 25, 7 (2013), 1460--1470. Google ScholarDigital Library
Liangyue Li, Sheng Li, and Yun Fu. 2014. Learning low-rank and discriminative dictionary for image classification. Image and Vision Computing 32, 10 (2014), 814--823.Google ScholarCross Ref
Sheng Li and Yun Fu. 2013. Low-rank coding with b-matching constraint for semi-supervised classification. In IJCAI. 1472--1478. Google ScholarDigital Library
Sheng Li and Yun Fu. 2014. Robust subspace discovery through supervised low-rank constraints. In SDM. 163--171.Google Scholar
Sheng Li and Yun Fu. 2015. Multi-view low-rank analysis for outlier detection. In SDM.Google Scholar
Sheng Li and Yun Fu. 2017. Robust Representation for Data Analytics. Springer. Google ScholarDigital Library
Sheng Li, Ming Shao, and Yun Fu. 2014. Locality linear fitting one-class SVM with low-rank constraints for outlier detection. In IJCNN. 676--683.Google Scholar
Shao-Yuan Li, Yuan Jiang, and Zhi-Hua Zhou. 2014. Partial multi-view clustering. In AAAI. Citeseer, 1968--1974. Google ScholarDigital Library
Z. C. Lin, M. M. Chen, L. Q. Wu, and Y. Ma. 2009. The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices. Technical Report, University of Illinois at Urbana-Champaign.Google Scholar
Alexander Liu and Dung N. Lam. 2012. Using consensus clustering for multi-view anomaly detection. In IEEE Symposium on Security and Privacy Workshops. 117--124. Google ScholarDigital Library
Bo Liu, Yanshan Xiao, Longbing Cao, Zhifeng Hao, and Feiqi Deng. 2013. SVDD-based outlier detection on uncertain data. Knowledge and Information Systems 34, 3 (2013), 597--618.Google ScholarDigital Library
Bo Liu, Yanshan Xiao, Philip S. Yu, Zhifeng Hao, and Longbing Cao. 2014. An efficient approach for outlier detection with imperfect data labels. IEEE Transactions on Knowledge and Data Engineering 26, 7 (2014), 1602--1616.Google ScholarCross Ref
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2012. Isolation-based anomaly detection. TKDD 6, 1 (2012), 3. Google ScholarDigital Library
Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, and Yi Ma. 2013. Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine 35, 1 (2013), 171--184. Google ScholarDigital Library
Guangcan Liu, Qingshan Liu, and Ping Li. 2017. Blessing of dimensionality: Recovering mixture data via dictionary pursuit. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 1 (2017), 47--60. Google ScholarDigital Library
Guangcan Liu, Huan Xu, Jinhui Tang, Qingshan Liu, and Shuicheng Yan. 2016. A deterministic analysis for LRR. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 3 (2016), 417--430. Google ScholarDigital Library
Guangcan Liu, Huan Xu, and Shuicheng Yan. 2012. Exact subspace segmentation and outlier detection by low-rank representation. In AISTATS. 703--711.Google Scholar
G. C. Liu, Z. C. Lin, and Y. Yu. 2010. Robust subspace segmentation by low-rank representation. In ICML. 663--670. Google ScholarDigital Library
Roland Memisevic. 2012. On multi-view feature learning. In ICML. Google ScholarDigital Library
Krikamol Muandet and Bernhard Schölkopf. 2013. One-class support measure machines for group anomaly detection. In UAI. DOI:https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu===2&article_id===2406&proceeding_id===29. Google ScholarDigital Library
Emmanuel Müller, Ira Assent, Patricia Iglesias Sanchez, Yvonne Mülle, and Klemens Böhm. 2012. Outlier ranking via subspace analysis in multiple views of the data. In ICDM. 529--538. Google ScholarDigital Library
Colin O’Reilly, Alexander Gluhak, and Muhammad Ali Imran. 2015. Adaptive anomaly detection with kernel eigenspace splitting and merging. IEEE Transactions on Knowledge and Data Engineering 27, 1 (2015), 3--16.Google ScholarCross Ref
Yaling Pei, Osmar R. Zaïane, and Yong Gao. 2006. An efficient reference-based approach to outlier detection in large datasets. In ICDM. 478--487. Google ScholarDigital Library
Bryan Perozzi, Leman Akoglu, Patricia Iglesias Sanchez, and Emmanuel Müller. 2014. Focused clustering and outlier detection in large attributed graphs. In KDD. 1346--1355. Google ScholarDigital Library
Ninh Pham and Rasmus Pagh. 2012. A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data. In KDD. 877--885. Google ScholarDigital Library
Erich Schubert, Arthur Zimek, and Hans-Peter Kriegel. 2014. Generalized outlier detection with flexible kernel density estimates. In SDM. 542--550.Google Scholar
Ming Shao, Dmitry Kit, and Yun Fu. 2014. Generalized transfer subspace learning through low-rank constraint. International Journal of Computer Vision 109, 1--2 (2014), 74--93. Google ScholarDigital Library
Vikas Sindhwani and David S. Rosenberg. 2008. An RKHS for multi-view learning and manifold co-regularization. In ICML. 976--983. Google ScholarDigital Library
Karthik Sridharan and Sham M. Kakade. 2008. An information theoretic framework for multi-view learning. In COLT. 403--414.Google Scholar
Hanghang Tong and Ching-Yung Lin. 2011. Non-negative residual matrix factorization with application to graph anomaly detection. In SDM. 143--153.Google Scholar
Grigorios Tzortzis and Aristidis Likas. 2012. Kernel-based weighted multi-view clustering. In ICDM. 675--684. Google ScholarDigital Library
Martha White, Yaoliang Yu, Xinhua Zhang, and Dale Schuurmans. 2012. Convex multi-view subspace learning. In NIPS. 1682--1690. Google ScholarDigital Library
Shu Wu and Shengrui Wang. 2013. Information-theoretic outlier detection for large-scale categorical data. IEEE Transactions on Knowledge and Data Engineering 25, 3 (2013), 589--602. Google ScholarDigital Library
Liang Xiong, Xi Chen, and Jeff Schneider. 2011. Direct robust matrix factorization for anomaly detection. In ICDM. IEEE, 844--853. Google ScholarDigital Library
Liang Xiong, Barnabás Póczos, and Jeff G. Schneider. 2011. Group anomaly detection using flexible genre models. In NIPS. 1071--1079. Google ScholarDigital Library
Chang Xu, Dacheng Tao, and Chao Xu. 2013. A survey on multi-view learning. CoRR abs/1304.5634 (2013).Google Scholar
Huan Xu, Constantine Caramanis, and Sujay Sanghavi. 2010. Robust PCA via outlier pursuit. In NIPS. 2496--2504. Google ScholarDigital Library
Qi Rose Yu, Xinran He, and Yan Liu. 2014. GLAD: Group anomaly detection in social media analysis. In KDD. 372--381. Google ScholarDigital Library
Xiaowei Zhou, Can Yang, and Weichuan Yu. 2012. Automatic mitral leaflet tracking in echocardiography by outlier detection in the low-rank representation. In CVPR. 972--979. Google ScholarDigital Library
Arthur Zimek, Matthew Gaudet, Ricardo J. G. B. Campello, and Jörg Sander. 2013. Subsampling for efficient and effective unsupervised outlier detection ensembles. In KDD. 428--436. Google ScholarDigital Library

Index Terms

Multi-View Low-Rank Analysis with Applications to Outlier Detection
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
      1. Regularization
2. Information systems
  1. Information systems applications
    1. Data mining
      1. Data cleaning

Recommendations

Information-aware Multi-view Outlier Detection
With the development of multi-view learning, multi-view outlier detection has received increasing attention in recent years. However, the current research still faces two challenges: (1) The current research lacks theoretical analysis tools for multi-view ...
Read More
Multi-view Outlier Detection via Graphs Denoising
Abstract
Recently, multi-view outlier detection attracts increasingly more attention. Although existing multi-view outlier detection methods have demonstrated promising performance, they still suffer from some problems. Firstly, many methods make the ...
Highlights
- A novel unsupervised multi-view outlier detection method is proposed.
- It can explicitly extract the structured outliers on multiple graphs.
- The experiments demonstrates the effectiveness and superiority of the proposed method.
Read More
Robust Multi-view Subspace Learning Through Structured Low-Rank Matrix Recovery
Pattern Recognition and Computer Vision
Abstract
Multi-view data exists widely in our daily life. A popular approach to deal with multi-view data is the multi-view subspace learning (MvSL), which projects multi-view data into a common latent subspace to learn more powerful representation. Low-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 12, Issue 3
June 2018
360 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3178546
Editors:
Charu Aggarwal
IBM T. J. Watson Research, USA
,
Xindong Wu
University of Louisiana at Lafayette, USA
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 March 2018
- Accepted: 1 November 2017
- Revised: 1 April 2017
- Received: 1 September 2016
Published in tkdd Volume 12, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Multi-view learning
low-rank matrix recovery
outlier detection
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 43
  Total Citations
  View Citations
- 672
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Multi-View Low-Rank Analysis with Applications to Outlier Detection

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Information-aware Multi-view Outlier Detection

Multi-view Outlier Detection via Graphs Denoising

Robust Multi-view Subspace Learning Through Structured Low-Rank Matrix Recovery