skip to main content
research-article
Public Access

Robust Spectral Ensemble Clustering via Rank Minimization

Published:09 January 2019Publication History
Skip Abstract Section

Abstract

Ensemble Clustering (EC) is an important topic for data cluster analysis. It targets to integrate multiple Basic Partitions (BPs) of a particular dataset into a consensus partition. Among previous works, one promising and effective way is to transform EC as a graph partitioning problem on the co-association matrix, which is a pair-wise similarity matrix summarized by all the BPs in essence. However, most existing EC methods directly utilize the co-association matrix, yet without considering various noises (e.g., the disagreement between different BPs and the outliers) that may exist in it. These noises can impair the cluster structure of a co-association matrix, and thus mislead the final graph partitioning process. To address this challenge, we propose a novel Robust Spectral Ensemble Clustering (RSEC) algorithm in this article. Specifically, we learn low-rank representation (LRR) for the co-association matrix to uncover its cluster structure and handle the noises, and meanwhile, we perform spectral clustering with the learned representation to seek for a consensus partition. These two steps are jointly proceeded within a unified optimization framework. In particular, during the optimizing process, we leverage consensus partition to iteratively enhance the block-diagonal structure of LRR, in order to assist the graph partitioning. To solve RSEC, we first formulate it by using nuclear norm as a convex proxy to the rank function. Then, motivated by the recent advances in non-convex rank minimization, we further develop a non-convex model for RSEC and provide it a solution by the majorization--minimization Augmented Lagrange Multiplier algorithm. Experiments on 18 real-world datasets demonstrate the effectiveness of our algorithm compared with state-of-the-art methods. Moreover, several impact factors on the clustering performance of our approach are also explored extensively.

References

  1. H. Ayad and M. Kamel. 2008. Cumulative voting consensus method for partitions with variable number of clusters. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 1 (2008), 160--173. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. Cai, X. He, J. Han, and T. S. Huang. 2011. Graph regularized nonnegative matrix factorization for data representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 8 (2011), 1548--1560. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Cai, E. J. Candès, and Z. Shen. 2010. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization 20, 4 (2010), 1956--1982.Google ScholarGoogle ScholarCross RefCross Ref
  4. E. J. Candès, X. Li, Y. Ma, and J. Wright. 2011. Robust principal component analysis?Journal of the ACM 58, 3 (2011), 11:1--11:37.Google ScholarGoogle Scholar
  5. X. Cao, Z. Tao, B. Zhang, H. Fu, and W. Feng. 2014. Self-adaptively weighted co-saliency detection via rank constraint. IEEE Transactions on Image Processing 23, 9 (2014), 4175--4186.Google ScholarGoogle Scholar
  6. B. Cheng, G. Liu, J. Wang, Z. Huang, and S. Yan. 2011. Multi-task low-rank affinity pursuit for image segmentation. In Proceedings of the IEEE International Conference on Computer Vision. 2439--2446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. I. S. Dhillon, Y. Guan, and B. Kulis. 2004. Kernel K-means: Spectral clustering and normalized cuts. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 551--556. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Z. Ding and Y. Fu. 2014. Low-rank common subspace for multi-view learning. In Proceedings of the IEEE International Conference on Data Mining. 110--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Z. Ding, M. Shao, and Y. Fu. 2015. Missing modality transfer learning via latent low-rank constraint. IEEE Transactions on Image Processing 24, 11 (2015), 4322--4334.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Domeniconi and M. Al-Razgan. 2009. Weighted cluster ensembles: Methods and analysis. ACM Transactions on Knowledge Discovery from Data 2, 4 (2009), 17:1--17:40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. X. Z. Fern and C. E. Brodley. 2004. Solving cluster ensemble problems by bipartite graph partitioning. In Proceedings of International Conference on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. L. N. Fred and A. K. Jain. 2005. Combining multiple clusterings using evidence accumulation. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 6 (2005), 835--850. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Gionis, H. Mannila, and P. Tsaparas. 2007. Clustering aggregation. ACM Transactions on Knowledge Discovery from Data 1, 1 (2007), Article 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Iam-on, T. Boongoen, S. M. Garrett, and C. J. Price. 2011. A link-based approach to the cluster ensemble problem. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 12 (2011), 2396--2409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. K. Jain. 2010. Data clustering: 50 years beyond K-means. Pattern Recognition Letter 31, 8 (2010), 651--666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. I. Kuncheva and D. Vetrov. 2006. Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 11 (2006), 1798--1808. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.Google ScholarGoogle ScholarCross RefCross Ref
  18. J. Li, Y. Kong, H. Zhao, J. Yang, and Y. Fu. 2016b. Learning fast low-rank projection for image classification. IEEE Transaction on Image Processing 25, 10 (2016), 4803--4814.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Li, H. Liu, H. Zhao, and Y. Fu. 2017b. Projective low-rank subspace clustering via learning deep encoder. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2145--2151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Li, H. Zhao, Z. Tao, and Y. Fu. 2017c. Large-scale subspace clustering by fast regression coding. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. 2138--2144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Li and Y. Fu. 2014. Robust subspace discovery through supervised low-rank constraints. In Proceedings of SIAM International Conference on Data Mining. 163--171.Google ScholarGoogle Scholar
  22. S. Li and Y. Fu. 2015. Learning balanced and unbalanced graphs via low-rank coding. IEEE Transactions on Knowledge and Data Engineering 27, 5 (2015), 1274--1287.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Li and Y. Fu. 2016. Learning robust and discriminative subspace with low-rank constraints. IEEE Transaction Neural Networks Learning System 27, 11 (2016), 2160--2173.Google ScholarGoogle ScholarCross RefCross Ref
  24. S. Li, K. Li, and Y. Fu. 2018a. Self-taught low-rank coding for visual learning. IEEE Transactions on Neural Networks and Learning Systems 29, 3 (2018), 645--656.Google ScholarGoogle ScholarCross RefCross Ref
  25. S. Li, H. Liu, Z. Tao, and Y. Fu. 2017a. Multi-view graph learning with adaptive label propagation. In Proceedings of the IEEE International Conference on Big Data (Big Data’17). IEEE, 110--115.Google ScholarGoogle Scholar
  26. S. Li, M. Shao, and Y. Fu. 2018b. Multi-view low-rank analysis with applications to outlier detection. ACM Transactions on Knowledge Discovery from Data 12, 3 (2018), Article 32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Li, B. Cheng, B. Ni, G. Liu, and S. Yan. 2016a. Multitask low-rank affinity graph for image segmentation and image annotation. ACM Transactions on Intelligent Systems and Technology 7, 4 (2016), Article 65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Li, C. Ding, and M. I. Jordan. 2007. Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. In Proceedings of the IEEE International Conference on Data Mining. 577--582. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Z. Lin, M. Chen, and Y. Ma. 2010. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv:1009.5005.Google ScholarGoogle Scholar
  30. Z Lin, R. Liu, and Z. Su. 2011. Linearized alternating direction method with adaptive penalty for low-rank representation. In Proceedings of 24th International Conference on Neural Information Processing Systems. 612--620. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma. 2013. Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 1 (2013), 171--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. G. Liu, Z. Lin, and Y. Yu. 2010. Robust subspace segmentation by low-rank representation. In Proceedings of the 27th International Conference on Machine Learning. 663--670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. G. Liu and S. Yan. 2011. Latent low-rank representation for subspace segmentation and feature extraction. In Proceedings of the IEEE International Conference on Computer Vision. 1615--1622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. H. Liu, T. Liu, J. Wu, D. Tao, and Y. Fu. 2015a. Spectral ensemble clustering. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 715--724. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. H. Liu, M. Shao, S. Li, and Y. Fu. 2016. Infinite ensemble for image clustering. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1745--1754. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. H. Liu, M. Shao, S. Li, and Y. Fu. 2018. Infinite ensemble clustering. Data Mining and Knowledge Discovery 32, 2 (2018), 385--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. H. Liu, J. Wu, T. Liu, D. Tao, and Y. Fu. 2017. Spectral ensemble clustering via weighted K-means: Theoretical and practical evidence. IEEE Transactions on Knowledge and Data Engineering 29, 5 (2017), 1129--1143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. H. Liu, J. Wu, D. Tao, Y. Zhang, and Y. Fu. 2015b. Dias: A disassemble-assemble framework for highly sparse text clustering. In Proceedings of SIAM International Conference on Data Mining. 766--774.Google ScholarGoogle Scholar
  39. C. Lu, J. Tang, S. Yan, and Z. Lin. 2014. Generalized nonconvex nonsmooth low-rank minimization. In IEEE Conference on Computer Vision and Pattern Recognition. 4130--4137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Z. Lu, Y. Peng, and J. Xiao. 2008. From comparing clusterings to combining clusterings. In Proceedings of AAAI Conference on Artificial Intelligence. 665--670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. U. Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing 17, 4 (2007), 395--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. B. Minaei-Bidgoli, A. P. Topchy, and W. F. Punch. 2004. A comparison of resampling methods for clustering ensembles. In Proceedings of the International Conference on Machine Learning; Models, Technologies and Applications. 939--945.Google ScholarGoogle Scholar
  43. B. Mirkin. 2001. Reinterpreting the category utility function. Machine Learning 45, 2 (2001), 219--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. S. A. Nene, S. K. Nayar, and H. Murase. 1996. Columbia Object Image Library (COIL-20). Technical Report CUCS-005-96.Google ScholarGoogle Scholar
  45. A. Y. Ng, M. I. Jordan, and Y. Weiss. 2001. On spectral clustering: Analysis and an algorithm. In Proceedings of Advances in Neural Information Processing Systems. 849--856. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. T. Hyun Oh, Y. Matsushita, Y. Tai, and I. Kweon. 2015. Fast randomized singular value thresholding for nuclear norm minimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4484--4493.Google ScholarGoogle Scholar
  47. C. Peng, Z. Kang, H. Li, and Q. Cheng. 2015. Subspace clustering using log-determinant rank approximation. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 925--934. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. M. Shao, D. Kit, and Y. Fu. 2014. Generalized transfer subspace learning through low-rank constraint. International Journal of Computer Vision 109, 1--2 (2014), 74--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. M. Shao, S. Li, Z. Ding, and Y. Fu. 2015. Deep linear coding for fast graph clustering. In Proceedings of International Joint Conference on Artificial Intelligence. 3798--3804. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. J. Shi and J. Malik. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 8 (2000), 888--905. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. A. Strehl and J. Ghosh. 2003. Cluster ensembles — A knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3 (2003), 583--617. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Z. Tao, H. Liu, and Y. Fu. 2017a. Simultaneous clustering and ensemble. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 1546--1552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Z. Tao, H. Liu, S. Li, Z. Ding, and Y. Fu. 2017b. From ensemble clustering to multi-view clustering. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2843--2849. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Z. Tao, H. Liu, S. Li, and Y. Fu. 2016. Robust spectral ensemble clustering. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 367--376. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. A. Topchy, A. K. Jain, and W. Punch. 2003. Combining multiple weak clusterings. In Proceedings of IEEE International Conference on Data Mining. 331--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. A. Topchy, A Jain, and W. Punch. 2004. A mixture model for clustering ensembles. In Proceedings of SIAM International Conference on Data Mining. 379--390.Google ScholarGoogle Scholar
  57. A. Topchy, A Jain, and W. Punch. 2005. Clustering ensembles: Models of consensus and weak partitions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 12 (2005), 1866--1881. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. S. Vega-Pons, J. Correa-Morris, and J. Ruiz-Shulcloper. 2010. Weighted partition consensus via kernels. Pattern Recognition 43, 8 (2010), 2712--2724. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. S. Wang, D. Liu, and Z. Zhang. 2013. Nonconvex relaxation approaches to robust matrix recovery. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 1764--1770. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. J. Wright, A. Ganesh, S. Rao, Y. Peng, and Y. Ma. 2009. Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In Proceedings of Advances in Neural Information Processing Systems. 2080--2088. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. J. Wu, H. Liu, H. Xiong, J. Cao, and J. Chen. 2015. K-means-based consensus clustering: A unified view. IEEE Transactions on Knowledge and Data Engineering 27, 1 (2015), 155--169.Google ScholarGoogle ScholarCross RefCross Ref
  62. J. Wu, H. Xiong, and J. Chen. 2009. Adapting the right measures for K-means clustering. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 877--886. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. S. Xiao, W. Li, D. Xu, and D. Tao. 2015. FaLRR: A fast low rank representation solver. In IEEE Conference on Computer Vision and Pattern Recognition. 4612--4620.Google ScholarGoogle Scholar
  64. D. Yan, L. Huang, and M. I. Jordan. 2009. Fast approximate spectral clustering.. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 907--916. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. J. Yi, T. Yang, R. Jin, A. K. Jain, and M. Mahdavi. 2012. Robust ensemble clustering by matrix completion. In Proceedings of IEEE International Conference on Data Mining. 1176--1181. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. M. Yin, J. Gao, and Z. Lin. 2016. Laplacian regularized low-rank representation and its application. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 3 (2016), 504--517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. H. Yoon, S. Ahn, S. Lee, S. Cho, and J. Kim. 2006. Heterogeneous clustering ensemble method for combining different cluster results. Data Mining for Biomedical Applications 3916 (2006), 82--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. C. Zhang. 2010. Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics 38, 2 (2010), 894--942.Google ScholarGoogle ScholarCross RefCross Ref
  69. L. Zheng, T. Li, and C. Ding. 2014. A framework for hierarchical ensemble clustering. ACM Transactions on Knowledge Discovery from Data 9, 2 (2014), 9:1--9:23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. P. Zhou, L. Du, H. Wang, L. Shi, and Y. Shen. 2015. Learning a robust consensus matrix for clustering ensemble via Kullback--Leibler divergence minimization. In Proceedings of International Joint Conference on Artificial Intelligence. 4112--4118. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Robust Spectral Ensemble Clustering via Rank Minimization

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Knowledge Discovery from Data
          ACM Transactions on Knowledge Discovery from Data  Volume 13, Issue 1
          February 2019
          340 pages
          ISSN:1556-4681
          EISSN:1556-472X
          DOI:10.1145/3301280
          Issue’s Table of Contents

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 January 2019
          • Accepted: 1 September 2018
          • Revised: 1 February 2018
          • Received: 1 December 2016
          Published in tkdd Volume 13, Issue 1

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format