skip to main content
Skip header Section
Constrained Clustering: Advances in Algorithms, Theory, and ApplicationsAugust 2008
Publisher:
  • Chapman & Hall/CRC
ISBN:978-1-58488-996-0
Published:12 August 2008
Pages:
472
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

Since the initial work on constrained clustering, there have been numerous advances in methods, applications, and our understanding of the theoretical properties of constraints and constrained clustering algorithms. Bringing these developments together, Constrained Clustering: Advances in Algorithms, Theory, and Applications presents an extensive collection of the latest innovations in clustering data analysis methods that use background knowledge encoded as constraints. Algorithms The first five chapters of this volume investigate advances in the use of instance-level, pairwise constraints for partitional and hierarchical clustering. The book then explores other types of constraints for clustering, including cluster size balancing, minimum cluster size,and cluster-level relational constraints. Theory It also describes variations of the traditional clustering under constraints problem as well as approximation algorithms with helpful performance guarantees. Applications The book ends by applying clustering with constraints to relational data, privacy-preserving data publishing, and video surveillance data. It discusses an interactive visual clustering approach, a distance metric learning approach, existential constraints, and automatically generated constraints. With contributions from industrial researchers and leading academic experts who pioneered the field, this volume delivers thorough coverage of the capabilities and limitations of constrained clustering methods as well as introduces new types of constraints and clustering algorithms.

Cited By

  1. ACM
    Lappas T (2020). Mining Career Paths from Large Resume Databases, ACM Transactions on Knowledge Discovery from Data, 14:3, (1-38), Online publication date: 30-Jun-2020.
  2. Hünemörder M, Kazempour D, Kröger P and Seidl T SIDEKICK: Linear Correlation Clustering with Supervised Background Knowledge Similarity Search and Applications, (221-230)
  3. Zhang H, Basu S and Davidson I A Framework for Deep Constrained Clustering - Algorithms and Advances Machine Learning and Knowledge Discovery in Databases, (57-72)
  4. Mercado P, Bosch J and Stoll M Node Classification for Signed Social Networks Using Diffuse Interface Methods Machine Learning and Knowledge Discovery in Databases, (524-540)
  5. ACM
    Ahmadian S, Epasto A, Kumar R and Mahdian M Clustering without Over-Representation Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (267-275)
  6. Liu Y, Du F, Sun J, Silva T, Jiang Y and Zhu T (2019). Identifying social roles using heterogeneous features in online social networks, Journal of the Association for Information Science and Technology, 70:7, (660-674), Online publication date: 5-Jun-2019.
  7. ACM
    Arendt D, Saldanha E, Wesslen R, Volkova S and Dou W Towards rapid interactive machine learning Proceedings of the 24th International Conference on Intelligent User Interfaces, (591-602)
  8. Davidson I, Gourru A and Ravi S The cluster description problem - complexity results, formulations and approximations Proceedings of the 32nd International Conference on Neural Information Processing Systems, (6193-6203)
  9. Ngoc M and Park D (2018). Centroid Neural Network with Pairwise Constraints for Semi-supervised Learning, Neural Processing Letters, 48:3, (1721-1747), Online publication date: 1-Dec-2018.
  10. Wei D, Natesan Ramamurthy K and Varshney K (2018). Distribution‐preserving k‐anonymity, Statistical Analysis and Data Mining, 11:6, (253-270), Online publication date: 16-Nov-2018.
  11. ACM
    Aydin O, Janikas M, Assunção R and Lee T SKATER-CON Proceedings of the 2nd ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery, (33-42)
  12. Delias P and Lakiotaki K (2018). Discovering Process Horizontal Boundaries to Facilitate Process Comprehension, International Journal of Operations Research and Information Systems, 9:2, (1-31), Online publication date: 1-Apr-2018.
  13. ACM
    Vu V and Do H Graph-based Clustering with Background Knowledge Proceedings of the 8th International Symposium on Information and Communication Technology, (167-172)
  14. mieja M and Geiger B (2017). Semi-supervised cross-entropy clustering with information bottleneck constraint, Information Sciences: an International Journal, 421:C, (254-271), Online publication date: 1-Dec-2017.
  15. mieja M, Struski u and Tabor J (2017). Semi-supervised model-based clustering with controlled clusters leakage, Expert Systems with Applications: An International Journal, 85:C, (146-157), Online publication date: 1-Nov-2017.
  16. Li T, De la Prieta Pintado F, Corchado J and Bajo J (2017). Multi-source homogeneous data clustering for multi-target detection from cluttered background with misdetection, Applied Soft Computing, 60:C, (436-446), Online publication date: 1-Nov-2017.
  17. Karpatne A, Atluri G, Faghmous J, Steinbach M, Banerjee A, Ganguly A, Shekhar S, Samatova N and Kumar V (2017). Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data, IEEE Transactions on Knowledge and Data Engineering, 29:10, (2318-2331), Online publication date: 1-Oct-2017.
  18. Chang Y, Chen J, Cho M, Castaldi P, Silverman E and Dy J Multiple clustering views from multiple uncertain experts Proceedings of the 34th International Conference on Machine Learning - Volume 70, (674-683)
  19. Greco G and Guzzo A (2017). Constrained coalition formation on valuation structures, Artificial Intelligence, 249:C, (19-46), Online publication date: 1-Aug-2017.
  20. Diez-Olivan A, Pagan J, Sanz R and Sierra B (2017). Data-driven prognostics using a combination of constrained K-means clustering, fuzzy modeling and LOF-based score, Neurocomputing, 241:C, (97-107), Online publication date: 7-Jun-2017.
  21. Kuo C, Ravi S, Dao T, Vrain C and Davidson I A framework for minimal clustering modification via constraint programming Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, (1389-1395)
  22. Shin S and Moon I (2017). Guided HTM, IEEE Transactions on Knowledge and Data Engineering, 29:2, (330-343), Online publication date: 1-Feb-2017.
  23. Gollub T, Busse M, Stein B and Hagen M Keyqueries for Clustering and Labeling Information Retrieval Technology, (42-55)
  24. Liu W, Luo X, Gong Z, Xuan J, Kou N and Xu Z (2016). Discovering the core semantics of event from social media, Future Generation Computer Systems, 64:C, (175-185), Online publication date: 1-Nov-2016.
  25. Trabelsi A and Zaïane O (2016). Mining contentious documents, Knowledge and Information Systems, 48:3, (537-560), Online publication date: 1-Sep-2016.
  26. Dao T, Vrain C, Duong K and Davidson I A framework for actionable clustering using constraint programming Proceedings of the Twenty-second European Conference on Artificial Intelligence, (453-461)
  27. Borgwardt S and Onn S (2016). Efficient solutions for weight-balanced partitioning problems, Discrete Optimization, 21:C, (71-84), Online publication date: 1-Aug-2016.
  28. Zhang X, Gao L and Yu H Constraint Based Subspace Clustering for High Dimensional Uncertain Data Proceedings, Part II, of the 20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining - Volume 9652, (271-282)
  29. ACM
    Bakharia A, Bruza P, Watters J, Narayan B and Sitbon L Interactive Topic Modeling for aiding Qualitative Content Analysis Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval, (213-222)
  30. Trabelsi A and Zaïane O (2015). Extraction and clustering of arguing expressions in contentious text, Data & Knowledge Engineering, 100:PB, (226-239), Online publication date: 1-Nov-2015.
  31. Aidos H, Lourenço A, Batista D, Bulò S and Fred A Semi-supervised consensus clustering for ECG pathology classification Proceedings of the 2015th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III, (150-164)
  32. ACM
    Campello R, Moulavi D, Zimek A and Sander J (2015). Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection, ACM Transactions on Knowledge Discovery from Data, 10:1, (1-51), Online publication date: 27-Jul-2015.
  33. Wang C, Song Y, Roth D, Wang C, Han J, Ji H and Zhang M Constrained information-theoretic tripartite graph clustering to identify semantically similar relations Proceedings of the 24th International Conference on Artificial Intelligence, (3882-3889)
  34. Ashtiani H and Ben-David S Representation learning for clustering Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, (82-91)
  35. ACM
    Silva W, Barioni M, de Amo S and Razente H Semi-supervised clustering using multi-assistant-prototypes to represent each cluster Proceedings of the 30th Annual ACM Symposium on Applied Computing, (831-836)
  36. Easterling D, Watson L and Ramakrishnan N An improved probability-one homotopy map for tracking constrained clustering solutions Proceedings of the Symposium on High Performance Computing, (233-240)
  37. ACM
    Gress A and Davidson I A Flexible Framework for Projecting Heterogeneous Data Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, (1169-1178)
  38. ACM
    Günnemann S, Färber I, Rüdiger M and Seidl T SMVC Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, (253-262)
  39. Yoshida T (2014). A GRAPH-BASED APPROACH FOR SEMISUPERVISED CLUSTERING, Computational Intelligence, 30:2, (263-284), Online publication date: 1-May-2014.
  40. Abin A and Beigy H (2014). Active selection of clustering constraints, Pattern Recognition, 47:3, (1443-1458), Online publication date: 1-Mar-2014.
  41. Rajasekaran S and Saha S A Novel Deterministic Sampling Technique to Speedup Clustering Algorithms Part II of the Proceedings of the 9th International Conference on Advanced Data Mining and Applications - Volume 8347, (34-46)
  42. ACM
    Gilpin S, Qian B and Davidson I Efficient hierarchical clustering of large high dimensional datasets Proceedings of the 22nd ACM international conference on Information & Knowledge Management, (1371-1380)
  43. Bair E (2013). Semi-supervised clustering methods, WIREs Computational Statistics, 5:5, (349-361), Online publication date: 1-Sep-2013.
  44. ACM
    Gilpin S, Eliassi-Rad T and Davidson I Guided learning for role discovery (GLRD) Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, (113-121)
  45. Zeng H, Song A and Cheung Y (2013). Improving clustering with pairwise constraints, Knowledge and Information Systems, 36:2, (489-515), Online publication date: 1-Aug-2013.
  46. Covões T, Hruschka E and Ghosh J (2013). A study of K-Means-based algorithms for constrained clustering, Intelligent Data Analysis, 17:3, (485-505), Online publication date: 1-May-2013.
  47. Easterling D, Hossain M, Watson L and Ramakrishnan N Probability-one homotopy maps for tracking constrained clustering solutions Proceedings of the High Performance Computing Symposium, (1-8)
  48. Yoshida T Influence of erroneous pairwise constraints in semi-supervised clustering Proceedings of the 8th international conference on Active Media Technology, (43-52)
  49. ACM
    Li F, He T, Tu X and Hu X Incorporating word correlation into tag-topic model for semantic knowledge acquisition Proceedings of the 21st ACM international conference on Information and knowledge management, (1622-1626)
  50. ACM
    Wang X, Qian B and Davidson I Improving document clustering using automated machine translation Proceedings of the 21st ACM international conference on Information and knowledge management, (645-653)
  51. Métivier J, Boizumault P, Crémilleux B, Khiari M and Loudni S Constrained clustering using SAT Proceedings of the 11th international conference on Advances in Intelligent Data Analysis, (207-218)
  52. Hoffman J, Kulis B, Darrell T and Saenko K Discovering Latent Domains for Multisource Domain Adaptation Proceedings, Part II, of the 12th European Conference on Computer Vision --- ECCV 2012 - Volume 7573, (702-715)
  53. ACM
    Chhabra S and Resnick P CubeThat Proceedings of the sixth ACM conference on Recommender systems, (295-296)
  54. Silva A and Antunes C Semi-supervised clustering Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition, (252-263)
  55. Ebrahimi J and Saniee Abadeh M Semi supervised clustering Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition, (237-251)
  56. Ares M, Parapar J and Barreiro Á (2012). An experimental study of constrained clustering effectiveness in presence of erroneous constraints, Information Processing and Management: an International Journal, 48:3, (537-551), Online publication date: 1-May-2012.
  57. Jagarlamudi J, Daumé H and Udupa R Incorporating lexical priors into topic models Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, (204-213)
  58. Vu V, Labroche N and Bouchon-Meunier B (2012). Improving constrained clustering with active query selection, Pattern Recognition, 45:4, (1749-1758), Online publication date: 1-Apr-2012.
  59. Parapar J and Barreiro Á Language modelling of constraints for text clustering Proceedings of the 34th European conference on Advances in Information Retrieval, (352-363)
  60. ACM
    Métivier J, Boizumault P, Crémilleux B, Khiari M and Loudni S A constraint language for declarative pattern discovery Proceedings of the 27th Annual ACM Symposium on Applied Computing, (119-125)
  61. Bravo C and Weber R Semi-supervised constrained clustering with cluster outlier filtering Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, (347-354)
  62. Miyamoto S Two classes of algorithms for data clustering Proceedings of the 2011 international conference on Integrated uncertainty in knowledge modelling and decision making, (19-30)
  63. Masiero A, Leite M, Filgueiras L and Aquino P Multidirectional knowledge extraction process for creating behavioral personas Proceedings of the 10th Brazilian Symposium on Human Factors in Computing Systems and the 5th Latin American Conference on Human-Computer Interaction, (91-99)
  64. ACM
    Shakarian P, Subrahmanian V and Sapino M (2011). GAPs, ACM Transactions on Intelligent Systems and Technology, 3:1, (1-27), Online publication date: 1-Oct-2011.
  65. Anand R and Reddy C Constrained logistic regression for discriminative pattern mining Proceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, (92-107)
  66. Benabdeslem K and Hindawi M Constrained laplacian score for semi-supervised feature selection Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I, (204-218)
  67. Anand R and Reddy C Constrained logistic regression for discriminative pattern mining Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I, (92-107)
  68. Allab K and Benabdeslem K Constraint selection for semi-supervised topological clustering Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I, (28-43)
  69. Benabdeslem K and Hindawi M Constrained Laplacian score for semi-supervised feature selection Proceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, (204-218)
  70. Allab K and Benabdeslem K Constraint selection for semi-supervised topological clustering Proceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, (28-43)
  71. ACM
    Gilpin S and Davidson I Incorporating SAT solvers into hierarchical clustering algorithms Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, (1136-1144)
  72. ACM
    Liu E, Zhang Z and Wang W Clustering with relative constraints Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, (947-955)
  73. Anand R and Reddy C Graph-based clustering with constraints Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II, (51-62)
  74. Zhai Z, Liu B, Xu H and Jia P Constrained LDA for grouping product features in opinion mining Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I, (448-459)
  75. Martínez-Torres M, Toral S and Barrero F (2011). Identification of the design variables of eLearning tools, Interacting with Computers, 23:3, (279-288), Online publication date: 1-May-2011.
  76. Mueller M and Kramer S Integer linear programming models for constrained clustering Proceedings of the 13th international conference on Discovery science, (159-173)
  77. Dubey A, Bhattacharya I and Godbole S A cluster-level semi-supervision model for interactive clustering Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I, (409-424)
  78. Dubey A, Bhattacharya I and Godbole S A cluster-level semi-supervision model for interactive clustering Proceedings of the 2010th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, (409-424)
  79. Ares M, Parapar J and Barreiro Á Improving alternative text clustering quality in the avoiding bias task with spectral and flat partition algorithms Proceedings of the 21st international conference on Database and expert systems applications: Part II, (407-421)
  80. Cai J and Strube M End-to-end coreference resolution via hypergraph partitioning Proceedings of the 23rd International Conference on Computational Linguistics, (143-151)
  81. Jiang X and Abdala D Exploring the performance limit of cluster ensemble techniques Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition, (405-414)
  82. Vu V, Labroche N and Bouchon-Meunier B Boosting Clustering by Active Constraint Selection Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence, (297-302)
  83. ACM
    Preston D, Brodley C, Khardon R, Sulla-Menashe D and Friedl M Redefining class definitions using constraint-based clustering Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, (823-832)
  84. ACM
    Wang X and Davidson I Flexible constrained spectral clustering Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, (563-572)
  85. ACM
    Ye Y, Li T, Chen Y and Jiang Q Automatic malware categorization using cluster ensemble Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, (95-104)
  86. Hartung S and Niedermeier R Incremental list coloring of graphs, parameterized by conservation Proceedings of the 7th annual conference on Theory and Applications of Models of Computation, (258-270)
  87. Zhang C, Cai Q and Song Y (2010). Boosting with pairwise constraints, Neurocomputing, 73:4-6, (908-919), Online publication date: 1-Jan-2010.
  88. ACM
    Nielsen F (2009). Technical opinionSteering self-learning distance algorithms, Communications of the ACM, 52:11, (150-152), Online publication date: 1-Nov-2009.
  89. Ares M, Parapar J and Barreiro Á Avoiding Bias in Text Clustering Using Constrained K-means and May-Not-Links Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory, (322-329)
  90. Wang F, Wang X and Li T Generalized cluster aggregation Proceedings of the 21st International Joint Conference on Artificial Intelligence, (1279-1284)
  91. ACM
    Andrzejewski D, Zhu X and Craven M Incorporating domain knowledge into topic modeling via Dirichlet Forest priors Proceedings of the 26th Annual International Conference on Machine Learning, (25-32)
  92. Andrzejewski D and Zhu X Latent Dirichlet Allocation with topic-in-set knowledge Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing, (43-48)
  93. Davidson I and Ravi S (2009). Using instance-level constraints in agglomerative hierarchical clustering, Data Mining and Knowledge Discovery, 18:2, (257-282), Online publication date: 1-Apr-2009.
  94. ACM
    Bonchi F, Castillo C, Donato D and Gionis A Topical query decomposition Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, (52-60)
Contributors
  • Google LLC
  • University of California, Davis
  • Jet Propulsion Laboratory

Recommendations