- {BF98} Bradley, P., and Fayyad, U. M., "Refining Initial Points for KM Clustering," Microsoft Technical Report 98-36, May 1998.Google Scholar
- {BFR98} Bradley, P., Fayyad, U. M., and Reina, C. A., "Scaling EM Clustering to Large Databases," Microsoft Technical Report, 1998.Google Scholar
- {BFR98a} Bradley, P., Fayyad, U. M., and Reina, C. A., "Scaling Clustering to Large Databases," KDD98, 1998.Google Scholar
- {DLR77} Dempster, A. P., Laird, N. M., and Rubin, D. B., "Miximum Likelihood from Incomplete Data via the EM Algorithm," Journal of the Royal Statistical Society, Series B, 39(1):1-38, 1977.Google Scholar
- {DM99} Dhillon, I. S. and Modha, D. S. "A data clustering algorithm on distributed memory machines," ACM SIGKDD Workshop on Large-Scale Parallel KDD Systems (with KDD99), August 1999.Google Scholar
- {GG92} Gersho & Gray, "Vector Quantization and Signal Compression," KAP, 1992. Google ScholarDigital Library
- {JD77} Anil K. Jain, Richard C. Dubes, "Algorithms for Clustering Data (Prentice Hall Advanced Reference Series : Computer Science)," Prentice Hall, 1977.Google Scholar
- {KC99} Kantabutra, S. and Couch, A. L., "Parallel K-Means Clustering Algorithm on NOWs," NECTEC Technical Journal, Vol. 1, No. l, March 1999.Google Scholar
- {KR90} Kaufman, L. and Rousseeuw, P. J., "Finding Groups in Data : An Introduction to Cluster Analysis," John Wiley & Sons, 1990.Google Scholar
- {M67} MacQueen, J. "Some methods for classification and analysis of multivariate observations," pp. 281-297 in: L. M. Le Cam & J. Neyman {eds.} Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1. University of California Press, Berkeley. xvii + 666 p. 1967.Google Scholar
- {MK97} McLachlan, G. J. and Krishnan, T., "The EM Algorithm and Extensions," John Wiley & Sons, Inc., 1997.Google Scholar
- {NetPerception} A commercial recommender system, http://www.netperceptions.comGoogle Scholar
- {RF97} Ruocco A. and Frieder O., "Clustering and Classification of Large Document Bases in a Parallel Environment," Journal of the American Society of Information Science, 48(10), October 1997. Google ScholarDigital Library
- {S99} Snyder, L., "A Programmer's Guide to ZPL," Scientific and Engineering Computation Series, MIT Press; ISBN: 0262692171, 1999. See also: http://www.cs.washington.edu/research/zpl Google ScholarDigital Library
- {ZHD00a} Zhang, B., Hsu, M. and Dayal, U. (2000). "K-Harmonic Means: A Spatial Clustering Algorithm with Boosting." In Proc. International Workshop on Temporal, Spatial and Spatio-Temporal Data Mining, TSDM2000, Lyon, France, Lecture Notes in Artificial Intelligence, 2007. Roddick, J. F. and Hornsby, K., Eds., Springer.Google ScholarDigital Library
- {Z00b} Zhang, B. "Generalized K-Harmonic Means - Boosting in Unsupervised Learning", Hewllet-Packard Laboratories Technical Report: http://www.hpl.hp.com/techreports/2000/HPL- 2000-137.html.Google Scholar
- {ZHF00} Zhang, B., Hsu, M., and Forman, G. "Accurate Recasting of Parameter Estimation Algorithms using Sufficient Statistics for Efficient Parallel Speed-up: Demonstrated for Center-Based Data Clustering Algorithms," 4th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), September 13-16, 2000. Also available as Hewlett-Packard Labs Technical Report HPL-2000-6.Google Scholar
- {ZRL96} Zhang, T., Ramakrishnan, R., and Livny, M., "BIRCH: an efficient data clustering method for very large databases," ACM SIGMOD Record, Vol. 25, No. 2, pages 103-114, June 1996. Google ScholarDigital Library
Index Terms
- Distributed data clustering can be efficient and exact
Recommendations
Mining constrained frequent itemsets from distributed uncertain data
Nowadays, high volumes of massive data can be generated from various sources (e.g.,sensor data from environmental surveillance). Many existing distributed frequent itemset mining algorithms do not allow users to express the itemsets to be mined ...
Distributed Association Mining on Message Passing Systems
ISPA '10: Proceedings of the International Symposium on Parallel and Distributed Processing with ApplicationsAssociation mining in finding relationships between items in a dataset has been demonstrated to be practical in business applications. Many companies are applying association mining on market data for analyzing consumers’ purchase behavior. The Apriori ...
New Spark solutions for distributed frequent itemset and association rule mining algorithms
AbstractThe large amount of data generated every day makes necessary the re-implementation of new methods capable of handle with massive data efficiently. This is the case of Association Rules, an unsupervised data mining tool capable of extracting ...
Comments