skip to main content
survey

A Comprehensive Survey on Cloud Data Mining (CDM) Frameworks and Algorithms

Authors Info & Claims
Published:13 September 2019Publication History
Skip Abstract Section

Abstract

Data mining is used for finding meaningful information out of a vast expanse of data. With the advent of Big Data concept, data mining has come to much more prominence. Discovering knowledge out of a gigantic volume of data efficiently is a major concern as the resources are limited. Cloud computing plays a major role in such a situation. Cloud data mining fuses the applicability of classical data mining with the promises of cloud computing. This allows it to perform knowledge discovery out of huge volumes of data with efficiency. This article presents the existing frameworks, services, platforms, and algorithms for cloud data mining. The frameworks and platforms are compared among each other based on similarity, data mining task support, parallelism, distribution, streaming data processing support, fault tolerance, security, memory types, storage systems, and others. Similarly, the algorithms are grouped on the basis of parallelism type, scalability, streaming data mining support, and types of data managed. We have also provided taxonomies on the basis of data mining techniques such as clustering, classification, and association rule mining. We also have attempted to discuss and identify the major applications of cloud data mining. The various taxonomies for cloud data mining frameworks, platforms, and algorithms have been identified. This article aims at gaining better insight into the present research realm and directing the future research toward efficient cloud data mining in future cloud systems.

References

  1. Ronald C. Taylor. 2010. An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinform. 11, 12 (2010), S1.Google ScholarGoogle ScholarCross RefCross Ref
  2. Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. X. Geng and Z. Yang. 2013. Data mining in cloud computing. In Proceedings of the International Conference on Information Science and Computer Applications (ISCA’13). 1--7.Google ScholarGoogle Scholar
  4. M. Zaharia, A. Konwinski, A. D. Joseph, R. H. Katz, and I. Stoica. 2008. Improving MapReduce performance in heterogeneous environments. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation. 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. X. Tan, V. L. Liu, M. Kantarcioglu, and B. Thuraisingham. 2010. A comparison of approaches for large-scale data mining. Technical Report UTDCS-24-10.Google ScholarGoogle Scholar
  6. Yunhong Gu and Robert L. Grossman. 2009. Sector and sphere: The design and implementation of a high-performance data cloud. Philos. Trans. Roy. Soc. London A: Math. Phys. Eng. Sci. 367.1897 (2009), 2429--2445.Google ScholarGoogle ScholarCross RefCross Ref
  7. Uzma Ali and Punam Khandar. 2013. Data mining for data cloud and compute cloud. International Journal of Innovative Research in Computer and Communication Engineering 1, 5 (July 2013), 1137--1141.Google ScholarGoogle Scholar
  8. Yunhong Gu, Li Lu, Robert Grossman, and Andy Yoo. 2010. Processing massive sized graphs using Sector/Sphere. In Proceedings of the IEEE Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS’10). IEEE, 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  9. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and S. I. Spark. 2010. Cluster computing with working sets. In Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. USENIX Association Berkeley, CA, 10--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Fabrizio Marozzo, Domenico Talia, and Paolo Trunfio. 2011. A cloud framework for parameter sweeping data mining applications. In Proceedings of the IEEE 3rd International Conference on Cloud Computing Technology and Science (CloudCom’11). IEEE, 367--374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. 2012. Distributed GraphLab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 8 (2012), 716--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Aapo Kyrola, Guy E. Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Amy Xuyang Tan, Valerie Li Liu, Murat Kantarcioglu, and Bhavani Thuraisingham. 2010. A comparison of approaches for large-scale data mining. Technical Report UTDCS-24-10.Google ScholarGoogle Scholar
  14. A. Mahendiran, N. Saravanan, N. Venkata Subramanian, and N. Sairam. 2012. Implementation of K-means clustering in cloud computing environment. Res. J. Appl. Sci. Eng. Technol. 4, 10 (2012), 1391--1394.Google ScholarGoogle Scholar
  15. K. Srivastava, R. Shah, D. Valia, and H. Swaminarayan. 2013. Data mining using hierarchical agglomerative clustering algorithm in distributed cloud computing environment. Int. J. Comput. Theory Eng. 5, 3 (2013), 520.Google ScholarGoogle ScholarCross RefCross Ref
  16. Tugdual Sarazin, Mustapha Lebbah, and Hanane Azzag. 2014. Biclustering using Spark-MapReduce. In Proceedings of the IEEE International Conference on Big Data (BigData’14). IEEE, 58--60.Google ScholarGoogle ScholarCross RefCross Ref
  17. Wei Liu and Ling Chen. 2008. A parallel algorithm for gene expressing data biclustering. J. Comput. Phys. 3, 10 (2008), 71--77.Google ScholarGoogle Scholar
  18. Spiros Papadimitriou and Jimeng Sun. 2008. Disco: Distributed co-clustering with MapReduce: A case study towards petabyte-scale end-to-end mining. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM’08). IEEE, 512--521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Esha Sarkar and C. H. Sekhar. 2014. Organizing data in cloud using clustering approach. Int. J. Sci. Eng. Res. 5, 5 (2014).Google ScholarGoogle Scholar
  20. Madhuri H. Parekh. {n.d.}. Enhancement clustering of cloud datasets using improved agglomerative technique. Int. J. Adv. Netw. Appl. 128--131.Google ScholarGoogle Scholar
  21. Renu Ansari. 2015. A distributed k-mean clustering algorithm for cloud data mining. Int. J. Eng. Trends Technol. 30, 7 (2015).Google ScholarGoogle Scholar
  22. Xianfeng Yang and Pengfei Liu. 2013. A new algorithm of the data mining model in cloud computing based on web fuzzy clustering analysis. J. Theor. Appl. Info. Technol. 49, 1 (2013).Google ScholarGoogle Scholar
  23. S. Guha, R. Rastogi, and K. Shim. 1998. June. CURE: An efficient clustering algorithm for large databases. In ACM SIGMOD Record, Vol. 27, No. 2. ACM, 73--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Madhuri H. Parekh and Ishan K. Rajani. 2015. Improve performance of clustering on cloud datasets using improved agglomerative CURE hierarchical algorithm. Int. J. Sci. Eng. Technol. Res. 4, 6 (2015).Google ScholarGoogle Scholar
  25. Kun Qin, Min Xu, Yi Du, and Shuying Yue. 2008. Cloud model and hierarchical clustering-based spatial data mining method and application. Int. Arch. Photogram. Remote Sens. Spatial Info. Sci. 37, B2 (2008), 241--245.Google ScholarGoogle Scholar
  26. Ran Jin, Chunhai Kou, Ruijuan Liu, and Yefeng Li. 2013. Efficient parallel spectral clustering algorithm design for large data sets under cloud computing environment. J. Cloud Comput.: Adv. Syst. Appl. 2, 1 (2013), 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nivranshu Hans, Sana Mahajan, and S. Omkar. 2015. Big data clustering using genetic algorithm on Hadoop MapReduce. Int. J. Sci. Technol. Res. 4 (2015).Google ScholarGoogle Scholar
  28. M. Shindler, A. Wong, and A. W. Meyerson. 2011. Fast and accurate k-means for large datasets. In Advances in Neural Information Processing Systems. MIT Press, 2375--2383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Bhupendra Panchal and R. K. Kapoor. 2013. Performance enhancement of cloud computing with clustering. Int. J. Eng. Adv. Technol. 2, 5 (2013).Google ScholarGoogle Scholar
  30. Pooja Bisht and Kulvinder Singh. 2016. Big data mining: Analysis of genetic K- means algorithm for big data clustering. Int. J. Adv. Res. Comput. Sci. Software Eng. 6, 7 (2016).Google ScholarGoogle Scholar
  31. Alessandro Lulli, Matteo Dell’Amico, Pietro Michiardi, and Laura Ricci. 2016. NG-DBSCAN: Scalable density-based clustering for arbitrary data. Proc. VLDB Endow. 10, 3 (2016), 157--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Yaobin He, Haoyu Tan, Wuman Luo, Huajian Mao, Di Ma, Shengzhong Feng, and Jianping Fan. 2011. Mr-dbscan: An efficient parallel density-based clustering algorithm using MapReduce. In Proceedings of the IEEE 17th International Conference on Parallel and Distributed Systems (ICPADS’11). IEEE, 473--480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Dianwei Han, Ankit Agrawal, Wei-Keng Liao, and Alok Choudhary. 2016. A novel scalable DBSCAN algorithm with Spark. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops. IEEE, 1393--1402.Google ScholarGoogle ScholarCross RefCross Ref
  34. F. Ozgur Catak and M. Erdal Balaban. 2012. CloudSVM: Training an SVM classifier in cloud computing systems. In Proceedings of the Joint International Conference on Pervasive Computing and the Networked World. Springer, Berlin, 57--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Lijuan Zhang and Shuguang Zhao. 2013. The strategy of classification mining based on cloud computing. In Proceedings of the International Workshop on Cloud Computing and Information Security (CCIS’13).Google ScholarGoogle Scholar
  36. Lijuan Zhou, Hui Wang, and Wenbo Wang. 2012. Parallel implementation of classification algorithms based on cloud computing environment. TELKOMNIKA Indones. J. Electr. Eng. 10, 5 (2012), 1087--1092.Google ScholarGoogle Scholar
  37. Jing Ding and Shanlin Yang. 2012. Classification rules mining model with genetic algorithm in cloud computing. Int. J. Comput. Appl. 48, 18 (2012), 24--32.Google ScholarGoogle Scholar
  38. Jian Wang. 2012. A novel K-NN classification algorithm for privacy preserving in cloud computing. Res. J. Appl. Sci. Eng. Technol. 22, 4 (2012), 4865--4870.Google ScholarGoogle Scholar
  39. Pooja Bajare, Monika Bhoyate, Yogita Bhujbal, Erandole Monika, and Vaishali Shinde. {n.d.}. k-nearest neighbor classification over encrypted cloud data. IOSR Journal of Computer Engineering (IOSR-JCE). 45--48.Google ScholarGoogle Scholar
  40. Apexa B. Kamdar and Jay M. Jagani. 2014. A survey: Classification of huge cloud datasets with efficient map-reduce policy. International Journal of Engineering Trends and Technology (IJETT) 18, 2 (2014), 103--107.Google ScholarGoogle ScholarCross RefCross Ref
  41. Kun Liu and Jan Boehm. 2015. Classification of big point cloud data using cloud computing. Int. Arch. Photogram. Remote Sens. Spatial Info. Sci. 40, 3 (2015), 553.Google ScholarGoogle ScholarCross RefCross Ref
  42. Zhang Danping, Yu Haoran, and Zheng Linyu. 2014. Apriori algorithm research based on MapReduce in cloud computing environments. Open Autom. Control Syst. J. 6 (2014), 368--373.Google ScholarGoogle ScholarCross RefCross Ref
  43. Juan Li, Pallavi Roy, Samee U. Khan, Lizhe Wang, and Yan Bai. 2012. Data mining using clouds: An experimental implementation of Apriori over MapReduce. In Proceedings of the 12th International Conference on Scalable Computing and Communications (ScalCom’13). 1--8.Google ScholarGoogle Scholar
  44. Kuldeep Mishra, Ravi Rai Chaudhary, and Dheresh Soni. 2013. A premeditated CDM algorithm in cloud computing environment for FPM. Int. J. Comput. Eng. Technol. 4, 4 (2013), 213--223.Google ScholarGoogle Scholar
  45. Dheresh Soni, Atish Mishra, and Hitesh Gupta. 2016. An efficient cloud data mining (CDM) algorithm for frequent pattern mining in cloud computing environment. Lecture Notes Software Eng. 4, 3 (2016).Google ScholarGoogle Scholar
  46. Dheresh Soni, Atish Mishra, Satyendra Singh Thakur, and Nishant Chaurasia. 2011. Applying frequent pattern mining in cloud computing environment. Int. J. Adv. Comput. Res. 1 (2011), 84--87.Google ScholarGoogle Scholar
  47. N. Khurana and R. K. Datta. 2013. Pruning large data sets for finding association rule in cloud: CBPA (Count-based Pruning Algorithm). Int. J. Softw. Web Sci. (2013), 118--122.Google ScholarGoogle Scholar
  48. Lijuan Zhou and Xiang Wang. 2014. Research of the FP-growth algorithm based on cloud environments. J. Software 9, 3 (2014), 676--683.Google ScholarGoogle ScholarCross RefCross Ref
  49. Lingjuan Li and Min Zhang. 2011. The strategy of mining association rule based on cloud computing. In Proceedings of the International Conference on Business Computing and Global Informatization (BCGIN’11). IEEE, 475--478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Pooja Godse, Tejal Zete, Mohit Bhanushali, and Shubhangi Kale. 2019. The strategy of mining association rule based on cloud computing. Technical Report. Retrieved 2019 from http://kddlab.zjgsu.edu.cn:7200/research/DistributedMining.Google ScholarGoogle Scholar
  51. Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Silvia Chiusano, and Luigi Grimaudo. 2013. SeaRum: A cloud-based service for association rule mining. In Proceedings of the 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications. 1283--1290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. K. Mangayarkkarasi and M. Chidambaram. 2017. An intelligent service recommendation model for service usage pattern discovery in secure cloud computing environment. J. Theor. Appl. Info. Technol. 95, 15 (2017).Google ScholarGoogle Scholar
  53. Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Paolo Garza, Pietro Michiardi, and Fabio Pulvirenti. 2015. PaMPa-HD: A parallel MapReduce-based frequent Pattern miner for high-dimensional data. In Proceedings of the IEEE International Conference on Data Mining Workshop (ICDMW’15). IEEE, 839--846. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Arkan Al-Hamodi, Songfeng Lu, and Yahya Al-Salhi. 2016. An enhanced frequent pattern growth based on MapReduce for mining association rules. Int. J. Data Min. Knowl. Manage. Process 6, 2 (2016), 19--28.Google ScholarGoogle ScholarCross RefCross Ref
  55. Bo He. 2012. Fast mining algorithm of association rules base on cloud computing. In Proceedings of the 2nd International Conference on Electronic 8 Mechanical Engineering and Information Technology. Atlantis Press.Google ScholarGoogle ScholarCross RefCross Ref
  56. Wenzheng Zhu and Changhoon Lee. 2014. A new approach to web data mining based on cloud computing. J. Comput. Sci. Eng. 8, 4 (2014), 181--186.Google ScholarGoogle ScholarCross RefCross Ref
  57. R. Farivar et al. 2009. Mithra: Multiple data independent tasks on heterogeneous resource architecture. In Proceedings of the IEEE International Conference on Cluster Computing and Workshops. 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  58. Kyong-Ha Lee, Yoon-Joon Lee, Hyunsik Choi, Yon Dohn Chung, and Bongki Moon. 2012. Parallel data processing with MapReduce: A survey. ACM SIGMOD Rec. 40, 4 (2012), 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Indrajit Roy, Srinath T. V. Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: Security and privacy for MapReduce. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation. 297--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. C. Dwork. 2006. Differential privacy. In Proceedings of the International Colloquium on Automata, Languages and Programming (ICALP’06).Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. C. Dwork. 2007. An ad omnia approach to defining and achieving pri-vate data analysis. In Proceedings of the ACM SIGKDD International Workshop on Privacy, Security, and Trust in Knowledge, Discovery, and Data Mining (PinKDD’07).Google ScholarGoogle Scholar
  62. C. Dwork. 2007. Ask a better question, get a better answer: A new approach to private data analysis. In Proceedings of the International Conference on Database Theory (ICDT’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. C. Dwork. 2008. Differential privacy: A survey of results. In Proceedings of the International Conference on Theory and Applications of Models of Computation (TAMC’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Hanna M. Said, Ibrahim El Emary, Bader A. Alyoubi, and Adel A. Alyoubi. {n.d.}. Application of intelligent data mining approach in securing the cloud computing. Int. J. Adv. Comput. Sci. Appl. 1, 7, 151--159.Google ScholarGoogle Scholar
  65. Eric A. Brewer. 2000. Towards robust distributed systems. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC’00), Vol. 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Werner Vogels. 2008. Eventually consistent. Queue 6, 6 (2008), 14--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Daniel Abadi. 2012. Consistency tradeoffs in modern distributed database system design: CAP is only part of the story. Computer 45, 2 (2012), 37--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Domenico Talia. 2013. Toward cloud-based big-data analytics. IEEE Comput. Sci. (2013), 98--101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Robert Grossman and Yunhong Gu. 2008. Data mining using high performance data clouds: Experimental studies using sector and sphere. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 920--927. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Robert L. Grossman, Yunhong Gu, Michael Sabala, and Wanzhi Zhang. 2009. Compute and storage clouds using wide area high performance networks. Future Gen. Comput. Syst. 25, 2 (2009), 179--183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. 2007. Evaluating MapReduce for multi-core and multiprocessor systems. In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture. 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Zhenhua Guo, Geoffrey Fox, and Mo Zhou. 2012. Investigation of data locality in MapReduce. In Proceedings of the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid’12). IEEE, 419--426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Domenico Talia and Paolo Trunfio. 2010. How distributed data mining tasks can thrive as knowledge services. Commun. ACM 53, 7 (2010), 132--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Shivnath Babu. 2010. Towards automatic optimization of MapReduce programs. In Proceedings of the 1st ACM Symposium on Cloud Computing. ACM, 137--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Eaman Jahani, Michael J. Cafarella, and Christopher R. 2011. Automatic optimization for MapReduce programs. Proc. VLDB Endow. 4, 6 (2011), 385--396. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Praveen Kumar Lakkimsetti. 2011. A framework for automatic optimization of MapReduce programs based on job parameter configurations. PhD dissertation, Kansas State University (2011).Google ScholarGoogle Scholar
  77. Nezih Yigitbasi, Theodore L. Willke, Guangdeng Liao, and Dick Epema. 2013. Towards machine-learning-based auto-tuning of MapReduce. In Proceedings of the IEEE 21st International Symposium on Modeling, Analysis 8 Simulation of Computer and Telecommunication Systems (MASCOTS’13). IEEE, 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Herodotos Herodotou, Harold Lim, Gang Luo, Nedyalko Borisov, Liang Dong, Fatma Bilgen Cetin, and Shivnath Babu. 2011. Starfish: A self-tuning system for big data analytics. In Proceedings of the Conference on Innovative Data Systems Research (CIDR’11) 11, 2011 (2011), 261--272.Google ScholarGoogle Scholar
  79. Vasiliki Kalavri and Vladimir Vlassov. 2013. MapReduce: Limitations, optimizations and open issues. In Proceedings of the12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom’13). IEEE, 1031--1038. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Robert Grossman and Yunhong Gu. 2008. Data mining using high performance data clouds: experimental studies using sector and sphere. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 920--927. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. F. Ferrucci, P. Salza, M. Kechadi, and F. Sarro. 2015. A parallel genetic algorithms framework based on Hadoop MapReduce. In Proceedings of the 30th Annual ACM Symposium on Applied Computing. ACM, 1664--1667. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. M. Ester, H. P. Kriegel, J. Sander, and X. Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the International Conference on Knowledge Discovery in Databases and Data Mining (KDD’96). 226--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. A. Maithili, R. V. Kumari, and S. Rajamanickam. 2012. Neural networks cum cloud computing approach in diagnosis of cancer. Int. J. Eng. Res. Appl. 2, 2 (2012), 428--35.Google ScholarGoogle Scholar
  84. I. Kaur. 2019. Security of cloud from data mining-based attacks. Technical Report. Retrieved 2019 from https://studyres.com/doc/572585/security-of-cloud-from-data-mining-based-attacks-inderjit.Google ScholarGoogle Scholar
  85. S. Sharma. 2014. Improving cloud security using data mining. IOSR J. Comput. Eng. 1, 16 (2014), 66--69.Google ScholarGoogle ScholarCross RefCross Ref
  86. Sakshi Aggarwal and Ritu Sindhu. 2014. A survey on cloud mining with privacy protection. Int. J. Adv. Res. Comput. Sci. Software Eng. 4, 10 (2014).Google ScholarGoogle Scholar
  87. Chintada. Srinivasa Rao and Chinta. Chandra Sekhar. 2014. Dynamic massive data storage security challenges in cloud computing environments. Int. J. Innovat. Res. Comput. Commun. Eng. 2, 3 (Mar. 2014), ISSN(Online): 2320-9801.Google ScholarGoogle Scholar
  88. W. Lian, X. Zhu, J. Zhang, and S. Li. 2015. Cloud computing environments parallel data mining policy research. Int. J. Grid Distrib. Comput. 8, 4 (2015), 135--144.Google ScholarGoogle ScholarCross RefCross Ref
  89. Jiong Xie, Shu Yin, and Zhiyang Ding. 2010. Improving MapReduc performance through data placement in heterogeneous clusters. Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’10).Google ScholarGoogle Scholar
  90. A. S. Saabith, E. Sundararajan, and A. A. Bakar. 2016. Parallel implementation of Apriori algorithms on the Hadoop-MapReduce platform—An evaluation of literature. J. Theor. Appl. Info. Technol. 85, 3 (2016), 321.Google ScholarGoogle Scholar
  91. A. A. Pandagale and A. R. Surve. 2016. Hadoop-HBase for finding association rules using Apriori MapReduce algorithm. In Proceedings of the IEEE International Conference on Recent Trends in Electronics, Information 8 Communication Technology (RTEICT’16). IEEE, 795--798.Google ScholarGoogle Scholar
  92. K. Chandy and L. Lamport. 1985. Distributed snapshots: Determining global states of distributed systems. ACM Trans. Comput. Syst. 3, 1 (1985), 63--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. K. Chandy and J. Misra. 1981. Asynchronous distributed simulation via a sequence of parallel computations. Commun. ACM 24, 2 (1981), 198--205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. L. Ismail, M. M. Masud, and L. Khan. 2014. FSBD: A framework for scheduling of big data mining in cloud computing. In Proceedings of the IEEE International Congress on Big Data (BigData’14). IEEE, 514--521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. U. Kang, C. E. Tsourakakis, and C. Faloutsos. 2009. Pegasus: A peta-scale graph mining system implementation and observations. In Proceedings of the 9th IEEE International Conference on Data Mining (ICDM’09). IEEE, 229--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. 2010. Pregel: A system for large-scale graph processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 135--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. Apache giraph. 2019. Retrieved from http://giraph.apache.org.Google ScholarGoogle Scholar
  98. Giraph. 2019. Retrieved from jira. https://issues.apache.org/jira/browse/GIRAPH.Google ScholarGoogle Scholar
  99. Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. 2015. One trillion edges: Graph processing at facebook-scale. Proc. VLDB Endow. 8, 12 (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica. 2013. Graphx: A resilient distributed graph system on spark. In Proceedings of the 1st International Workshop on Graph Data Management Experiences and Systems. ACM, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). Vol. 14, 599--613. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. R. S. Xin, D. Crankshaw, A. Dave, J. E. Gonzalez, M. J. Franklin, and I. Stoica. 2014. Graphx: Unifying data-parallel and graph-parallel analytics. arXiv preprint arXiv:1402.2394.Google ScholarGoogle Scholar
  103. S. Mishra, Y. C. Lee, and A. Nayak. 2016. Distributed genetic algorithm on GraphX. In Proceedings of the Australasian Joint Conference on Artificial Intelligence. Springer, 548--554.Google ScholarGoogle Scholar
  104. E. Y. Chang, H. Bai, and K. Zhu. 2009. Parallel algorithms for mining large-scale rich-media data. In Proceedings of the 17th ACM International Conference on Multimedia. ACM, 917--918. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. L. Zhou, Z. Zhong, J. Chang, J. Li, J. Z. Huang, and S. Feng. 2010. Balanced parallel fp-growth with MapReduce. In Proceedings of the IEEE Youth Conference on Information Computing and Telecommunications (YC-ICT’10). IEEE, 243--246.Google ScholarGoogle Scholar
  106. W. Zhang, H. Liao, and N. Zhao. 2008. Research on the FP growth algorithm about association rule mining. In Proceedings of the International Seminar on Business and Information Management (ISBIM’08). IEEE (Vol. 1, pp. 315--318). Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. I. Pramudiono and M. Kitsuregawa. 2003. Parallel FP-growth on PC cluster. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Berlin, 467--473. Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. R. Mishra and A. Choubey. 2012. Discovery of frequent patterns from web log data by using FP-growth algorithm for web usage mining. Int. J. Adv. Res. Comput. Sci. Software Eng. 2, 9 (2012).Google ScholarGoogle Scholar
  109. B. S. Kumar and K. V. Rukmani. 2010. Implementation of web usage mining using Apriori and FP growth algorithms. Int. J. Adv. Netw. Appl. 1, 06 (2010), 400--404.Google ScholarGoogle Scholar
  110. Y. Qiu, Y. J. Lan, and Q. S. Xie. 2004. An improved algorithm of mining from FP-tree. In Proceedings of the International Conference on Machine Learning and Cybernetics. IEEE, Vol. 3, 1665--1670.Google ScholarGoogle Scholar
  111. J. Han, J. Pei, and Y. Yin. 2000. Mining frequent patterns without candidate generation. In ACM SIGMOD Record. ACM, Vol. 29, No. 2, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. M. N. Vora. 2011. Hadoop-HBase for large-scale data. In Proceedings of the International Conference on Computer Science and Network Technology (ICCSNT’11). IEEE, (Vol. 1, pp. 601--605).Google ScholarGoogle Scholar
  113. D. Carstoiu, E. Lepadatu, and M. Gaspar. 2010. Hbase-non SQL database, performances evaluation. International Journal of Advancements in Computing Technology 2, 5 (Dec. 2010).Google ScholarGoogle Scholar
  114. S. Nishimura, S. Das, D. Agrawal, and A. El Abbadi. 2011. MD-HBase: A scalable multi-dimensional data infrastructure for location aware services. In Proceedings of the 12th IEEE International Conference on Mobile Data Management (MDM’11). IEEE, Vol. 1, 7--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. T. Harter, D. Borthakur, S. Dong, A. S. Aiyer, L. Tang, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. 2014. Analysis of HDFS under HBase: A Facebook messages case study. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’14), Vol. 14, 12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. W. Zhao, H. Ma, and Q. He. 2009. Parallel k-means clustering based on MapReduce. In Proceedings of the IEEE International Conference on Cloud Computing. Springer, Berlin, 674--679. Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. R. M. Esteves, R. Pais, and C. Rong. 2011. K-means clustering in the cloud—A Mahout test. In Proceedings of the IEEE Workshops of International Conference on Advanced Information Networking and Applications (WAINA’11). IEEE, 514--519. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. X. Cui, P. Zhu, X. Yang, K. Li, and C. Ji. 2014. Optimized big data K-means clustering using MapReduce. J. Supercomput. 70, 3 (2014), 1249--1259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. S. Liu and Y. Cheng. 2012. Research on k-means algorithm based on cloud computing. In Proceedings of the International Conference on Computer Science 8 Service System (CSSS’12). IEEE, 1762--1765. Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. T. Sajana, C. S. Rani, and K. V. Narayana. 2016. A survey on clustering techniques for big data mining. Indian J. Sci. Technol. 9, 3 (2016).Google ScholarGoogle ScholarCross RefCross Ref
  121. M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, N. Seliya, R. Wald, and E. Muharemagic. 2015. Deep learning applications and challenges in big data analytics. J. Big Data 2, 1 (2015), 1.Google ScholarGoogle ScholarCross RefCross Ref
  122. D. Agrawal, S. Das, and A. El Abbadi. 2011. Big data and cloud computing: Current state and future opportunities. In Proceedings of the 14th International Conference on Extending Database Technology. ACM, 530--533. Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. X. Wu, X. Zhu, G. Q. Wu, and W. Ding. 2014. Data mining with big data. IEEE Trans. Knowl. Data Eng. 26, 1 (2014), 97--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. Y. Simmhan, S. Aman, A. Kumbhare, R. Liu, S. Stevens, Q. Zhou, and V. Prasanna. 2013. Cloud-based software platform for big data analytics in smart grids. Comput. Sci. Eng. 15, 4 (2013), 38--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. L. Wei, H. Zhu, Z. Cao, X. Dong, W. Jia, Y. Chen, and A. V. Vasilakos. 2014. Security and privacy for storage and computation in cloud computing. Info. Sci. 258 (2014), 371--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. B. McCarty. 2004. SELinux: NSA’s open source security enhanced Linux. O’Reilly Media. Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. J. Da Silva, C. Giannella, R. Bhargava, H. Kargupta, and M. Klusch. 2005. Distributed data mining and agents. Int. J. Eng. App. Artific. Intell. 18, 4 (2005), 791--807. Elsevier Science. Google ScholarGoogle ScholarDigital LibraryDigital Library
  128. H. Kargupta, W. Huang, K. Sivakumar, and E. Johnson. 2001. Distributed clustering using collective principal component analysis. Knowl. Info. Syst. J. 3, 4 (2001), 422--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  129. L. Ismail and L. Khan. 2014. Implementation and Performance Evaluation of a Scheduling Algorithmfor Divisible Load Parallel Applications in a Cloud Computing Environment. Software: Practice and Experience. Wiley.Google ScholarGoogle Scholar
  130. M. Shee, S. Bhavsar, and M. Parashar. 1999. Characterizing the performance of dynamic distribution and load-balancing techniques for adaptive grid hierarchies. In Proceedings of the IASTED International Conference of Parallel and Distributed Computing and Systems, Vol. 4.Google ScholarGoogle Scholar
  131. Apache Mahout. 2019. Retrieved from http://mahout.apache.org.Google ScholarGoogle Scholar
  132. S. Schelter and S. Owen. 2012. Collaborative filtering with apache mahout. In Proceedings of the ACM RecSys Challenge.Google ScholarGoogle Scholar
  133. R. Nair. 2015. Big data needs approximate computing: Technical perspective. Commun. ACM 58, 1 (2015), 104--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  134. S. Mitra, S. K. Pal, and P. Mitra. 2002. Data mining in soft computing framework: A survey. IEEE Trans. Neural Netw. 13, 1 (2002), 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  135. Foto N. Afrati. 2006. On approximation algorithms for data mining applications. In Efficient Approximation and Online Algorithms. Springer, 1--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  136. InfoQ. 2019. Approximate Methods for Scalable Data Mining. Retrieved from https://www.infoq.com/presentations/scalability-data-mining.Google ScholarGoogle Scholar
  137. G. Kollios, D. Gunupulos, N. Koudas, and S. Berchtold. 2001. An efficient approximation scheme for data mining tasks. In Proceedings of the 17th International Conference on Data Engineering. IEEE, 453--462. Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. P. Gupta, S. Agnihotri, and S. Saha. 2013. Approximate data mining using sketches for massive data. Procedia Technol. 10 (2013), 781--787.Google ScholarGoogle ScholarCross RefCross Ref
  139. B. Welton, E. Samanas, and B. P. Miller. 2013. Mr. scan: Extreme scale density-based clustering using a tree-based network of GPGPU nodes. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. ACM, 84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. J. Han and M. Kamber. 2004. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers.Google ScholarGoogle Scholar
  141. L. Qian, Z. Luo, Y. Du, and L. Guo. 2009. Cloud computing: An overview. In IEEE International Conference on Cloud Computing. Springer, 626--631. Google ScholarGoogle ScholarDigital LibraryDigital Library
  142. R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic. 2009. Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Gen. Comput. Syst. 25, 6 (2009), 599--616. Google ScholarGoogle ScholarDigital LibraryDigital Library
  143. T. B. Winans and J. S. Brown. 2009. Cloud computing: A collection of working papers. Deloitte LLC.Google ScholarGoogle Scholar
  144. S. Mittal. 2016. A survey of techniques for approximate computing. ACM Comput. Surveys 48, 4 (2016), 62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  145. J. Gruska. 1999. Quantum Computing, Vol. 2005. McGraw-Hill, London.Google ScholarGoogle Scholar
  146. P. Wittek. 2014. Quantum Machine Learning: What Quantum Computing Means to Data Mining. Academic Press.Google ScholarGoogle Scholar
  147. M. Ykhlef. 2011. A quantum swarm evolutionary algorithm for mining association rules in large databases. J. King Saud Univ.-Comput. Info. Sci. 23, 1 (2011), 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  148. S. Wang and G. Long. 2015. Big data and quantum computation. Chinese Sci. Bull. 60, 5--6 (2015), 499--508.Google ScholarGoogle Scholar
  149. P. Rebentrost, M. Mohseni, and S. Lloyd. 2014. Quantum support vector machine for big data classification. Phys. Rev. Lett. 113, 13 (2014), 130503.Google ScholarGoogle ScholarCross RefCross Ref
  150. H. K. Lo, T. Spiller, and S. Popescu. 1998. Introduction to Quantum Computation and Information. World Scientific, Singapore.Google ScholarGoogle Scholar
  151. C. H. Yu, F. Gao, Q. L. Wang, and Q. Y. Wen. 2016. Quantum algorithm for association rules mining. Phys. Rev. A 94, 4 (2016), 042311.Google ScholarGoogle ScholarCross RefCross Ref
  152. D. A. Reed and J. Dongarra. 2015. Exascale computing and big data. Commun. ACM 58, 7 (2015), 56--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  153. M. Weinstein. 2010. Strange bedfellows: Quantum mechanics and data mining. Nuclear Phys. B-Proc. Suppl. 199, 1 (2010), 74--84.Google ScholarGoogle ScholarCross RefCross Ref
  154. Nature. 2019. IBM's Quantum Cloud Computer Goes Commercial. Retrieved from http://www.nature.com/news/ibm-s-quantum-cloud-computer-goes-commercial-1.21585.Google ScholarGoogle Scholar
  155. Livemint. 2019. Google's Quantum Computing Push Opens New Front in Cloud Battle. Retrieved from http://www.livemint.com/Technology/FtFrwgaQFFa07m0BenyGIK/Googles-quantum-computing-push-opens-new-front-in-cloud-bat.html.Google ScholarGoogle Scholar
  156. Engadget. 2019. Google Wants to Sell Quantum Computing in the Cloud. Retrieved from https://www.engadget.com/2017/07/17/google-puts-quantum-computers-to-work-in-cloud/.Google ScholarGoogle Scholar
  157. Theregister. 2019. Google Tests its Own Quantum Computer -- Both Qubits of it. Retrieved from https://www.theregister.co.uk/2016/07/21/google_tests_a_quantum_computer_its_own_both_qubits_of_it/.Google ScholarGoogle Scholar
  158. Quantum computing -- Wikipedia. 2019. Retrieved from https://en.wikipedia.org/wiki/Quantum_computing.Google ScholarGoogle Scholar
  159. E. Rieffel and W. Polak. 2000. An introduction to quantum computing for non-physicists. ACM Comput. Surveys 32, 3 (2000), 300--335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  160. V. S. Denchev and G. Pandurangan. 2008. Distributed quantum computing: A new frontier in distributed systems or science fiction? ACM SIGACT News 39, 3 (2008), 77--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  161. I. A. T. Hashem, I. Yaqoob, N. B. Anuar, S. Mokhtar, A. Gani, and S. U. Khan. 2015. The rise of big data on cloud computing: Review and open research issues. Info. Syst. 47 (2015), 98--115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  162. T. Mastelic, A. Oleksiak, H. Claussen, I. Brandic, J. M. Pierson, and A. V. Vasilakos. 2015. Cloud computing: Survey on energy efficiency. ACM Comput. Surveys 47, 2 (2015), 33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  163. D. Chakrabarti and C. Faloutsos. 2006. Graph mining: Laws, generators, and algorithms. ACM Comput. Surveys 38, 1 (2006), 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  164. S. Venugopal, R. Buyya, and K. Ramamohanarao. 2006. A taxonomy of data grids for distributed data sharing, management, and processing. ACM Comput. Surveys 38, 1 (2006), 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  165. I. Goiri, R. Bianchini, S. Nagarakatte, and T. D. Nguyen. 2015. Approxhadoop: Bringing approximations to MapReduce frameworks. In ACM SIGARCH Computer Architecture News. ACM, Vol. 43, No. 1, 383--397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  166. O. Agmon Ben-Yehuda, M. Ben-Yehuda, A. Schuster, and D. Tsafrir. 2014. The rise of RaaS: The resource-as-a-service cloud. Commun. ACM 57, 7 (2014), 76--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  167. F. Pan, G. Cong, A. K. Tung, J. Yang, and M. J. Zaki. 2003. Carpenter: Finding closed patterns in long biological datasets. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 637--642. Google ScholarGoogle ScholarDigital LibraryDigital Library
  168. K. A. Shakil and M. Alam. 2016. Recent developments in cloud-based systems: State of art. Int. J. Comput. Sci. Info. Secur. 14, 12 (2016), 242.Google ScholarGoogle Scholar
  169. V. Nekvapil. 2015. Cloud computing in data mining-A survey. J. Syst. Integr. 6, 1 (2015), 12.Google ScholarGoogle ScholarCross RefCross Ref
  170. M. Marjani, F. Nasaruddin, A. Gani, A. Karim, I. A. T. Hashem, A. Siddiqa, and I. Yaqoob. 2017. Big IoT data analytics: Architecture, opportunities, and open research challenges. IEEE Access 5 (2017), 5247--5261.Google ScholarGoogle ScholarCross RefCross Ref
  171. T. Hu, H. Chen, L. Huang, and X. Zhu. 2012. A survey of mass data mining based on cloud-computing. In Proceedings of the International Conference on Anti-Counterfeiting, Security and Identification (ASID’12). IEEE, 1--4.Google ScholarGoogle Scholar
  172. C. W. Tsai, C. F. Lai, H. C. Chao, and A. V. Vasilakos. 2015. Big data analytics: A survey. J. Big Data 2, 1 (2015), 21.Google ScholarGoogle ScholarCross RefCross Ref
  173. A. S. Shirkhorshidi, S. Aghabozorgi, T. Y. Wah, and T. Herawan. 2014. Big data clustering: A review. In Proceedings of the International Conference on Computational Science and Its Applications. Springer, Cham, 707--720.Google ScholarGoogle Scholar
  174. B. Zerhari, A. A. Lahcen, and S. Mouline. 2015. Big data clustering: Algorithms and challenges. In Proceedings of the International Conference on Big Data, Cloud and Applications (BDCA’15).Google ScholarGoogle Scholar
  175. A. Mohebi, S. Aghabozorgi, T. Ying Wah, T. Herawan, and R. Yahyapour. 2016. Iterative big data clustering algorithms: A review. Software: Pract. Exper. 46, 1 (2016), 107--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  176. A. Fahad, N. Alshatri, Z. Tari, A. Alamri, I. Khalil, A. Y. Zomaya, S. Foufou, and A. Bouras. 2014. A survey of clustering algorithms for big data: Taxonomy and empirical analysis. IEEE Trans. Emerg. Topics Comput. 2, 3 (2014), 267--279.Google ScholarGoogle ScholarCross RefCross Ref
  177. D. Singh and C. K. Reddy. 2015. A survey on platforms for big data analytics. J. Big Data 2, 1 (2015), 8.Google ScholarGoogle ScholarCross RefCross Ref
  178. H. Tong and U. Kang. 2013. Big Data Clustering. Data Clustering: Algorithms and Applications, Chapter 11. CRC Press, Taylor 8 Francis Group, 259--276.Google ScholarGoogle Scholar
  179. X. Lin. 2014. Mr-Apriori: Association rules algorithm based on MapReduce. In Proceedings of the 5th IEEE International Conference on Software Engineering and Service Science (ICSESS’14). IEEE, 141--144.Google ScholarGoogle ScholarCross RefCross Ref
  180. Q. He, F. Zhuang, J. Li, and Z. Shi. 2010. Parallel implementation of classification algorithms based on MapReduce. In Proceedings of the International Conference on Rough Sets and Knowledge Technology. Springer, Berlin, 655--662. Google ScholarGoogle ScholarDigital LibraryDigital Library
  181. IBM. 2019. Bluemix is now IBM Cloud. Retrieved from https://www.ibm.com/blogs/bluemix/2017/10/bluemix-is-now-ibm-cloud/.Google ScholarGoogle Scholar
  182. A. Gheith et al. 2016, IBM Bluemix mobile cloud services. IBM J. Res. Dev. 60, 2-3 (Mar. 2016), 7:1--7:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  183. Google Cloud. 2019. Cloud Machine Learning Engine. Retrieved from https://cloud.google.com/ml-engine/.Google ScholarGoogle Scholar
  184. GE. 2019. Predix Platform Brief-GE. Retrieved from https://www.ge.com/digital/sites/default/files/Predix-The-Industrial-Internet-Platform-Brief.pdf.Google ScholarGoogle Scholar
  185. TCS. 2019. TCS Connected Universe Platform. Retrieved from https://www.tcs.com/tcs-connected-universe-platform.Google ScholarGoogle Scholar
  186. IBM Watson | IBM. 2019. Retrieved from https://www.ibm.com/watson/.Google ScholarGoogle Scholar
  187. Machine Learning Studio | Microsoft Azure. 2019. Retrieved from https://azure.microsoft.com/en-in/services/machine-learning-studio/.Google ScholarGoogle Scholar
  188. D. R. Krishnan, D. L. Quoc, P. Bhatotia, C. Fetzer, and R. Rodrigues. 2016. Incapprox: A data analytics system for incremental approximate computing. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1133--1144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  189. Spark Streaming | Apache Spark. 2019. Retrieved from https://spark.apache.org/streaming/.Google ScholarGoogle Scholar
  190. A. Bifet, S. Maniu, J. Qian, G. Tian, C. He, and W. Fan. 2015. StreamDM: Advanced data mining in Spark streaming. In Proceedings of the IEEE International Conference on Data Mining Workshop (ICDMW’15). IEEE, 1608--1611. Google ScholarGoogle ScholarDigital LibraryDigital Library
  191. Mehdi Mohammadi, Ala Al-Fuqaha, Sameh Sorour, and Mohsen Guizani. 2018. Deep learning for IoT big data and streaming analytics: A survey. IEEE Commun. Surveys Tutor. 20, 4 (2018), 2923--2960.Google ScholarGoogle ScholarDigital LibraryDigital Library
  192. A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. 2010. Moa: Massive online analysis. J. Mach. Learn. Res. 11 (May 2010), 1601--1604. Google ScholarGoogle ScholarDigital LibraryDigital Library
  193. B. R. Prasad and S. Agarwal. 2016. Stream data mining: Platforms, algorithms, performance evaluators, and research trends. Int. J. Database Theory Appl. 9, 9 (2016), 201--218.Google ScholarGoogle ScholarCross RefCross Ref
  194. G. D. F. Morales and A. Bifet. 2015. SAMOA: Scalable advanced massive online analysis. J. Mach. Learn. Res. 16, 1 (2015), 149--153. Google ScholarGoogle ScholarDigital LibraryDigital Library
  195. A. Amini, T. Y. Wah, and H. Saboohi. 2014. On density-based data streams clustering algorithms: A survey. J. Comput. Sci. Technol. 29, 1 (2014), 116--141.Google ScholarGoogle ScholarCross RefCross Ref
  196. H. Song and J. G. Lee. 2018. RP-DBSCAN: A superfast parallel DBSCAN algorithm based on random partitioning. In Proceedings of the International Conference on Management of Data. ACM, 1173--1187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  197. O. Backhoff and E. Ntoutsi. 2016. Scalable online-offline stream clustering in apache spark. In Proceedings of the IEEE 16th International Conference on Data Mining Workshops (ICDMW’16). IEEE, 37--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  198. J. Zgraja and M. Woniak. 2018. Drifted data stream clustering based on ClusTree algorithm. In Proceedings of the International Conference on Hybrid Artificial Intelligence Systems. Springer, Cham, 338--349.Google ScholarGoogle Scholar
  199. C. Sauvanaud, G. Silvestre, M. Kaniche, and K. Kanoun. 2015. Data stream clustering for online anomaly detection in cloud applications. In Proceedings of the 11th European Dependable Computing Conference (EDCC’15). IEEE, 120--131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  200. L. Tu and Y. Chen. 2009. Stream data clustering based on grid density and attraction. ACM Trans. Knowl. Discov. Data 3, 3 (2009), 12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  201. R. Latif, H. Abbas, S. Latif, and A. Masood. 2015. EVFDT: An enhanced very fast decision tree algorithm for detecting distributed denial of service attack in cloud-assisted wireless body area network. Mobile Info. Syst. 2015, Article 260594 (2015), 13 pages.Google ScholarGoogle Scholar
  202. T. M. Al-Khateeb, M. M. Masud, L. Khan, and B. Thuraisingham. 2012. Cloud guided stream classification using class-based ensemble. In Proceedings of the IEEE 5th International Conference on Cloud Computing (CLOUD’12). IEEE, 694--701. Google ScholarGoogle ScholarDigital LibraryDigital Library
  203. J. Chen, K. Li, Z. Tang, K. Bilal, S. Yu, C. Weng, and K. Li. 2017. A parallel random forest algorithm for big data in a spark cloud computing environment. IEEE Trans. Parallel Distrib. Syst. 1 (2017), 1--1. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Comprehensive Survey on Cloud Data Mining (CDM) Frameworks and Algorithms

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in

                    Full Access

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader

                    HTML Format

                    View this article in HTML Format .

                    View HTML Format