Abstract
The aim of this work is to survey and reflect on the various ways visualization and data mining can be integrated to achieve effective knowledge discovery by involving the best of human and machine capabilities. Following a bottom-up bibliographic research approach, the article categorizes the observed techniques in classes, highlighting current trends, gaps, and potential future directions for research. In particular it looks at strengths and weaknesses of information visualization (infovis) and data mining, and for which purposes researchers in infovis use data mining techniques and reversely how researchers in data mining employ infovis techniques. The article then proposes, on the basis of the extracted patterns, a series of potential extensions not found in literature. Finally, we use this information to analyze the discovery process by comparing the analysis steps from the perspective of information visualization and data mining. The comparison brings to light new perspectives on how mining and visualization can best employ human and machine strengths. This activity leads to a series of reflections and research questions that can help to further advance the science of visual analytics.
- J.A. Fails and J. Olsen, "Interactive machine learning," IU '03: Proceedings of the 8th international conference on Intelligent user interfaces, New York, NY, USA: ACM, 2003, pp. 39--45. Google ScholarDigital Library
- M. Ware, E. Frank, G. Holmes, M. Hall, and I.H. Witten, "Interactive machine learning: letting users build classifiers," International Journal of Human Computer Studies, vol. 55, 2001, pp. 281--292. Google ScholarDigital Library
- J.J. Thomas and K.A. Cook, Illuminating the path: The research and development agenda for visual analytics, IEEE, 2005.Google Scholar
- D.A. Keim, F. Mansmann, J. Schneidewind, J. Thomas, and H. Ziegler, "Visual analytics: Scope and challenges," Visual Data Mining: Theory, Techniques and Tools for Visual Analytics, Springer, 2008, pp. 76--90. Google ScholarDigital Library
- M.O. Ward, "A taxonomy of glyph placement strategies for multidimensional data visualization," Information Visualization, vol. 1, 2002, pp. 194--210. Google ScholarDigital Library
- A. Morrison, G. Ross, and M. Chalmers, "Fast multidimensional scaling through sampling, springs and interpolation," Information Visualization, vol. 2, 2003, pp. 68--77. Google ScholarDigital Library
- P. Yang, "Interactive Hierarchical Dimension Ordering, Spacing and Filtering for Exploration of High Dimensional Datasets," Oct. 2003. Google ScholarDigital Library
- W. Peng, M.O. Ward, and E.A. Rundensteiner, "Clutter reduction in multi-dimensional data visualization using dimension reordering," IEEE Symposium on Information Visualization, 2004. INFOVIS 2004, pp. 89--96. Google ScholarDigital Library
- J. Heer and D. Boyd, "Vizster: Visualizing online social networks," Proceedings of the 2005 IEEE Symposium on Information Visualization, 2005, pp. 33--40. Google ScholarDigital Library
- J. Johansson, P. Ljung, M. Jern, and M. Cooper, "Revealing Structure within Clustered Parallel Coordinates Displays," Proceedings of the Proceedings of the 2005 IEEE Symposium on Information Visualization, IEEE Computer Society, 2005, p. 17. Google ScholarDigital Library
- A. Jakulin, M. Mo***ina, J. Dem***ar, I. Bratko, and B. Zupan, "Nomograms for visualizing support vector machines," Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, Chicago, Illinois, USA: ACM, 2005, pp. 108--117. Google ScholarDigital Library
- Pak Chung Wong, W. Cowley, H. Foote, E. Jurrus, and J. Thomas, "Visualizing sequential patterns for text mining," Information Visualization, 2000. InfoVis 2000. IEEE Symposium on, 2000, pp. 105--111. Google ScholarDigital Library
- M. Ankerst, M. Ester, and H. Kriegel, "Towards an effective cooperation of the user and the computer for classification," Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2000, pp. 179--188. Google ScholarDigital Library
- E. Müller, I. Assent, R. Krieger, T. Jansen, and T. Seidl, "Morpheus: interactive exploration of subspace clustering," Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2008, pp. 1089--1092. Google ScholarDigital Library
- S.T. Teoh and K. Ma, "PaintingClass: interactive construction, visualization and exploration of decision trees," Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, D.C.: ACM, 2003, pp. 667--672. Google ScholarDigital Library
- M. Ankerst, C. Elsen, M. Ester, and H. Kriegel, "Visual classification: an interactive approach to decision tree construction," Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 1999, pp. 392--396. Google ScholarDigital Library
- Q. Cui and J. Yang, "Measuring Data Abstraction Quality in Multiresolution Visualizations," IEEE Transactions on Visualization and Computer Graphics, vol. 12, 2006, pp. 709--716. Google ScholarDigital Library
- D. Yang, Z. Xie, E.A. Rundensteiner, and M.O. Ward, "Managing discoveries in the visual analytics process," SIGKDD Explor. Newsl., vol. 9, 2007, pp. 22--29. Google ScholarDigital Library
- G. Ellis and A. Dix, "Density control through random sampling: an architectural perspective," Information Visualisation, IV 2002., 2002, pp. 82--90.Google ScholarCross Ref
- E. Bertini and G. Santucci, "Give chance a chance: modeling density to enhance scatter plot quality through random data sampling," Information Visualization, vol. 5, 2006, pp. 95--110. Google ScholarDigital Library
- R.A. Amar, J.T. Stasko, "Knowledge Precepts for Design and Evaluation of Information Visualizations," IEEE Transactions on Visualization and Computer Graphics, vol. 11, 2005, pp. 432--442. Google ScholarDigital Library
- C. Plaisant, J. Fekete, and G. Grinstein, "Promoting Insight-Based Evaluation of Visualizations: From Contest to Benchmark Repository," Visualization and Computer Graphics, IEEE Transactions on, vol. 14, 2008, pp. 120--134. Google ScholarDigital Library
- J. Seo and B. Shneiderman, "A Rank-by-Feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections," Proceedings of the IEEE Symposium on Information Visualization, IEEE Computer Society, 2004, pp. 65--72. Google ScholarDigital Library
- P. Pirolli and S. Card, "The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis," Proceedings of International Conference on Intelligence Analysis, 2005.Google Scholar
- J. Mackinlay, "Automating the design of graphical presentations of relational information," ACM Transactions on Graphics, vol. 5, 1986. Google ScholarDigital Library
- D. Keim, "Visual Analytics: Combining Automated Discovery with Interactive Visualizations. (Invited Talk at VAKD'09 - http://www.hiit.fi/vakd09/keim.html)."Google Scholar
Index Terms
- Investigating and reflecting on the integration of automatic data analysis and visualization in knowledge discovery
Recommendations
Surveying the complementary role of automatic data analysis and visualization in knowledge discovery
VAKD '09: Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive ExplorationThe aim of this work is to survey and reflect on the various ways to integrate visualization and data mining techniques toward a mixed-initiative knowledge discovery taking the best of human and machine capabilities. Following a bottom-up bibliographic ...
Knowledge Discovery and Data Visualization: Theories and Perspectives
This article reviews the literature in the search for the theories and perspectives of knowledge discovery and data visualization. The literature review highlights the overview of knowledge discovery; Knowledge Discovery in Databases KDD; Knowledge ...
FpViz: a visualizer for frequent pattern mining
VAKD '09: Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive ExplorationOver the past 15 years, numerous algorithms have been proposed for frequent pattern mining as it plays an essential role in many knowledge discovery and data mining (KDD) tasks. Most of these frequent pattern mining algorithms return the mined results ...
Comments