ABSTRACT
We present a study aimed at understanding how human observers judge scatter plot similarity when presented with a large set of iconic scatter plot representations. The work we present involves 18 participants with a scientific background in a similarity perception study. The study asks participants to group a carefully selected set of plots according to their subjective perceptual judgement of similarity, and it integrates the results into a consensus similarity grouping. We then use this consensus grouping to generate insights on similarity perception. The main output of this work is a list of concepts we derive to describe major perceptual features, and a description of how these concepts relate and rank. We also evaluate scagnostics (scatter plot diagnostics), a popular and established set of scatter plot descriptors, and show that they do not reliably reproduce our participants judgements. Finally, we discuss the major implications of this study and how these results can be used for future research.
Supplemental Material
Available for Download
pn0696-file4.zip
- Georgia Albuquerque, Martin Eisemann, and Marcus Magnor. 2011. Perception-based visual quality measures. In Proc. of IEEE Conference on Visual Analytics Science and Technology (VAST). 13-20.Google ScholarCross Ref
- Anushka Anand, Leland Wilkinson, and Tuan Nhon Dang. 2012. Visual pattern discovery using random projections. In Proc. of IEEE Conference on Visual Analytics Science and Technology (VAST). 43-52. Google ScholarDigital Library
- Mihael Ankerst, Stefan Berchtold, and Daniel A Keim. 1998. Similarity clustering of dimensions for an enhanced visualization of multidimensional data. In Proc. of IEEE Symposium on Information Visualization. 52-60. Google ScholarDigital Library
- Enrico Bertini and Giuseppe Santucci. 2004. Quality metrics for 2d scatterplot graphics: automatically reducing visual clutter. In Smart Graphics. Springer, 77-89.Google Scholar
- Enrico Bertini, Andrada Tatu, and Daniel Keim. 2011. Quality metrics in high-dimensional data visualization: An overview and systematization. IEEE Transactions on Visualization and Computer Graphics 17, 12 (2011), 2203-2212. Google ScholarDigital Library
- Ingwer Borg, Patrick JF Groenen, and Patrick Mair. 2012. Applied multidimensional scaling. Springer Science & Business Media. Google ScholarDigital Library
- Qingguang Cui, Matthew O Ward, Elke A Rundensteiner, and Jing Yang. 2006. Measuring data abstraction quality in multiresolution visualizations. IEEE Transactions on Visualization and Computer Graphics 12, 5 (2006), 709-716. Google ScholarDigital Library
- Tuan Nhon Dang and Leland Wilkinson. 2014. Scagexplorer: Exploring scatterplots by their scagnostics. In Proc. of IEEE Pacific Visualization Symposium (PacificVis). 73-80. Google ScholarDigital Library
- Aritra Dasgupta and Robert Kosara. 2010. Pargnostics: Screen-space metrics for parallel coordinates. IEEE Transactions on Visualization and Computer Graphics 16, 6 (2010), 1017-1026. Google ScholarDigital Library
- Çağatay Demiralp, Michael S Bernstein, and Jeffrey Heer. 2014. Learning Perceptual Kernels for Visualization Design. IEEE Transactions on Visualization and Computer Graphics 20, 12 (2014), 1933-1942.Google ScholarCross Ref
- Bilkis J Ferdosi, Hugo Buddelmeijer, Scott Trager, Michael Wilkinson, and Jos Roerdink. 2010. Finding and visualizing relevant subspaces for clustering high-dimensional astronomical data using connected morphological operators. In Proc. of IEEE Symposium on Visual Analytics Science and Technology (VAST). 35-42.Google ScholarCross Ref
- Johannes Fuchs, Petra Isenberg, Anastasia Bezerianos, Fabian Fischer, and Enrico Bertini. 2014. The Influence of Contour on Similarity Perception of Star Glyphs. IEEE Transactions on Visualization and Computer Graphics 20, 12 (2014), 2251-2260.Google ScholarCross Ref
- Lane Harrison, Fumeng Yang, Steven Franconeri, and Remco Chang. 2014. Ranking visualizations of correlation using weber's law. IEEE Transactions on Visualization and Computer Graphics 20, 12 (2014), 1943-1952.Google ScholarCross Ref
- John D Hunter. 2007. Matplotlib: A 2D graphics environment. Computing In Science & Engineering 9, 3 (2007), 90-95. Google ScholarDigital Library
- Ilknur Icke and Andrew Rosenberg. 2011. Automated measures for interpretable dimensionality reduction for visual classification: A user study. In Proc. of IEEE Conference on Visual Analytics Science and Technology (VAST). 281-282.Google ScholarCross Ref
- Ilknur Icke and Andrew Rosenberg. 2012. Visual and semantic interpretability of projections of high dimensional data for classification tasks. arXiv preprint arXiv:1205.4776 (2012).Google Scholar
- Jimmy Johansson and Matthew Cooper. 2008. A screen space quality method for data abstraction. In Computer Graphics Forum, Vol. 27. Wiley Online Library, 1039-1046. Google ScholarDigital Library
- Sara Johansson and Jimmy Johansson. 2009. Interactive dimensionality reduction through user-defined combinations of quality metrics. IEEE Transactions on Visualization and Computer Graphics 15, 6 (2009), 993-1000. Google ScholarDigital Library
- Jing Li, Jean-Bernard Martens, and Jarke J Van Wijk. 2010. Judging correlation from scatterplots and parallel coordinate plots. Information Visualization 9, 1 (2010), 13-30. Google ScholarDigital Library
- A Chris Long Jr, James A Landay, Lawrence A Rowe, and Joseph Michiels. 2000. Visual similarity of pen gestures. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 360-367. Google ScholarDigital Library
- Wei Peng, Matthew O Ward, and Elke A Rundensteiner. 2004. Clutter reduction in multi-dimensional data visualization using dimension reordering. In Proc. of IEEE Symposium on Information Visualization (InfoVis). IEEE, 89-96. Google ScholarDigital Library
- Ronald A Rensink and Gideon Baldridge. 2010. The perception of correlation in scatterplots. In Computer Graphics Forum, Vol. 29. Wiley Online Library, 1203-1210. Google ScholarDigital Library
- David N Reshef, Yakir A Reshef, Hilary K Finucane, Sharon R Grossman, Gilean McVean, Peter J Turnbaugh, Eric S Lander, Michael Mitzenmacher, and Pardis C Sabeti. 2011. Detecting novel associations in large data sets. Science 334, 6062 (2011), 1518-1524.Google ScholarCross Ref
- Bernice E Rogowitz, Thomas Frese, John R Smith, Charles A Bouman, and Edward B Kalin. 1998. Perceptual image similarity experiments. In Photonics West'98 Electronic Imaging. International Society for Optics and Photonics, 576-590.Google Scholar
- Eleanor Rosch, Carolyn B Mervis, Wayne D Gray, David M Johnson, and Penny Boyes-Braem. 1976. Basic objects in natural categories. Cognitive psychology 8, 3 (1976), 382-439.Google Scholar
- Jörn Schneidewind, Mike Sips, and Daniel A Keim. 2007. An automated approach for the optimization of pixel-based visualizations. Information Visualization 6, 1 (2007), 75-88. Google ScholarDigital Library
- Michael Sedlmair and Michaël Aupetit. 2014. Data-driven Evaluation of Visual Quality Measures. In Computer Graphics Forum, Vol. 34. Wiley Online Library, 201-210. Google ScholarDigital Library
- Michael Sedlmair, Andrada Tatu, Tamara Munzner, and Melanie Tory. 2012. A taxonomy of visual cluster separation factors. In Computer Graphics Forum, Vol. 31. Wiley Online Library, 1335-1344. Google ScholarDigital Library
- Jinwook Seo and Ben Shneiderman. 2005. A rank-by-feature framework for interactive exploration of multidimensional data. Information Visualization 4, 2 (2005), 96-113. Google ScholarDigital Library
- Mike Sips, Boris Neubert, John P Lewis, and Pat Hanrahan. 2009. Selecting good views of high-dimensional data using class consistency. In Computer Graphics Forum, Vol. 28. Wiley Online Library, 831-838. Google ScholarDigital Library
- Donna Spencer. 2009. Card sorting: Designing usable categories. Rosenfeld Media.Google Scholar
- Andrada Tatu, Georgia Albuquerque, Martin Eisemann, Jörn Schneidewind, Holger Theisel, Marcus Magnor, and Daniel Keim. 2009. Combining automated analysis and visualization techniques for effective exploration of high-dimensional data. In Proc. of IEEE Symposium on Visual Analytics Science and Technology (VAST). 59-66.Google ScholarCross Ref
- Edward Tufte. 1991. Envisioning information. Graphics press Cheshire, CT. Google ScholarDigital Library
- Simon Urbanek. 2012. Scagnostics: Compute scagnostics - scatterplot diagnostics. (2012). https://cran.r-project.org/web/packages/ scagnostics/index.html.Google Scholar
- Liyang Wei, Yongyi Yang, Miles N Wernick, and Robert M Nishikawa. 2009. Learning of perceptual similarity from expert readers for mammogram retrieval. IEEE Journal of Selected Topics in Signal Processing 3, 1 (2009), 53-61.Google ScholarCross Ref
- Leland Wilkinson, Anushka Anand, and Robert L Grossman. 2005. Graph-Theoretic Scagnostics. In Proc. of IEEE Symposium on Information Visualization (InfoVis). 157-164. Google ScholarDigital Library
- Josh Wills, Sameer Agarwal, David Kriegman, and Serge Belongie. 2009. Toward a perceptual space for gloss. ACM Transactions on Graphics (TOG) 28, 4 (2009), 103. Google ScholarDigital Library
- Myron Wish. 1970. Individual differences in perceptions and preferences among nations. Journal of Personality and Social Psychology 16, 3 (1970), 361-373.Google ScholarCross Ref
Index Terms
- Towards Understanding Human Similarity Perception in the Analysis of Large Sets of Scatter Plots
Recommendations
GeoAnalytics visual inquiry and filtering tools in parallel coordinates plots
GIS '07: Proceedings of the 15th annual ACM international symposium on Advances in geographic information systemsThe complex nature of social and scientific spatial-temporal multivariate data calls for highly interactive integrated information visualization (InfoVis) and geo-visualization (GeoVis) tools and applications. Our research concentrates on improving ...
Variable binned scatter plots
Special issue on selected papers from visualization and data analysis 2010The scatter plot is a well-known method of visualizing pairs of two continuous variables. Scatter plots are intuitive and easy-to-use, but often have a high degree of overlap which may occlude a significant portion of the data. To analyze a dense non-...
Generalized scatter plots
Scatter Plots are one of the most powerful and most widely used techniques for visual data exploration. A well-known problem is that scatter plots often have a high degree of overlap, which may occlude a significant portion of the data values shown. In ...
Comments