ABSTRACT
Enabling interactive visualization over new datasets at "human speed" is key to democratizing data science and maximizing human productivity. In this work, we first argue why existing analytics infrastructures do not support interactive data exploration and then outline the challenges and opportunities of building a system specifically designed for interactive data exploration. Finally, we present an Interactive Data Exploration Accelerator (IDEA), a new type of system for interactive data exploration that is specifically designed to integrate with existing data management landscapes and allow users to explore their data instantly without expensive data preparation costs.
- S. Agarwal et al. BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data. In EuroSys, pages 29--42, 2013. Google ScholarDigital Library
- Apache Flink. http://flink.apache.org/.Google Scholar
- C. Binnig et al. The End of Slow Networks: It's Time for a Redesign. In VLDB, pages 528--539, 2016. Google ScholarDigital Library
- C. Böhm, S. Berchtold, H. Kriegel, and U. Michel. Multidimensional Index Structures in Relational Databases. J. Intell. Inf. Syst., pages 51--70, 2000. Google ScholarDigital Library
- S. Chaudhuri, G. Das, and V. R. Narasayya. Optimized Stratified Sampling for Approximate Query Processing. TODS, 2007. Google ScholarDigital Library
- A. Crotty et al. Vizdom Demo Video. https://vimeo.com/139165014.Google Scholar
- A. Crotty et al. An Architecture for Compiling UDF-centric Workflows. In VLDB, pages 1466--1477, 2015. Google ScholarDigital Library
- A. Crotty et al. Vizdom: Interactive Analytics through Pen and Touch. In VLDB, pages 2024--2035, 2015. Google ScholarDigital Library
- G. Cumming and S. Finch. Inference by Eye: Confidence Intervals and How to Read Pictures of Data. American Psychologist, pages 170--180, 2005.Google ScholarCross Ref
- M. El-Hindi, Z. Zhao, C. Binnig, and T. Kraska. VisTrees: Fast Indexes for Interactive Data Exploration. In HILDA, 2016. Google ScholarDigital Library
- J. M. Hellerstein, P. J. Haas, and H. J. Wang. Online Aggregation. In SIGMOD, pages 171--182, 1997. Google ScholarDigital Library
- S. Idreos, M. L. Kersten, and S. Manegold. Database Cracking. In CIDR, pages 68--78, 2007.Google Scholar
- M. Lichman. UCI Machine Learning Repository, 2013.Google Scholar
- Z. Liu and J. Heer. The Effects of Interactive Latency on Exploratory Visual Analysis. TVCG, pages 2122--2131, 2014.Google ScholarCross Ref
- Z. Liu, B. Jiang, and J. Heer. imMens: Real-time Visual Querying of Big Data. In EuroVis, pages 421--430, 2013. Google ScholarDigital Library
- F. Olken and D. Rotem. Random Sampling from Relational Databases. In VLDB, pages 160--169, 1986. Google ScholarDigital Library
- N. Pansare, V. R. Borkar, C. Jermaine, and T. Condie. Online Aggregation for Large MapReduce Jobs. In VLDB, pages 1135--1145, 2011.Google Scholar
- The Apache Software Foundation. Hadoop. http://hadoop.apache.org.Google Scholar
- M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica. Discretized Streams: Fault-tolerant Streaming Computation at Scale. In SOSP, pages 423--438, 2013. Google ScholarDigital Library
- M. Zaharia et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-memory Cluster Computing. In NSDI, pages 15--28, 2012. Google ScholarDigital Library
Recommendations
Interactive data exploration using semantic windows
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of DataWe present a new interactive data exploration approach, called Semantic Windows (SW), in which users query for multidimensional "windows" of interest via standard DBMS-style queries enhanced with exploration constructs. Users can specify SWs using (i) ...
Optimizing star-coordinate visualization models for effective interactive cluster exploration on big data
Interactive visual cluster analysis is the most intuitive way for finding clustering patterns, validating algorithmic clustering results, understanding data clusters with domain knowledge, and refining cluster definitions. The most challenging step is ...
A case study on interactive exploration and guidance aids for visualizing historical data
VIS '01: Proceedings of the conference on Visualization '01In this paper, we address the problem of historical data visualization. We describe the data acquisition, preparation, and visualization. Since the data contain four dimensions, the standard 3D exploration techniques have to be extended or appropriately ...
Comments