ABSTRACT
Data volumes are rising at an increasing rate, stressing the limits of human attention. Current techniques for prioritizing user attention in this fast data are characterized by either cumbersome, ad-hoc analysis pipelines comprised of a diverse set of analytics tools, or brittle, static rule-based engines. To address this gap, we have developed MacroBase, a fast data analytics engine that acts as a search engine over fast data streams. MacroBase provides a set of highly-optimized, modular operators for streaming feature transformation, classification, and explanation. Users can leverage these optimized operators to construct efficient pipelines tailored for their use case. In this demonstration, SIGMOD attendees will have the opportunity to interactively answer and refine queries using MacroBase and discover the potential benefits of an advanced engine for prioritizing attention in high-volume, real-world data streams.
- M. I. Ali, F. Gao, and A. Mileo. Citybench: A configurable benchmark to evaluate rsp engines using smart city datasets. In ISWC, 2015.Google ScholarDigital Library
- P. Bailis, E. Gan, S. Madden, D. Narayanan, K. Rong, and S. Suri. MacroBase: Prioritizing Attention in Fast Data. In SIGMOD, 2017. Google ScholarDigital Library
- P. Bailis, E. Gan, K. Rong, and S. Suri. Prioritizing Attention in Fast Data: Principles and Promise. In CIDR, 2017.Google Scholar
- E. Gan and P. Bailis. Ic2: Indexed cutoffs for kernel density classification. In SIGMOD, 2017.Google Scholar
- K. Rong and P. Bailis. ASAP: Automatic Smoothing for Attention Prioritization in Streaming Time Series Visualization. 2017. arXiv:1703.00983.Google Scholar
Index Terms
- Demonstration: MacroBase, A Fast Data Analysis Engine
Recommendations
MacroBase: Prioritizing Attention in Fast Data
Best of SIGMOD 2017 PapersAs data volumes continue to rise, manual inspection is becoming increasingly untenable. In response, we present MacroBase, a data analytics engine that prioritizes end-user attention in high-volume fast data streams. MacroBase enables efficient, ...
A Demonstration of Striim A Streaming Integration and Intelligence Platform
DEBS '19: Proceedings of the 13th ACM International Conference on Distributed and Event-based SystemsToday's data-driven applications need to process, analyze and act on real-time data as it arrives. The massive amount of data is continuously generated from multiple sources and arrives in a streaming fashion with high volume and high velocity, which ...
Data Streams with Bounded Deletions
PODS '18: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsTwo prevalent models in the data stream literature are the insertion-only and turnstile models. Unfortunately, many important streaming problems require a Θ(log(n)) multiplicative factor more space for turnstile streams than for insertion-only streams. ...
Comments