We present a pipelining, dynamically user-controllable reorder operator, for use in data-intensive applications. Allowing the user to reorder the data delivery on the fly increases the interactivity in several contexts such as online aggregation and large-scale spreadsheets; it allows the user to control the processing of data by dynamically specifying preferences for different data items based on prior feedback, so that data of interest is prioritized for early processing. In this paper we describe an efficient, non-blocking mechanism for reordering, which can be used over arbitrary data streams from files, indexes, and continuous data feeds. We also investigate several policies for the reordering based on the performance goals of various typical applications. We present results from an implementation used in Online Aggregation in the Informix Dynamic Server with Universal Data Option, and in sorting and scrolling in a large-scale spreadsheet. Our experiments demonstrate that for a variety of data distributions and applications, reordering is responsive to dynamic preference changes, imposes minimal overheads in overall completion time, and provides dramatic improvements in the quality of the feedback over time. Surprisingly, preliminary experiments indicate that online reordering can also be useful in traditional batch query processing, because it can serve as a form of pipelined, approximate sorting.
Cited By
- (2015). Practical Identification of Dynamic Precedence Criteria to Produce Critical Results from Big Data Streams, Big Data Research, 2:4, (127-144), Online publication date: 1-Dec-2015.
- Li J, Tufte K, Shkapenyuk V, Papadimos V, Johnson T and Maier D (2008). Out-of-order processing, Proceedings of the VLDB Endowment, 1:1, (274-288), Online publication date: 1-Aug-2008.
- Haas P and Hellerstein J (2019). Ripple joins for online aggregation, ACM SIGMOD Record, 28:2, (287-298), Online publication date: 1-Jun-1999.
- Haas P and Hellerstein J Ripple joins for online aggregation Proceedings of the 1999 ACM SIGMOD international conference on Management of data, (287-298)
Recommendations
Dual-Paradigm Stream Processing
ICPP '18: Proceedings of the 47th International Conference on Parallel ProcessingExisting stream processing frameworks operate either under data stream paradigm processing data record by record to favor low latency, or under operation stream paradigm processing data in micro-batches to desire high throughput. For complex and mutable ...
Online dynamic reordering
We present a pipelining, dynamically tunable reorder operator for providing user control during long running, data- intensive operations. Users can see partial results and accordingly direct the processing by specifying preferences for various data ...
The Switch Reordering Contagion: Preventing a Few Late Packets from Ruining the Whole Party
Packet reordering has now become one of the most significant bottlenecks in next-generation switch designs. A switch practically experiences a reordering delay contagion, such that a few late packets may affect a disproportionate number of other ...