ABSTRACT
In spite of the omnipresence of parallel (multi-core) systems, the predominant strategy to evaluate window-based stream joins is still strictly sequential, mostly just straightforward along the definition of the operation semantics.
In this work we present handshake join, a way of describing and executing window-based stream joins that is highly amenable to parallelized execution. Handshake join naturally leverages available hardware parallelism, which we demonstrate with an implementation on a modern multi-core system and on top of field-programmable gate arrays (FPGAs), an emerging technology that has shown distinctive advantages for high-throughput data processing.
On the practical side, we provide a join implementation that substantially outperforms CellJoin (the fastest published result) and that will directly turn any degree of parallelism into higher throughput or larger supported window sizes. On the semantic side, our work gives a new intuition of window semantics, which we believe could inspire other stream processing algorithms or ongoing standardization efforts for stream query languages.
- D. J. Abadi, Y. Ahmad, M. Balazinska, U. Çetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, and S. Zdonik. The Design of the Borealis Stream Processing Engine. In Proc. of the 2nd Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, CA, USA, 2005.Google Scholar
- A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom. STREAM: The Stanford Data Stream Management System. http://infolab.stanford.edu/ usriv/papers/streambook.pdf.Google Scholar
- A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The Multikernel: A new OS architecture for scalable multicore systems. In Proc. of the 22nd ACM Symposium on Operating Systems Principles (SOSP), Big Sky, MT, USA, 2009. Google ScholarDigital Library
- I. Botan, R. Derakhshan, N. Dindar, L. Haas, R. J. Miller, and N. Tatbul. SECRET: A Model for Analysis of the Execution Semantics of Stream Processing Systems. Proc. of the VLDB Endowment, 3(1), 2010. Google ScholarDigital Library
- M. Butts. Synchronization Through Communication in a Massively Parallel Processor Array. IEEE Micro, 27(5):32--40, 2007. Google ScholarDigital Library
- P. Conway, N. Kalyanasundharam, G. Donley, K. Lepak, and B. Hughes. Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor. IEEE Micro, 30(2):16--29, 2010. Google ScholarDigital Library
- J. Davis, C. Thacker, and C. Chang. BEE3: Revitalizing Computer Architecture Research. Technical Report MSR-TR-2009-45, Microsoft Research, 2009.Google Scholar
- A. DeHon. Balancing Interconnect and Computation in a Reconfigurable Computing Array (or, why you don't really want 100% LUT utilization). In Proc. of the Int'l Symposium on Field Programmable Gate Arrays (FPGA), Monterey, CA, USA, 1999. Google ScholarDigital Library
- P. Frey, R. Gonçalves, M. Kersten, and J. Teubner. A Spinning Join That Does Not Get Dizzy. In Proc. of the 30th Int'l Conference on Distributed Computing Systems (ICDCS), Genoa, Italy, 2010. Google ScholarDigital Library
- B. Gedik, P. S. Yu, and R. Bordawekar. CellJoin: A Parallel Stream Join Operator for the Cell Processor. The VLDB Journal, 18(2):501--519, 2009. Google ScholarDigital Library
- T. M. Ghanem, M. A. Hammad, M. F. Mokbel, W. G. Aref, and A. K. Elmagarmid. Incremental Evaluation of Sliding-Window Queries over Data Streams. IEEE Trans. on Knowledge and Data Engineering (TKDE), 19(1):57--72, 2007. Google ScholarDigital Library
- L. Golab, S. Garg, and T. Öszu. On Indexing Sliding Windows over Online Data Streams. In Proc. of the 9th Int'l Conference on Extending Database Technology (EDBT), Crete, Greece, 2004. Google ScholarDigital Library
- S. Helmer, T. Westmann, and G. Moerkotte. Diag-Join: An Opportunistic Join Algorithm for 1:N Relationships. In Proc. of the 24th Int'l Conference on Very Large Databases (VLDB), New York, NY, USA, 1998. Google ScholarDigital Library
- J. Howard, S. Dighe, Y. Hoskote, S. Vangal, D. Finan, G. Ruhl, D. Jenkins, H. Wilson, N. Borkar, G. Schrom, F. Pailet, S. Jain, T. Jacob, S. Yada, S. Marella, P. Salihundam, V. Erraguntla, M. Konow, M. Riepen, G. Droege, J. Lindemann, M. Gries, T. Apel, K. Henriss, T. Lund-Larsen, S. Steibl, S. Borkar, V. De, R. v. d. Wijngaart, and T. Mattson. A 48-Core IA-32 Message-Passing Processor with DVFS in 45nm CMOS. In 2010 IEEE International Solid-State Circuits Conference, San Francisco, CA, USA, February 2010.Google ScholarCross Ref
- Z. Ives, D. Florescu, M. Friedman, A. Levy, and D. S. Weld. An Adaptive Query Execution System for Data Integration. In Proc. of the Int'l Conference on Management of Data (SIGMOD), Philadelphia, PA, USA, 1999. Google ScholarDigital Library
- N. Jain, S. Mishra, A. Srinivasan, J. Gehrke, J. Widom, H. Balakrishnan, U. Çetintemel, M. Cherniack, R. Tibbetts, and S. Zdonik. Towards a Streaming SQL Standard. Proc. of the VLDB Endowment, 1(2), 2008. Google ScholarDigital Library
- J. Kang, J. F. Naughton, and S. D. Viglas. Evaluating Window Joins over Unbounded Streams. In Proc. of the 19th Int'l Conference on Data Engineering (ICDE), Bangalore, India, 2003.Google ScholarCross Ref
- S. Krishnamurthy, M. J. Franklin, J. Davis, D. Farina, P. Golovko, A. Li, and N. Thombre. Continuous Analytics Over Discontinuous Streams. In Proc. of the Int'l Conference on Management of Data (SIGMOD), Indidanapolis, IN, USA, 2010. Google ScholarDigital Library
- H. T. Kung and P. L. Lohman. Systolic (VLSI) Arrays for Relational Database Operations. In Proc. of the Int'l Conference on Management of Data (SIGMOD), Santa Monica, CA, USA, 1980. Google ScholarDigital Library
- I. Kuon and J. Rose. Measuring the gap between FPGAs and ASICs. In Proc. of the 14th Int'l Symposium on Field-Programmable Gate Arrays (FPGA), Monterey, CA, USA, 2006. Google ScholarDigital Library
- J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker. Semantics and Evaluation Techniques for Window Aggregates in Data Streams. In Proc. of the Int'l Conference on Management of Data (SIGMOD), Baltimore, MD, USA, 2005. Google ScholarDigital Library
- O. Mencer, K. H. Tsoi, S. Craimer, T. Todman, W. Luk, M. Y. Wong, and P. H. W. Leong. CUBE: A 512-FPGA Cluster. In Proc. of the Southern Programmable Logic Conference (SPL), Sao Carlos, Brazil, 2009.Google ScholarCross Ref
- A. Mitra, M. R. Vieira, P. Bakalov, V. J. Tsotras, and W. A. Najjar. Boosting XML Filtering Through a Scalable FPGA-Based Architecture. In Proc. of the 4th Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, CA, USA, 2009.Google Scholar
- Netezza Corp. http://www.netezza.com/.Google Scholar
- J.-B. Qian, H.-B. Xu, Y.-S. Dong, X.-J. Liu, and Y.-L. Wang. FPGA Acceleration Window Joins Over Multiple Data Streams. Journal of Circuits, Systems, and Computers, 14(4):813--830, 2005.Google Scholar
- K. A. Ross. Selection Conditions in Main Memory. ACM Trans. on Database Systems (TODS), 29:132--161, 2004. Google ScholarDigital Library
- N. Tatbul, U. Çetintemel, S. Zdonik, M. Cherniack, and M. Stonebraker. Load Shedding in a Data Stream Manager. In Proc. of the 29th Int'l Conference on Very Large Databases (VLDB), Berlin, Germany, 2003. Google ScholarDigital Library
- W. B. Teeuw and H. M. Blanken. Control Versus Data Flow in Parallel Database Machines. IEEE Trans. on Parallel and Distributed Systems (TPDS), 4(11):1265--1279, 1993. Google ScholarDigital Library
- J. Teubner, R. Mueller, and G. Alonso. FPGA Acceleration for the Frequent Item Problem. In Proc. of the 26th Int'l Conference on Data Engineering (ICDE), Long Beach, CA, USA, 2010.Google ScholarCross Ref
- A. Wilschut and P. Apers. Dataflow Query Execution in a Parallel Main-Memory Environment. In Proc. of the 1st Int'l Conference on Parallel and Distributed Information Systems (PDIS), Miami Beach, FL, USA, 1991. Google ScholarDigital Library
Index Terms
- How soccer players would do stream joins
Recommendations
CPU load shedding for binary stream joins
We present an adaptive load shedding approach for windowed stream joins. In contrast to the conventional approach of dropping tuples from the input streams, we explore the concept ofselective processing for load shedding. We allow stream tuples to be ...
Adaptive load shedding for windowed stream joins
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge managementWe present an adaptive load shedding approach for windowed stream joins. In contrast to the conventional approach of dropping tuples from the input streams, we explore the concept of selective processing for load shedding. We allow stream tuples to be ...
Executing stream joins on the cell processor
VLDB '07: Proceedings of the 33rd international conference on Very large data basesLow-latency and high-throughput processing are key requirements of data stream management systems (DSMSs). Hence, multi-core processors that provide high aggregate processing capacity are ideal matches for executing costly DSMS operators. The recently ...
Comments