skip to main content
10.1145/1142473.1142543acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

On-the-fly sharing for streamed aggregation

Published:27 June 2006Publication History

ABSTRACT

Data streaming systems are becoming essential for monitoring applications such as financial analysis and network intrusion detection. These systems often have to process many similar but different queries over common data. Since executing each query separately can lead to significant scalability and performance problems, it is vital to share resources by exploiting similarities in the queries. In this paper we present ways to efficiently share streaming aggregate queries with differing periodic windows and arbitrary selection predicates. A major contribution is our sharing technique that does not require any up-front multiple query optimization. This is a significant departure from existing techniques that rely on complex static analyses of fixed query workloads. Our approach is particularly vital in streaming systems where queries can join and leave the system at any point. We present a detailed performance study that evaluates our strategies with an implementation and real data. In these experiments, our approach gives us as much as an order of magnitude performance improvement over the state of the art.

References

  1. A. Arasu et al. Resource sharing in continuous sliding-window aggregates. In VLDB. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Arasu, et al. The CQL continuous query language: Semantic foundations and query execution. VLDB Journal, (To appear). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. F. Bancilhon, et al. FAD, a powerful and simple database language. In VLDB. 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Carney, et al. Monitoring streams - a new class of data management applications. In VLDB. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Chandrasekaran et al. Streaming queries over streaming data. In VLDB. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Chandrasekaran, et al. TelegraphCQ: Continuous dataflow processing for an uncertain world. In CIDR. 2003.Google ScholarGoogle Scholar
  7. J. Chen, et al. NiagaraCQ: a scalable continuous query system for Internet databases. In SIGMOD. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. D. Cranor, et al. Gigascope: A stream database for network applications. In SIGMOD. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Denny et al. Predicate result range caching for continuous queries. In SIGMOD. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. M. Deshpande, et al. Caching multidimensional queries using chunks. In SIGMOD. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. L. Forgy. Rete: A fast algorithm for the many pattern/many object match problem. Artifical Intelligence, 19(1):17--37, September 1982.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. J. Franklin, et al. Design considerations for high fan-in systems: The HiFi approach. In CIDR. 2005.Google ScholarGoogle Scholar
  13. L. Golab et al. Update-pattern-aware modeling and processing of continuous queries. In SIGMOD. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73--170, June 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Gray, et al. Data Cube: a relational aggregation operator generalizing group-by, cross-tab and sub-total. In ICDE. February 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. A. Hammad, et al. Efficient pipelined execution of sliding window queries over data streams. Technical Report CSD TR#03-035, Purdue, 2003.Google ScholarGoogle Scholar
  17. M. A. Hammad, et al. Scheduling for shared window joins over data streams. In vldb. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. V. Harinarayan, et al. Implementing data cubes efficiently. In SIGMOD. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Jarke. Common subexpression isolation in multiple query optimization. In Query Processing in Database Systems. Springer Verlag, 1985.Google ScholarGoogle Scholar
  20. S. Krishnamurthy, et al. TelegraphCQ: An architectural status report. IEEE DE. Bull., 26(1), 2003.Google ScholarGoogle Scholar
  21. S. Krishnamurthy, et al. The case for precision sharing. In VLDB. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Li, et al. No pane, no gain: Efficient evaluation of sliding-window aggregates over data streams. SIGMOD Record, March 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. R. Madden, et al. Continuously adaptive continuous queries over streams. In SIGMOD. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. R. Madden, et al. TAG: a tiny aggregation service for ad-hoc sensor networks. In OSDI. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Motwani, et al. Query processing, resource management, and approximation in a data stream management system. In CIDR. 2003.Google ScholarGoogle Scholar
  26. NASDAQ. NASTRAQ: North American Securities Tracking and Quantifying System. http://www.nastraq.com/description.htm.Google ScholarGoogle Scholar
  27. NYSE. NYSE TAQ: Daily Trades and Quotes Database. http://www.nysedata.com/info/productdetail.asp?dpbid=13.Google ScholarGoogle Scholar
  28. P. Roy, et al. Efficient and extensible algorithms for multi query optimization. In SIGMOD. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. K. Sellis. Multiple-query optimization. ACM TODS, March 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. Srivastava, et al. Multiple aggregations over data streams. In SIGMOD. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. N. Tatbul, et al. Load shedding in a data stream manager. In VLDB. 2003.Google ScholarGoogle Scholar

Index Terms

  1. On-the-fly sharing for streamed aggregation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data
      June 2006
      830 pages
      ISBN:1595934340
      DOI:10.1145/1142473

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 June 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader