skip to main content
10.1145/1247480.1247510acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Fast data stream algorithms using associative memories

Published:11 June 2007Publication History

ABSTRACT

The primary goal of data stream research is to develop space and time efficient solutions for answering continuous on-line summarization queries. Research efforts over the last decade have resulted in a number of efficient algorithms with varying degrees of space and time complexities. While these techniques are developed in a standard CPU setting, many of their applications such as click-fraud detection and network-traffic summarization typically execute on special networking architectures called Network Processing Units (NPUs). These NPUs interface with special associative memories known as Ternary Content Addressable Memories (TCAMs) to provide gigabit rate forwarding at network routers. In this paper, we describe how the integrated architecture of NPU and TCAMs can be exploited towards achieving the goal of developing high-speed stream summarization solutions. We propose two TCAM-conscious solutions for the frequent elements problem in data streams and present a comprehensive evaluation of these techniques on a state-of-the-art networking platform.

References

  1. Intel network processing units. http://www.intel.com/, 2006.Google ScholarGoogle Scholar
  2. Teja networking systems. http://www.teja.com, 2006.Google ScholarGoogle Scholar
  3. D. J. Abadi, D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. The VLDB Journal, 12(2):120--139, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Bandi, D. Agrawal, and A. E. Abbadi. Fast algorithms for heavy distinct hitters using associative memories. In International Conference on Distributed Computing Systems(ICDCS), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Bandi, S. Schnieder, D. Agrawal, and A. E. Abbadi. Hardware acceleration of database operations using content-addressable memories. In DaMoN, 2005.Google ScholarGoogle Scholar
  6. S. Chandrasekaran. Telegraphcq: Continuous dataflow processing for an uncertain world, 2003.Google ScholarGoogle Scholar
  7. M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In ICALP'02, pages 693--703, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Cormode and S. Muthukrishnan. What's hot and what's not: tracking most frequent items dynamically. In PODS '03, pages 296--306, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Cranor, T. Johnson, O. Spataschek, and V. Shkapenyuk. Gigascope: a stream database for network applications. In SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 647--651, New York, NY, USA, 2003. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. D. Demaine, A. López-Ortiz, and J. I. Munro. Frequency estimation of internet packet streams with limited space. In Proceedings of the 10th Annual European Symposium on Algorithms (ESA'02), volume 2461, pages 348--360, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Estan and G. Varghese. New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice. ACM Trans. Comput. Syst., 21(3):270--313, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. H. K. Fang Yu. Efficient Multi-Match Packet Classification with TCAM. In IEEE Hot Interconnects, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. T. Gold, A. Ailamaki, L. Huston, and B. Falsafi. Accelerating database operations using a network processor. In DaMoN, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Greenwald and S. Khanna. Space-efficient online computation of quantile summaries. In SIGMOD Conference, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Hershberger, N. Shrivastava, S. Suri, and C. D. Tóth. Adaptive spatial partitioning for multidimensional data streams. In ISAAC, pages 522--533, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. M. Karp, S. Shenker, and C. H. Papadimitriou. A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Syst., 28(1):51--55, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. Lakshminarayanan, A. Rangarajan, and S. Venkatachary. Algorithms for advanced packet classification with ternary cams. In SIGCOMM '05, pages 193--204, New York, NY, USA, 2005. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. S. Manku and R. Motwani. Approximate frequency counts over data streams, 2002.Google ScholarGoogle Scholar
  19. G. S. Manku, S. Rajagopalan, and B. G. Lindsay. Approximate medians and other quantiles in one pass and with limited memory. In SIGMOD'98, pages 426--435, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Metwally, D. Agrawal, and A. E. Abbadi. Efficient computation of frequent and top-k elements in data streams. In ICDT, pages 398--412, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Misra and D. Gries. Finding repeated elements. Sci. Comput. Program., 2(2):143--152, 1982.Google ScholarGoogle ScholarCross RefCross Ref
  22. S. Mysore, B. Agrawal, T. Sherwood, N. Shrivastava, and S. Suri. Profiling over adaptive ranges. In CGO '06: Proceedings of the International Symposium on Code Generation and Optimization, pages 147--158, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. J. Narlikar, A. Basu, and F. Zane. Coolcams: Power-efficient tcams for forwarding engines. In INFOCOM, 2003.Google ScholarGoogle Scholar
  24. R. Panigrahy and S. Sharma. Sorting and Searching using Ternary CAMs. In IEEE Micro., 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Widom and R. Motwani. Query processing, resource management, and approximation in a data stream management system, 2003.Google ScholarGoogle Scholar
  26. F. Yu, R. H. Katz, and T. V. Lakshman. Gigabit rate packet pattern-matching using tcam. In ICNP '04, pages 174--183, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fast data stream algorithms using associative memories

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
      June 2007
      1210 pages
      ISBN:9781595936868
      DOI:10.1145/1247480
      • General Chairs:
      • Lizhu Zhou,
      • Tok Wang Ling,
      • Program Chair:
      • Beng Chin Ooi

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 June 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader