skip to main content
10.1145/1281192.1281238acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

A fast algorithm for finding frequent episodes in event streams

Published:12 August 2007Publication History

ABSTRACT

Frequent episode discovery is a popular framework for mining data available as a long sequence of events. An episode is essentially a short ordered sequence of event types and the frequency of an episode is some suitable measure of how often the episode occurs in the data sequence. Recently,we proposed a new frequency measure for episodes based on the notion of non-overlapped occurrences of episodes in the event sequence, and showed that, such a definition, in addition to yielding computationally efficient algorithms, has some important theoretical properties in connecting frequent episode discovery with HMM learning. This paper presents some new algorithms for frequent episode discovery under this non-overlapped occurrences-based frequency definition. The algorithms presented here are better (by a factor of N, where N denotes the size of episodes being discovered) in terms of both time and space complexities when compared to existing methods for frequent episode discovery. We show through some simulation experiments, that our algorithms are very efficient. The new algorithms presented here have arguably the least possible orders of spaceand time complexities for the task of frequent episode discovery.

References

  1. M. J. Atallah, R. Gwadera, and W. Szpankowski. Detection of significant sets of episodes in event sequences. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), pages 3--10, Brighton, UK, 01-04 November 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Casas-Garriga. Discovering unbounded episodes in sequential data. In Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'03), pages 83--94, Cavtat-Dubvrovnik, Croatia, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  3. R. Gwadera, M. J. Atallah, and W. Szpankowski. Reliable detection of episodes in event sequences. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), pages 67--74, 19-23 November 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Gwadera, M. J. Atallah, and W. Szpankowski. Markov models for identification of significant episodes. In Proceedings of the 2005 SIAM International Conference on Data Mining (SDM-05), Newport Beach, California, April 2005.Google ScholarGoogle ScholarCross RefCross Ref
  5. S. Laxman. Discovering frequent episodes in event streams: Fast algorithms, connections with HMMs and generalizations. PhD thesis, Dept. of Electrical Engineering, Indian Institute of Science, Bangalore, India, Mar. 2006.Google ScholarGoogle Scholar
  6. S. Laxman, P. S. Sastry, and K. P. Unnikrishnan. Discovering frequent episodes and learning Hidden Markov Models: A formal connection. IEEE Transactions on Knowledge and Data Engineering, 17(11):1505--1517, Nov. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Laxman, P. S. Sastry, and K. P. Unnikrishnan. Discovering frequent generalized episodes when events persist for different durations. To appear in IEEE Transactions on Knowledge and Data Engineering, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H. Mannila, H. Toivonen, and A. I. Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3):259--289, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Meger and C. Rigotti. Constraint-based mining of episode rules and optimal window sizes. In Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'04), Pisa, Italy, Sept. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Regnier and W. Szpankowski. On pattern frequency occurrences in a Markovian sequence. Algorithmica, 22:631--649, 1998.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A fast algorithm for finding frequent episodes in event streams

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2007
      1080 pages
      ISBN:9781595936097
      DOI:10.1145/1281192

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 August 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      KDD '07 Paper Acceptance Rate111of573submissions,19%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader