ABSTRACT
Frequent episode discovery is a popular framework for mining data available as a long sequence of events. An episode is essentially a short ordered sequence of event types and the frequency of an episode is some suitable measure of how often the episode occurs in the data sequence. Recently,we proposed a new frequency measure for episodes based on the notion of non-overlapped occurrences of episodes in the event sequence, and showed that, such a definition, in addition to yielding computationally efficient algorithms, has some important theoretical properties in connecting frequent episode discovery with HMM learning. This paper presents some new algorithms for frequent episode discovery under this non-overlapped occurrences-based frequency definition. The algorithms presented here are better (by a factor of N, where N denotes the size of episodes being discovered) in terms of both time and space complexities when compared to existing methods for frequent episode discovery. We show through some simulation experiments, that our algorithms are very efficient. The new algorithms presented here have arguably the least possible orders of spaceand time complexities for the task of frequent episode discovery.
- M. J. Atallah, R. Gwadera, and W. Szpankowski. Detection of significant sets of episodes in event sequences. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), pages 3--10, Brighton, UK, 01-04 November 2004. Google ScholarDigital Library
- G. Casas-Garriga. Discovering unbounded episodes in sequential data. In Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'03), pages 83--94, Cavtat-Dubvrovnik, Croatia, 2003.Google ScholarCross Ref
- R. Gwadera, M. J. Atallah, and W. Szpankowski. Reliable detection of episodes in event sequences. In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), pages 67--74, 19-23 November 2003. Google ScholarDigital Library
- R. Gwadera, M. J. Atallah, and W. Szpankowski. Markov models for identification of significant episodes. In Proceedings of the 2005 SIAM International Conference on Data Mining (SDM-05), Newport Beach, California, April 2005.Google ScholarCross Ref
- S. Laxman. Discovering frequent episodes in event streams: Fast algorithms, connections with HMMs and generalizations. PhD thesis, Dept. of Electrical Engineering, Indian Institute of Science, Bangalore, India, Mar. 2006.Google Scholar
- S. Laxman, P. S. Sastry, and K. P. Unnikrishnan. Discovering frequent episodes and learning Hidden Markov Models: A formal connection. IEEE Transactions on Knowledge and Data Engineering, 17(11):1505--1517, Nov. 2005. Google ScholarDigital Library
- S. Laxman, P. S. Sastry, and K. P. Unnikrishnan. Discovering frequent generalized episodes when events persist for different durations. To appear in IEEE Transactions on Knowledge and Data Engineering, 2007. Google ScholarDigital Library
- H. Mannila, H. Toivonen, and A. I. Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3):259--289, 1997. Google ScholarDigital Library
- N. Meger and C. Rigotti. Constraint-based mining of episode rules and optimal window sizes. In Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'04), Pisa, Italy, Sept. 2004. Google ScholarDigital Library
- M. Regnier and W. Szpankowski. On pattern frequency occurrences in a Markovian sequence. Algorithmica, 22:631--649, 1998.Google ScholarCross Ref
Index Terms
- A fast algorithm for finding frequent episodes in event streams
Recommendations
Stream prediction using a generative model based on frequent episodes in event sequences
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data miningThis paper presents a new algorithm for sequence prediction over long categorical event streams. The input to the algorithm is a set of target event types whose occurrences we wish to predict. The algorithm examines windows of events that precede ...
Discovery of Frequent Episodes in Event Sequences
Sequences of events describing the behavior and actions of users or systems can be collected in several domains. An episode is a collection of events that occur relatively close to each other in a given partial order. We consider the problem of discovering ...
Efficient mining of frequent episodes from complex sequences
Discovering patterns with great significance is an important problem in data mining discipline. An episode is defined to be a partially ordered set of events for consecutive and fixed-time intervals in a sequence. Most of previous studies on episodes ...
Comments