A fast algorithm for finding frequent episodes in event streams

  • Srivatsan Laxman ,
  • P. S. Sastry ,
  • K. P. Unnikrishnan

Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), San Jose, USA |

Published by Association for Computing Machinery, Inc.

Frequent episode discovery is a popular framework for mining data available as a long sequence of events. An episode is essentially a short ordered sequence of event types and the frequency of an episode is some suitable measure of how often the episode occurs in the data sequence. Recently, we proposed a new frequency measure for episodes based on the notion of non-overlapped occurrences of episodes in the event sequence, and showed that, such a definition, in addition to yielding computationally efficient algorithms, has some important theoretical properties in connecting frequent episode discovery with HMM learning. This paper presents some new algorithms for frequent episode discovery under this non-overlapped occurrences-based frequency definition. The algorithms presented here are better (by a factor of N, where N denotes the size of episodes being discovered) in terms of both time and space complexities when compared to existing methods for frequent episode discovery. We show through some simulation experiments, that our algorithms are very efficient. The new algorithms presented here have arguably the least possible orders of space and time complexities for the task of frequent episode discovery.