Mining Interesting Patterns in Multi-Relational Data
Eirini Spyropoulou and Tijl De Bie, University of Bristol
Abstract: Research on local pattern mining algorithms has focused for years on mining one table of transactions and items (market basket data). However, a large amount of real-world datasets are multi-relational, i.e., they contain more than two entities and more than one relationships. There is therefore increasing interest in developing techniques for mining such complex types of data. Defining a multi-relational pattern syntax that is suitably expressive and finding an efficient mining algorithm, is a challenging problem. Furthermore, since local pattern mining methods usually suffer from the combinatorial explosion of patterns in the output, assessing the quality of patterns is also very important in order to make them useful to the end user.
In this work we introduce a novel approach for mining patterns in multi-relational data. We propose the pattern syntax of Complete Connected Subsets, a new syntax for multi-relational patterns. While this pattern syntax is generally applicable to multi-relational data it reduces to well known itemsets when the data is a simple transaction table. We also introduce RMiner, a simple yet practically efficient algorithm to mine such patterns. We show how the interestingness of patterns of the proposed syntax can be conveniently quantified using a framework for quantifying the subjective interestingness of patterns according to the user's prior knowledge. Finally, we present results on real world data sets.