Share this page
Share this page E-mail this page Print this page RSS feeds
Home > Publications > Connections between mining frequent itemsets and learning generative models
Connections between mining frequent itemsets and learning generative models

Frequent itemsets mining is a popular framework for pattern discovery. In this framework, given a database of customer transactions, the task is to unearth all pat- terns in the form of sets of items appearing in a sizable number of transactions. We present a class of models called Itemset Generating Models (or IGMs) that can be used to formally connect the process of frequent item- sets discovery with the learning of generative models. IGMs are specified using simple probability mass func- tions (over the space of transactions), peaked at spe- cific sets of items and uniform everywhere else. Under such a connection, it is possible to rigorously associate higher frequency patterns with generative models that have greater data likelihoods. This enables a generative model-learning interpretation of frequent itemsets min- ing. More importantly, it facilitates a statistical sig- nificance test which prescribes the minimum frequency needed for a pattern to be considered interesting. We illustrate the effectiveness of our analysis through ex- periments on standard benchmark data sets.

fitemsets-icdm07.pdf
PDF file

In: Proceedings of the Seventh International Conference on Data Mining (ICDM 2007), Omaha, NE, USA

Publisher: Institute of Electrical and Electronics Engineers, Inc.
© 2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Details

Type: Inproceedings
Pages: 571–576