Efficient Evaluation of Queries with Mining Predicates

Proceedings of 18th International Conference on Data Engineering |

Published by IEEE Computer Society

Modern relational database systems are beginning to support ad hoc queries on mining models. In this paper, we explore novel techniques for optimizing queries that apply mining models to relational data. For such queries, we use the internal structure of the mining model to automatically derive traditional database predicates. We present algorithms for deriving such predicates for some popular discrete mining models: decision trees, naive Bayes, and clustering. Our experiments on Microsoft SQL Server 2000 demonstrate that these derived predicates can significantly reduce the cost of evaluating such queries.