Conditional Selectivity for Statistics on Query Expressions

Nicolas Bruno and Surajit Chaudhuri

Abstract

Cardinality estimation during query optimization relies on simplifying assumptions that usually do not hold in practice. To diminish the impact of inaccurate estimates during optimization, statistics on query expressions (SITs) have been previously proposed.

These statistics help directly model the distribution of tuples on query sub-plans. Past work in statistics on query expressions has exploited view matching technology to harness their benefits. In this paper we argue against such an approach as it overlooks significant

opportunities for improvement in cardinality estimations. We then introduce a framework to reason with SITs based on the notion of conditional selectivity. We present a dynamic programming algorithm to efficiently find the most accurate selectivity estimation

for given queries, and discuss how such an approach can be incorporated into existing optimizers with a small number of changes. Finally, we demonstrate experimentally that our technique results in superior cardinality estimations than previous approaches with

very little overhead.

Details

Publication typeInproceedings
Published inSIGMOD
PublisherAssociation for Computing Machinery, Inc.
> Publications > Conditional Selectivity for Statistics on Query Expressions