An Efficient, Cost-Driven Index Selection Tool for Microsoft SQL Server

VLDB |

Published by Very Large Data Bases Endowment Inc.

In this paper we describe novel techniques that make it possible to build an industrial-strenth tool for automating the choice of indexes in the physical design of a SQL database. The tool takes as input a workload of SQL queries, and suggests a set of suitable indexes. We ensure that the indexes chosen are effective in reducing the cost of the workload by keeping the index selection tool and the query optimizer “in step”. The number of index sets that must be evaluated to find the optimal configuration is very large. We reduce the complexity of this problem using three techniques. First, we remove a large number of spurious indexes from consideration by taking into account both query syntax and cost information. Second, we introduce optimizations that make it possible to cheaply evaluate the “goodness” of an index set. Third, we describe an iterative approach to handle the complexity arising from multi-column indexes. The tool has been implemented on Microsoft SQL Server 7.0. We performed extensive experiments over a range of workloads, including TPC-D. The results indicate that the tool is efficient and its choices are close to optimal.