Surajit Chaudhuri and Vivek Narasayya
In this paper we describe novel techniques that make it possible to build an industrial-strenth tool for automating the choice of indexes in the physical design of a SQL database. The tool takes as input a workload of SQL queries, and suggests a set of suitable indexes. We ensure that the indexes chosen are effective in reducing the cost of the workload by keeping the index selection tool and the query optimizer "in step". The number of index sets that must be evaluated to find the optimal configuration is very large. We reduce the complexity of this problem using three techniques. First, we remove a large number of spurious indexes from consideration by taking into account both query syntax and cost information. Second, we introduce optimizations that make it possible to cheaply evaluate the "goodness" of an index set. Third, we describe an iterative approach to handle the complexity arising from multi-column indexes. The tool has been implemented on Microsoft SQL Server 7.0. We performed extensive experiments over a range of workloads, including TPC-D. The results indicate that the tool is efficient and its choices are close to optimal.
Publisher Very Large Data Bases Endowment Inc.
All articles published in this journal are protected by copyright, which covers the exclusive rights to reproduce and distribute the article (e.g., as offprints), as well as all translation rights. No material published in this journal may be reproduced photographically or stored on microfilm, in electronic data bases, video disks, etc., without first obtaining written permission from Very Large Data Bases Endowment Inc.