Improving the Productivity of Compiler Code Quality Analysis

  • Hongbo Rong ,
  • Andy Ayers ,
  • David Gillies

MSR-TR-2009-18 |

Producing quality code is one of the most important goals of an optimizing compiler. Analyzing code quality is therefore an essential activity in compiler engineering. By motivating new optimizations and diagnosing regressions, it takes a bottleneck position in the process. However, it has been highly empirical, and dependent on architectures and tools. This makes it a difficult and time-consuming task, and its productivity is unpredictable and usually low.

This paper proposes two novel approaches for code quality analysis. The first approach focuses on the key scenario in compiler construction, the computation-intensive benchmarks. We observed that the workload upon the processor dominates the execution time of such benchmarks. Therefore, we use the compiler to parse the workload. By doing this, the compiler applies its static analysis power to identify its own code quality issues and potentials. We have implemented a software system for this approach and integrated it with our daily testing infrastructure. It automates the code quality analysis for Spec benchmarks, and provides developers relevant information in reasonable time.

The second approach addresses code quality regressions of operating system benchmarks. We take advantage of the built-in instrumentation of the operating system to collect traces of events, and then construct a tree out of the trace. Analyzing regression is thus simplified as tree comparison. With the tree structure, this approach divides and conquers the usually huge amount of data, and effectively localizes the focus to a few leaves of the trees. It has been implemented, and addressed several difficult issues that led to visible code quality improvement.

This paper also summarizes our experiences in bringing up the code quality of a product compiler framework, establishes a few simple guidelines to achieve code quality objectives with minimized efforts, and empirically compares various analysis approaches.