Nachiappan Nagappan and Thomas Ball
Software systems evolve over time due to changes in requirements, optimization of code, fixes for security and reliability bugs etc. Code churn, which measures the changes made to a component over a period of time, quantifies the extent of this change. We present a technique for early prediction of system defect density using a set of relative code churn measures that relate the amount of churn to other variables such as component size and the temporal extent of churn. Using statistical regression models, we show that while absolute measures of code churn are poor predictors of defect density, our set of relative measures of code churn is highly predictive of defect density. A case study performed on Windows Server 2003 indicates the validity of the relative code churn measures as early indicators of system defect density. Furthermore, our code churn metric suite is able to discriminate between fault and not fault-prone binaries with an accuracy of 89.0 percent.
Publisher Association for Computing Machinery, Inc.
Copyright © 2004 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or firstname.lastname@example.org. The definitive version of this paper can be found at ACM’s Digital Library –http://www.acm.org/dl/.