It's not a Bug, It's a Feature: How Misclassification Impacts Bug Prediction

Kim Herzig, Sascha Just, and Andreas Zeller

Abstract

In a manual examination of more than 7,000 issue reports from the bug databases of five open-source projects, we found 33.8% of all issue reports to be misclassified, that is, rather than referring to a code fix, they resulted in a new feature, an update to documentation, or an internal refactoring. This misclassification introduces bias in bug prediction models, confusing bugs and features: On average, 39% of files marked as defective actually never had a bug. We estimate the impact of this misclassification on earlier studies and recommend manual data validation for future studies.

Details

Publication typeInproceedings
Published inProceedings of the 2013 International Conference on Software Engineering
PublisherIEEE
> Publications > It's not a Bug, It's a Feature: How Misclassification Impacts Bug Prediction