Article

Use of relative code churn measures to predict system defect density

Authors:
Nachiappan Nagappan

North Carolina State University, Raleigh, NC

North Carolina State University, Raleigh, NC
View Profile

,
Thomas Ball

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

ICSE '05: Proceedings of the 27th international conference on Software engineeringMay 2005Pages 284–292https://doi.org/10.1145/1062455.1062514

Published:15 May 2005Publication History

ICSE '05: Proceedings of the 27th international conference on Software engineering

Pages 284–292

ABSTRACT

Software systems evolve over time due to changes in requirements, optimization of code, fixes for security and reliability bugs etc. Code churn, which measures the changes made to a component over a period of time, quantifies the extent of this change. We present a technique for early prediction of system defect density using a set of relative code churn measures that relate the amount of churn to other variables such as component size and the temporal extent of churn.Using statistical regression models, we show that while absolute measures of code churn are poor predictors of defect density, our set of relative measures of code churn is highly predictive of defect density. A case study performed on Windows Server 2003 indicates the validity of the relative code churn measures as early indicators of system defect density. Furthermore, our code churn metric suite is able to discriminate between fault and not fault-prone binaries with an accuracy of 89.0 percent.

References

ANSI/IEEE, "IEEE Standard Glossary of Software Engineering Terminology, Standard 729," 1983.Google Scholar
Basili, V., Shull, F.,Lanubile, F., "Building Knowledge through Families of Experiments," IEEE Transactions on Software Engineering, Vol. Vol. 25, No.4, No., 1999. Google ScholarDigital Library
Boehm, B. W., Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1981. Google ScholarDigital Library
Brace, N., Kemp, R., Snelgar, R., SPSS for Psychologists: Palgrave Macmillan, 2003.Google Scholar
Brito e Abreu, F., Melo, W., "Evaluating the Impact of Object-Oriented Design on Software Quality," Proceedings of Third International Software Metrics Symposium, 1996, pp. 90--99. Google ScholarDigital Library
Denaro, G., Pezze, M., "An Empirical Evaluation of Fault-Proneness Models," Proceedings of International Conference on Software Engineering, 2002, pp. 241--251. Google ScholarDigital Library
Fenton, N. E., Ohlsson, N., "Quantitative analysis of faults and failures in a complex software system," IEEE Transactions on Software Engineering, Vol. 26, No. 8, pp. 797--814, 2000. Google ScholarDigital Library
Fenton, N. E., Pfleeger, S.L., Software Metrics. Boston, MA: International Thompson Publishing, 1997.Google Scholar
Graves, T. L., Karr, A.F., Marron, J.S., Siy, H., "Predicting Fault Incidence Using Software Change History," IEEE Transactions on Software Engineering, Vol. 26, No. 7, pp. 653--661, 2000. Google ScholarDigital Library
Jackson, E. J., A User's Guide to Principal Components: John Wiley & Sons, Inc., 1991.Google ScholarCross Ref
Kaiser, H. F., "An Index of Factorial Simplicity," Psychometrika, Vol. 39, No., pp. 31--36, 1974.Google ScholarCross Ref
Karunanithi, N., "A Neural Network approach for Software Reliability Growth Modeling in the Presence of Code Churn," Proceedings of International Symposium on Software Reliability Engineering, 1993, pp. 310--317.Google Scholar
Khoshgoftaar, T. M., Allen, E.B., Goel, N., Nandi, A., McMullan, J., "Detection of Software Modules with high Debug Code Churn in a very large Legacy System," Proceedings of International Symposium on Software Reliability Engineering, 1996, pp. 364--371. Google ScholarDigital Library
Khoshgoftaar, T. M., Allen, E.B., Kalaichelvan, K.S., Goel, N., Hudepohl, J.P., Mayrand, J., "Detection of fault-prone program modules in a very large telecommunications system," Proceedings of International Symposium Software Reliability Engineering, 1995, pp. 24--33.Google ScholarCross Ref
Khoshgoftaar, T. M., Szabo, R.M., "Improving Code Churn Predictions During the System Test and Maintenance Phases," Proceedings of IEEE International Conference on Software Maintainence, 1994, pp. 58--67. Google ScholarDigital Library
Kleinbaum, D. G., Kupper, L.L., Muller, K.E., Applied Regression Analysis and Other Multivariable Methods. Boston: PWS-KENT Publishing Company, 1987. Google ScholarDigital Library
Munson, J. C., Elbaum, S., "Code Churn: A Measure for Estimating the Impact of Code Change," Proceedings of IEEE International Conference on Software Maintenance, 1998, pp. 24--31. Google ScholarDigital Library
Munson, J. C., Khoshgoftaar, T.M., "The Detection of Fault-Prone Programs," IEEE Transactions on Software Engineering, Vol. 18, No. 5, pp. 423--433, 1992. Google ScholarDigital Library
Ohlsson, M. C., von Mayrhauser, A., McGuire, B., Wohlin, C., "Code Decay Analysis of Legacy Software through Successive Releases," Proceedings of IEEE Aerospace Conference, 1999, pp. 69--81.Google Scholar
Ostrand, T. J., Weyuker, E.J, Bell, R.M., "Where the Bugs Are," Proceedings of the 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2004, pp. 86--96. Google ScholarDigital Library

Index Terms

Recommendations

Static analysis tools as early indicators of pre-release defect density
ICSE '05: Proceedings of the 27th international conference on Software engineering

During software development it is helpful to obtain early estimates of the defect density of software components. Such estimates identify fault-prone areas of code requiring further testing. We present an empirical approach for the early prediction of ...
Read More
Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
Background. Slice-based cohesion metrics leverage program slices with respect to the output variables of a module to quantify the strength of functional relatedness of the elements within the module. Although slice-based cohesion metrics have been ...
Read More
A Comparison of Different Defect Measures to Identify Defect-Prone Components
IWSM-MENSURA '13: Proceedings of the 2013 Joint Conference of the 23nd International Workshop on Software Measurement (IWSM) and the 8th International Conference on Software Process and Product Measurement

(Background) Defect distribution in software systems has been shown to follow the Pareto rule of 20-80. This motivates the prioritization of components with the majority of defects for testing activities. (Research goal) Are there significant variations ...
Read More

Reviews

Reviewer: Elliot Jaffe

New software releases are not defect free, and such defects are expensive to fix once they have been deployed in the field. If a company can predict which components are likely to have more defects, then they can focus their quality assurance (QA) efforts more efficiently, and prepare for a higher incidence of support calls. This paper presents a postmortem case study of the Windows Server 2003 defect rates. The focus is on a predictive model that identifies defect-prone components based on their most recent software development history. The authors looked at a number of software engineering metrics, in an attempt to correlate the defect rate of each binary with the changes made to that binary during the development process. They found a surprisingly high level of correlation between a set of eight relative change measures and the defect rate of the resulting program. The tone of the paper is that of professional statistics. Each argument is stated clearly, along with associated caveats and exceptions. I felt that this approach showed careful analysis, and improved the overall credibility of this paper. This paper's contribution is its argument that "code that changes many times pre-release will likely have more post-release defects than code that changes less over the same period of time." This is a very intuitive notion, and the data and statistics presented in the paper make a strong case. One of my problems with this paper is that it is focused on proving its main thesis, and not on providing direction to developers or organizations. The metrics are shown to have predictive power, but I was left wondering how to deploy them. Due to the proprietary nature of the data, we know that the metrics are predictive, but we have no scale on which to apply them. Does modifying a file 100 times increase the error rate by one percent or 50 percent__?__ How much change is acceptable during a development cycle, and what are the implications__?__ If you are a researcher in this field, then this paper is probably going to be on your citation list. It is a good, careful case study. If you are a practitioner, then there is little to gain from reading this paper. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICSE '05: Proceedings of the 27th international conference on Software engineering
May 2005
754 pages
ISBN:1581139632
DOI:10.1145/1062455
General Chair:
Gruia-Catalin Roman
Washington University in St. Louis
,
Program Chairs:
William Griswold
University of California, San Diego
,
Bashar Nuseibeh
The Open University, UK
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 May 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
defect density
fault-proneness
multiple regression
principal component analysis
relative code churn
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate276of1,856submissions,15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 337
  Total Citations
  View Citations
- 1,988
  Total Downloads
- Downloads (Last 12 months)80
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Use of relative code churn measures to predict system defect density

ICSE '05: Proceedings of the 27th international conference on Software engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Static analysis tools as early indicators of pre-release defect density

Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study

A Comparison of Different Defect Measures to Identify Defect-Prone Components

Reviews

Access critical reviews of Computing literature here