ABSTRACT
Background: The increasing up-trend of software size brings about challenges related to release planning and maintainability. Foreseeing the growth of software metrics can assist in taking proactive decisions regarding different areas where software metrics play vital roles. For example, source code metrics are used to automatically calculate technical debt related to code quality which may indicate how maintainable a software is. Thus, predicting such metrics can give us an indication of technical debt in the future releases of software. Objective: Estimation or prediction of software metrics can be performed more meaningfully if the relationships between different domains of metrics and relationships between the metrics and different domains are well understood. To understand such relationships, this empirical study has collected 25 metrics classified into four domains from 9572 software revisions of 20 open source projects from 8 well-known companies. Results: We found software size related metrics are most correlated among themselves and with metrics from other domains. Complexity and documentation related metrics are more correlated with size metrics than themselves. Metrics in the duplications domain are observed to be more correlated to themselves on a domain-level. However, a metric to domain level relationship exploration reveals that metrics with most strong correlations are in fact connected to size metrics. The Overall correlation ranking of duplication metrics are least among all domains and metrics. Contribution: Knowledge earned from this research will help to understand inherent relationships between metrics and domains. This knowledge together with metric-level relationships will allow building better predictive models for software code metrics.
- Amit Deshpande and Dirk Riehle. 2008. The Total Growth of Open Source. In Open Source Development, Communities and Quality. Springer, Boston, MA, 197--209. Google ScholarCross Ref
- Sallie Henry, Dennis Kafura, and Kathy Harris. 1981. On the Relationships Among Three Software Metrics. In Proceedings of the 1981 ACM Workshop/Symposium on Measurement and Evaluation of Software Quality. ACM, New York, NY, USA, 81--88. Google ScholarDigital Library
- S. Henry and C. Selig. 1990. Predicting source-code complexity at the design stage. IEEE Software 7, 2 (March 1990), 36--44. Google ScholarDigital Library
- S. Jantunen, L. Lehtola, D. C. Gause, U. R. Dumdum, and R. J. Barnes. 2011. The challenge of release planning. In 2011 Fifth International Workshop on Software Product Management (IWSPM). 36--45. Google ScholarCross Ref
- Graylin Jay, Joanne E. Hale, Randy K. Smith, David Hale, Nicholas A. Kraft, and Charles Ward. 2009. Cyclomatic Complexity and Lines of Code: Empirical Evidence of a Stable Linear Relationship. Journal of Software Engineering and Applications 02, 03 (Oct. 2009), 137. Google ScholarCross Ref
- Davy Landman, Alexander Serebrenik, Eric Bouwers, and Jurgen J. Vinju. 2016. Empirical analysis of the relationship between CC and SLOC in a large corpus of Java methods and C functions. Journal of Software: Evolution and Process 28, 7 (July 2016), 589--618. Google ScholarDigital Library
- Timothy C. Lethbridge, Susan Elliott Sim, and Janice Singer. 2005. Studying Software Engineers: Data Collection Techniques for Software Field Studies. Empirical Software Engineering 10, 3 (01 Jul 2005), 311--341. Google ScholarDigital Library
- J. Letouzey and M. Ilkiewicz. 2012. Managing Technical Debt with the SQALE Method. Software, IEEE 29, 6 (Nov 2012), 44--51. Google ScholarDigital Library
- LinuxFoundation. 2017. Microsoft Fortifies Commitment to Open Source, Becomes Linux Foundation Platinum Member. (May 2017). https://www.linuxfoundation.org/announcements/microsoft-fortifies-commitment- to-open-source-becomes-linux-foundation-platinum Accessed 2017-05-01.Google Scholar
- T. J. McCabe. 1976. A Complexity Measure. IEEE Transactions on Software Engineering SE-2, 4 (Dec 1976), 308--320. Google ScholarDigital Library
- M. J. P. v d Meulen and M. A. Revilla. 2007. Correlations between Internal Software Metrics and Software Dependability in a Large Population of Small C/C+ + Programs. In The 18th IEEE International Symposium on Software Reliability (ISSRE '07). 203--208. Google ScholarDigital Library
- Mehwish Riaz, Emilia Mendes, and Ewan Tempero. 2009. A Systematic Review of Software Maintainability Prediction and Metrics. In Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement (ESEM '09). IEEE Computer Society, Washington, DC, USA, 367--377. Google ScholarDigital Library
- Per Runeson and Martin Höst. 2008. Guidelines for Conducting and Reporting Case Study Research in Software Engineering. Empirical Software Engineering 14, 2 (Dec. 2008), 131--164. Google ScholarDigital Library
- S. Saini, S. Sharma, and R. Singh. 2015. Better utilization of correlation between metrics using Principal Component Analysis (PCA). In 2015 Annual IEEE India Conference (INDICON). 1--6. Google ScholarCross Ref
- Yahya Tashtoush, Mohammed Al-Maolegi, and Bassam Arkok. 2014. The Correlation among Software Complexity Metrics with Case Study. arXiv:1408.4523 [cs] (Aug. 2014). http://arxiv.org/abs/1408.4523 arXiv: 1408.4523.Google Scholar
- Stephen Walli, Dave Gynn, and Bruno von Rotz. 2005. The growth of open source software in organizations. A report (2005). http://www.academia.edu/download/7022731/wp_optaros_oss_usage_in_organizations.pdfGoogle Scholar
- N. Wirth. 1995. A Plea for Lean Software. Computer 28, 2 (1995), 64--68. Google ScholarDigital Library
Index Terms
- Correlations of software code metrics: an empirical study
Recommendations
A survey of dynamic software metrics
Software metrics help us to make meaningful estimates for software products and guide us in taking managerial and technical decisions. However, conventional static metrics have been found to be inadequate for modern object-oriented software due to the ...
On the usefulness of ownership metrics in open-source software projects
ContextCode ownership metrics were recently defined in order to distinguish major and minor contributors of a software module, and to assess whether the ownership of such a module is strong or shared between developers. ObjectiveThe relationship between ...
Comments