ABSTRACT
Over the past decades, numerous approaches were proposed to help practitioner to predict or locate defective files. These techniques often use syntactic dependency, history co-change relation, or semantic similarity. The problem is that, it remains unclear whether these different dependency relations will present similar accuracy in terms of defect prediction and localization. In this paper, we present our systematic investigation of this question from the perspective of software architecture. Considering files involved in each dependency type as an individual design space, we model such a design space using one DRSpace. We derived 3 DRSpaces for each of the 117 Apache open source projects, with 643,079 revision commits and 101,364 bug reports in total, and calculated their interactions with defective files. The experiment results are surprising: the three dependency types present significantly different architectural views, and their interactions with defective files are also drastically different. Intuitively, they play completely different roles when used for defect prediction/localization. The good news is that the combination of these structures has the potential to improve the accuracy of defect prediction/localization. In summary, our work provides a new perspective regarding to which type(s) of relations should be used for the task of defect prediction/localization. These quantitative and qualitative results also advance our knowledge of the relationship between software quality and architectural views formed using different dependency types.
- R. W. Selby and V. R. Basili, "Analyzing error-prone system structure," IEEE Transactions on Software Engineering, vol. 17, no. 2, pp. 141--152, 1991. Google ScholarDigital Library
- N. Nagappan, T. Ball, and A. Zeller, "Mining metrics to predict component failures," in Proceedings of the 28th international conference on Software engineering. ACM, 2006, pp. 452--461. Google ScholarDigital Library
- T. Zimmermann and N. Nagappan, "Predicting defects using network analysis on dependency graphs," in ACM/IEEE International Conference on Software Engineering, 2008, pp. 531--540. Google ScholarDigital Library
- M. Cataldo, A. Mockus, J. A. Roberts, and J. D. Herbsleb, "Software dependencies, work dependencies, and their impact on failures," IEEE Transactions on Software Engineering, vol. 35, no. 6, pp. 864--878, 2009. Google ScholarDigital Library
- T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy, "Predicting fault incidence using software change history," IEEE Transactions on software engineering, vol. 26, no. 7, pp. 653--661, 2000. Google ScholarDigital Library
- S. Wang, T. Liu, and L. Tan, "Automatically learning semantic features for defect prediction," in Ieee/acm International Conference on Software Engineering, 2016, pp. 297--308. Google ScholarDigital Library
- Y. Qu, X. Guan, Q. Zheng, T. Liu, L. Wang, Y. Hou, and Z. Yang, "Exploring community structure of software call graph and its applications in class cohesion measurement," Journal of Systems and Software, vol. 108, pp. 193--210, 2015.Google Scholar
- L. Xiao, Y. Cai, and R. Kazman, "Design rule spaces: A new form of architecture insight," in Proceedings of the 36th International Conference on Software Engineering. ACM, 2014, pp. 967--977. Google ScholarDigital Library
- Y. Cai, H. Wang, S. Wong, and L. Wang, "Leveraging design rules to improve software architecture recovery," in Proceedings of the 9th international ACM Sigsoft conference on Quality of software architectures. ACM, 2013, pp. 133--142. Google ScholarDigital Library
- Z. Li, X. Y. Jing, X. Zhu, and H. Zhang, "Heterogeneous defect prediction through multiple kernel learning and ensemble learning," in IEEE International Conference on Software Maintenance and Evolution, 2017, pp. 91--102.Google Scholar
- D. D. Nucci, F. Palomba, R. Oliveto, and A. D. Lucia, "Dynamic selection of classifiers in bug prediction: An adaptive method," IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 1, no. 3, pp. 202--212, 2017.Google Scholar
- V. Tzerpos and R. C. Holt, "Acdc: An algorithm for comprehension-driven clustering." in wcre, 2000, pp. 258--267. Google ScholarDigital Library
- J. Garcia, D. Popescu, C. Mattmann, N. Medvidovic, and Y. Cai, "Enhancing architectural recovery using concerns," in Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, 2011, pp. 552--555. Google ScholarDigital Library
- A. Corazza, S. Di Martino, V. Maggio, and G. Scanniello, "Investigating the use of lexical information for software system clustering," in Software Maintenance and Reengineering (CSMR), 2011 15th European Conference on. IEEE, 2011, pp. 35--44. Google ScholarDigital Library
- B. S. Mitchell, "A heuristic approach to solving the software clustering problem," in Software Maintenance, 2003. ICSM 2003. Proceedings. International Conference on. IEEE, 2003, pp. 285--288. Google ScholarDigital Library
- F. Beck and S. Diehl, "Evaluating the impact of software evolution on software clustering," in Reverse Engineering, 2010, pp. 99--108. Google ScholarDigital Library
- B. S. Mitchell and S. Mancoridis, "On the automatic modularization of software systems using the bunch tool," IEEE Transactions on Software Engineering, vol. 32, no. 3, pp. 193--208, 2006. Google ScholarDigital Library
- J. Garcia, I. Ivkovic, and N. Medvidovic, "A comparative analysis of software architecture recovery techniques," in Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on. IEEE, 2013, pp. 486--496. Google ScholarDigital Library
- T. Lutellier, D. Chollak, J. Garcia, L. Tan, D. Rayside, N. Medvidovic, and R. Kroeger, "Comparing software architecture recovery techniques using accurate dependencies," in 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 2. IEEE, 2015, pp. 69--78. Google ScholarDigital Library
- I. Macia, R. Arcoverde, E. Cirilo, A. Garcia, and A. V. Staa, "Supporting the identification of architecturally-relevant code anomalies," in IEEE International Conference on Software Maintenance, 2012, pp. 662--665. Google ScholarDigital Library
- R. Mo, Y. Cai, R. Kazman, and L. Xiao, "Hotspot patterns: The formal definition and automatic detection of architecture smells," in Software Architecture (WICSA), 2015 12th Working IEEE/IFIP Conference on. IEEE, 2015, pp. 51--60. Google ScholarDigital Library
- W. Oizumi, A. Garcia, L. da Silva Sousa, B. Cafeo, and Y. Zhao, "Code anomalies flock together: exploring code anomaly agglomerations for locating design problems," in Proceedings of the 38th International Conference on Software Engineering. ACM, 2016, pp. 440--451. Google ScholarDigital Library
- F. A. Fontana, I. Pigazzini, R. Roveda, and M. Zanoni, "Automatic detection of instability architectural smells," in IEEE International Conference on Software Maintenance and Evolution, 2017, pp. 433--437.Google Scholar
- F. A. Fontana, I. Pigazzini, R. Roveda, D. Tamburri, M. Zanoni, and E. D. Nitto, "Arcan: A tool for architectural smells detection," in IEEE International Conference on Software Architecture Workshops, 2017, pp. 282--285.Google Scholar
- L. Xiao, Y. Cai, R. Kazman, R. Mo, and Q. Feng, "Identifying and quantifying architectural debt," in Proceedings of the 38th International Conference on Software Engineering. ACM, 2016, pp. 488--498. Google ScholarDigital Library
- M. Gethers and D. Poshyvanyk, "Using relational topic models to capture coupling among classes in object-oriented software systems," in IEEE International Conference on Software Maintenance, 2010, pp. 1--10. Google ScholarDigital Library
- J. Chang and D. M. Blei, "Hierarchical relational models for document networks," Annals of Applied Statistics, vol. 4, no. 1, pp. 124--150, 2010.Google Scholar
- G. Bavota, R. Oliveto, M. Gethers, D. Poshyvanyk, and A. D. Lucia, "Methodbook: Recommending move method refactorings via relational topic models," IEEE Transactions on Software Engineering, vol. 40, no. 7, pp. 671--694, 2014. Google ScholarDigital Library
- L. Bass, P. Clements, and R. Kazman, Software Architecture in Practice, 3rd ed., 2012. Google ScholarDigital Library
- R. Kazman, Y. Cai, R. Mo, Q. Feng, L. Xiao, S. Haziyev, V. Fedak, and A. Shapochka, "A case study in locating the architectural roots of technical debt," in Proceedings of the 37th International Conference on Software Engineering-Volume 2. IEEE Press, 2015, pp. 179--188. Google ScholarDigital Library
- L. Xiao, Y. Cai, and R. Kazman, "Titan: A toolset that connects software architecture with quality analysis," in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 2014, pp. 763--766. Google ScholarDigital Library
- Z. Wen and V. Tzerpos, "An effectiveness measure for software clustering algorithms," in IEEE International Workshop on Program Comprehension, 2004, p. 194. Google ScholarDigital Library
- D. M. Le, P. Behnamghader, J. Garcia, and D. Link, "An empirical study of architectural change in open-source software systems," in MSR, 2015, pp. 235--245. Google ScholarDigital Library
- S. Wong and Y. Cai, "Generalizing evolutionary coupling with stochastic dependencies," in Ieee/acm International Conference on Automated Software Engineering, 2011, pp. 293--302. Google ScholarDigital Library
- A. Bachmann, C. Bird, F. Rahman, P. Devanbu, and A. Bernstein, "The missing links:bugs and bug-fix commits," in ACM Sigsoft International Symposium on Foundations of Software Engineering, 2010, Santa Fe, Nm, Usa, November, 2010, pp. 97--106. Google ScholarDigital Library
Recommendations
Mining software defects: should we consider affected releases?
ICSE '19: Proceedings of the 41st International Conference on Software EngineeringWith the rise of the Mining Software Repositories (MSR) field, defect datasets extracted from software repositories play a foundational role in many empirical studies related to software quality. At the core of defect data preparation is the ...
Process metrics for software defect prediction in object‐oriented programs
Software evolution is an important activity in the life cycle of a modern software system. In the process of software evolution, the repair of historical defects and the increasing demands may introduce new defects. Therefore, evolution‐oriented defect ...
Progress on approaches to software defect prediction
Software defect prediction is one of the most popular research topics in software engineering. It aims to predict defect‐prone software modules before defects are discovered, therefore it can be used to better prioritise software quality assurance effort. ...
Comments