skip to main content
research-article

Linear Programming as a Baseline for Software Effort Estimation

Published:17 September 2018Publication History
Skip Abstract Section

Abstract

Software effort estimation studies still suffer from discordant empirical results (i.e., conclusion instability) mainly due to the lack of rigorous benchmarking methods. So far only one baseline model, namely, Automatically Transformed Linear Model (ATLM), has been proposed yet it has not been extensively assessed. In this article, we propose a novel method based on Linear Programming (dubbed as Linear Programming for Effort Estimation, LP4EE) and carry out a thorough empirical study to evaluate the effectiveness of both LP4EE and ATLM for benchmarking widely used effort estimation techniques. The results of our study confirm the need to benchmark every other proposal against accurate and robust baselines. They also reveal that LP4EE is more accurate than ATLM for 17% of the experiments and more robust than ATLM against different data splits and cross-validation methods for 44% of the cases. These results suggest that using LP4EE as a baseline can help reduce conclusion instability. We make publicly available an open-source implementation of LP4EE in order to facilitate its adoption in future studies.

Skip Supplemental Material Section

Supplemental Material

References

  1. A. J. Albrecht and J. E. Gaffney. 1983. Software function, source lines of code, and development effort prediction: A software science validation. IEEE TSE SE-9, 6 (1983), 639--648. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Arcuri and L. Briand. 2014. A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. STVR 24, 3 (2014), 219--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. W. Bailey and V. R, Basili. 1981. A meta-model for software development resource expenditures. In Proc. of ICSE ’81. IEEE, 107--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Bhattacharya and I. Neamtiu. 2011. Bug-fix time prediction models: Can we do better?. In Proc. of MSR ’11. ACM, 207--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. W. Boehm. 1981. Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth Publishing Company, Belmont, CA.Google ScholarGoogle Scholar
  7. L. C. Briand and I. Wieczorek. 2002. Software esource estimation. Encyclopedia of Software Engineering (2002), 1160--1196.Google ScholarGoogle Scholar
  8. J. Chen, V. Nair, R. Krishna, and T. Menzies. 2018. “Sampling” as a baseline optimizer for search-based software engineering. IEEE TSE (2018).Google ScholarGoogle Scholar
  9. N. Cliff. 1996. Ordinal Methods for Behavioral Data Analysis. L. Erlbaum Associates Inc, New Jersey.Google ScholarGoogle Scholar
  10. J. Cohen. 1988. Statistical Power Analysis for the Behavioral Sciences (2nd ed.). L. Earlbaum Associates.Google ScholarGoogle Scholar
  11. P. R. Cohen. 1995. Empirical Methods for Artificial Intelligence. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Conte, H. E. Dunsmore, and V. Y. Shen. 1986. Software Engineering Metrics and Models. The Benjamin/Cummings Publishing Company, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Corazza, S. Di Martino, F. Ferrucci, C. Gravino, F. Sarro, and E. Mendes. 2010. How effective is tabu search to configure support vector regression for effort estimation?. In Proc. of PROMISE’10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Corazza, S. Di Martino, F. Ferrucci, C. Gravino, F. Sarro, and E. Mendes. 2013. Using tabu search to configure support vector regression for effort estimation. ESE 18, 3 (2013), 506--546.Google ScholarGoogle ScholarCross RefCross Ref
  15. G. B. Dantzig. 1998. Linear Programming and Extensions. Princeton University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. M. Desharnais. 1989. Analyse Statistique de la Productivitie des Projets Informatique a Partie de la Technique des Point des Fonction. Ph.D. dissertation. Unpublished Masters thesis, University of Montreal.Google ScholarGoogle Scholar
  17. S. Di Martino, F. Ferrucci, C. Gravino, and F. Sarro. 2011. Using web objects for development effort estimation of web applications: A replicated study. Proc. of PROFES’11. 186--201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Di Martino, F. Ferrucci, C. Gravino, and F. Sarro. 2016. Web effort estimation: Function point analysis vs. COSMIC. IST 72 (2016), 90--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer, and A. Weingessel. 2008. Misc functions of the Department of Statistics (e1071), TU Wien. R package 1 (2008), 5--24.Google ScholarGoogle Scholar
  20. F. Ferrucci, C. Gravino, R. Oliveto, and F. Sarro. 2009. Using tabu search to estimate software development effort. In Proc. of MENSURA’09. LNCS 5891, Springer, 307--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. F. Ferrucci, C. Gravino, R. Oliveto, and F. Sarro. 2010a. Genetic programming for effort estimation: An analysis of the impact of different fitness functions. In Proc. of SSBSE’10. 89--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. F. Ferrucci, C. Gravino, R. Oliveto, F. Sarro, and E. Mendes. 2010b. Investigating tabu search for web effort estimation. In Proc. of EUROMICRO-SEAA’10. 350--357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Ferrucci, C. Gravino, P. Salza, and F. Sarro. 2015a. Investigating functional and code size measures for mobile applications. In Proc. of EUROMICRO-SEAA’15. 365--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. F. Ferrucci, C. Gravino, P. Salza, and F. Sarro. 2015b. Investigating functional and code size measures for mobile applications: A replicated study. In Proc. of PROFES’15. 271--287. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Ferrucci, C. Gravino, and F. Sarro. 2014. Exploiting prior-phase effort data to estimate the effort for the subsequent phases: A further assessment. In Proc. of PROMISE’14. ACM, 42--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Ferrucci, M. Harman, and F. Sarro. 2014. Search-based software project management. In Software Project Management in a Changing World, Günther Ruhe and Claes Wohlin (Eds.). Springer, Berlin, 373--399.Google ScholarGoogle Scholar
  27. F. Ferrucci, E. Mendes, and F. Sarro. 2012. Web effort estimation: The value of cross-company data set compared to single-company data set. In Proc. of PROMISE’12. ACM, New York, 29--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit. 2003. A simulation study of the model evaluation criterion MMRE. IEEE TSE 29, 11 (2003), 985--995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. W. Fu, T. Menzies, and X. Shen. 2016. Tuning for software analytics: Is it really necessary? IST 76 (2016), 135--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. Kam Ho. 1995. Random decision forests. In Proc. of 3rd International Conference on Document Analysis and Recognition, Vol. 1. 278--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. Hochberg. 1988. A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75, 4 (1988), 800.Google ScholarGoogle ScholarCross RefCross Ref
  32. Y. Hochberg and Y. Benjamini. 1990. More powerful procedures for multiple significance testing. Stat. Med. 9, 7 (1990), 811--818.Google ScholarGoogle ScholarCross RefCross Ref
  33. M. Jorgensen and M. Shepperd. 2007. A systematic review of software development cost estimation studies. IEEE TSE 33, 1 (2007), 33--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. G. Kadoda, M. Cartwright, and M. Shepperd. 2001. Issues on the effective use of CBR technology for software project prediction. In Proc. of ICCBR’01. LNCS, Vol. 2080. 276--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Kadoda and M. Shepperd. 2001. Using simulation to evaluate predictions techniques. In Proc. of METRICS’01. IEEE, 349--358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. C. F. Kemerer. 1987. An empirical validation of software cost estimation models. Commun. ACM 30, 5 (1987), 416--429. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Keung, E. Kocaguneli, and T. Menzies. 2013. Finding conclusion stability for selecting the best effort predictor in software effort estimation. ASE 20, 4 (2013), 543--567. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. B. Kitchenham and E. Mendes. 2009. Why comparative effort prediction studies may be invalid. In Proc. of PROMISE’09. 1--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. B. Kitchenham, S. L. Pfleeger, B. McColl, and S. Eagan. 2002. An empirical study of maintenance and development estimation accuracy. JSS 64, 1 (2002), 57--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. B. Kitchenham, L. Pickard, and S. L. Pfleeger. 1995. Case studies for method and tool evaluation. IEEE Softw. 12, 4 (1995), 52--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. B. Kitchenham, L. M. Pickard, S. G. MacDonell, and M. J. Shepperd. 2001. What accuracy statistics really measure. IEEE Proc. Softw. 148, 3 (2001), 81--85.Google ScholarGoogle ScholarCross RefCross Ref
  42. E. Kocaguneli, T. Menzies, A. Bener, and J. W. Keung. 2012b. Exploiting the essential assumptions of analogy-based effort estimation. IEEE TSE 38, 2 (2012), 425--438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. E. Kocaguneli, T. Menzies, J. Keung, D. Cok, and R. Madachy. 2013. Active learning and effort estimation: Finding the essential content of software effort estimation data. IEEE TSE 39, 8 (2013), 1040--1053. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. E. Kocaguneli, T. Menzies, and J. W. Keung. 2012a. On the value of ensemble effort estimation. IEEE TSE 38, 6 (2012), 1403--1416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. E. Kocaguneli, A. Tosun, and A. Bener. 2010. AI-based models for software effort estimation. In Proc. of EUROMICRO-SEAA’10. 323--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. Korte and D. Port. 2008. Confidence in software cost estimation results based on MMRE and Pred. In Proc. of PROMISE’08. 63--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. W. B. Langdon, J. Dolado, F. Sarro, and M. Harman. 2016. Exact mean absolute error of baseline predictor, MARP0. IST 73 (2016), 16--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Stephen G. MacDonell and Martin J. Shepperd. 2003. Using prior-phase effort records for re-estimation during software projects. In Proc. of METRICS’03. IEEE Computer Society, 73--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. C. Mair, G. Kadoda, M. Lefley, K. Phalp, C. Schofield, M. Shepperd, and S. Webster. 2000. An investigation of machine learning based prediction systems. JSS 53, 1 (2000), 23--29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. K. Maxwell. 2002. Applied Statistics for Software Managers. Software Quality Institute Series, Prentice Hall.Google ScholarGoogle Scholar
  51. E. Mendes, S. Counsell, N. Mosley, C. Triggs, and I. Watson. 2003. A comparative study of cost estimation models for Web hypermedia applications. ESE 8, 23 (2003), 163--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. E. Mendes, M. Kalinowski, D. Martins, F. Ferrucci, and F. Sarro. 2014. Cross- vs. within-company cost estimation studies revisited: An extended systematic review. In Proc. of EASE’14. ACM, 12:1--12:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. E. Mendes and B. Kitchenham. 2004a. A comparison of cross-company and within-company effort estimation models for Web applications. In Proc. of EASE’04. 47--55.Google ScholarGoogle Scholar
  54. E. Mendes and B. Kitchenham. 2004b. Further comparison of cross-company and within-company effort estimation models for Web applications. In Proc. of METRICS’04. 348--357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. E. Mendes and N. Mosley. 2008. Bayesian network models for Web effort prediction: A comparative study. IEEE TSE 34, 6 (2008), 723--737. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. T. Menzies, R. Krishna, and D. Pryor. 2017. The SEACRAFT Repository of Empirical Software Engineering Data. Retrieved March 2017 from https://zenodo.org/communities/seacraft.Google ScholarGoogle Scholar
  57. T. Menzies, D. Port, Z. Chen, and J. Hihn. 2005. Validation methods for calibratingsoftware effort models. In Proc. of ICSE’05. 587--595. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. T. Menzies and M. Shepperd. 2012. Special issue on repeatable results in software engineering prediction. ESE 17, 1 (2012), 1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. L. Minku, F. Sarro, E. Mendes, and F. Ferrucci. 2015. How to make best use of cross-company data for Web effort estimation?. In Proc. of ESEM’15. 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. N. Mittas and L. Angelis. 2012. A permutation test based on regression error characteristic curves for software cost estimation models. ESE 17, 1 (2012), 34--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. N. Mittas and L. Angelis. 2013. Ranking and clustering software cost estimation models through a multiple comparisons algorithm. IEEE TSE 39, 4 (2013), 537--551. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. N. Mittas, I. Mamalikidis, and L. Angelis. 2015. A framework for comparing multiple cost estimation methods using an automated visualization toolkit. IST 57, Supplement C (2015), 310--328.Google ScholarGoogle Scholar
  63. Y. Miyazaki, M. Terakado, K. Ozaki, and H. Nozaki. 1994. Robust regression for developing software estimation models. JSS 27, 1 (1994), 3--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. I. Myrtveit, M. Shepperd, and E. Stensrud. 2005. Reliability and validity in comparative studies of software prediction models. IEEE TSE 31, 5 (2005), 380--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. J. C. Nash. 2000. The (Dantzig) simplex method for linear programming. Comput. Sci. Engi. 2, 1 (2000), 29--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. J. Neter, M. H. Kutner, C. J. Nachtsheim, and W. Wasserman. 1996. Applied Linear Statistical Models. McGraw-Hill, Irwin.Google ScholarGoogle Scholar
  67. G. Neumann, M. Harman, and S. M. Poulding. 2015. Transformed Vargha-Delaney effect size. In Proc. of SSBSE’15. 318--324.Google ScholarGoogle Scholar
  68. D. Port and M. Korte. 2008. Comparative studies of the model evaluation criterions MMRE and Pred in software cost estimation research. In Proc. of ESEM’08. 51--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. R Development Core Team. 2011. R: A Language and Environment for Statistical Computing. Retrieved from http://www.r-project.org/foundation/.Google ScholarGoogle Scholar
  70. R. L. Rardin. 1998. Optimization in Operations Research. Vol. 166. Prentice Hall, Upper Saddle River, NJ.Google ScholarGoogle Scholar
  71. J. D. Rodriguez, A. Perez, and J. A. Lozano. 2010. Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell 32, 3 (2010), 569--575. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. P. Royston. 1982. An extension of Shapiro and Wilk’s W test for normality to large samples. Appl. Stat. 31, 2 (1982), 115--124.Google ScholarGoogle ScholarCross RefCross Ref
  73. F. Sarro, S. Di Martino, F. Ferrucci, and C. Gravino. 2012a. A further analysis on the use of genetic algorithm to configure support vector machines for inter-release fault prediction. In Proc. of SAC’12. ACM, 1215--1220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. F. Sarro, F. Ferrucci, and C. Gravino. 2012b. Single and multi objective genetic programming for software development effort estimation. In Proc. of SAC’12. ACM, 1221--1226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. F. Sarro, F. Ferrucci, M. Harman, A. Manna, and J. Ren. 2017. Adaptive multi-objective evolutionary algorithms for overtime planning in software projects. IEEE TSE (2017).Google ScholarGoogle Scholar
  76. F. Sarro, M. Harman, Y. Jia, and Y. Zhang. 2018. Customer rating reactions can be predicted purely using app features. In Proc. of RE'18. 76--87.Google ScholarGoogle Scholar
  77. F. Sarro, A. Petrozziello, and M. Harman. 2016. Multi-objective software effort estimation. In Proc. of ICSE’16. 619--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. M. Shepperd, M. Cartwright, and G. Kadoda. 2000. On building prediction systems for software engineers. ESE 5, 3 (2000), 175--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. M. Shepperd and C. Schofield. 2000. Estimating software project effort using analogies. IEEE TSE 23, 11 (2000), 736--743. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. M. Shepperd, C. Schofield, and B. Kitchenham. 1996. Effort estimation using analogy. In Proc. of the International Conference on Software Engineering. IEEEs, 170--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. M. J. Shepperd and S. G. MacDonell. 2012. Evaluating prediction systems in software project estimation. IST 54, 8 (2012), 820--827. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. B. Sigweni, M. Shepperd, and T. Turchi. 2016. Realistic assessment of software effort estimation models. In Proc. of EASE’16. ACM, 41:1--41:6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. E. Stensrud, T. Foss, B. Kitchenham, and I. Myrtveit. 2003. A further empirical investigation of the relationship between MRE and project size. ESE 8, 2 (2003), 139--161. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. E. Stensrud and I. Myrtveit. 1996. Human performance estimating with analogy and regression models: An empirical validation. In Proc. of METRICS’98. IEEE, 205--213. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. C. Tantithamthavorn, S. McIntosh, A. E. Hassan, and K. Matsumoto. 2018. The impact of automated parameter optimization on defect prediction models. IEEE TSE (2018).Google ScholarGoogle Scholar
  86. P. A. Whigham, Caitlin A. Owen, and S. G. Macdonell. 2015. A baseline model for software effort estimation. ACM TOSEM. 24, 3 (2015), 20:1--20:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. F. H. Yun. 2010. China: Effort Estimation Dataset. Retrieved fromGoogle ScholarGoogle Scholar
  88. H. Zhang, L. Gong, and S. Versteeg. 2013. Predicting bug-fixing time: An empirical study of commercial software projects. In Proc. of ICSE’13. 1042--1051. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Linear Programming as a Baseline for Software Effort Estimation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 27, Issue 3
      July 2018
      210 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/3276753
      Issue’s Table of Contents

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 September 2018
      • Accepted: 1 June 2018
      • Revised: 1 May 2018
      • Received: 1 November 2017
      Published in tosem Volume 27, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader