skip to main content
research-article

Degree-of-knowledge: Modeling a developer's knowledge of code

Published:04 April 2014Publication History
Skip Abstract Section

Abstract

As a software system evolves, the system's codebase constantly changes, making it difficult for developers to answer such questions as who is knowledgeable about particular parts of the code or who needs to know about changes made. In this article, we show that an externalized model of a developer's individual knowledge of code can make it easier for developers to answer such questions. We introduce a degree-of-knowledge model that computes automatically, for each source-code element in a codebase, a real value that represents a developer's knowledge of that element based on a developer's authorship and interaction data. We present evidence that shows that both authorship and interaction data of the code are important in characterizing a developer's knowledge of code. We report on the usage of our model in case studies on expert finding, knowledge transfer, and identifying changes of interest. We show that our model improves upon an existing expertise-finding approach and can accurately identify changes for which a developer should likely be aware. We discuss how our model may provide a starting point for knowledge transfer but that more refinement is needed. Finally, we discuss the robustness of the model across multiple development sites.

References

  1. Erik M. Altmann. 2001. Near-term memory in programming: A simulation-based analysis. Int. J. Human Comput. Stud. 54, 2, 189--210. citeseer.ist.psu.edu/article/altmann99nearterm.html Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andrew Begel, Yit Phang Khoo, and Thomas Zimmermann. 2010. Codebook: Discovering and exploiting relationships in software repositories. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ICSE'10). ACM, 125--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Lucy M. Berlin. 1993. Beyond program understanding: A look at programming expertise in industry. In Proceedings of the 5th Workshop on Empirical Studies of Programmers. Curtis R. Cook, Jean C. Scholtz, and James C. Spohrer, Eds., Ablex Publishing Corporation, 6--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jacob T. Biehl, Mary Czerwinski, Greg Smith, and George G. Robertson. 2007. FASTDash: A visual dashboard for fostering awareness in software teams. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'07). ACM, New York, NY, 1313--1322. DOI: http://dx.doi.org/10.1145/1240624.1240823 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ruven Brooks. 1978. Using a behavioral theory of program comprehension in software engineering. In Proceedings of the 3rd International Conference on Software Engineering (ICSE'78). IEEE Press, 196--201. http://portal.acm.org/citation.cfm?id=800099.803210 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ruven Brooks. 1983. Towards a theory of the comprehension of computer programs. Int. J. Man-Mach. Stud. 18, 6, 543--554. DOI: http://dx.doi.org/DOI: 10.1016/S0020-7373(83)80031-5Google ScholarGoogle ScholarCross RefCross Ref
  7. Neil R. Carlson, William Buskist, Michael E. Enzle, and C. Donald Heth. 2005. Psychology: The Science of Behaviour. Prentice Hall Canada.Google ScholarGoogle Scholar
  8. Mauro Cherubini, Gina Venolia, Rob DeLine, and Andrew J. Ko. 2007. Let's go to the whiteboard: How and why software developers use drawings. In Proceedings of CHI. ACM, 557--566. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Robert DeLine, Amir Khella, Mary Czerwinski, and George Robertson. 2005. Towards understanding programs through wear-based filtering. In Proceedings of the ACM Symposium Software Visualization (SoftVis'05). ACM, 183--192. DOI: http://dx.doi.org/10.1145/1056018.1056044 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Françoise Détienne. 2002. Software Design—Cognitive Aspects. Springer-Verlag New York, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Paul Dourish and Victoria Bellotti. 1992. Awareness and coordination in shared workspaces. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work (CSCW'92). ACM, New York, NY, 107--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Thomas Fritz and Gail C. Murphy. 2010. Using information fragments to answer the questions developers ask. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ISCE'10). ACM, New York, NY, 175--184. DOI: http://dx.doi.org/10.1145/1806799.1806828 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thomas Fritz, Gail C. Murphy, and Emily Hill. 2007. Does a programmer's activity indicate knowledge of code? In Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC-FSE'07). ACM, New York, NY, 341--350. DOI: http://dx.doi.org/10.1145/1287624.1287673 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Thomas Fritz, Jingwen Ou, Gail C. Murphy, and Emerson Murphy-Hill. 2010. A degree-of-knowledge model to capture source code familiarity. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ISCE'10). ACM, New York, NY, 385--394. DOI: http://dx.doi.org/10.1145/1806799.1806856 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gary Gillund and Richard M. Shiffrin. 1984. A retrieval model for both recognition and recall. Psychol. Rev. 91, 1, 1--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Tudor Girba, Adrian Kuhn, Mauricio Seeberger, and Stéphane Ducasse. 2005. How developers drive software evolution. In Proceedings of the 8th International Workshop on Principles of Software Evolution (IWPSE'05). IEEE Computer Society, 113--122. DOI: http://dx.doi.org/10.1109/IWPSE.2005.21 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Peter Graf and Daniel L. Schacter. 1987. Selective effects of interference on implicit and explicit memory for new associations. J. Exp. Psychol. Learn. Memory Cognition 13.Google ScholarGoogle ScholarCross RefCross Ref
  18. Carl Gutwin, Reagan Penner, and Kevin Schneider. 2004. Group awareness in distributed software development. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW'04). ACM, New York, NY, 72--81. DOI: http://dx.doi.org/10.1145/1031607.1031621 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lile Hattori and Michele Lanza. 2009. Mining the history of synchronous changes to refine code ownership. In Proceedings of the International Workshop on Mining Software Repositories. 141--150. DOI: http://dx.doi.org/10.1109/MSR.2009.5069492 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Reid Holmes and Andrew Begel. 2008. Deep intellisense: A tool for rehydrating evaporated information. In Proceedings of the International Workshop on Mining Software Repositories (MSR'08). ACM, New York, NY, 23--26. DOI: http://dx.doi.org/10.1145/1370750.1370755 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Reid Holmes and Robert J. Walker. 2010. Customized awareness: Recommending relevant external change events. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE'10). ACM, New York, NY, 465--474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Mik Kersten. 2007. Focusing knowledge work with task context. Ph.D. Dissertation, University of British Columbia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mik Kersten and Gail C. Murphy. 2005. Mylar: A degree-of-interest model for IDEs. In Proceedings of the 4th International Conference on Aspect-Oriented Software Development (AOSD'05). ACM, New York, NY, 159--168. DOI: http://dx.doi.org/10.1145/1052898.1052912 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mik Kersten and Gail C. Murphy. 2006. Using task context to improve programmer productivity. In Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT'06/FSE-14). ACM, New York, NY, USA, 1--11. DOI: http://dx.doi.org/10.1145/1181775.1181777 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Andrew J. Ko, Brad A. Myers, Michael J. Coblenz, and Htet Htet Aung. 2006. An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans. Softw. Eng. 32, 971--987. DOI: http://dx.doi.org/10.1109/TSE.2006.116 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Thomas D. LaToza and Brad A. Myers. 2010. Hard-to-answer questions about code. In Proceedings of the 2nd Workshop on the Evaluation and Usability of Programming Languages and Tools at SPLASH'10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Thomas D. LaToza, Gina Venolia, and Robert DeLine. 2006. Maintaining mental models: A study of developer work habits. In Proceedings of the 28th International Conference on Software Engineering (ICSE'06). ACM, New York, NY, 492--501. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Taek Lee, Jaechang Nam, DongGyun Han, Sunghun Kim, and Hoh Peter In. 2011. Micro interaction metrics for defect prediction. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE'11). 311--321. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. David W. McDonald and Mark S. Ackerman. 2000. Expertise recommender: A flexible recommendation system and architecture. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW'00). ACM Press, New York, NY, 231--240. DOI: http://dx.doi.org/10.1145/358916.358994 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Shawn Minto and Gail C. Murphy. 2007. Recommending emergent teams. In Proceedings of the International Workshop on Mining Software Repositories (MSR'07). IEEE Computer Society, 5. DOI: http://dx.doi.org/10.1109/MSR.2007.27 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Audris Mockus and James D. Herbsleb. 2002. Expertise browser: A quantitative approach to identifying expertise. In Proceedings of the 24th International Conference on Software Engineering (ICSE'02). ACM, New York, NY, 503--512. DOI: http://dx.doi.org/10.1145/581339.581401 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Gail C. Murphy, Mik Kersten, and Leah Findlater. 2006. How are java software developers using the Eclipse IDE? IEEE Softw. 23, 4, 76--83. DOI: http://dx.doi.org/10.1109/MS.2006.105 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Emerson Murphy-Hill and Andrew P. Black. 2010. An interactive ambient visualization for code smells. In Proceedings of the ACM Symposium on Software Visualization (SoftVis'10). ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Chris Parnin, Carsten Görg, and Spencer Rugaber. 2006. Enriching revision history with interactions. In Proceedings of the International Workshop on Mining Software Repositories (MSR'06). ACM, 155--158. DOI: http://dx.doi.org/10.1145/1137983.1138019 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Nancy Pennington. 1987. Stimulus structures and mental representations in expert comprehension of computer programs. Cognitive Psychol. 19, 3, 295--341. DOI: http://dx.doi.org/DOI: 10.1016/0010-0285(87) 90007-7Google ScholarGoogle ScholarCross RefCross Ref
  36. Charles Rich and Richard C. Waters. 1988. The programmer's apprentice: A research overview. Computer 21, 11, 10--25. DOI: http://dx.doi.org/10.1109/2.86782 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Anita Sarma, Gerald Bortis, and Andre van der Hoek. 2007. Towards supporting awareness of indirect conflicts across software configuration management workspaces. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE'07). ACM, New York, NY, 94--103. DOI: http://dx.doi.org/10.1145/1321631.1321647 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Anita Sarma, Zahra Noroozi, and André van der Hoek. 2003. Palantír: Raising awareness among configuration management workspaces. In Proceedings of the 25th International Conference on Software Engineering (ICSE'03). IEEE Computer Society, 444--454. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Daniel L. Schacter. 1987. Implicit memory: History and current status. J. Exp. Psychol. Learn. Memory Cognition 13.Google ScholarGoogle ScholarCross RefCross Ref
  40. David Schuler and Thomas Zimmermann. 2008. Mining usage expertise from version archives. In Proceedings of the International Workshop on Mining Software Repositories (MSR'08). ACM, 121--124. DOI: http://dx.doi.org/10.1145/1370750.1370779 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Elliot Soloway and Kate Ehrlich. 1984. Empirical studies of programming knowledge. IEEE Trans. Softw. Eng. 10, 5, 595--609. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. A. von Mayrhauser and A. M. Vans. 1994. Comprehension processes during large scale maintenance. In Proceedings of the 16th International Conference on Software Engineering. 39--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Lijie Zou and Michael W. Godfrey. 2008. Understanding interaction differences between newcomer and expert programmers. In Proceedings of the International Workshop on Recommendation Systems for Software Engineering (RSSE'08). ACM, 26--29. DOI: http://dx.doi.org/10.1145/1454247.1454256 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Degree-of-knowledge: Modeling a developer's knowledge of code

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 23, Issue 2
      March 2014
      319 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/2600788
      Issue’s Table of Contents

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 April 2014
      • Accepted: 1 August 2013
      • Revised: 1 February 2012
      • Received: 1 December 2010
      Published in tosem Volume 23, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader