research-article

Degree-of-knowledge: Modeling a developer's knowledge of code

Authors:
Thomas Fritz

University of Zurich

University of Zurich
View Profile

,
Gail C. Murphy

University of British Columbia

University of British Columbia
View Profile

,
Emerson Murphy-Hill

North Carolina State University

North Carolina State University
View Profile

,
Jingwen Ou

University of British Columbia

University of British Columbia
View Profile

,
Emily Hill

Montclair State University

Montclair State University
View Profile

ACM Transactions on Software Engineering and Methodology Volume 23 Issue 2Article No.: 14pp 1–42https://doi.org/10.1145/2512207

Published:04 April 2014Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

As a software system evolves, the system's codebase constantly changes, making it difficult for developers to answer such questions as who is knowledgeable about particular parts of the code or who needs to know about changes made. In this article, we show that an externalized model of a developer's individual knowledge of code can make it easier for developers to answer such questions. We introduce a degree-of-knowledge model that computes automatically, for each source-code element in a codebase, a real value that represents a developer's knowledge of that element based on a developer's authorship and interaction data. We present evidence that shows that both authorship and interaction data of the code are important in characterizing a developer's knowledge of code. We report on the usage of our model in case studies on expert finding, knowledge transfer, and identifying changes of interest. We show that our model improves upon an existing expertise-finding approach and can accurately identify changes for which a developer should likely be aware. We discuss how our model may provide a starting point for knowledge transfer but that more refinement is needed. Finally, we discuss the robustness of the model across multiple development sites.

References

Erik M. Altmann. 2001. Near-term memory in programming: A simulation-based analysis. Int. J. Human Comput. Stud. 54, 2, 189--210. citeseer.ist.psu.edu/article/altmann99nearterm.html Google ScholarDigital Library
Andrew Begel, Yit Phang Khoo, and Thomas Zimmermann. 2010. Codebook: Discovering and exploiting relationships in software repositories. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ICSE'10). ACM, 125--134. Google ScholarDigital Library
Lucy M. Berlin. 1993. Beyond program understanding: A look at programming expertise in industry. In Proceedings of the 5th Workshop on Empirical Studies of Programmers. Curtis R. Cook, Jean C. Scholtz, and James C. Spohrer, Eds., Ablex Publishing Corporation, 6--25. Google ScholarDigital Library
Jacob T. Biehl, Mary Czerwinski, Greg Smith, and George G. Robertson. 2007. FASTDash: A visual dashboard for fostering awareness in software teams. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI'07). ACM, New York, NY, 1313--1322. DOI: http://dx.doi.org/10.1145/1240624.1240823 Google ScholarDigital Library
Ruven Brooks. 1978. Using a behavioral theory of program comprehension in software engineering. In Proceedings of the 3rd International Conference on Software Engineering (ICSE'78). IEEE Press, 196--201. http://portal.acm.org/citation.cfm&quest;id=800099.803210 Google ScholarDigital Library
Ruven Brooks. 1983. Towards a theory of the comprehension of computer programs. Int. J. Man-Mach. Stud. 18, 6, 543--554. DOI: http://dx.doi.org/DOI: 10.1016/S0020-7373(83)80031-5Google ScholarCross Ref
Neil R. Carlson, William Buskist, Michael E. Enzle, and C. Donald Heth. 2005. Psychology: The Science of Behaviour. Prentice Hall Canada.Google Scholar
Mauro Cherubini, Gina Venolia, Rob DeLine, and Andrew J. Ko. 2007. Let's go to the whiteboard: How and why software developers use drawings. In Proceedings of CHI. ACM, 557--566. Google ScholarDigital Library
Robert DeLine, Amir Khella, Mary Czerwinski, and George Robertson. 2005. Towards understanding programs through wear-based filtering. In Proceedings of the ACM Symposium Software Visualization (SoftVis'05). ACM, 183--192. DOI: http://dx.doi.org/10.1145/1056018.1056044 Google ScholarDigital Library
Françoise Détienne. 2002. Software Design—Cognitive Aspects. Springer-Verlag New York, Inc. Google ScholarDigital Library
Paul Dourish and Victoria Bellotti. 1992. Awareness and coordination in shared workspaces. In Proceedings of the ACM Conference on Computer-Supported Cooperative Work (CSCW'92). ACM, New York, NY, 107--114. Google ScholarDigital Library
Thomas Fritz and Gail C. Murphy. 2010. Using information fragments to answer the questions developers ask. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ISCE'10). ACM, New York, NY, 175--184. DOI: http://dx.doi.org/10.1145/1806799.1806828 Google ScholarDigital Library
Thomas Fritz, Gail C. Murphy, and Emily Hill. 2007. Does a programmer's activity indicate knowledge of code&quest; In Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC-FSE'07). ACM, New York, NY, 341--350. DOI: http://dx.doi.org/10.1145/1287624.1287673 Google ScholarDigital Library
Thomas Fritz, Jingwen Ou, Gail C. Murphy, and Emerson Murphy-Hill. 2010. A degree-of-knowledge model to capture source code familiarity. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering (ISCE'10). ACM, New York, NY, 385--394. DOI: http://dx.doi.org/10.1145/1806799.1806856 Google ScholarDigital Library
Gary Gillund and Richard M. Shiffrin. 1984. A retrieval model for both recognition and recall. Psychol. Rev. 91, 1, 1--67. Google ScholarDigital Library
Tudor Girba, Adrian Kuhn, Mauricio Seeberger, and Stéphane Ducasse. 2005. How developers drive software evolution. In Proceedings of the 8th International Workshop on Principles of Software Evolution (IWPSE'05). IEEE Computer Society, 113--122. DOI: http://dx.doi.org/10.1109/IWPSE.2005.21 Google ScholarDigital Library
Peter Graf and Daniel L. Schacter. 1987. Selective effects of interference on implicit and explicit memory for new associations. J. Exp. Psychol. Learn. Memory Cognition 13.Google ScholarCross Ref
Carl Gutwin, Reagan Penner, and Kevin Schneider. 2004. Group awareness in distributed software development. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW'04). ACM, New York, NY, 72--81. DOI: http://dx.doi.org/10.1145/1031607.1031621 Google ScholarDigital Library
Lile Hattori and Michele Lanza. 2009. Mining the history of synchronous changes to refine code ownership. In Proceedings of the International Workshop on Mining Software Repositories. 141--150. DOI: http://dx.doi.org/10.1109/MSR.2009.5069492 Google ScholarDigital Library
Reid Holmes and Andrew Begel. 2008. Deep intellisense: A tool for rehydrating evaporated information. In Proceedings of the International Workshop on Mining Software Repositories (MSR'08). ACM, New York, NY, 23--26. DOI: http://dx.doi.org/10.1145/1370750.1370755 Google ScholarDigital Library
Reid Holmes and Robert J. Walker. 2010. Customized awareness: Recommending relevant external change events. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE'10). ACM, New York, NY, 465--474. Google ScholarDigital Library
Mik Kersten. 2007. Focusing knowledge work with task context. Ph.D. Dissertation, University of British Columbia. Google ScholarDigital Library
Mik Kersten and Gail C. Murphy. 2005. Mylar: A degree-of-interest model for IDEs. In Proceedings of the 4th International Conference on Aspect-Oriented Software Development (AOSD'05). ACM, New York, NY, 159--168. DOI: http://dx.doi.org/10.1145/1052898.1052912 Google ScholarDigital Library
Mik Kersten and Gail C. Murphy. 2006. Using task context to improve programmer productivity. In Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering (SIGSOFT'06/FSE-14). ACM, New York, NY, USA, 1--11. DOI: http://dx.doi.org/10.1145/1181775.1181777 Google ScholarDigital Library
Andrew J. Ko, Brad A. Myers, Michael J. Coblenz, and Htet Htet Aung. 2006. An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans. Softw. Eng. 32, 971--987. DOI: http://dx.doi.org/10.1109/TSE.2006.116 Google ScholarDigital Library
Thomas D. LaToza and Brad A. Myers. 2010. Hard-to-answer questions about code. In Proceedings of the 2nd Workshop on the Evaluation and Usability of Programming Languages and Tools at SPLASH'10. Google ScholarDigital Library
Thomas D. LaToza, Gina Venolia, and Robert DeLine. 2006. Maintaining mental models: A study of developer work habits. In Proceedings of the 28th International Conference on Software Engineering (ICSE'06). ACM, New York, NY, 492--501. Google ScholarDigital Library
Taek Lee, Jaechang Nam, DongGyun Han, Sunghun Kim, and Hoh Peter In. 2011. Micro interaction metrics for defect prediction. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE'11). 311--321. Google ScholarDigital Library
David W. McDonald and Mark S. Ackerman. 2000. Expertise recommender: A flexible recommendation system and architecture. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW'00). ACM Press, New York, NY, 231--240. DOI: http://dx.doi.org/10.1145/358916.358994 Google ScholarDigital Library
Shawn Minto and Gail C. Murphy. 2007. Recommending emergent teams. In Proceedings of the International Workshop on Mining Software Repositories (MSR'07). IEEE Computer Society, 5. DOI: http://dx.doi.org/10.1109/MSR.2007.27 Google ScholarDigital Library
Audris Mockus and James D. Herbsleb. 2002. Expertise browser: A quantitative approach to identifying expertise. In Proceedings of the 24th International Conference on Software Engineering (ICSE'02). ACM, New York, NY, 503--512. DOI: http://dx.doi.org/10.1145/581339.581401 Google ScholarDigital Library
Gail C. Murphy, Mik Kersten, and Leah Findlater. 2006. How are java software developers using the Eclipse IDE&quest; IEEE Softw. 23, 4, 76--83. DOI: http://dx.doi.org/10.1109/MS.2006.105 Google ScholarDigital Library
Emerson Murphy-Hill and Andrew P. Black. 2010. An interactive ambient visualization for code smells. In Proceedings of the ACM Symposium on Software Visualization (SoftVis'10). ACM. Google ScholarDigital Library
Chris Parnin, Carsten Görg, and Spencer Rugaber. 2006. Enriching revision history with interactions. In Proceedings of the International Workshop on Mining Software Repositories (MSR'06). ACM, 155--158. DOI: http://dx.doi.org/10.1145/1137983.1138019 Google ScholarDigital Library
Nancy Pennington. 1987. Stimulus structures and mental representations in expert comprehension of computer programs. Cognitive Psychol. 19, 3, 295--341. DOI: http://dx.doi.org/DOI: 10.1016/0010-0285(87) 90007-7Google ScholarCross Ref
Charles Rich and Richard C. Waters. 1988. The programmer's apprentice: A research overview. Computer 21, 11, 10--25. DOI: http://dx.doi.org/10.1109/2.86782 Google ScholarDigital Library
Anita Sarma, Gerald Bortis, and Andre van der Hoek. 2007. Towards supporting awareness of indirect conflicts across software configuration management workspaces. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE'07). ACM, New York, NY, 94--103. DOI: http://dx.doi.org/10.1145/1321631.1321647 Google ScholarDigital Library
Anita Sarma, Zahra Noroozi, and André van der Hoek. 2003. Palantír: Raising awareness among configuration management workspaces. In Proceedings of the 25th International Conference on Software Engineering (ICSE'03). IEEE Computer Society, 444--454. Google ScholarDigital Library
Daniel L. Schacter. 1987. Implicit memory: History and current status. J. Exp. Psychol. Learn. Memory Cognition 13.Google ScholarCross Ref
David Schuler and Thomas Zimmermann. 2008. Mining usage expertise from version archives. In Proceedings of the International Workshop on Mining Software Repositories (MSR'08). ACM, 121--124. DOI: http://dx.doi.org/10.1145/1370750.1370779 Google ScholarDigital Library
Elliot Soloway and Kate Ehrlich. 1984. Empirical studies of programming knowledge. IEEE Trans. Softw. Eng. 10, 5, 595--609. Google ScholarDigital Library
A. von Mayrhauser and A. M. Vans. 1994. Comprehension processes during large scale maintenance. In Proceedings of the 16th International Conference on Software Engineering. 39--48. Google ScholarDigital Library
Lijie Zou and Michael W. Godfrey. 2008. Understanding interaction differences between newcomer and expert programmers. In Proceedings of the International Workshop on Recommendation Systems for Software Engineering (RSSE'08). ACM, 26--29. DOI: http://dx.doi.org/10.1145/1454247.1454256 Google ScholarDigital Library

Index Terms

Degree-of-knowledge: Modeling a developer's knowledge of code
1. Software and its engineering
  1. Software notations and tools
    1. Development frameworks and environments

Recommendations

Does a programmer's activity indicate knowledge of code?
ESEC-FSE '07: Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering

The practice of software development can likely be improved if an externalized model of each programmer's knowledge of a particular code base is available. Some tools already assume a useful form of such a model can be created from data collected during ...
Read More
An empirical study on how expert knowledge affects bug reports

Bug reports are crucial software artifacts for both software maintenance researchers and practitioners. A typical use of bug reports by researchers is to evaluate automated software maintenance tools: a large repository of reports is used as input for a ...
Read More
Determining Differences in Reading Behavior Between Experts and Novices by Investigating Eye Movement on Source Code Constructs During a Bug Fixing Task
ETRA '21 Short Papers: ACM Symposium on Eye Tracking Research and Applications

This research compares the eye movement of expert and novice programmers working on a bug fixing task. This comparison aims at investigating which source code elements programmers focus on when they review Java source code. Programmer code reading ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Software Engineering and Methodology Volume 23, Issue 2
March 2014
319 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/2600788
Editor:
David S. Rosenblum
Issue’s Table of Contents
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 April 2014
- Accepted: 1 August 2013
- Revised: 1 February 2012
- Received: 1 December 2010
Published in tosem Volume 23, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Authorship
degree-of-interest
degree-of-knowledge
development environment
expertise
onboarding
recommendation
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 51
  Total Citations
  View Citations
- 1,126
  Total Downloads
- Downloads (Last 12 months)39
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Degree-of-knowledge: Modeling a developer's knowledge of code

ACM Transactions on Software Engineering and Methodology

Abstract

References

Cited By

Index Terms

Recommendations

Does a programmer's activity indicate knowledge of code?

An empirical study on how expert knowledge affects bug reports

Determining Differences in Reading Behavior Between Experts and Novices by Investigating Eye Movement on Source Code Constructs During a Bug Fixing Task