ABSTRACT
The paper presents a semi-automated technique for feature location in source code. The technique is based on combining information from two different sources: an execution trace, on one hand and the comments and identifiers from the source code, on the other hand.
Users execute a single partial scenario, which exercises the desired feature and all executed methods are identified based on the collected trace. The source code is indexed using Latent Semantic Indexing, an Information Retrieval method, which allows users to write queries relevant to the desired feature and rank all the executed methods based on their textual similarity to the query.
Two case studies on open source software (JEdit and Eclipse) indicate that the new technique has high accuracy, comparable with previously published approaches and it is easy to use as it considerably simplifies the dynamic analysis.
- Aho, A. V., "Pattern matching in strings", in Formal Language Theory: Perspectives and Open Problems, New York Academic Press, 1980, pp. 325--347.Google Scholar
- Antoniol, G., Canfora, G., Casazza, G., De Lucia, A., and Merlo, E., "Recovering Traceability Links between Code and Documentation", IEEE Transactions on Software Engineering, vol. 28, no. 10, October 2002, pp. 970 -- 983. Google ScholarDigital Library
- Antoniol, G. and Guéhéneuc, Y. G., "Feature Identification: An Epidemiological Metaphor", IEEE Transactions on Software Engineering, vol. 32, no. 9, 2006, pp. 627--641. Google ScholarDigital Library
- Biggerstaff, T. J., Mitbander, B. G., and Webster, D. E., "The Concept Assignment Problem in Program Understanding", in Proc. of 15th IEEE/ACM International Conference on Software Engineering (ICSE'94), 1994, pp. 482--498. Google ScholarDigital Library
- Chen, K. and Rajlich, V., "Case Study of Feature Location Using Dependence Graph", in Proc. of 8th IEEE International Workshop on Program Comprehension (IWPC'00), 2000, pp. 241--249. Google ScholarDigital Library
- De Lucia, A., Fasano, F., Oliveto, R., and Tortora, G., "Recovering Traceability Links in Software Artefact Management Systems", ACM Transactions on Software Engineering and Methodology, 2007. Google ScholarDigital Library
- Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R., "Indexing by Latent Semantic Analysis", Journal of the American Society for Information Science, vol. 41, 1990, pp. 391--407.Google ScholarCross Ref
- Deprez, J.-C. and Lakhotia, A., "A formalism to automate mapping from program features to code", in Proc. of 8th IEEE International Workshop on Program Comprehension (IWPC'00), 2000, pp. 69--78. Google ScholarDigital Library
- Edwards, D., Simmons, S., and Wilde, N., "An approach to feature location in distributed systems", Software Engineering Research Center 2004.Google Scholar
- Egyed, A., Binder, G., and Grunbacher, P., "STRADA: A Tool for Scenario-Based Feature-to-Code Trace Detection and Analysis", in Proc. of IEEE/ACM 29th International Conference on Software Engineering (ICSE'07), 2007, pp. 41--42. Google ScholarDigital Library
- Eisenbarth, T., Koschke, R., and Simon, D., "Locating Features in Source Code", IEEE Transactions on Software Engineering, vol. 29, no. 3, March 2003, pp. 210--224. Google ScholarDigital Library
- Eisenberg, A. D. and De Volder, K., "Dynamic Feature Traces: Finding Features in Unfamiliar Code", in Proc. of 21st IEEE International Conference on Software Maintenance (ICSM'05), 2005, pp. 337--346. Google ScholarDigital Library
- Fischer, M., Pinzger, M., and Gall, H., "Analyzing and Relating Bug Report Data for Feature Tracking." in Proc. of IEEE Working Conference on Reverse Engineering (WCRE'03), 2003, pp. 90--101. Google ScholarDigital Library
- Frakes, W. and Kang, K., "Software Reuse Research: Status and Future", IEEE Transactions on Software Engineering, vol. 31, no. 7, 2005, pp. 529--536. Google ScholarDigital Library
- Greevy, O., Ducasse, S., and Girba, T., "Analyzing Feature Traces to Incorporate the Semantics of Change in Software Evolution Analysis", in Proc. of 21st IEEE International Conference on Software Maintenance (ICSM'05), 2005, pp. 347--356. Google ScholarDigital Library
- Hayes, J. H., Dekhtyar, A., and Sundaram, S. K., "Advancing candidate link generation for requirements tracing: the study of methods", IEEE Transactions on Software Engineering, vol. 32, no. 1, January 2006 2006, pp. 4--19. Google ScholarDigital Library
- Kothari, J., Denton, T., Mancoridis, S., and Shokoufandeh, A., "On Computing the Canonical Features of Software Systems", in 13th IEEE Working Conference on Reverse Engineering (WCRE'06), Benevento, Italy, 2006. Google ScholarDigital Library
- Kuhn, A., Ducasse, S., and Gîrba, T., "Semantic Clustering: Identifying Topics in Source Code", Information and Software Technology, vol. 49, no. 3, March 2006, pp. 230--243. Google ScholarDigital Library
- Maarek, Y. S., Berry, D. M., and Kaiser, G. E., "An Information Retrieval Approach for Automatically Constructing Software Libraries", IEEE Transactions on Software Engineering, vol. 17, no. 8, 1991, pp. 800--813. Google ScholarDigital Library
- Maletic, J. I. and Marcus, A., "Supporting Program Comprehension Using Semantic and Structural Information", in Proc. of 23rd International Conference on Software Engineering (ICSE'01), 2001, pp. 103--112. Google ScholarDigital Library
- Marcus, A. and Maletic, J. I., "Identification of High-Level Concept Clones in Source Code", in Proc. of Automated Software Engineering (ASE'01), 2001, pp. 107--114. Google ScholarDigital Library
- Marcus, A., Maletic, J. I., and Sergeyev, A., "Recovery of Traceability Links Between Software Documentation and Source Code", International Journal of Software Engineering and Knowledge Engineering, vol. 15, no. 4, October 2005, pp. 811--836.Google ScholarCross Ref
- Marcus, A., Rajlich, V., Buchta, J., Petrenko, M., and Sergeyev, A., "Static Techniques for Concept Location in Object-Oriented Code", in Proc. of 13th IEEE International Workshop on Program Comprehension (IWPC'05), 2005, pp. 33--42. Google ScholarDigital Library
- Marcus, A., Sergeyev, A., Rajlich, V., and Maletic, J., "An Information Retrieval Approach to Concept Location in Source Code", in Proc. of 11th IEEE Working Conference on Reverse Engineering (WCRE'04), 2004, pp. 214--223. Google ScholarDigital Library
- Poshyvanyk, D., Guéhéneuc, G. Y., Marcus, A., Antoniol, G., and Rajlich, V., "Feature Location using Probabilistic Ranking of Methods based on Execution Scenarios and Information Retrieval", IEEE Transactions on Software Engineering, vol. 33, no. 6, June 2007, pp. 420--432. Google ScholarDigital Library
- Rajlich, V., "Changing the Paradigm of Software Engineering", in Communications of ACM, vol. August, 2006, pp. 67--70. Google ScholarDigital Library
- Rajlich, V. and Gosavi, P., "Incremental Change in Object-Oriented Programming", in IEEE Software, 2004, pp. 2--9. Google ScholarDigital Library
- Reiss, S. P. and Renieris, M., "Generating Java Trace Data", in Proc. of the ACM Conference on Java Grande, 2000, pp. 71--77. Google ScholarDigital Library
- Robillard, M., "Automatic Generation of Suggestions for Program Investigation", in Proc. of Joint European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2005, pp. 11--20 Google ScholarDigital Library
- Salah, M. and Mancoridis, S., "A hierarchy of dynamic software views: from object-interactions to feature-interactions", in Proc. of 20th IEEE International Conference on Software Maintenance (ICSM'04), 2004, pp. 72--81. Google ScholarDigital Library
- Salah, M., Mancoridis, S., Antoniol, G., and Di Penta, M., "Scenario-driven dynamic analysis for comprehending large software systems", in Proc. of 10th European Conference on Software Maintenance and Reengineering (CSMR'06), 2006. Google ScholarDigital Library
- Shepherd, D., Fry, Z., Gibson, E., Pollock, L., and Vijay-Shanker, K., "Using Natural Language Program Analysis to Locate and Understand Action-Oriented Concerns", in Proc. of International Conference on Aspect Oriented Software Development (AOSD'07), 2007, pp. 212--224. Google ScholarDigital Library
- Simmons, S., Edwards, D., Wilde, N., Homan, J., and Groble, M., "Industrial tools for the feature location problem: an exploratory study", Journal of Software Maintenance: Research and Practice, vol. 18, no. 6, 2006, pp. 457--474. Google ScholarDigital Library
- Szegedi, A. and Gyimothy, T., "Dynamic Slicing of Java Bytecode Programs", in Proc. of 5th IEEE International Workshop on Source Code Analysis and Manipulation (SCAM'05), 2005, pp. 35--44. Google ScholarDigital Library
- Tonella, P. and Ceccato, M., "Aspect Mining through the Formal Concept Analysis of Execution Traces", in Proc. of 11th IEEE Working Conference on Reverse Engineering (WCRE'04), 2004, pp. 112 -- 121 Google ScholarDigital Library
- Wilde, N. and Scully, M., "Software Reconnaissance: Mapping Program Features to Code", Software Maintenance: Research and Practice, vol. 7, 1995, pp. 49--62. Google ScholarDigital Library
- Wong, W. E. and Gokhale, S., "Static and dynamic distance metrics for feature-based code analysis", Journal of Systems and Software, vol. 74, no. 3, February 2005, pp. 283--295. Google ScholarDigital Library
- Wong, W. E., Gokhale, S. S., Horgan, J. R., and Trivedi, K. S., "Locating program features using execution slices", in Proc. of IEEE Symposium on Application--Specific Systems and Software Engineering and Technology (ASSET'99), 1999, pp. 194--203. Google ScholarDigital Library
- Ye, Y. and Fischer, G., "Supporting Reuse by Delivering Task-Relevant and Personalized Information", in Proc. of IEEE/ACM International Conference on Software Engineering (ICSE'02), 2002, pp. 513--523. Google ScholarDigital Library
- Zhao, W., Zhang, L., Liu, Y., Sun, J., and Yang, F., "SNIAFL: Towards a Static Non-interactive Approach to Feature Location", ACM Transactions on Software Engineering and Methodologies, vol. 15, no. 2, 2006, pp. 195--226. Google ScholarDigital Library
- Zimmermann, T., Zeller, A., Weissgerber, P., and Diehl, S., "Mining Version Histories to Guide Software Changes", IEEE Transactions on Software Engineering, vol. 31, no. 6, June 2005, pp. 429--445. Google ScholarDigital Library
Index Terms
- Feature location via information retrieval based filtering of a single scenario execution trace
Recommendations
Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval
This paper recasts the problem of feature location in source code as a decision-making problem in the presence of uncertainty. The solution to the problem is formulated as a combination of the opinions of different experts. The experts in this work are ...
Extending Bug Localization Using Information Retrieval and Code Clone Location Techniques
WCRE '11: Proceedings of the 2011 18th Working Conference on Reverse EngineeringBug localization involves the use of information about a bug to assist in locating sections of code that must be modified to fix the bug. Such a task can involve a considerable amount of time and effort on the part of software developers and/or ...
Concept location using formal concept analysis and information retrieval
The article addresses the problem of concept location in source code by proposing an approach that combines Formal Concept Analysis and Information Retrieval. In the proposed approach, Latent Semantic Indexing, an advanced Information Retrieval approach,...
Comments