research-article

Open Access

Reflection-aware static regression test selection

Authors:
August Shi

University of Illinois at Urbana-Champaign, USA

University of Illinois at Urbana-Champaign, USA
View Profile

,
Milica Hadzi-Tanovic

University of Illinois at Urbana-Champaign, USA

University of Illinois at Urbana-Champaign, USA
View Profile

,
Lingming Zhang

University of Texas at Dallas, USA

University of Texas at Dallas, USA
View Profile

,
Darko Marinov

University of Illinois at Urbana-Champaign, USA

University of Illinois at Urbana-Champaign, USA
View Profile

,
Owolabi Legunsen

University of Illinois at Urbana-Champaign, USA

University of Illinois at Urbana-Champaign, USA
View Profile

Proceedings of the ACM on Programming Languages Volume 3 Issue OOPSLAArticle No.: 187pp 1–29https://doi.org/10.1145/3360613

Published:10 October 2019Publication History

Proceedings of the ACM on Programming Languages

Abstract

Regression test selection (RTS) aims to speed up regression testing by rerunning only tests that are affected by code changes. RTS can be performed using static or dynamic analysis techniques. Our prior study showed that static and dynamic RTS perform similarly for medium-sized Java projects. However, the results of that prior study also showed that static RTS can be unsafe, missing to select tests that dynamic RTS selects, and that reflection was the only cause of unsafety observed among the evaluated projects.

In this paper, we investigate five techniques—three purely static techniques and two hybrid static-dynamic techniques—that aim to make static RTS safe with respect to reflection. We implement these reflection-aware (RA) techniques by extending the reflection-unaware (RU) class-level static RTS technique in a tool called STARTS. To evaluate these RA techniques, we compare their end-to-end times with RU, and with RetestAll, which reruns all tests after every code change. We also compare safety and precision of the RA techniques with Ekstazi, a state-of-the-art dynamic RTS technique; precision is a measure of unaffected tests selected.

Our evaluation on 1173 versions of 24 open-source Java projects shows negative results. The RA techniques improve the safety of RU but at very high costs. The purely static techniques are safe in our experiments but decrease the precision of RU, with end-to-end time at best 85.8% of RetestAll time, versus 69.1% for RU. One hybrid static-dynamic technique improves the safety of RU but at high cost, with end-to-end time that is 91.2% of RetestAll. The other hybrid static-dynamic technique provides better precision, is safer than RU, and incurs lower end-to-end time—75.8% of RetestAll, but it can still be unsafe in the presence of test-order dependencies. Our study highlights the challenges involved in making static RTS safe with respect to reflection.

Supplemental Material

a187-shi.webm

webm

100.7 MB

Download

References

Apache Software Foundation. 2019a. Apache Camel. (2019). http://camel.apache.org/ .Google Scholar
Apache Software Foundation. 2019b. Apache Commons Math. (2019). https://commons.apache.org/proper/commons-math/ .Google Scholar
Apache Software Foundation. 2019c. Apache CXF. (2019). https://cxf.apache.org/ .Google Scholar
Linda Badri, Mourad Badri, and Daniel St-Yves. 2005. Supporting predictive change impact analysis: A control call graph based technique. In APSEC. 167–175.Google Scholar
Paulo Barros, René Just, Suzanne Millstein, Paul Vines, Werner Dietl, Marcelo d’Amorim, and Michael D. Ernst. 2015. Static analysis of implicit control flow: Resolving Java reflection and Android intents. In ASE. 669–679.Google Scholar
Jonathan Bell and Gail Kaiser. 2014. Unit test virtualization with VMVM. In ICSE. 550–561.Google Scholar
Jonathan Bell, Gail Kaiser, Eric Melski, and Mohan Dattatreya. 2015. Efficient dependency detection for safe Java test acceleration. In ESEC/FSE. 770–781.Google Scholar
Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders. In ICSE. 241–250.Google Scholar
Ahmet Çelik, Young Chul Lee, and Milos Gligoric. 2018. Regression test selection for TizenRT. In FSE Industry Track. 845–850.Google Scholar
Ahmet Çelik, Marko Vasic, Aleksandar Milicevic, and Milos Gligoric. 2017. Regression test selection across JVM boundaries. In ESEC/FSE. 809–820.Google Scholar
Lingchao Chen and Lingming Zhang. 2018. Speeding up mutation testing via regression test selection: An extensive study. In ICST. 58–69.Google Scholar
Yih-Farn Chen, David S. Rosenblum, and Kiem-Phong Vo. 1994. TestTube: A system for selective regression testing. In ICSE. 211–220.Google Scholar
Shigeru Chiba. 2000. Load-time structural reflection in Java. In ECOOP. 313–336.Google Scholar
Aske Simon Christensen, Anders Møller, and Michael I. Schwartzbach. 2003. Precise analysis of string expressions. In SAS. 1–18.Google Scholar
Nima Dini, Allison Sullivan, Milos Gligoric, and Gregg Rothermel. 2016. The effect of test suite type on regression test selection. In ISSRE. 47–58.Google Scholar
Sebastian Elbaum, Gregg Rothermel, and John Penix. 2014. Techniques for improving regression testing in continuous integration development environments. In FSE. 235–245.Google Scholar
Hamed Esfahani, Jonas Fietz, Qi Ke, Alexei Kolomiets, Erica Lan, Erik Mavrinac, Wolfram Schulte, Newton Sanches, and Srikanth Kandula. 2016. CloudBuild: Microsoft’s distributed and caching build service. In ICSE SEIP. 11–20.Google Scholar
Alessio Gambi, Jonathan Bell, and Andreas Zeller. 2018. Practical test dependency detection. In ICST. 1–11.Google Scholar
Milos Gligoric, Lamyaa Eloussi, and Darko Marinov. 2015a. Ekstazi: Lightweight test selection. In ICSE Demo. 713–716.Google Scholar
Milos Gligoric, Lamyaa Eloussi, and Darko Marinov. 2015b. Practical regression test selection with dynamic file dependencies. In ISSTA. 211–222.Google Scholar
Neville Grech, George Kastrinis, and Yannis Smaragdakis. 2018. Efficient reflection string analysis via graph coloring. In ECOOP . 1–25.Google Scholar
Michaela Greiler, Arie van Deursen, and Margaret-Anne Storey. 2013. Automated detection of test fixture strategies and smells. In ICST. 322–331.Google Scholar
José de Oliveira Guimarães. 1998. Reflection for statically typed languages. In ECOOP. 440–461.Google Scholar
Pooja Gupta, Mark Ivey, and John Penix. 2011. Testing at the speed and scale of Google. (Jun 2011). http://goo.gl/2B5cyl .Google Scholar
Alex Gyori, Owolabi Legunsen, Farah Hariri, and Darko Marinov. 2018. Evaluating regression test selection opportunities in a very large open-source ecosystem. In ISSRE. 112–122.Google Scholar
Alex Gyori, August Shi, Farah Hariri, and Darko Marinov. 2015. Reliable testing: Detecting state-polluting tests to prevent test dependency. In ISSTA. 223–233.Google Scholar
Milica Hadzi-Tanovic. 2018. Reflection-aware static regression test selection. Master’s thesis. University of Illinois at Urbana-Champaign, USA.Google Scholar
Mary Jean Harrold, James A. Jones, Tongyu Li, Donglin Liang, Alessandro Orso, Maikel Pennings, Saurabh Sinha, S. Alexander Spoon, and Ashish Gujarathi. 2001. Regression test selection for Java software. In OOPSLA. 312–326.Google Scholar
Kim Herzig and Nachi Nagappan. 2015. Empirically detecting false test alarms using association rules. In ICSE. 39–48.Google Scholar
Chen Huo and James Clause. 2014. Improving oracle quality by detecting brittle assertions and unused inputs in tests. In ISSTA . 621–631.Google Scholar
Henrik Karlsson. 2019. Limiting transitive closure for static regression test selection approaches. Master’s thesis. KTH Royal Institute of Technology, Sweden.Google Scholar
Christian Kirkegaard, Anders Moller, and Michael I. Schwartzbach. 2004. Static analysis of XML transformations in Java. TSE 30, 3 (2004), 181–192.Google ScholarDigital Library
David Chenho Kung, Jerry Gao, Pei Hsia, Jeremy Lin, and Yasufumi Toyoshima. 1995. Class firewall, test order, and regression testing of object-oriented programs. JOOP 8, 2 (1995), 51–65.Google Scholar
Wing Lam, Reed Oei, August Shi, Darko Marinov, and Tao Xie. 2019. iDFlakies: A framework for detecting and partially classifying flaky tests. In ICST. 312–322.Google Scholar
Wing Lam, Sai Zhang, and Michael D. Ernst. 2015. When tests collide: Evaluating and coping with the impact of test dependence. Technical Report. University of Washington CSE Dept.Google Scholar
Davy Landman, Alexander Serebrenik, and Jurgen J. Vinju. 2017. Challenges for static analysis of Java reflection: Literature review and empirical study. In ICSE. 507–518.Google Scholar
Owolabi Legunsen, Farah Hariri, August Shi, Yafeng Lu, Lingming Zhang, and Darko Marinov. 2016. An extensive study of static regression test selection in modern software evolution. In FSE. 583–594.Google Scholar
Owolabi Legunsen, Darko Marinov, and Grigore Roşu. 2015. Evolution-aware monitoring-oriented programming. In ICSE NIER . 615–618.Google Scholar
Owolabi Legunsen, August Shi, and Darko Marinov. 2017. STARTS: STAtic Regression Test Selection. In ASE. 949–954.Google Scholar
Owolabi Legunsen, Yi Zhang, Milica Hadzi-Tanovic, Grigore Roşu, and Darko Marinov. 2019. Techniques for evolution-aware runtime verification. In ICST. 300–311.Google Scholar
Hareton K.N. Leung and Lee White. 1990. A study of integration testing and software regression at the integration level. In ICSM . 290–301.Google Scholar
Ding Li, Yingjun Lyu, Mian Wan, and William G.J. Halfond. 2015a. String analysis for Java and Android applications. In ESEC/FSE . 661–672.Google Scholar
Li Li, Tegawendé F Bissyandé, Damien Octeau, and Jacques Klein. 2016a. Droidra: Taming reflection to support wholeprogram analysis of Android apps. In ISSTA. 318–329.Google Scholar
Li Li, Tegawendé F Bissyandé, Damien Octeau, and Jacques Klein. 2016b. Reflection-aware static analysis of Android apps. In ASE. 756–761.Google Scholar
Yue Li, Tian Tan, Yulei Sui, and Jingling Xue. 2014. Self-inferencing reflection resolution for Java. In ECOOP. 27–53.Google Scholar
Yue Li, Tian Tan, and Jingling Xue. 2015b. Effective soundness-guided reflection analysis. In SAS. 162–180.Google Scholar
Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondrej Lhoták, José Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z Guyer, Uday P Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: A manifesto. CACM 58, 2 (2015), 44–46.Google ScholarDigital Library
Erik Lundsten. 2019. EALRTS: A predictive regression test selection tool. Master’s thesis. KTH Royal Institute of Technology, Sweden.Google Scholar
Mateusz Machalica, Alex Samylkin, Meredith Porth, and Satish Chandra. 2019. Predictive test selection. In ICSE SEIP. 91–100.Google Scholar
Atif M. Memon, Zebao Gao, Bao N. Nguyen, Sanjeev Dhanda, Eric Nickell, Rob Siemborski, and John Micco. 2017. Taming Google-scale continuous testing. In ICSE-SEIP. 233–242.Google Scholar
Jesper Öqvist, Görel Hedin, and Boris Magnusson. 2016. Extraction-based regression test selection. In PPPJ. 1–10.Google Scholar
Oracle. 2018. jdeps. (2018). https://docs.oracle.com/javase/8/docs/technotes/tools/unix/jdeps.html .Google Scholar
Alessandro Orso, Nanjuan Shi, and Mary Jean Harrold. 2004. Scaling regression testing to large software systems. In FSE. 241–251.Google Scholar
OW2 Consortium. 2018. ASM. (2018). http://asm.ow2.org/ .Google Scholar
Fabio Palomba and Andy Zaidman. 2017. Does refactoring of test smells induce fixing flaky tests?. In ICSME. 1–12.Google Scholar
Xiaoxia Ren, Fenil Shah, Frank Tip, Barbara G Ryder, and Ophelia Chesley. 2004. Chianti: A tool for change impact analysis of Java programs. In ACM Sigplan Notices, Vol. 39. 432–448.Google ScholarDigital Library
Xiaoxia Ren, Fenil Shah, Frank Tip, Barbara G Ryder, Ophelia Chesley, and Julian Dolby. 2003. Chianti: A prototype change impact analysis tool for Java . Technical Report DCS-TR-533. Rutgers University CS Dept.Google Scholar
Gregg Rothermel and Mary Jean Harrold. 1993. A safe, efficient algorithm for regression test selection. In ICSM. 358–367.Google Scholar
Gregg Rothermel and Mary Jean Harrold. 1997. A safe, efficient regression test selection technique. TOSEM 6, 2 (1997), 173–210.Google ScholarDigital Library
August Shi, Wing Lam, Reed Oei, Tao Xie, and Darko Marinov. 2019. iFixFlakies: A framework for automatically fixing order-dependent flaky tests. In ESEC/FSE. 545–555.Google Scholar
August Shi, Tifany Yung, Alex Gyori, and Darko Marinov. 2015. Comparing and combining test-suite reduction and regression test selection. In ESEC/FSE. 237–247.Google Scholar
Yannis Smaragdakis, George Balatsouras, George Kastrinis, and Martin Bravenboer. 2015. More sound static handling of Java reflection. In APLAS. 485–503.Google Scholar
Davide Spadini, Fabio Palomba, Andy Zaidman, Magiel Bruntink, and Alberto Bacchelli. 2018. On the relation of test smells to software code quality. In ICSME. 1–12.Google Scholar
Amitabh Srivastava and Jay Thiagarajan. 2002. Effectively prioritizing tests in development environment. In ISSTA. 97–106.Google Scholar
STARTS Team. 2018. STARTS webpage. (2018). https://github.com/TestingResearchIllinois/starts .Google Scholar
Andreas Thies and Eric Bodden. 2012. RefaFlex: Safer refactorings for reflective Java programs. In ISSTA. 1–11.Google Scholar
Michele Tufano, Fabio Palomba, Gabriele Bavota, Rocco Oliveto, Massimiliano Di Penta, Andrea De Lucia, and Denys Poshyvanyk. 2015. When and why your code starts to smell bad. In ICSE. 403–414.Google Scholar
Kaiyuan Wang, Chenguang Zhu, Ahmet Çelik, Jongwook Kim, Don Batory, and Milos Gligoric. 2018. Towards refactoringaware regression test selection. In ICSE. 233–244.Google Scholar
Ugur Yilmaz. 2019. A method for selecting regression test cases based on software changes and software faults. Master’s thesis. Hacettepe University, Turkey.Google Scholar
Shin Yoo and Mark Harman. 2012. Regression testing minimization, selection and prioritization: A survey. STVR 22, 2 (2012), 67–120.Google Scholar
Nathan York. 2011. Tools for continuous integration at Google scale. (Jan 2011). https://goo.gl/Gqj7uL .Google Scholar
Lingming Zhang. 2018. Hybrid regression test selection. In ICSE. 199–209.Google Scholar
Lingming Zhang, Miryung Kim, and Sarfraz Khurshid. 2011. Localizing failure-inducing program edits based on spectrum information. In ICSM. 23–32.Google Scholar
Sai Zhang, Darioush Jalali, Jochen Wuttke, Kivanç Muşlu, Wing Lam, Michael D. Ernst, and David Notkin. 2014. Empirically revisiting the test independence assumption. In ISSTA. 385–396.Google Scholar
Chenguang Zhu, Owolabi Legunsen, August Shi, and Milos Gligoric. 2019. A framework for checking regression test selection tools. In ICSE. 430–441.Google Scholar

Index Terms

Reflection-aware static regression test selection
1. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Software evolution
    2. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software organization and properties
    1. Software functional properties
      1. Formal methods
        Automated static analysis

Recommendations

An extensive study of static regression test selection in modern software evolution
FSE 2016: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering

Regression test selection (RTS) aims to reduce regression testing time by only re-running the tests affected by code changes. Prior research on RTS can be broadly split into dy namic and static techniques. A recently developed dynamic RTS technique ...
Read More
A safe, efficient regression test selection technique

Regression testing is an expensive but necessary maintenance activity performed on modified software to provide confidence that changes are correct and do not adversely affect other portions of the softwore. A regression test selection technique choses, ...
Read More
Empirical Studies of a Safe Regression Test Selection Technique

Regression testing is an expensive testing procedure utilized to validate modified software. Regression test selection techniques attempt to reduce the cost of regression testing by selecting a subset of a program's existing test suite. Safe regression ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the ACM on Programming Languages Volume 3, Issue OOPSLA
October 2019
2077 pages
EISSN:2475-1421
DOI:10.1145/3366395
Issue’s Table of Contents

Copyright © 2019 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 October 2019
Published in pacmpl Volume 3, Issue OOPSLA

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
class firewall
reflection
regression test selection
regression testing
static analysis
string analysis
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 16
  Total Citations
  View Citations
- 746
  Total Downloads
- Downloads (Last 12 months)111
- Downloads (Last 6 weeks)15
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Reflection-aware static regression test selection

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

An extensive study of static regression test selection in modern software evolution

A safe, efficient regression test selection technique

Empirical Studies of a Safe Regression Test Selection Technique

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Reflection-aware static regression test selection

Proceedings of the ACM on Programming Languages

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

An extensive study of static regression test selection in modern software evolution

A safe, efficient regression test selection technique

Empirical Studies of a Safe Regression Test Selection Technique

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media