
Integrating Defect Prediction to Guide Search-Based Software Testing: A Comprehensive Empirical Investigation
Dr. Larian D. Venorth , Department of Software Engineering, Zurich Technical University, Zurich, SwitzerlandAbstract
Background: The increasing complexity of software systems necessitates robust and efficient testing methods. While Search-Based Software Testing (SBST) has emerged as a powerful technique for automated test case generation, its effectiveness can be limited by its singular focus on code coverage. The generated tests, although structurally sound, may not target the most fault-prone areas of the code.
Aim: This study aims to address this limitation by proposing and empirically investigating a novel approach that integrates defect prediction (DP) models to guide the search process of SBST. By leveraging insights from historical code data, our method prioritizes the generation of test cases for code modules identified as having a higher likelihood of containing defects.
Method: We conducted a large-scale empirical study using 20 real-world, open-source Java projects from the Defects4J database. We developed a machine learning-based defect prediction model to identify fault-prone files. We then implemented a new fitness function for the EvoSuite test generation tool that incorporates the prediction score. The performance of this defect prediction-guided SBST approach was compared against a traditional, coverage-based SBST approach, using metrics of fault detection effectiveness and computational efficiency.
Results: Our findings indicate that the proposed DP-guided SBST approach significantly outperforms the traditional method in terms of the number of unique faults detected. Statistical analysis revealed a strong positive effect size for our approach. While there was a slight increase in computational overhead associated with the defect prediction component, it was minimal relative to the substantial gain in fault detection.
Conclusion: The results demonstrate that integrating defect prediction into the search-based test generation process is a highly effective strategy for improving the overall quality and fault-finding capability of automated testing. This approach represents a promising direction for enhancing software testing practices, particularly in continuous integration environments.
Keywords
Search-Based Software Testing (SBST), Defect Prediction, Automated Test Case Generation
References
G. Fraser and A. Arcuri, “Whole test suite generation,” IEEE Trans. Softw. Eng., vol. 39, no. 2, pp. 276–291, Feb.2013.
A. Panichella, F. M. Kifetew, and P. Tonella, “Reformulating branch coverage as a many-objective optimization problem,” in Proc. IEEE 8th Int. Conf. Softw. Testing, Verification Validation, 2015, pp. 1–10.
A. Panichella, F. M. Kifetew, and P. Tonella, “Automated test case generation as a many-objective optimisation problem with dynamic selection of the targets,” IEEE Trans. Softw. Eng., vol. 44, no. 2, pp. 122–158, Feb.2018.
A. Panichella, F. M. Kifetew, and P. Tonella, “A large scale empirical comparison of state-of-the-art search-based test case generators,” Inf. Softw. Technol., vol. 104, pp. 236–256, 2018.
S. Shamshiri, R. Just, J. M. Rojas, G. Fraser, P. McMinn, and A. Arcuri, “Do automatically generated unit tests find real faults? An empirical study of effectiveness and challenges (T),” in Proc. 30th IEEE/ACM Int. Conf. Automated Softw. Eng., 2015, pp. 201–211.
M. M. Almasi, H. Hemmati, G. Fraser, A. Arcuri, and J. Benefelds, “An industrial evaluation of unit test generation: Finding real faults in a financial application,” in Proc. 39th Int. Conf. Softw. Eng.: Softw. Eng. Pract. Track, 2017, pp. 263–272.
A. Salahirad, H. Almulla, and G. Gay, “Choosing the fitness function for the job: Automated generation of test suites that detect real faults,” Softw. Testing, Verification Rel., vol. 29, no. 4–5, 2019, Art. no. e1701.
A. Perera, A. Aleti, M. Böhme, and B. Turhan, “Defect prediction guided search-based software testing,” in Proc. 35th IEEE/ACM Int. Conf. Automated Softw. Eng., 2020, pp. 448–460.
A. Schröter, T. Zimmermann, and A. Zeller, “Predicting component failures at design time,” in Proc. ACM/IEEE Int. Symp. Empirical Softw. Eng., 2006, pp. 18–27.
S. Kim, T. Zimmermann, E. J. Whitehead Jr, and A. Zeller, “Predicting faults from cached history,” in Proc. 29th Int. Conf. Softw. Eng., 2007, pp. 489–498.
P. A. F. de Freitas, “Software repository mining analytics to estimate software component reliability,” Faculty of Engineering, University of Porto, Tech. Rep., 2015.
H. Hata, O. Mizuno, and T. Kikuno, “Bug prediction based on fine-grained module histories,” in Proc. 34th Int. Conf. Softw. Eng., 2012, pp. 200–210.
E. Giger, M. D’Ambros, M. Pinzger, and H. C. Gall, “Method-level bug prediction,” in Proc. ACM-IEEE Int. Symp. Empirical Softw. Eng. Meas., 2012, pp. 171–180.
T. Menzies, J. Greenwald, and A. Frank, “Data mining static code attributes to learn defect predictors,” IEEE Trans. Softw. Eng., vol. 33, no. 1, pp. 2–13, Jan.2007.
T. Zimmermann, R. Premraj, and A. Zeller, “Predicting defects for eclipse,” in Proc. 3rd Int. Workshop Predictor Models Softw. Eng., 2007, Art. no. 9.
N. Nagappan and T. Ball, “Use of relative code churn measures to predict system defect density,” in Proc. 27th Int. Conf. Softw. Eng., 2005, pp. 284–292.
N. Nagappan, B. Murphy, and V. Basili, “The influence of organizational structure on software quality,” in Proc. ACM/IEEE 30th Int. Conf. Softw. Eng., 2008, pp. 521–530.
B. Caglayan, B. Turhan, A. Bener, M. Habayeb, A. Miransky, and E. Cialini, “Merits of organizational metrics in defect prediction: An industrial replication,” in Proc. 37th Int. Conf. Softw. Eng., 2015, pp. 89–98.
N. Nagappan, A. Zeller, T. Zimmermann, K. Herzig, and B. Murphy, “Change bursts as defect predictors,” in Proc. IEEE 21st Int. Symp. Softw. Rel. Eng., 2010, pp. 309–318.
C. Lewis, Z. Lin, C. Sadowski, X. Zhu, R. Ou, and E. J. Whitehead Jr, “Does bug prediction support human developers? Findings from a Google case study,” in Proc. Int. Conf. Softw. Eng., 2013, pp. 372–381.
C. Lewis and R. Ou, “Bug prediction at Google,” 2011, Accessed: Sep., 2019. [Online]. Available: http://google-engtools.blogspot.com
H. K. Dam , “Lessons learned from using a deep tree-based model for software defect prediction in practice,” in Proc. 16th Int. Conf. Mining Softw. Repositories, 2019, pp. 46–57.
D. Paterson, J. Campos, R. Abreu, G. M. Kapfhammer, G. Fraser, and P. McMinn, “An empirical study on the use of defect prediction for test case prioritization,” in Proc. 12th IEEE Conf. Softw. Testing, Validation Verification, 2019, pp. 346–357.
E. Hershkovich, R. Stern, R. Abreu, and A. Elmishali, “Prediction-guided software test generation,” in Proc. 30th Int. Workshop Princ. Diagnosis, 2019, Accessed: Feb. 08, 2022. [Online]. Available: https://dx-workshop.org/2019/
T. Zimmermann, N. Nagappan, H. Gall, E. Giger, and B. Murphy, “Cross-project defect prediction: A large scale experiment on data versus domain versus process,” in Proc. 7th Joint Meeting Eur. Softw. Eng. Conf. ACM SIGSOFT Symp., 2009, pp. 91–100.
J. M. Rojas, M. Vivanti, A. Arcuri, and G. Fraser, “A detailed investigation of the effectiveness of whole test suite generation,” Empirical Softw. Eng., vol. 22, no. 2, pp. 852–893, 2017.
B. Korel, “Automated software test data generation,” IEEE Trans. Softw. Eng., vol. 16, no. 8, pp. 870–879, Aug.1990.
P. McMinn, “Search-based software testing: Past, present and future,” in Proc. IEEE 4th Int. Conf. Softw. Testing, Verification Validation Workshops, 2011, pp. 153–163.
R. Just, “Defects4J - A database of real faults and an experimental infrastructure to enable controlled experiments in software engineering research,” 2019, Accessed: Oct., 2019. [Online]. Available: https://github.com/rjust/defects4j
R. A. DeMilli and A. J. Offutt, “Constraint-based automatic te st data generation,” IEEE Trans. Softw. Eng., vol. 17, no. 9, pp. 900–910, Sep.1991.
L. J. Morell, “A theory of fault-based testing,” IEEE Trans. Softw. Eng., vol. 16, no. 8, pp. 844–857, Aug.1990.
L. J. Morell, “A theory of error-based testing,” Dept. Comput. Sci., Maryland Univ. College Park, MD, USA, Tech. Rep. TR-1395, 1984.
A. Offutt, “Automatic test data generation,” Georgia Institute of Technology, Tech. Rep., 1989.
N. Li and J. Offutt, “Test Oracle strategies for model-based test ing,” IEEE Trans. Softw. Eng., vol. 43, no. 4, pp. 372–395, Apr.2017.
E. T. Barr, M. Harman, P. McMinn, M. Shahbaz, and S. Yoo, “The Oracle problem in software testing: A survey,” IEEE Trans. Softw. Eng., vol. 41, no. 5, pp. 507–525, May2015.
G. Fraser and A. Arcuri, “Evolutionary generation of whole test suites,” in Proc. 11th Int. Conf. Qual. Softw., 2011, pp. 31–40.
S. Hosseini, B. Turhan, and D. Gunarathna, “A systematic literature review and meta-analysis on cross project defect prediction,” IEEE Trans. Softw. Eng., vol. 45, no. 2, pp. 111–147, Feb.2019.
A. Arcuri and L. Briand, “A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering,” Softw. Testing, Verification Rel., vol. 24, no. 3, pp. 219–250, 2014.
A. Vargha and H. D. Delaney, “A critique and improvement of the “CL” common language effect size statistics of McGraw and Wong,” J. Educ. Behav. Statist., vol. 25, no. 2, pp. 101–132, 2000.
C. O. Fritz, P. E. Morris, and J. J. Richler, “Effect size estimates: Current use, calculations, and interpretation.” J. Exp. Psychol.: Gen., vol. 141, no. 1, pp. 2–18, 2012.
J. Campos, A. Arcuri, G. Fraser, and R. Abreu, “Continuous test generation: Enhancing continuous integration with automated test generation,” in Proc. 29th ACM/IEEE Int. Conf. Automated Softw. Eng., 2014, pp. 55–66.
M. Fowler and M. Foemmel, “Continuous integration,” 2006, Accessed: Feb. 10, 2022. [Online]. Available: https://www.martinfowler.com/articles/continuousIntegration.html
R. Just, D. Jalali, and M. D. Ernst, “Defects4J: A database of existing faults to enable controlled testing studies for Java programs,” in Proc. Int. Symp. Softw. Testing Anal., 2014, pp. 437–440.
J. Sohn and S. Yoo, “Empirical evaluation of fault localisation using code and change metrics,” IEEE Trans. Softw. Eng., 2019, vol. 47, no. 8, pp. 1605–1625, Aug.2021.
G. Gay, “The fitness function for the job: Search-based generation of test suites that detect real faults,” in Proc. IEEE Int. Conf. Softw. Testing, Verification Validation., 2017, pp. 345–355.
A. Aleti and M. Martinez, “E-APR: Mapping the effectiveness of automated program repair,” Empirical Softw. Eng., vol. 26, no. 5, pp. 1–30, 2021.
S. Pearson, “Evaluating and improving fault localization,” in Proc. 39th Int. Conf. Softw. Eng., 2017, pp. 609–620.
J. Campos, A. Panichella, and G. Fraser, “EvoSuite at the SBST 2019 tool competition,” in Proc. 12th Int. Workshop Search-Based Softw. Testing, 2019, pp. 29–32.
EvoSuite, “EvoSuite - Automated generation of Junit test suites for Java classes,” 2019, Accessed: Nov., 2019. Available: https://github.com/EvoSuite/evosuite
G. Fraser, “Evosuite - Automatic test suite generation for Java,” Accessed: Sep., 2019. [Online]. Available: http://www.evosuite.org
Article Statistics
Downloads
Copyright License
Copyright (c) 2025 Dr. Larian D. Venorth

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright and Ethics:
- Authors are responsible for obtaining permission to use any copyrighted materials included in their manuscript.
- Authors are also responsible for ensuring that their research was conducted in an ethical manner and in compliance with institutional and national guidelines for the care and use of animals or human subjects.
- By submitting a manuscript to International Journal of Computer Science & Information System (IJCSIS), authors agree to transfer copyright to the journal if the manuscript is accepted for publication.