Good news! The PRISM website is available for submissions. The planned data migration to the Scholaris server has been successfully completed. We’d love to hear your feedback at openservices@ucalgary.libanswers.com
 

Automated Test Case Generation Using Transformers and Domain Adaptation

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Software testing is an important part of the software development cycle. It helps prevent future potential defects in the system and reduce maintenance cost. However, it is often neglected by developers like other tasks such as documentation, code review, etc. Automated unit test generation is the process of generating unit tests for a method or a class for a given project. It helps developers to detect some of the potential defects automatically. Existing approaches usually use heuristic methods such as random data generation (fuzzing) or genetic algorithms (search-based testing) to generate tests, which are based on optimizing a cost function such as code coverage. Some of the main downsides of these methods are (a) the synthetic nature of the tests that are generated based on predefined templates and (b) targeting surrogate measures such as coverage, which might not always detect existing defects. In this study, we leverage novel machine translation models, i.e., transformers to generate unit tests for a given program. Specifically, we use CodeT5, i.e., a state-of-the-art large code model, and fine-tune it on the test generation downstream task. We use the Methods2test dataset for fine-tuning CodeT5. We also use Defects4j for project-level domain adaptation and evaluation. The main contribution of this study is proposing a fully automated testing framework that leverages developer-written tests and available code models to generate compilable unit tests. Results show that using domain adaptation, we can increase line coverage of the model-generated unit tests by 49.9% and 54% in terms of mean and median (compared to the model without domain adaptation). We can also use our framework as a complementary solution alongside common search-based methods to increase the overall coverage with mean and median of 25.3% and 6.3%. It can also increase the mutation score of search-based methods with mean and median of 8.6% and 1%.

Description

Citation

Hashtroudi, S. P. (2023). Automated test case generation using transformers and domain adaptation (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.