TORACLE 2021 – Proceedings

Message from the Chairs
Welcome to the first edition of the International Workshop on Test Oracles (TORACLE 2021) to be held virtually on August 24, 2021, co-located with the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021).
Testing is an important activity in software engineering, especially with the growing adoption of software systems in safety-critical domains. Testing research, however, is mostly focused on determining which test inputs to use (e.g., proposing and evaluating test coverage criteria or automatic test generation tools). Regardless of the used coverage criterion, we need to know whether a given program executes correctly on a given input. Indeed, a test execution for which we cannot discriminate between success or failure is fruitless. This corresponds to the so-called ”oracle problem”, the problem of knowing whether a program behaves correctly for a specific input. While the importance of the oracle problem is well understood, only a few alternatives exist to manually deriving test oracles. This makes test oracle automation one of the main bottlenecks for full test automation. Therefore, novel approaches and tools are needed to address this important problem.

Using Machine Learning to Generate Test Oracles: A Systematic Literature Review
Afonso Fontes and Gregory Gay

(Chalmers University of Technology, Sweden; University of Gothenburg, Sweden)
Machine learning may enable the automated generation of test oracles. We have characterized emerging research in this area through a systematic literature review examining oracle types, researcher goals, the ML techniques applied, how the generation process was assessed, and the open research challenges in this emerging field.
Based on a sample of 22 relevant studies, we observed that ML algorithms generated test verdict, metamorphic relation, and---most commonly---expected output oracles. Almost all studies employ a supervised or semi-supervised approach, trained on labeled system executions or code metadata---including neural networks, support vector machines, adaptive boosting, and decision trees. Oracles are evaluated using the mutation score, correct classifications, accuracy, and ROC. Work-to-date show great promise, but there are significant open challenges regarding the requirements imposed on training data, the complexity of modeled functions, the ML algorithms employed---and how they are applied---the benchmarks used by researchers, and replicability of the studies. We hope that our findings will serve as a roadmap and inspiration for researchers in this field.

Publisher's Version

TORACLE 2021 – Proceedings

1st International Workshop on Test Oracles (TORACLE 2021)

Frontmatter

Paper