AST 2012 – Proceedings

SECFUZZ: Fuzz-testing Security Protocols
Petar Tsankov, Mohammad Torabi Dashti, and David Basin
(ETH Zurich, Switzerland)
We propose a light-weight, yet effective, technique for fuzz-testing security protocols. Our technique is modular, it exercises (stateful) protocol implementations in depth, and handles encrypted traffic. We use a concrete implementation of the protocol to generate valid inputs, and mutate the inputs using a set of fuzz operators. A dynamic memory analysis tool monitors the execution as an oracle to detect the vulnerabilities exposed by fuzz-testing. We provide the fuzzer with the necessary keys and cryptographic algorithms in order to properly mutate encrypted messages. We present a case study on two widely used, mature implementations of the Internet Key Exchange (IKE) protocol and report on two new vulnerabilities discovered by our fuzz-testing tool. We also compare the effectiveness of our technique to two existing model-based fuzz-testing tools for IKE.

Testing of PolPA Authorization Systems
Antonia Bertolino, Said Daoudagh, Francesca Lonetti, Eda Marchetti, Fabio Martinelli, and Paolo Mori
(ISTI-CNR, Italy; IIT-CNR, Italy)
The implementation of an authorization system is a difficult and error-prone activity that requires a careful verification and testing process. In this paper, we focus on testing the implementation of the PolPA authorization system and in particular its Policy Decision Point (PDP), used to define whether an access should be allowed or not. Thus exploiting the PolPA policy specification, we present a fault model and a test strategy able to highlight the problems, vulnerabilities and faults that could occur during the PDP implementation, and a testing framework for the automatic generation of a test suite that covers the fault model. Preliminary results of the test framework application to a realistic case study are presented.

Grammar Based Oracle for Security Testing of Web Applications
Andrea Avancini and Mariano Ceccato
(Fondazione Bruno Kessler, Italy)
The goal of security testing is to detect those defects that could be exploited to conduct attacks. Existing works, however, address security testing mostly from the point of view of automatic generation of test cases. Less attention is paid to the problem of developing and integrating with a security oracle.
In this paper we address the problem of the security oracle, in particular for Cross-Site Scripting vulnerabilities. We rely on existing test cases to collect HTML pages in safe conditions, i.e. when no attack is run. Pages are then used to construct the safe model of the application under analysis, a model that describes the structure of an application response page for safe input values. The oracle eventually detects a successful attack when a test makes the application display a web page that is not compliant with the safe model.

A Whitebox Approach for Automated Security Testing of Android Applications on the Cloud
Riyadh Mahmood, Naeem Esfahani, Thabet Kacem, Nariman Mirzaei, Sam Malek, and Angelos Stavrou
(George Mason University, USA)
By changing the way software is delivered to end users, markets for mobile apps create a false sense of security: apps are downloaded from a market that can potentially be regulated. In practice, this is far from truth and instead, there has been evidence that security is not one of the primary design tenets for the mobile app stores. Recent studies have indicated mobile markets are harboring apps that are either malicious or vulnerable leading to compromises of millions of devices. The key technical obstacle for the organizations overseeing these markets is the lack of practical and automated mechanisms to assess the security of mobile apps, given that thousands of apps are added and updated on a daily basis. In this paper, we provide an overview of a multi-faceted project targeted at automatically testing the security and robustness of Android apps in a scalable manner. We describe an Android-specific program analysis technique capable of generating a large number of test cases for fuzzing an app, as well as a test bed that given the generated test cases, executes them in parallel on numerous emulated Androids running on the cloud.

Surveys

Software Testing of Mobile Applications: Challenges and Future Research Directions
Henry Muccini, Antonio Di Francesco, and Patrizio Esposito
(University of L'Aquila, Italy)
While mobile applications are becoming so extraordinarily adopted, it is still unclear if they deserve any specific testing approach for their verification and validation. This paper wants to investigate new research directions on mobile applications testing automation, by answering three research questions: (RQ1) are mobile applications (so) different from traditional ones, so to require different and specialized new testing techniques?, (RQ2) what are the new challenges and research directions on testing mobile applications?, and (RQ3) which is the role automation may play in testing mobile applications? We answer those questions by analyzing the current state of the art in mobile applications development and testing, and by proposing our view on the topic.

Benefits and Limitations of Automated Software Testing: Systematic Literature Review and Practitioner Survey
Dudekula Mohammad Rafi, Katam Reddy Kiran Moses, Kai Petersen, and Mika V. Mäntylä
(Blekinge Institute of Technology, Sweden; Ericsson, Sweden; Lund University, Sweden)
There is a documented gap between academic and practitioner views on software testing. This paper tries to close the gap by investigating both views regarding the benefits and limits of test automation. The academic views are studied with a systematic literature review while the practitioners views are assessed with a survey, where we received responses from 115 software professionals. The results of the systematic literature review show that the source of evidence regarding benefits and limitations is quite shallow as only 25 papers provide the evidence. Furthermore, it was found that benefits often originated from stronger sources of evidence (experiments and case studies), while limitations often originated from experience reports. We believe that this is caused by publication bias of positive results. The survey showed that benefits of test automation were related to test reusability, repeatability, test coverage and effort saved in test executions. The limitations were high initial invests in automation setup, tool selection and training. Additionally, 45% of the respondents agreed that available tools in the market offer a poor fit for their needs. Finally, it was found that 80% of the practitioners disagreed with the vision that automated testing would fully replace manual testing.

Industrial Case Studies

Introducing Model-Based Testing in an Industrial Scrum Project
Vladimir Entin, Mathias Winder, Bo Zhang, and Stephan Christmann
(Omicron Electronics, Austria)
Various approaches for the automated test case generation in the area of graphical user interface (GUI) testing have emerged in recent years. A notable trend is model-based testing (MBT). In this experience report we shed light on the challenges faced during the introduction and every day use of a concrete technique which leverages MBT in a Scrum project along with practical solutions found. Such topics as process of model definition and maintenance for the purposes of regression and risk-based testing of GUIs, suitable test case derivation algorithms, human factors as well as choice of appropriate architecture are discussed.

An Industrial Case Study of the Effectiveness of Test Generators
Pietro Braione, Giovanni Denaro, Andrea Mattavelli, Mattia Vivanti, and Ali Muhammad
(University of Milano-Bicocca, Italy; University of Lugano, Switzerland; VTT Technical Research Center of Finland, Finland)
Automatic test generators pursue some type of systematic coverage of the program code or heuristic sampling of the program inputs. Test generators are effective after the assumption, often (enthusiastically) embraced by researchers, that the generated test cases produce informative data for domain experts, e.g., pinpoint important bugs. This paper investigates the validity of such assumption through a case study of using test generators on industrial software with nontrivial domain-specific peculiarities. Our results properly enhance the available body of knowledge on the strengths and weaknesses of test generators.

Software Test Automation Practices in Agile Development Environment: An Industry Experience Report
Eliane Figueiredo Collins and Vicente Ferreira De Lucena, Jr.
(Nokia Institute of Technology, Brazil; Federal University of Amazonas, Brazil)
The increased importance of Test Automation in software engineering is very evident considering the number of companies investing in automated testing tools nowadays, with the main aim of preventing defects during the development process. Test Automation is considered an essential activity for agile methodologies being the key to speed up the quality assurance process. This paper presents empirical observations and the challenges of a test team new to agile practices and Test Automation using open source testing tools integrated in software projects that use the Scrum methodology. The results obtained showed some important issues to be discussed and the Test Automation practices collected based on the experiences and lessons learned.

Input Generation and Selection I

Category Partition Method and Satisfiability Modulo Theories for Test Case Generation
Valentin Chimisliu and Franz Wotawa
(TU Graz, Austria)
In this paper we focus on test case generation for large database applications in the telecommunication industry domain. In particular, we present an approach that is based on the Category Partition Method and uses the SMT solver Z3 for automatically generating input test data values for the obtained test cases. For the generation process, we make use of different test case generation strategies. First initial results show that the one based on genetic programming delivers the fewest number of test cases while retaining choice coverage. Moreover, the obtained results indicate that the presented approach is feasible for the intended application domain.

Scalable Automated Test Generation Using Coverage Guidance and Random Search
TheAnh Do, Alvis C. M. Fong, and Russel Pears
(Auckland University of Technology, New Zealand)
Dynamic symbolic execution has been shown to be an effective technique for automated test input generation. When applied to large-scale programs, its scalability however is limited due to the combinatorial explosion of the path space and the high cost of computation. Several sophisticated search strategies have been proposed to better guide dynamic symbolic execution towards achieving high code coverage. While confirmed effective, these techniques may deteriorate in practical situations because of the large computation cost involved. In this paper, we propose a search heuristic which is directed by coverage information and interleaved with random search to perform dynamic symbolic execution for coverage improvements and cost-effectiveness. We conducted two evaluations to evaluate the effectiveness of our proposed approach and to study the impact of computation costs on its practical capabilities.

Automated EFSM-Based Test Case Generation with Scatter Search
Jie Zhang, Rui Yang, Zhenyu Chen, Zhihong Zhao, and Baowen Xu

(Nanjing University, China)
Extended Finite State Machine (EFSM) is widelyused to represent system specifications. Automated test data generation based on EFSM models is still a challenging task due to the complexity of transition paths. In this paper, we introduce a new approach to generate test cases automatically for given transition paths of an EFSM model. An executable EFSM model is used to provide run-time feedback information as fitness function. And then scatter search algorithm is used to search for test data that can trigger given transition paths. Based on the executable model, the expected outputs associated with test data are also collected for construction of test oracles automatically. Finally, test data (inputs) and test oracles (expected outputs) are combined to be test cases. The experimental results show that our approach can effectively generate test cases to exercise the feasible transition paths.

GUI Testing

BlackHorse: Creating Smart Test Cases from Brittle Recorded Tests
Santo Carino, James H. Andrews, Sheldon Goulding, Pradeepan Arunthavarajah, Tony Florio, and Jakub Hertyk
(University of Western Ontario, Canada; Research In Motion, Canada)
Testing software with a GUI is difficult. Manual testing is costly and error-prone, but recorded test cases frequently ``break'' due to changes in the GUI. Test cases intended to test business logic must therefore be converted to a less ``brittle'' form to lengthen their useful lifespan. In this paper, we describe BlackHorse, an approach to doing this that converts a recorded test case to Java code that bypasses the GUI. The approach was implemented within the testing environment of Research In Motion. We describe the design of the toolset and discuss lessons learned during the course of the project.

Declarative Automated Test
Niels Hallenberg and Philip Lykke Carlsen
(SimCorp, Denmark)
Automated tests at the business level can be expensive to develop and maintain. One common approach is to have a domain expert instruct a QA developer to implement what she would do manually in the application. Though there exist record-replay tools specifically developed for this, these tend to scale poorly for more complicated test scenarios.
We present a different solution: An Embedded Domain Specific Language (EDSL) in F#, containing the means to model the user interface, and the various manipulations of it. We hope that this DSL will bridge the gap between the business domain and technical domain of applications to such a degree that domain experts may be able to construct automatic tests without depending on QA developers, and that these tests will prove more maintainable.

Beyond Plain Video Recording of GUI Tests: Linking Test Case Instructions with Visual Response Documentation
Raphael Pham, Helge Holzmann, Kurt Schneider, and Christian Brüggemann
(Leibniz Universität Hannover, Germany; Capgemini, Germany)
Information systems with sophisticated graphical user interfaces are still difficult to test and debug. As a detailed and reproducible report of test case execution is essential, we advocate the documentation of test case execution on several levels. We present an approach to video-based documentation of automated GUI testing that is linked to the test execution procedure. Viewing currently executed test case instructions alongside actual onscreen responses of the application under test facilitates understanding of the failure. This approach is tailored to the challenges of automated GUI testing and debugging with respect to technical and usability aspects. Screen recording is optimized for speed and memory consumption while all relevant details are captured. Additional browsing capabilities for easier debugging are introduced. Our concepts are evaluated by a working implementation, a series of performance measurements during a technical experiment, and industrial experience from 370 real-world test cases carried out in a large software company.

A Methodology for Energy Performance Testing of Smartphone Applications
Abdulhakim Abogharaf, Rajesh Palit, Kshirasagar Naik, and Ajit Singh
(University of Waterloo, Canada)
Smartphones are becoming increasingly popular among users. They are equipped with an enormous number of applications, and these applications drain the smartphones’ batteries. Moreover, battery capacity is significantly restricted due to constraints on size and weight of the device. It is important for smartphone applications to be energy efficient. Thus, a methodology to conduct energy performance testing is needed for two reasons: (i) evaluate the power consumption of a single application on a given device; (ii) compare the power consumption of different smartphones or platforms running the same application. In our earlier work “Selection and execution of user level test cases for energy cost evaluation of smartphones” (Proceedings of the 6th AST, 2011), we have developed a testing methodology that significantly reduces the number of test cases. In addition, we have introduced the concepts of primary and standalone test configurations. However, ordering of the executions of those two kinds of tests is non-trivial, and it was not studied in that paper.
In this paper, we introduce a methodology to interleave the identification of those two kinds of test configurations in order to reduce the total number of configurations. We express the methodology in the form of a detailed flow chart that application developers can easily follow. We apply the methodology to a specific smartphone, namely HTC Nexus One smartphone in order to illustrate the process of this methodology. We have shown that the total number of test configurations obtained by the given methodology is the same as the number predicted by numerical expressions.

Design for Test

Input Generation Selection II

Test Case Prioritization Incorporating Ordered Sequence of Program Elements
Kun Wu, Chunrong Fang, Zhenyu Chen, and Zhihong Zhao
(Nanjing University, China)
Test suites often grow very large over many releases, such that it is impractical to re-execute all test cases within limited resources. Test case prioritization, which rearranges test cases, is a key technique to improve regression testing. Code coverage information has been widely used in test case prioritization. However, other important information, such as the ordered sequence of program elements measured by execution frequencies, was ignored by previous studies. It raises a risk to lose detections of difficult-to-find bugs. Therefore, this paper improves the similarity-based test case prioritization using the ordered sequence of program elements measured by execution counts. The empirical results show that our new technique can increase the rate of fault detection more significantly than the coverage-based ART technique. Moreover, our technique can detect bugs in loops more quickly and be more cost-benefits than the traditional ones.

G-RankTest: Regression Testing of Controller Applications
Leonardo Mariani, Oliviero Riganelli, Mauro Santoro, and Muhammad Ali
(University of Milano-Bicocca, Italy; VTT Technical Research Center of Finland, Finland)
Since controller applications must typically satisfy real-time constraints while manipulating real-world variables, their implementation often results in programs that run ex- tremely fast and manipulate numerical inputs and outputs. These characteristics make them particularly suitable for test case generation. In fact a number of test cases can be easily created, due to the simplicity of numerical inputs, and executed, due to the speed of computations. In this paper we present G-RankTest, a technique for test case generation and prioritization. The key idea is that test case generation can run for long sessions (e.g., days) to accurately sample the behavior of a controller application and then the generated test cases can be prioritized according to different strategies, and used for regression testing every time the application is modified. In this work we investigate the feasibility of using the gradient of the output as a criterion for selecting the test cases that activate the most tricky behaviors, which we expect easier to break when a change occurs, and thus deserve priority in regression testing.

All-Values Symbolic Execution
Giovanni Denaro
(University of Milano-Bicocca, Italy)
This paper discusses and exemplifies our ideas on all-values symbolic execution, an alternative strategy to the traditional all-paths style of symbolic execution. All-values symbolic execution focuses on enumerating the (symbolic) values that may derive from the symbolic execution of program statements. It exploits program dependencies to optimize the symbolic execution of those statements that can be executed with the same symbolic inputs on multiple (up to infinite) paths. Although a fully working implementation and a thorough evaluation are yet to come, this paper illustrates with simple, but representative examples that the proposed technique can boost the efficiency of symbolic execution, and suite interesting new applications.

On the Role of Diversity Measures for Multi-objective Test Case Selection
Andrea De Lucia

, Massimiliano Di Penta, Rocco Oliveto, and Annibale Panichella
(University of Salerno, Italy; University of Sannio, Italy; University of Molise, Italy)
Test case selection has been recently formulated as multi-objective optimization problem trying to satisfy conflicting goals, such as code coverage and computational cost. This paper introduces the concept of asymmetric distance preserving, useful to improve the diversity of non-dominated solutions produced by multi-objective Pareto efficient genetic algorithms, and proposes two techniques to achieve this objective. Results of an empirical study conducted over four programs from the SIR benchmark show how the proposed technique (i) obtains non-dominated solutions having a higher diversity than the previously proposed multi-objective Pareto genetic algorithms; and (ii) improves the convergence speed of the genetic algorithms.