Powered by
2014 International Symposium on Software Testing and Analysis (ISSTA),
July 21–25, 2014,
San Jose, CA, USA
Frontmatter
Message from the Chairs
It is our great pleasure to welcome you to ISSTA 2014, the 23rd International Symposium on Software Testing and Analysis, to be held in San Jose, California, on July 21-25, 2014. ISSTA is the leading research symposium on software testing and analysis, bringing together academics, industrial researchers, and practitioners to exchange new ideas, problems, and experience on how to analyze and test software systems.
Main Research
Concurrency and Verification
Wed, Jul 23, 10:30 - 12:10, Almaden Ballroom (Chair: Sarfraz Khurshid)
Runtime Prevention of Concurrency Related Type-State Violations in Multithreaded Applications
Lu Zhang and Chao Wang
(Virginia Tech, USA)
We propose a new method for runtime prevention of type state violations
in multithreaded applications due to erroneous thread interleavings.
The new method employs a combination of static and
dynamic program analysis techniques to control the execution order
of the method calls to suppress illegal call sequences. The legal
behavior of a shared object is specified by a type-state automaton,
which serves as the guidance for our method to delay certain
method calls at run time. Our main contribution is a new theoretical
framework for ensuring that the runtime prevention strategy is
always safe, i.e., they do not introduce new erroneous interleavings.
Furthermore, whenever the static program analysis is precise
enough, our method guarantees to steer the program to a failurefree
interleaving as long as such interleaving exists. We have implemented
the new method in a tool based on the LLVM compiler
framework. Our experiments on a set of multithreaded C/C++ applications
show that the method is both efficient and effective in
suppressing concurrency related type-state violations.
@InProceedings{ISSTA14p1,
author = {Lu Zhang and Chao Wang},
title = {Runtime Prevention of Concurrency Related Type-State Violations in Multithreaded Applications},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {1--12},
doi = {},
year = {2014},
}
Performance Regression Testing of Concurrent Classes
Michael Pradel, Markus Huggler, and Thomas R. Gross
(University of California at Berkeley, USA; ETH Zurich, Switzerland)
Developers of thread-safe classes struggle with two opposing goals. The class must be correct, which requires synchronizing concurrent accesses, and the class should provide reasonable performance, which is difficult to realize in the presence of unnecessary synchronization. Validating the performance of a thread-safe class is challenging because it requires diverse workloads that use the class, because existing performance analysis techniques focus on individual bottleneck methods, and because reliably measuring the performance of concurrent executions is difficult. This paper presents SpeedGun, an automatic performance regression testing technique for thread-safe classes. The key idea is to generate multi-threaded performance tests and to compare two versions of a class with each other. The analysis notifies developers when changing a thread-safe class significantly influences the performance of clients of this class. An evaluation with 113 pairs of classes from popular Java projects shows that the analysis effectively identifies 13 performance differences, including performance regressions that the respective developers were not aware of.
@InProceedings{ISSTA14p13,
author = {Michael Pradel and Markus Huggler and Thomas R. Gross},
title = {Performance Regression Testing of Concurrent Classes},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {13--25},
doi = {},
year = {2014},
}
Info
Verifying Atomicity via Data Independence
Ohad Shacham, Eran Yahav, Guy Golan Gueta,
Alex Aiken, Nathan Bronson,
Mooly Sagiv, and
Martin Vechev
(Yahoo Labs, Israel; Technion, Israel; Stanford University, USA; Tel Aviv University, Israel; ETH Zurich, Switzerland)
We present a technique for automatically verifying atomicity of composed concurrent operations. The main observation behind our approach is that many composed concurrent operations which occur in practice are data-independent. That is, the control-flow of the composed operation does not depend on specific input values. While verifying data-independence is undecidable in the general case, we provide succint sufficient conditions that can be used to establish a composed operation as data-independent. We show that for the common case of concurrent maps, data-independence reduces the hard problem of verifying linearizability to a verification problem that can be solved efficiently with a bounded number of keys and values.
We implemented our approach in a tool called VINE and evaluated it on all composed operations from 57 real-world applications (112 composed operations). We show that many composed operations (49 out of 112) are data-independent, and automatically verify 30 of them as linearizable and the rest 19 as having violations of linearizability that could be repaired and then subsequently automatically verified. Moreover, we show that the remaining 63 operations are not linearizable, thus indicating that data independence does not limit the expressiveness of writing realistic linearizable composed operations.
@InProceedings{ISSTA14p26,
author = {Ohad Shacham and Eran Yahav and Guy Golan Gueta and Alex Aiken and Nathan Bronson and Mooly Sagiv and Martin Vechev},
title = {Verifying Atomicity via Data Independence},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {26--36},
doi = {},
year = {2014},
}
Verification-Aided Regression Testing
Fabrizio Pastore, Leonardo Mariani, Antti E. J. Hyvärinen, Grigory Fedyukovich, Natasha Sharygina, Stephan Sehestedt, and Ali Muhammad
(University of Milano-Bicocca, Italy; University of Lugano, Switzerland; ABB Research, Germany; VTT Technical Research, Finland)
In this paper we present Verification-Aided Regression Testing (VART), a novel extension of regression testing that uses model checking to increase the fault revealing capability of existing test suites. The key idea in VART is to extend the use of test case executions from the conventional direct fault discovery to the generation of behavioral properties specific to the upgrade, by (i) automatically producing properties that are proved to hold for the base version of a program, (ii) automatically identifying and checking on the upgraded program only the properties that, according to the developers’ intention, must be preserved by the upgrade, and (iii) reporting the faults and the corresponding counter-examples that are not revealed by the regression tests. Our empirical study on both open source and industrial software systems shows that VART automatically produces properties that increase the effectiveness of testing by automatically detecting faults unnoticed by the existing regression test suites.
@InProceedings{ISSTA14p37,
author = {Fabrizio Pastore and Leonardo Mariani and Antti E. J. Hyvärinen and Grigory Fedyukovich and Natasha Sharygina and Stephan Sehestedt and Ali Muhammad},
title = {Verification-Aided Regression Testing},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {37--48},
doi = {},
year = {2014},
}
Web Testing
Wed, Jul 23, 13:30 - 15:10, Almaden Ballroom (Chair: Gregg Rothermel)
Hybrid Security Analysis of Web JavaScript Code via Dynamic Partial Evaluation
Omer Tripp, Pietro Ferrara, and Marco Pistoia
(IBM Research, USA)
This paper addresses the problem of detecting JavaScript security vulnerabilities in the client side of Web applications. Such vulnerabilities are becoming a source of growing concern due to the rapid migration of server-side business logic to the client side, combined with new JavaScript-backed Web technologies, such as AJAX and HTML5. Detection of client-side vulnerabilities is challenging given the dynamic and event-driven nature of JavaScript. We present a hybrid form of JavaScript analysis, which augments static analysis with (semi-)concrete information by applying partial evaluation to JavaScript functions according to dynamic data recorded by the Web crawler. The dynamic component rewrites the program per the enclosing HTML environment, and the static component then explores all possible behaviors of the partially evaluated program (while treating user-controlled aspects of the environment conservatively).
We have implemented this hybrid architecture as the JSA analysis tool, which is integrated into the IBM AppScan Standard Edition product. We formalize the static analysis and prove useful properties over it. We also tested the system across a set of 170,000 Web pages, comparing it with purely static and dynamic alternatives. The results provide conclusive evidence in favor of our hybrid approach. Only 10% of the reports by JSA are false alarms compared to 63% of the alarms flagged by its purely static counterpart, while not a single true warning is lost. This represents a reduction of 94% in false alarms. Compared with a commercial testing algorithm, JSA detects vulnerabilities in >4x more Web sites with only 4 false alarms.
@InProceedings{ISSTA14p49,
author = {Omer Tripp and Pietro Ferrara and Marco Pistoia},
title = {Hybrid Security Analysis of Web JavaScript Code via Dynamic Partial Evaluation},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {49--59},
doi = {},
year = {2014},
}
Virtual DOM Coverage for Effective Testing of Dynamic Web Applications
Yunxiao Zou, Zhenyu Chen, Yunhui Zheng,
Xiangyu Zhang, and Zebao Gao
(Nanjing University, China; Purdue University, USA; University of Maryland at College Park, USA)
Test adequacy criteria are fundamental in software testing. Among them, code coverage criterion is widely used due to its simplicity and effectiveness. However, in dynamic web application testing, merely covering server-side script code is inadequate because it neglects client-side execution, which plays an important role in triggering client-server interactions to reach important execution states. Similarly, a criterion aiming at covering the UI elements on client-side pages ignores the server-side execution, leading to insufficiency.
In this paper, we propose Virtual DOM (V-DOM) Coverage, a novel criterion, for effective web application testing. With static analysis, we first aggregate all the DOM objects that may be produced by a piece of server script to construct a V-DOM tree. The tree models execution on both the client- and server-sides such that V-DOM coverage is more effective than existing coverage criteria in web application testing. We conduct an empirical study on five real world dynamic web applications. We find that V-DOM tree can model much more DOM objects than a web crawling based technique. Test selection based on V-DOM tree criterion substantially outperforms the existing code coverage and UI element coverage, by detecting more faults.
@InProceedings{ISSTA14p60,
author = {Yunxiao Zou and Zhenyu Chen and Yunhui Zheng and Xiangyu Zhang and Zebao Gao},
title = {Virtual DOM Coverage for Effective Testing of Dynamic Web Applications},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {60--70},
doi = {},
year = {2014},
}
DOM-Based Test Adequacy Criteria for Web Applications
Mehdi Mirzaaghaei and
Ali Mesbah
(University of British Columbia, Canada)
To assess the quality of web application test cases, web developers currently measure code coverage. Although code coverage has traditionally been a popular test adequacy criterion, we believe it alone is not adequate for assessing the quality of web application test cases. We propose a set of novel DOM-based test adequacy criteria for web applications. These criteria aim at measuring coverage at two granularity levels, (1) the percentage of DOM states and transitions covered in the total state space of the web application under test, and (2) the percentage of elements covered in each particular DOM state. We present a technique and tool, called DomCovery, which automatically extracts and measures the proposed adequacy criteria and generates a visual DOM coverage report. Our evaluation shows that there is no correlation between code coverage and DOM coverage. A controlled experiment illustrates that participants using DomCovery completed coverage related tasks 22% more accurately and 66% faster.
@InProceedings{ISSTA14p71,
author = {Mehdi Mirzaaghaei and Ali Mesbah},
title = {DOM-Based Test Adequacy Criteria for Web Applications},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {71--81},
doi = {},
year = {2014},
}
Cross-Platform Feature Matching for Web Applications
Shauvik Roy Choudhary, Mukul R. Prasad, and
Alessandro Orso
(Georgia Tech, USA; Fujitsu Labs, USA)
With the emergence of new computing platforms, software written for traditional platforms is being re-targeted to reach the users on these new platforms. In particular, due to the proliferation of mobile computing devices, it is common practice for companies to build mobile-specific versions of their existing web applications to provide mobile users with a better experience. Because the differences between desktop and mobile versions of a web application are not only cosmetic, but can also include substantial rewrites of key components, it is not uncommon for these different versions to provide different sets of features. Whereas some of these differences are intentional, such as the addition of location-based features on mobile devices, others are not and can negatively affect the user experience, as confirmed by numerous user reports and complaints. Unfortunately, checking and maintaining the consistency of different versions of an application by hand is not only time consuming, but also error prone. To address this problem, and help developers in this difficult task, we propose an automated technique for matching features across different versions of a multi-platform web application. We implemented our technique in a tool, called FMAP, and used it to perform a preliminary empirical evaluation on nine real-world multi-platform web applications. The results of our evaluation are promising, as FMAP was able to correctly identify missing features between desktop and mobile versions of the web applications considered, as confirmed by our analysis of user reports and software fixes for these applications.
@InProceedings{ISSTA14p82,
author = {Shauvik Roy Choudhary and Mukul R. Prasad and Alessandro Orso},
title = {Cross-Platform Feature Matching for Web Applications},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {82--92},
doi = {},
year = {2014},
}
Info
aec-badge-issta
Artifact Studies
Wed, Jul 23, 15:40 - 16:30, Almaden Ballroom (Chair: Milos Gligoric)
Covrig: A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software
Paul Marinescu, Petr Hosek, and
Cristian Cadar
(Imperial College London, UK)
Software repositories provide rich information about the construction and evolution of software systems. While static data that can be mined directly from version control systems has been extensively studied, dynamic metrics concerning the execution of the software have received much less attention, due to the inherent difficulty of running and monitoring a large number of software versions.
In this paper, we present Covrig, a flexible infrastructure that can be used to run each version of a system in isolation and collect static and dynamic software metrics, using a lightweight virtual machine environment that can be deployed on a cluster of local or cloud machines.
We use Covrig to conduct an empirical study examining how code and tests co-evolve in six popular open-source systems. We report the main characteristics of software patches, analyse the evolution of program and patch coverage, assess the impact of nondeterminism on the execution of test suites, and investigate whether the coverage of code containing bugs and bug fixes is higher than average.
@InProceedings{ISSTA14p93,
author = {Paul Marinescu and Petr Hosek and Cristian Cadar},
title = {Covrig: A Framework for the Analysis of Code, Test, and Coverage Evolution in Real Software},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {93--104},
doi = {},
year = {2014},
}
Info
aec-badge-issta
CoREBench: Studying Complexity of Regression Errors
Marcel Böhme and
Abhik Roychoudhury
(Saarland University, Germany; National University of Singapore, Singapore)
Intuitively we know, some software errors are more complex than others. If the error can be fixed by changing one faulty statement, it is a simple error. The more substantial the fix must be, the more complex we consider the error.
In this work, we formally define and quantify the complexity of an error w.r.t. the complexity of the error's least complex, correct fix. As a concrete measure of complexity for such fixes, we introduce Cyclomatic Change Complexity which is inspired by existing program complexity metrics.
Moreover, we introduce CoREBench, a collection of 70 regression errors systematically extracted from several open-source C-projects and compare their complexity with that of the seeded errors in the two most popular error benchmarks, SIR and the Siemens Suite. We find that seeded errors are significantly less complex, i.e., require significantly less substantial fixes, compared to actual regression errors. For example, among the seeded errors more than 42% are simple compared to 8% among the actual ones. This is a concern for the external validity of studies based on seeded errors and we propose CoREBench for the controlled study of regression testing, debugging, and repair techniques.
@InProceedings{ISSTA14p105,
author = {Marcel Böhme and Abhik Roychoudhury},
title = {CoREBench: Studying Complexity of Regression Errors},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {105--115},
doi = {},
year = {2014},
}
Info
aec-badge-issta
Static Analyses and Transformations
Thu, Jul 24, 10:30 - 12:10, Almaden Ballroom (Chair: Satish Chandra)
ARC++: Effective Typestate and Lifetime Dependency Analysis
Xusheng Xiao, Gogul Balakrishnan, Franjo Ivančić, Naoto Maeda, Aarti Gupta, and Deepak Chhetri
(NEC Labs, USA; North Carolina State University, USA; Google, USA; NEC, Japan; NEC, India)
The ever-increasing reliance of today's society on software requires scalable and precise techniques for checking the correctness, reliability, and robustness of software. Object-oriented languages have been used extensively to build large-scale systems, including Java and C++. While many scalable static analysis approaches for C and Java have been proposed, there has been comparatively little work on the static analysis of C++ programs. In this paper, we provide an abstract representation to model C++ objects, containers, references, raw pointers, and smart pointers. Further, we present a new analysis called lifetime dependency analysis, which allows us to precisely track the complex lifetime semantics of temporary objects in C++. Finally, we propose an implementation of our techniques and present promising %experimental results on a large variety of open-source software.
@InProceedings{ISSTA14p116,
author = {Xusheng Xiao and Gogul Balakrishnan and Franjo Ivančić and Naoto Maeda and Aarti Gupta and Deepak Chhetri},
title = {ARC++: Effective Typestate and Lifetime Dependency Analysis},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {116--126},
doi = {},
year = {2014},
}
A Type System for Format Strings
Konstantin Weitz, Gene Kim, Siwakorn Srisakaokul, and Michael D. Ernst
(University of Washington, USA)
Most programming languages support format strings, but their use
is error-prone. Using the wrong format string syntax, or passing
the wrong number or type of arguments, leads to unintelligible text
output, program crashes, or security vulnerabilities.
This paper presents a type system that guarantees that calls to
format string APIs will never fail. In Java, this means that the API
will not throw exceptions. In C, this means that the API will not
return negative values, corrupt memory, etc.
We instantiated this type system for Java’s Formatter API, and
evaluated it on 6 large and well-maintained open-source projects.
Format string bugs are common in practice (our type system found
104 bugs), and the annotation burden on the user of our type system
is low (on average, for every bug found, only 1.0 annotations need
to be written).
@InProceedings{ISSTA14p127,
author = {Konstantin Weitz and Gene Kim and Siwakorn Srisakaokul and Michael D. Ernst},
title = {A Type System for Format Strings},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {127--137},
doi = {},
year = {2014},
}
Info
aec-badge-issta
Scalable Detection of Missed Cross-Function Refactorings
Narcisa Andreea Milea,
Lingxiao Jiang, and Siau-Cheng Khoo
(National University of Singapore, Singapore; Singapore Management University, Singapore)
Refactoring is an important way to improve the design of existing code.
Identifying refactoring opportunities (i.e., code fragments that can be
refactored) in large code bases is a challenging task. In this paper, we propose
a novel, automated and scalable technique for identifying cross-function
refactoring opportunities that span more than one function (e.g., Extract Method
and Inline Method). The key of our technique is the design of efficient vector
inlining operations that emulate the effect of method inlining among code
fragments, so that the problem of identifying cross-function refactoring can be
reduced to the problem of finding similar vectors before and after inlining. We
have implemented our technique in a prototype tool named ReDex which encodes
Java programs to particular vectors. We have applied the tool to a large code
base, 4.5 million lines of code, comprising of 200 bundle projects in the
Eclipse ecosystem (e.g., Eclipse JDT, Eclipse PDE, Apache Commons, Hamcrest,
etc.). Also, different from many other studies on detecting refactoring, ReDex
only searches for code fragments that can be, but have not yet been, refactored
in a way similar to some refactoring that happened in the code base. Our results
show that ReDex can find 277 cross-function refactoring opportunities in 2
minutes, and 223 cases were labelled as true opportunities by users, and cover
many categories of cross-function refactoring operations in classical
refactoring books, such as Self Encapsulate Field, Decompose Conditional
Expression, Hide Delegate, Preserve Whole Object, etc.
@InProceedings{ISSTA14p138,
author = {Narcisa Andreea Milea and Lingxiao Jiang and Siau-Cheng Khoo},
title = {Scalable Detection of Missed Cross-Function Refactorings},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {138--148},
doi = {},
year = {2014},
}
Tailored Source Code Transformations to Synthesize Computationally Diverse Program Variants
Benoit Baudry, Simon Allier, and Martin Monperrus
(INRIA, France; IRISA, France; University of Lille, France)
The predictability of program execution provides attackers
a rich source of knowledge who can exploit it to spy or
remotely control the program. Moving target defense ad-
dresses this issue by constantly switching between many di-
verse variants of a program, which reduces the certainty that
an attacker can have about the program execution. The ef-
fectiveness of this approach relies on the availability of a
large number of software variants that exhibit dierent ex-
ecutions. However, current approaches rely on the natural
diversity provided by o-the-shelf components, which is very
limited. In this paper, we explore the automatic synthe-
sis of large sets of program variants, called sosies. Sosies
provide the same expected functionality as the original pro-
gram, while exhibiting dierent executions. They are said
to be computationally diverse.
This work addresses two objectives: comparing dierent
transformations for increasing the likelihood of sosie synthe-
sis (densifying the search space for sosies); demonstrating
computation diversity in synthesized sosies. We synthesized
30 184 sosies in total, for 9 large, real-world, open source ap-
plications. For all these programs we identied one type of
program analysis that systematically increases the density
of sosies; we measured computation diversity for sosies of
3 programs and found diversity in method calls or data in
more than 40% of sosies. This is a step towards controlled
massive unpredictability of software.
@InProceedings{ISSTA14p149,
author = {Benoit Baudry and Simon Allier and Martin Monperrus},
title = {Tailored Source Code Transformations to Synthesize Computationally Diverse Program Variants},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {149--159},
doi = {},
year = {2014},
}
Test Selection and Reduction
Thu, Jul 24, 13:30 - 15:10, Almaden Ballroom (Chair: Neha Rungta)
Using Test Case Reduction and Prioritization to Improve Symbolic Execution
Chaoqiang Zhang,
Alex Groce, and Mohammad Amin Alipour
(Oregon State University, USA)
Scaling symbolic execution to large programs or programs
with complex inputs remains difficult due to path explosion
and complex constraints, as well as external method calls.
Additionally, creating an effective test structure with symbolic
inputs can be difficult. A popular symbolic execution
strategy in practice is to perform symbolic execution not
“from scratch” but based on existing test cases. This paper
proposes that the effectiveness of this approach to symbolic
execution can be enhanced by (1) reducing the size of seed
test cases and (2) prioritizing seed test cases to maximize exploration
efficiency. The proposed test case reduction strategy
is based on a recently introduced generalization of delta debugging,
and our prioritization techniques include novel
methods that, for this purpose, can outperform some traditional
regression testing algorithms. We show that applying
these methods can significantly improve the effectiveness of
symbolic execution based on existing test cases.
@InProceedings{ISSTA14p160,
author = {Chaoqiang Zhang and Alex Groce and Mohammad Amin Alipour},
title = {Using Test Case Reduction and Prioritization to Improve Symbolic Execution},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {160--170},
doi = {},
year = {2014},
}
FLOWER: Optimal Test Suite Reduction as a Network Maximum Flow
Arnaud Gotlieb and Dusica Marijan
(Simula Research Laboratory, Norway)
A trend in software testing is reducing the size of a test suite while preserving its overall quality. Given a test suite and a set of requirements covered by the suite, test suite reduction aims at selecting a subset of test cases that cover the same set of requirements. Even though this problem has received considerable attention, finding the smallest subset of test cases is still challenging and commonly-used approaches address this problem only with approximated solutions. When executing a single test case requires much manual effort (e.g., hours of preparation), finding the minimal subset is needed to reduce the testing costs.
In this paper, we introduce a radically new approach to test suite reduction, called FLOWER, based on a search among network maximum flows. From a given test suite and the requirements covered by the suite, FLOWER forms a flow network (with specific constraints) that is then traversed to find its maximum flows. FLOWER leverages the Ford-Fulkerson method to compute maximum flows and Constraint Programming techniques to search among optimal flows. FLOWER is an exact method that computes a minimum-sized test suite, preserving the coverage of requirements.
The experimental results show that FLOWER outperforms a non-optimized implementation of the Integer Linear Programming approach by 15-3000 times in terms of the time needed to find an optimal solution, and a simple greedy approach by 5-15% in terms of the size of reduced test suite.
@InProceedings{ISSTA14p171,
author = {Arnaud Gotlieb and Dusica Marijan},
title = {FLOWER: Optimal Test Suite Reduction as a Network Maximum Flow},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {171--180},
doi = {},
year = {2014},
}
Coverage and Fault Detection of the Output-Uniqueness Test Selection Criteria
Nadia Alshahwan and Mark Harman
(University College London, UK)
This paper studies the whitebox coverage and fault detection achieved by Output Uniqueness, a newly proposed blackbox
test criterion, using 6 web applications. We find that output uniqueness exhibits average correlation coefficients of 0.85, 0.83 and 0.97 with statement, branch and path coverage respectively. More interestingly, output uniqueness finds 92% of the real faults found by branch coverage (and a further 47% that remained undetected by such whitebox techniques). These results suggest that output uniqueness may provide a useful surrogate when whitebox techniques are inapplicable and an effective complement where they
are.
@InProceedings{ISSTA14p181,
author = {Nadia Alshahwan and Mark Harman},
title = {Coverage and Fault Detection of the Output-Uniqueness Test Selection Criteria},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {181--192},
doi = {},
year = {2014},
}
Dodona: Automated Oracle Data Set Selection
Pablo Loyola, Matt Staats, In-Young Ko, and Gregg Rothermel
(University of Chile, Chile; University of Luxembourg, Luxembourg; KAIST, South Korea; University of Nebraska-Lincoln, USA)
Software complexity has increased the need for automated software testing. Most research on automating testing, however, has focused on creating test input data. While careful selection of input data is necessary to reach faulty states in a system under test, test oracles are needed to actually detect failures. In this work, we describe Dodona, a system that supports the generation of test oracles. Dodona ranks program variables based on the interactions and dependencies observed between them during program execution. Using this ranking, Dodona proposes a set of variables to be monitored, that can be used by engineers to construct assertion-based oracles. Our empirical study of Dodona reveals that it is more effective and efficient than the current state-of-the-art approach for generating oracle data sets, and can often yield oracles that are almost as effective as oracles hand-crafted by engineers without support.
@InProceedings{ISSTA14p193,
author = {Pablo Loyola and Matt Staats and In-Young Ko and Gregg Rothermel},
title = {Dodona: Automated Oracle Data Set Selection},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {193--203},
doi = {},
year = {2014},
}
Localization and Repair
Thu, Jul 24, 15:40 - 17:20, Almaden Ballroom (Chair: Indradeep Ghosh)
CrashLocator: Locating Crashing Faults Based on Crash Stacks
Rongxin Wu, Hongyu Zhang,
Shing-Chi Cheung, and Sunghun Kim
(Hong Kong University of Science and Technology, China; Microsoft Research, China)
Software crash is common. When a crash occurs, software developers can receive a report upon user permission. A crash report typically includes a call stack at the time of crash. An important step of debugging a crash is to identify faulty functions, which is often a tedious and labor-intensive task. In this paper, we propose CrashLocator, a method to locate faulty functions using the crash stack information in crash reports. It deduces possible crash traces (the failing execution traces that lead to crash) by expanding the crash stack with functions in static call graph. It then calculates the suspiciousness of each function in the approximate crash traces. The functions are then ranked by their suspiciousness scores and are recommended to developers for further investigation. We evaluate our approach using real-world Mozilla crash data. The results show that our approach is effective: we can locate 50.6%, 63.7% and 67.5% of crashing faults by examining top 1, 5 and 10 functions recommended by CrashLocator, respectively. Our approach outperforms the conventional stack-only methods significantly.
@InProceedings{ISSTA14p204,
author = {Rongxin Wu and Hongyu Zhang and Shing-Chi Cheung and Sunghun Kim},
title = {CrashLocator: Locating Crashing Faults Based on Crash Stacks},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {204--214},
doi = {},
year = {2014},
}
Efficient Predicated Bug Signature Mining via Hierarchical Instrumentation
Zhiqiang Zuo, Siau-Cheng Khoo, and Chengnian Sun
(National University of Singapore, Singapore; University of California at Davis, USA)
Debugging is known to be a notoriously painstaking and time-consuming task. An essential and yet expensive process in debugging is bug isolation. As one major family of automatic bug isolation, statistical bug isolation approaches have been well studied in the past decade. A recent advancement in this area is the introduction of bug signature that provides contextual information to assist in debugging and several bug signature mining approaches have been reported. All these approaches instrument the entire buggy program to produce profiles for debugging. Consequently, they often incur hefty instrumentation and analysis cost. However, as in fact major part of the program code is error-free, full-scale program instrumentation is wasteful and unnecessary. In this paper, we devise a novel hierarchical instrumentation (HI) technique to perform selective instrumentation so as to enhance the efficiency of statistical debugging. We employ HI technique to predicated bug signature mining (called MPS) recently developed and propose an approach called HIMPS. The empirical study reveals that our technique can achieve around 40% to 60% saving in disk storage usage, time and memory consumption, and performs especially well on large programs. It greatly improves the efficiency of bug signature mining, making a step forward to painless debugging.
@InProceedings{ISSTA14p215,
author = {Zhiqiang Zuo and Siau-Cheng Khoo and Chengnian Sun},
title = {Efficient Predicated Bug Signature Mining via Hierarchical Instrumentation},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {215--224},
doi = {},
year = {2014},
}
Semantic Differential Repair for Input Validation and Sanitization
Muath Alkhalaf, Abdulbaki Aydin, and
Tevfik Bultan
(University of California at Santa Barbara, USA)
Correct validation and sanitization of user input is crucial in web
applications for avoiding security vulnerabilities and erroneous
application behavior. We present an automated differential repair technique
for input validation and sanitization functions. Differential repair
can be used within an application to repair client and server-side code
with respect to each other, or across applications in order to strengthen
the validation and sanitization checks. Given a reference and a target
function, our differential repair technique strengthens the validation
and sanitization operations in the target function based on the reference
function. It does this by synthesizing three patches: a validation,
a length, and a sanitization patch. Our automated patch
synthesis algorithms are based on forward and backward symbolic string
analyses that use automata as a symbolic representation. Composition of the
three automatically synthesized patches with the original target function
results in the repaired function, which provides stronger validation and
sanitization than both the target and the reference functions.
@InProceedings{ISSTA14p225,
author = {Muath Alkhalaf and Abdulbaki Aydin and Tevfik Bultan},
title = {Semantic Differential Repair for Input Validation and Sanitization},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {225--236},
doi = {},
year = {2014},
}
aec-badge-issta
Automatic Repair for Multi-threaded Programs with Deadlock/Livelock using Maximum Satisfiability
Yiyan Lin and Sandeep S. Kulkarni
(Michigan State University, USA)
Deadlock-freedom is a major challenge in developing multi-threaded programs, as a deadlock cannot be resolved until one restarts the program (mostly by using manual intervention). To avoid the risk of blocking, a program may use the trylock operations rather than lock operations. In this case, if a thread fails to acquire a lock using trylock, since trylock is non-blocking, the thread can release acquired locks to avoid a deadlock after trylock returns. Although this approach avoids deadlocks, it may also introduce bugs such as livelock and deadlivelock. Moreover, when such bugs are identified in a program, revising the program manually is error-prone.
With this motivation, in this paper, we propose an approach for avoiding deadlocks, livelocks and deadlivelocks in the given multi-threaded program. In our approach, we first identify cyclic lock dependencies that may lead to deadlocks, livelocks or deadlivelocks. Subsequently, we map the problem of ensuring freedom from deadlocks, livelocks and deadlivelocks to the weighted partial maximum satisfiability problem. To ensure that the repaired program preserves most of original design, our approach attempts to make minimal changes to the original program.
@InProceedings{ISSTA14p237,
author = {Yiyan Lin and Sandeep S. Kulkarni},
title = {Automatic Repair for Multi-threaded Programs with Deadlock/Livelock using Maximum Satisfiability},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {237--247},
doi = {},
year = {2014},
}
Security
Fri, Jul 25, 09:20 - 10:10, Almaden Ballroom (Chair: Alex Orso)
Make It Work, Make It Right, Make It Fast: Building a Platform-Neutral Whole-System Dynamic Binary Analysis Platform
Andrew Henderson, Aravind Prakash, Lok Kwong Yan, Xunchao Hu, Xujiewen Wang, Rundong Zhou, and Heng Yin
(Syracuse University, USA; Rome Laboratory, USA)
Dynamic binary analysis is a prevalent and indispensable technique in program analysis. While several dynamic binary analysis tools and frameworks have been proposed, all suffer from one or more of: prohibitive performance degradation, semantic gap between the analysis code and the program being analyzed, architecture/OS specificity, being user-mode only, lacking APIs, etc. We present DECAF, a virtual machine based, multi-target, whole-system dynamic binary analysis framework built on top of QEMU. DECAF provides Just-In-Time Virtual Machine Introspection combined with a novel TCG instruction-level tainting at bit granularity, backed by a plugin based, simple-to-use event driven programming interface. DECAF exercises fine control over the TCG instructions to accomplish on-the-fly optimizations. We present 3 platform-neutral plugins - Instruction Tracer, Keylogger Detector, and API Tracer, to demonstrate the ease of use and effectiveness of DECAF in writing cross-platform and system-wide analysis tools. Implementation of DECAF consists of 9550 lines of C++ code and 10270 lines of C code and we evaluate DECAF using CPU2006 SPEC benchmarks and show average overhead of 605% for system wide tainting and 12% for VMI.
@InProceedings{ISSTA14p248,
author = {Andrew Henderson and Aravind Prakash and Lok Kwong Yan and Xunchao Hu and Xujiewen Wang and Rundong Zhou and Heng Yin},
title = {Make It Work, Make It Right, Make It Fast: Building a Platform-Neutral Whole-System Dynamic Binary Analysis Platform},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {248--258},
doi = {},
year = {2014},
}
aec-badge-issta
Automated Testing for SQL Injection Vulnerabilities: An Input Mutation Approach
Dennis Appelt, Cu Duy Nguyen,
Lionel C. Briand, and Nadia Alshahwan
(University of Luxembourg, Luxembourg; University College London, UK)
Web services are increasingly adopted in various domains, from finance and e-government to social media. As they are built on top of the web technologies, they suffer also an unprecedented amount of attacks and exploitations like the Web. Among the attacks, those that target SQL injection vulnerabilities have consistently been top-ranked for the last years. Testing to detect such vulnerabilities before making web services public is crucial. We present in this paper an automated testing approach, namely μ4SQLi, and its underpinning set of mutation operators. μ4SQLi can produce effective inputs that lead to executable and harmful SQL statements. Executability is key as otherwise no injection vulnerability can be exploited. Our evaluation demonstrated that the approach is effective to detect SQL injection vulnerabilities and to produce inputs that bypass application firewalls, which is a common configuration in real world.
@InProceedings{ISSTA14p259,
author = {Dennis Appelt and Cu Duy Nguyen and Lionel C. Briand and Nadia Alshahwan},
title = {Automated Testing for SQL Injection Vulnerabilities: An Input Mutation Approach},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {259--269},
doi = {},
year = {2014},
}
UI Testing
Fri, Jul 25, 10:30 - 12:10, Almaden Ballroom (Chair: Andreas Zeller)
Reducing GUI Test Suites via Program Slicing
Stephan Arlt,
Andreas Podelski, and Martin Wehrle
(University of Luxembourg, Luxembourg; University of Freiburg, Germany; University of Basel, Switzerland)
A crucial problem in GUI testing is the identification of accurate event sequences that encode corresponding user interactions with the GUI. Ultimately, event sequences should be both feasible (i. e., executable on the GUI) and relevant (i.e., cover as much of the code as possible). So far, most work on GUI testing focused on approaches to generate feasible event sequences. In addition, based on event dependency analyses, a recently proposed static analysis approach systematically aims at selecting both relevant and feasible event sequences. However, statically analyzing event dependencies can cause the generation of a huge number of event sequences, leading to unmanageable GUI test suites that are not executable within reasonable time. In this paper we propose a refined static analysis approach based on program slicing. On the theoretical side, our approach identifies and eliminates redundant event sequences in GUI test suites. Redundant event sequences have the property that they are guaranteed to not affect the test effectiveness. On the practical side, we have implemented a slicing-based test suite reduction algorithm that approximatively identifies redundant event sequences. Our experiments on six open source GUI applications show that our reduction algorithm significantly reduces the size of GUI test suites. As a result, the overall execution time could significantly be reduced without losing test effectiveness.
@InProceedings{ISSTA14p270,
author = {Stephan Arlt and Andreas Podelski and Martin Wehrle},
title = {Reducing GUI Test Suites via Program Slicing},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {270--281},
doi = {},
year = {2014},
}
SunCat: Helping Developers Understand and Predict Performance Problems in Smartphone Applications
Adrian Nistor and Lenin Ravindranath
(Chapman University, USA; Massachusetts Institute of Technology, USA)
The number of smartphones shipped in 2014 will be four times larger
than the number of PCs. Compared to PCs, smartphones have limited
computing resources, and smartphone applications are more prone to
performance problems. Traditionally, developers use profilers to
detect performance problems by running applications with relatively
large inputs. Unfortunately, for smartphone applications, the
developer cannot easily control the input, because smartphone
applications interact heavily with the environment.
Given a run on a small input, how can a developer detect performance
problems that would occur for a run with large input? We present
SUNCAT, a novel technique that helps developers understand and predict
performance problems in smartphone applications. The developer runs
the application using a common input, typically small, and SUNCAT
presents a prioritized list of repetition patterns that summarize the
current run plus additional information to help the developer
understand how these patterns may grow in the future runs with large
inputs. We implemented SUNCAT for Windows Phone systems and used it
to understand the performance characteristics of 29 usage scenarios in
5 popular applications. We found one performance problem that was
confirmed and fixed, four problems that were confirmed, one confirmed
problem that was a duplicate of an older report, and three more
potential performance problems that developers agree may be improved.
@InProceedings{ISSTA14p282,
author = {Adrian Nistor and Lenin Ravindranath},
title = {SunCat: Helping Developers Understand and Predict Performance Problems in Smartphone Applications},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {282--292},
doi = {},
year = {2014},
}
A Variability-Based Testing Approach for Synthesizing Video Sequences
José A. Galindo, Mauricio Alférez,
Mathieu Acher, Benoit Baudry, and
David Benavides
(INRIA, France; University of Rennes 1, France; University of Seville, Spain)
A key problem when developing video processing software is the difficulty to test different input combinations. In this paper, we present VANE, a variability-based testing approach to derive video sequence variants. The ideas of VANE are i) to encode in a variability model what can vary within a video sequence; ii) to exploit the variability model to generate testable configurations; iii) to synthesize variants of video sequences corresponding to configurations. VANE computes T-wise covering sets while optimizing a function over attributes. Also, we present a preliminary validation of the scalability and practicality of VANE in the context of an industrial project involving the test of video processing algorithms.
@InProceedings{ISSTA14p293,
author = {José A. Galindo and Mauricio Alférez and Mathieu Acher and Benoit Baudry and David Benavides},
title = {A Variability-Based Testing Approach for Synthesizing Video Sequences},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {293--303},
doi = {},
year = {2014},
}
Robust Test Automation using Contextual Clues
Rahulkrishna Yandrapally, Suresh Thummalapenta, Saurabh Sinha, and
Satish Chandra
(IBM Research, India; Microsoft, USA; Samsung Research, USA)
Despite the seemingly obvious advantage of test automation, significant
skepticism exists in the industry regarding its cost-benefit tradeoffs. Test
scripts for web applications are fragile: even small changes in the
page layout can break a number of tests, requiring the expense of re-automating
them. Moreover, a test script created for one browser cannot be relied upon to
run on a different web browser: it requires duplicate effort to create and
maintain versions of tests for a variety of browsers. Because of these hidden
costs, organizations often fall back to manual testing.
We present a fresh solution to the problem of test-script fragility. Often, the
root cause of test-script fragility is that, to identify UI elements on a page,
tools typically record some metadata that depends on the internal representation
of the page in a browser. Our technique eliminates metadata almost
entirely. Instead, it identifies UI elements relative to other prominent
elements on the page. The core of our technique automatically identifies a
series of contextual clues that unambiguously identify a UI element,
without recording anything about the internal representation.
Empirical evidence shows that our technique is highly accurate in computing
contextual clues, and outperforms existing techniques in its resilience to UI
changes as well as browser changes.
@InProceedings{ISSTA14p304,
author = {Rahulkrishna Yandrapally and Suresh Thummalapenta and Saurabh Sinha and Satish Chandra},
title = {Robust Test Automation using Contextual Clues},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {304--314},
doi = {},
year = {2014},
}
Efficiency and Optimizations
Fri, Jul 25, 13:30 - 15:10, Almaden Ballroom (Chair: Zhendong Su)
Efficient Mutation Analysis by Propagating and Partitioning Infected Execution States
René Just,
Michael D. Ernst, and Gordon Fraser
(University of Washington, USA; University of Sheffield, UK)
Mutation analysis evaluates a testing technique by measur-
ing how well it detects seeded faults (mutants). Mutation
analysis is hampered by inherent scalability problems — a
test suite is executed for each of a large number of mutants.
Despite numerous optimizations presented in the literature,
this scalability issue remains, and this is one of the reasons
why mutation analysis is hardly used in practice.
Whereas most previous optimizations attempted to stati-
cally reduce the number of executions or their computational
overhead, this paper exploits information available only at
run time to further reduce the number of executions.
First, state infection conditions can reveal — with a single
test execution of the unmutated program — which mutants
would lead to a different state, thus avoiding unnecessary
test executions. Second, determining whether an infected
execution state propagates can further reduce the number
of executions. Mutants that are embedded in compound
expressions may infect the state locally without affecting the
outcome of the compound expression. Third, those mutants
that do infect the state can be partitioned based on the
resulting infected state — if two mutants lead to the same
infected state, only one needs to be executed as the result of
the other can be inferred.
We have implemented these optimizations in the Major mu-
tation framework and empirically evaluated them on 14 open
source programs. The optimizations reduced the mutation
analysis time by 40% on average.
@InProceedings{ISSTA14p315,
author = {René Just and Michael D. Ernst and Gordon Fraser},
title = {Efficient Mutation Analysis by Propagating and Partitioning Infected Execution States},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {315--326},
doi = {},
year = {2014},
}
Lightweight Automated Detection of Unsafe Information Leakage via Exceptions
Benwen Zhang and James Clause
(University of Delaware, USA)
Unintended information leakage is one of the most common and severe
problems facing modern applications. To help developers detect
information leaks before they can be leveraged by attackers, we
present a new static analysis-based technique for detecting a specific
type of information leak: information leaks via exceptions. Because it
focuses on a specific type of leak, the technique is able to be
efficient, effective, and easy to use, qualities that are often
lacking in more general techniques. We implemented our technique in a
prototype tool, UDLD, and performed an extensive empirical evaluation
using 19 real web applications. The results of the evaluation show
that UDLD is both efficient and effective at detecting unsafe
information leaks via exceptions; for the subjects that we considered,
UDLD is the fastest among several alternative tools. Moreover, it
reported more true leaks than existing state-of-the-art tools with no
known false negatives and no false positives.
@InProceedings{ISSTA14p327,
author = {Benwen Zhang and James Clause},
title = {Lightweight Automated Detection of Unsafe Information Leakage via Exceptions},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {327--338},
doi = {},
year = {2014},
}
Integrated Energy-Directed Test Suite Optimization
Ding Li, Yuchen Jin, Cagri Sahin, James Clause, and
William G. J. Halfond
(University of Southern California, USA; University of Delaware, USA)
In situ testing techniques have become an important means of ensuring
the reliability of embedded systems after they are deployed in the
field. However, these techniques do not help testers optimize the
energy consumption of their in situ test suites, which can needlessly
waste the limited battery power of these systems. In this work, we
extend prior techniques for test suite minimization in such a way as
to allow testers to generate energy-efficient, minimized test suites
with only minimal modifications to their existing work flow. We
perform an extensive empirical evaluation of our approach using the
test suites provided for real world applications. The results of the
evaluation show that our technique is effective at generating, in less
than one second, test suites that consume up to 95% less energy while
maintaining coverage of the testing requirements.
@InProceedings{ISSTA14p339,
author = {Ding Li and Yuchen Jin and Cagri Sahin and James Clause and William G. J. Halfond},
title = {Integrated Energy-Directed Test Suite Optimization},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {339--350},
doi = {},
year = {2014},
}
Identifying Optimal Trade-Offs between CPU Time Usage and Temporal Constraints Using Search
Shiva Nejati and
Lionel C. Briand
(University of Luxembourg, Luxembourg)
Integration of software from different sources is a critical activity in many embedded systems across most industry sectors. Software integrators are responsible for producing reliable systems that fulfil various functional and performance requirements. In many situations, these requirements inversely impact one another. In particular, embedded system integrators often need to make compromises regarding some of the functional system properties to optimize the use of various resources, such as CPU time. In this paper, motivated by challenges faced by industry, we introduce a multi-objective decision support approach to help balance the minimization of CPU time usage and the satisfaction of temporal constraints in automotive systems. We develop a multi-objective, search-based optimization algorithm, specifically designed to work for large search spaces, to identify optimal trade-off solutions fulfilling these two objectives. We evaluated our algorithm by applying it to a large automotive system. Our results show that our algorithm can find solutions that are very close to the estimated ideal optimal values, and further, it finds significantly better solutions than a random strategy while being faster. Finally, our approach efficiently identifies a large number of diverse solutions, helping domain experts and other stakeholders negotiate the solutions to reach an agreement.
@InProceedings{ISSTA14p351,
author = {Shiva Nejati and Lionel C. Briand},
title = {Identifying Optimal Trade-Offs between CPU Time Usage and Temporal Constraints Using Search},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {351--361},
doi = {},
year = {2014},
}
Generation and Propagation
Fri, Jul 25, 15:40 - 17:20, Almaden Ballroom (Chair: Oksana Tkachuk)
Feedback-Driven Dynamic Invariant Discovery
Lingming Zhang, Guowei Yang, Neha Rungta, Suzette Person, and Sarfraz Khurshid
(University of Texas at Austin, USA; Texas State University, USA; NASA Ames Research Center, USA; NASA Langley Research Center, USA)
Program invariants can help software developers identify
program properties that must be preserved as the software
evolves, however, formulating correct invariants can be challenging.
In this work, we introduce iDiscovery, a technique
which leverages symbolic execution to improve the quality
of dynamically discovered invariants computed by Daikon.
Candidate invariants generated by Daikon are synthesized
into assertions and instrumented onto the program. The instrumented
code is executed symbolically to generate new
test cases that are fed back to Daikon to help further refine
the set of candidate invariants. This feedback loop is executed
until a fix-point is reached. To mitigate the cost of
symbolic execution, we present optimizations to prune the
symbolic state space and to reduce the complexity of the
generated path conditions. We also leverage recent advances
in constraint solution reuse techniques to avoid computing
results for the same constraints across iterations. Experimental
results show that iDiscovery converges to a set of
higher quality invariants compared to the initial set of candidate
invariants in a small number of iterations.
@InProceedings{ISSTA14p362,
author = {Lingming Zhang and Guowei Yang and Neha Rungta and Suzette Person and Sarfraz Khurshid},
title = {Feedback-Driven Dynamic Invariant Discovery},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {362--372},
doi = {},
year = {2014},
}
Link: Exploiting the Web of Data to Generate Test Inputs
Leonardo Mariani, Mauro Pezzè, Oliviero Riganelli, and Mauro Santoro
(University of Milano-Bicocca, Italy; University of Lugano, Switzerland)
Applications that process complex data, such as maps, personal data, book information, travel data, etc., are becoming
extremely common. Testing such applications is hard, because they require realistic and coherent test inputs that are expensive to generate manually and difficult to synthesize automatically. So far the research on test case generation techniques has focused mostly on generating test sequences and synthetic test inputs, and has payed little attention to the generation of complex test inputs.
This paper presents Link, a technique to automatically generate test cases for applications that process complex data. The novel idea of Link is to exploit the Web of Data to generate test data that match the semantics of the related fields, and satisfy the semantic constraints that arise among interrelated fields. Link automatically analyzes the GUI of the application under test, generates a model of the required inputs, queries DBPedia to extract the data that can be used in the tests, and uses the extracted data to
generate complex system test inputs.
The experimental results show that Link can generate realistic and coherent test inputs that can exercise behaviors difficult to exercise with currently available techniques.
@InProceedings{ISSTA14p373,
author = {Leonardo Mariani and Mauro Pezzè and Oliviero Riganelli and Mauro Santoro},
title = {Link: Exploiting the Web of Data to Generate Test Inputs},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {373--384},
doi = {},
year = {2014},
}
Empirically Revisiting the Test Independence Assumption
Sai Zhang, Darioush Jalali, Jochen Wuttke, Kıvanç Muşlu, Wing Lam,
Michael D. Ernst, and David Notkin
(University of Washington, USA)
In a test suite, all the test cases should be independent: no test should affect any other test’s result, and running the tests in any order should produce the same test results. Techniques such as test prioritization generally assume that the tests in a suite are independent. Test dependence is a little-studied phenomenon. This paper presents five results related to test dependence.
First, we characterize the test dependence that arises in practice. We studied 96 real-world dependent tests from 5 issue tracking systems. Our study shows that test dependence can be hard for programmers to identify. It also shows that test dependence can cause non-trivial consequences, such as masking program faults and leading to spurious bug reports.
Second, we formally define test dependence in terms of test suites as ordered sequences of tests along with explicit environments in which these tests are executed. We formulate the problem of detecting dependent tests and prove that a useful special case is NP-complete.
Third, guided by the study of real-world dependent tests, we propose and compare four algorithms to detect dependent tests in a test suite.
Fourth, we applied our dependent test detection algorithms to 4 real-world programs and found dependent tests in each human-written and automatically-generated test suite.
Fifth, we empirically assessed the impact of dependent tests on five test prioritization techniques. Dependent tests affect the output of all five techniques; that is, the reordered suite fails even though the original suite did not.
@InProceedings{ISSTA14p385,
author = {Sai Zhang and Darioush Jalali and Jochen Wuttke and Kıvanç Muşlu and Wing Lam and Michael D. Ernst and David Notkin},
title = {Empirically Revisiting the Test Independence Assumption},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {385--396},
doi = {},
year = {2014},
}
aec-badge-issta
An Empirical Study of Injected versus Actual Interface Errors
Anna Lanzaro,
Roberto Natella, Stefan Winter, Domenico Cotroneo, and Neeraj Suri
(Federico II University of Naples, Italy; TU Darmstadt, Germany)
The reuse of software components is a common practice in commercial applications and increasingly appearing in safety critical systems as driven also by cost considerations. This practice puts dependability at risk, as differing operating conditions in different reuse scenarios may expose residual software faults in the components. Consequently, software fault injection techniques are used to assess how residual faults of reused software components may affect the system, and to identify appropriate counter-measures.
As fault injection in components’ code suffers from a number of practical disadvantages, it is often replaced by error injection at the component interface level. However, it is still an open issue, whether such injected errors are actually representative of the effects of residual faults. To this end, we propose a method for analyzing how software faults turn into interface errors, with the ultimate aim of supporting more representative interface error injection experiments.
Our analysis in the context of widely used software libraries reveals that existing interface error models are not suitable for emulating software faults, and provides useful insights for
improving the representativeness of interface error injection.
@InProceedings{ISSTA14p397,
author = {Anna Lanzaro and Roberto Natella and Stefan Winter and Domenico Cotroneo and Neeraj Suri},
title = {An Empirical Study of Injected versus Actual Interface Errors},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {397--408},
doi = {},
year = {2014},
}
Tool Demonstrations
Wed, Jul 23, 18:00 - 22:00, Winchester Room
Legend: An Agile DSL Toolset for Web Acceptance Testing
Tariq M. King, Gabriel Nunez, Dionny Santiago, Adam Cando, and Cody Mack
(Ultimate Software, USA)
Agile development emphasizes collaborations among customers, business analysts, domain experts, developers, and testers. However, the large scale and rapid pace of many agile projects presents challenges during testing activities. Large sets of test artifacts must be comprehensible and available to various stakeholders, traceable to requirements, and easily maintainable as the software evolves. In this paper we describe Legend, a toolset that leverages domain-specific language to streamline functional testing in agile projects. Some key features of the toolset include test template generation from user stories, model-based automation, test inventory synchronization, and centralized test tagging.
@InProceedings{ISSTA14p409,
author = {Tariq M. King and Gabriel Nunez and Dionny Santiago and Adam Cando and Cody Mack},
title = {Legend: An Agile DSL Toolset for Web Acceptance Testing},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {409--412},
doi = {},
year = {2014},
}
Video
ProCrawl: Mining Test Models from Multi-user Web Applications
Matthias Schur, Andreas Roth, and Andreas Zeller
(SAP, Germany; Saarland University, Germany)
Today's web applications demand very high release cycles--and consequently, frequent tests. Automating these tests typically requires a behavior model: A description of the states the application can be in, the transitions between these states, and the expected results. Furthermore one needs scripts to make the abstract actions (transitions) in the model executable. However, specifying such behavior models and writing the necessary scripts manually is a hard task. We present ProCrawl (Process Crawler), a tool that automatically mines (extended) finite-state machines from (multi-user) web applications and generates executable test scripts. ProCrawl explores the behavior of the application by systematically generating program runs and observing changes on the application's user interface. The resulting models can be directly used for effective model-based testing, in particular regression testing.
@InProceedings{ISSTA14p413,
author = {Matthias Schur and Andreas Roth and Andreas Zeller},
title = {ProCrawl: Mining Test Models from Multi-user Web Applications},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {413--416},
doi = {},
year = {2014},
}
X-PERT: A Web Application Testing Tool for Cross-Browser Inconsistency Detection
Shauvik Roy Choudhary, Mukul R. Prasad, and
Alessandro Orso
(Georgia Tech, USA; Fujitsu Labs, USA)
Web applications are popular among developers because of the ease of development and deployment through the ubiquitous web browsing platform. However, differences in a web application's execution across different web browsers manifest as Cross-browser Inconsistencies (XBIs), which are a serious concern for web developers. Testing for XBIs manually is a laborious and error-prone process. In this demo we present X-PERT, which is a tool to identify XBIs in web applications automatically, without requiring any effort from the developer. X-PERT implements a comprehensive technique to identify XBIs and has been found to be effective in detecting real-world XBIs in our empirical evaluation. The source code of X-PERT and XBI reports from our evaluation are available at http://gatech.github.io/xpert.
@InProceedings{ISSTA14p417,
author = {Shauvik Roy Choudhary and Mukul R. Prasad and Alessandro Orso},
title = {X-PERT: A Web Application Testing Tool for Cross-Browser Inconsistency Detection},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {417--420},
doi = {},
year = {2014},
}
Video
Info
Extending a Search-Based Test Generator with Adaptive Dynamic Symbolic Execution
Juan Pablo Galeotti, Gordon Fraser, and Andrea Arcuri
(Saarland University, Germany; University of Sheffield, UK; Simula Research Laboratory, Norway)
Automatic unit test generation aims to support developers by alleviating the burden of test writing. Different techniques have been proposed over the years, each with distinct limitations. To overcome these limitations, we present an extension to the EvoSuite unit test generator that combines two of the most popular techniques for test case generation: Search-Based Software Testing (SBST) and Dynamic Symbolic Execution (DSE). A novel integration of DSE as a step of local improvement in a genetic algorithm results in an adaptive approach, such that the best test generation technique for the problem at hand is favoured, resulting in overall higher code coverage.
@InProceedings{ISSTA14p421,
author = {Juan Pablo Galeotti and Gordon Fraser and Andrea Arcuri},
title = {Extending a Search-Based Test Generator with Adaptive Dynamic Symbolic Execution},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {421--424},
doi = {},
year = {2014},
}
Canalyze: A Static Bug-Finding Tool for C Programs
Zhenbo Xu,
Jian Zhang, Zhongxing Xu, and Jiteng Wang
(University of Science and Technology of China, China; Institute of Software at Chinese Academy of Sciences, China; Beijing University of Posts and Telecommunications, China)
Symbolic analysis is a commonly used approach for static bug finding. It usually performs a precise path-by-path symbolic simulation from program inputs. A major challenge is its scalability and precision on interprocedural analysis. The former limits the application to large programs. The latter may lead to many false alarms.
This paper presents a flexible, scalable and practical static bug detection tool, called Canalyze, for C programs. The flexibility is embodied in our modular design that supports
different precision-level constraint solvers and interprocedural analyses. Based on these options, one can enable the less precise options to achieve a more scalable analysis or the more time-consuming options to perform a more precise analysis. Our tool is also practical to analyze real-world applications. It has been applied to some industry systems and open source programs like httpd, lighttpd, etc. And hundreds of newly found bugs were confirmed by the maintainers of our benchmarks.
@InProceedings{ISSTA14p425,
author = {Zhenbo Xu and Jian Zhang and Zhongxing Xu and Jiteng Wang},
title = {Canalyze: A Static Bug-Finding Tool for C Programs},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {425--428},
doi = {},
year = {2014},
}
Video
Info
MuCheck: An Extensible Tool for Mutation Testing of Haskell Programs
Duc Le, Mohammad Amin Alipour, Rahul Gopinath, and
Alex Groce
(Oregon State University, USA)
This paper presents MuCheck, a mutation testing tool for Haskell
programs. MuCheck is a counterpart to the widely used QuickCheck
random testing tool for functional programs, and can be used to
evaluate the efficacy of QuickCheck property definitions. The tool
implements mutation operators that are specifically designed for
functional programs, and makes use of the type system of Haskell to
achieve a more relevant set of mutants than otherwise
possible. Mutation coverage is particularly valuable for functional
programs due to highly compact code, referential transparency, and
clean semantics; these make augmenting a test suite or specification
based on surviving mutants a practical method for improved testing.
@InProceedings{ISSTA14p429,
author = {Duc Le and Mohammad Amin Alipour and Rahul Gopinath and Alex Groce},
title = {MuCheck: An Extensible Tool for Mutation Testing of Haskell Programs},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {429--432},
doi = {},
year = {2014},
}
The Major Mutation Framework: Efficient and Scalable Mutation Analysis for Java
René Just
(University of Washington, USA)
Mutation analysis seeds artificial faults (mutants) into a pro-
gram and evaluates testing techniques by measuring how
well they detect those mutants. Mutation analysis is well-
established in software engineering research but hardly used
in practice due to inherent scalability problems and the lack
of proper tool support. In response to those challenges, this
paper presents Major, a framework for mutation analysis
and fault seeding. Major provides a compiler-integrated mu-
tator and a mutation analyzer for JUnit tests.
Major implements a large set of optimizations to enable
efficient and scalable mutation analysis of large software sys-
tems. It has already been applied to programs with more
than 200,000 lines of code and 150,000 mutants. Moreover,
Major features its own domain specific language and is de-
signed to be highly configurable to support fundamental re-
search in software engineering. Due to its efficiency and
flexibility, the Major mutation framework is suitable for the
application of mutation analysis in research and practice. It
is publicly available at http://mutation-testing.org.
@InProceedings{ISSTA14p433,
author = {René Just},
title = {The Major Mutation Framework: Efficient and Scalable Mutation Analysis for Java},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {433--436},
doi = {},
year = {2014},
}
Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs
René Just, Darioush Jalali, and
Michael D. Ernst
(University of Washington, USA)
Empirical studies in software testing research may not be
comparable, reproducible, or characteristic of practice. One
reason is that real bugs are too infrequently used in software
testing research. Extracting and reproducing real bugs is
challenging and as a result hand-seeded faults or mutants
are commonly used as a substitute.
This paper presents Defects4J, a database and extensible
framework providing real bugs to enable reproducible studies
in software testing research. The initial version of Defects4J
contains 357 real bugs from 5 real-world open source pro-
grams. Each real bug is accompanied by a comprehensive
test suite that can expose (demonstrate) that bug. Defects4J
is extensible and builds on top of each program’s version con-
trol system. Once a program is configured in Defects4J, new
bugs can be added to the database with little or no effort.
Defects4J features a framework to easily access faulty and
fixed program versions and corresponding test suites. This
framework also provides a high-level interface to common
tasks in software testing research, making it easy to con-
duct and reproduce empirical studies. Defects4J is publicly
available at http://defects4j.org.
@InProceedings{ISSTA14p437,
author = {René Just and Darioush Jalali and Michael D. Ernst},
title = {Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {437--440},
doi = {},
year = {2014},
}
A Format String Checker for Java
Konstantin Weitz, Siwakorn Srisakaokul, Gene Kim, and Michael D. Ernst
(University of Washington, USA)
Java supports format strings, but their use is error prone
because: Java’s type system does not find any but the most
trivial mistakes, Java’s format methods fail silently, and for-
mat methods are often executed infrequently.
This paper presents the Format String Checker that is
based on the format string type system presented in [3].
The Format String Checker guarantees that calls to Java’s
Formatter API will not throw exceptions.
We evaluate the Format String Checker on 6 large and
well-maintained open-source projects. Format string bugs
are common in practice (we found 104 bugs), and the an-
notation burden on the user of our type system is low (on
average, for every bug found, only 1.0 annotations need to
be written).
@InProceedings{ISSTA14p441,
author = {Konstantin Weitz and Siwakorn Srisakaokul and Gene Kim and Michael D. Ernst},
title = {A Format String Checker for Java},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {441--444},
doi = {},
year = {2014},
}
Info
Constructing Coding Duels in Pex4Fun and Code Hunt
Nikolai Tillmann,
Jonathan de Halleux,
Tao Xie, and Judith Bishop
(Microsoft Research, USA; University of Illinois at Urbana-Champaign, USA)
Pex is an automatic white-box test-generation tool for .NET. We have established that games can be built on top of Pex to open the tool to students and to the general public. In particular, we have released Pex4Fun (www.pexforfun.com) and its successor Code Hunt (www.codehunt.com) as web-based educational gaming environments for teaching and learning programming and software engineering. In Pex4Fun and Code Hunt, the main game type is a coding duel, where a player writes code in a method to achieve the same functionality as the secret method implementation, based on feedback provided by the underlying Pex tool. Players iteratively modify their code to match the functional behavior of the secret method. The scope of duels extends from the simplest one-line method to those including advanced concepts such as writing parameterized unit tests and code contracts. We have also used the game type for competitions with thousands of players, and have found that it differentiates well between beginners and top coders. This tool demonstration shows how coding duels in Pex4Fun and Code Hunt can be constructed and used in teaching and training programming and software engineering.
@InProceedings{ISSTA14p445,
author = {Nikolai Tillmann and Jonathan de Halleux and Tao Xie and Judith Bishop},
title = {Constructing Coding Duels in Pex4Fun and Code Hunt},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {445--448},
doi = {},
year = {2014},
}
Doctoral Symposium
Tue, Jul 22, 11:20 - 14:50, Pacific Room
Reusing Constraint Proofs for Scalable Program Analysis
Meixian Chen
(University of Lugano, Switzerland)
Despite the recent advances of research and technology in the area of automated constraint solvers, constraint solving remains a major bottleneck for the scalability of many techniques for program analysis. Recent studies indicate that this problem can be mitigated by caching the proofs for the constraints that occur repeatedly during the analysis of the same or similar programs, showing preliminary evidence that this can result in significantly reducing the analysis time.
My PhD research draws on this initial results and aims to bring the technology for storing and reusing constraint proofs at an entirely new stage of maturity. We believe that equivalent constraints occur frequently across programs, and aim to turn the problem of solving the constraints into a fast and reliable search over proofs shared among different projects and teams through distributed data stores.
@InProceedings{ISSTA14p449,
author = {Meixian Chen},
title = {Reusing Constraint Proofs for Scalable Program Analysis},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {449--452},
doi = {},
year = {2014},
}
Effective Test Generation and Adequacy Assessment for JavaScript-Based Web Applications
Shabnam Mirshokraie
(University of British Columbia, Canada)
Modern web applications rely heavily on JavaScript and client-side runtime manipulation of the DOM (Document Object Model) tree. However, JavaScript is loosely typed, dynamic, and challenging to analyze and test. We propose an automated technique to generate regression test cases at two complementary levels: (1) individual JavaScript functions, and (2) DOM event sequences. Moreover, to assess the quality of the test cases we propose a mutation testing technique that leverages static and dynamic program analysis to guide the mutation generation process towards parts of the code that are error-prone or likely to influence the program's output.
@InProceedings{ISSTA14p453,
author = {Shabnam Mirshokraie},
title = {Effective Test Generation and Adequacy Assessment for JavaScript-Based Web Applications},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {453--456},
doi = {},
year = {2014},
}
Efficient Statistical Debugging via Hierarchical Instrumentation
Zhiqiang Zuo
(National University of Singapore, Singapore)
Debugging is known to be a notoriously painstaking and time-consuming task. As one major family of automated debugging, statistical debugging approaches have been well investigated over the past decade to assist in debugging. All these approaches instrument the entire buggy program to produce execution profiles for debugging. Consequently, they often incur hefty instrumentation and analysis cost. However, as in fact major part of the program code is error-free, full-scale program instrumentation is wasteful and unnecessary. In this doctoral research, a novel hierarchical instrumentation (HI) technique is devised to perform selective instrumentation so as to make statistical debugging more efficient while upholding the debugging effectiveness. We apply HI to two different categories of statistical debugging: in-house and cooperative debugging. The experiments validate that HI can greatly improve the efficiency of debugging.
@InProceedings{ISSTA14p457,
author = {Zhiqiang Zuo},
title = {Efficient Statistical Debugging via Hierarchical Instrumentation},
booktitle = {Proc.\ ISSTA},
publisher = {ACM},
pages = {457--460},
doi = {},
year = {2014},
}
proc time: 0.81