Powered by
2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE),
November 11-15, 2013,
Palo Alto, USA
Preface
Message from the Chairs
This conference publication contains the proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE 2013), held at the Crowne Plaza Cabana Hotel in Palo Alto, California on November 11–15, 2013. The IEEE/ACM Automated Software Engineering (ASE) Conference series is the premier research forum for automating software engineering. Each year, it brings together researchers and practitioners from academia and industry to discuss foundations, techniques and tools for automating the analysis, design, implementation, testing, and maintenance of large software systems.
Invited Papers
BOOM: Experiences in Language and Tool Design for Distributed Systems (Keynote)
Joseph M. Hellerstein
(University of California at Berkeley, USA)
With the rapid expansion of cloud infrastructure and mobile devices, distributed systems have quickly emerged as a dominant computing platform. Distributed systems bring significant complexity to programming, due to platform issues including asynchrony, concurrency, and partial failure. Meanwhile, scalable distributed infrastructure---notably ``NoSQL'' systems---have put additional burdens on programmers by sacrificing traditional infrastructure contracts like linearizable or transactional I/O in favor of high availability. A growing segment of the developer community needs to deal with these issues today, and for the most part developers are still using languages and tools designed for sequential computation on tightly coupled architectures. This has led to software that is increasingly hard to test and hard to trust.
Over the past 5 years, the BOOM project at Berkeley has focused on making it easier to write correct and maintainable code for distributed systems. Our work has taken a number of forms, including the development of the Bloom programming language for distributed systems, tools for testing and checking distributed programs, and the CALM Theorem, which connects programmer level concerns of determinism to system-level concerns about the need for distributed coordination. This talk will reflect on this work, and highlight opportunities for improved collaboration between the software engineering and distributed systems research communities.
@InProceedings{ASE13p1,
author = {Joseph M. Hellerstein},
title = {BOOM: Experiences in Language and Tool Design for Distributed Systems (Keynote)},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {1--1},
doi = {},
year = {2013},
}
The Challenges of Verification and Validation of Automated Planning Systems (Keynote)
Jeremy Frank
(NASA Ames Research Center, USA)
Mission planning is central to space mission operations, and has benefited from advances in model-based planning software. A model is a description of the objects, actions, constraints and preferences that the planner reasons over to generate plans. Developing, verifying and validating a planning model is, however, a difficult task. Mission planning constraints and preferences arise from many sources, including simulators and engineering specification documents. As mission constraints evolve, planning domain modelers must add and update model constraints efficiently using the available source data, catching errors quickly, and correcting the model. The consequences of erroneous models are very high, especially in the space operations environment.
We first describe the space operations environment, particularly the role of the mission planning system. We then describe model-based planning, and briefly review the current state of the practice in designing model-based mission planning tools and the challenges facing model developers. We then describe an Interactive Model Development Environment (IMDE) approach to developing mission planning systems. This approach integrates modeling and simulation environments to reduce model editing time, generate simulations automatically to evaluate plans, and identify modeling errors automatically by evaluating simulation output. The IMDE approach was tested on a small subset of the Lunar Atmosphere and Dust Environment Explorer (LADEE) flight software to demonstrate how to develop the LADEE mission planning system.
@InProceedings{ASE13p2,
author = {Jeremy Frank},
title = {The Challenges of Verification and Validation of Automated Planning Systems (Keynote)},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {2--2},
doi = {},
year = {2013},
}
Big Problems in Industry (Panel)
John Penix
(Google, USA)
Software Engineering in practice deals with scale in a variety of
dimensions. We build large scale systems operating on vast amount of
data. We have millions of customers with billions of queries and
transactions. We have distributed teams making thousands of changes,
running millions of tests and releasing multiple times per day.
These dimensions of scale interact to provide challenges for software
development tools and processes. The panelists will describe the
challenging aspects of scale in their specific problem domains and
discuss which software engineering methods work and which leave room
for improvement.
@InProceedings{ASE13p3,
author = {John Penix},
title = {Big Problems in Industry (Panel)},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {3--3},
doi = {},
year = {2013},
}
Technical Research Track
Concurrency
Round-Up: Runtime Checking Quasi Linearizability of Concurrent Data Structures
Lu Zhang, Arijit Chattopadhyay, and Chao Wang
(Virginia Tech, USA)
We propose a new method for runtime checking of a relaxed consistency
property called quasi linearizability for concurrent data
structures. Quasi linearizability generalizes the standard notion of
linearizability by intentionally introducing nondeterminism
into the parallel computations and exploiting such nondeterminism to
improve the performance. However, ensuring the quantitative aspects
of this correctness condition in the low level code is a
difficult task.
Our method is the first fully automated method for checking quasi
linearizability in the unmodified C/C++ code of concurrent data
structures. It guarantees that all the reported quasi linearizability
violations are real violations.
We have implemented our method in a software tool based on LLVM and a
concurrency testing tool called Inspect.
Our experimental evaluation shows that the new method is effective in
detecting quasi linearizability violations in the source code of
concurrent data structures.
@InProceedings{ASE13p4,
author = {Lu Zhang and Arijit Chattopadhyay and Chao Wang},
title = {Round-Up: Runtime Checking Quasi Linearizability of Concurrent Data Structures},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {4--14},
doi = {},
year = {2013},
}
Constraint-Based Automatic Symmetry Detection
Shao Jie Zhang, Jun Sun, Chengnian Sun,
Yang Liu, Junwei Ma, and Jin Song Dong
(Singapore University of Technology and Design, Singapore; National University of Singapore, Singapore; Nanyang Technological University, Singapore)
We present an automatic approach to detecting symmetry relations for general concurrent models. Despite the success of symmetry reduction in mitigating state explosion problem, one essential step towards its soundness and effectiveness, i.e., how to discover sufficient symmetries with least human efforts, is often either overlooked or oversimplified. In this work, we show how a concurrent model can be viewed as a constraint satisfaction problem (CSP), and present an algorithm capable of detecting symmetries arising from the CSP which induce automorphisms of the model. To the best of our knowledge, our method is the first approach that can automatically detect both process and data symmetries as demonstrated via a number of systems.
@InProceedings{ASE13p15,
author = {Shao Jie Zhang and Jun Sun and Chengnian Sun and Yang Liu and Junwei Ma and Jin Song Dong},
title = {Constraint-Based Automatic Symmetry Detection},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {15--25},
doi = {},
year = {2013},
}
Proving MCAPI Executions Are Correct using SMT
Yu Huang, Eric Mercer, and Jay McCarthy
(Brigham Young University, USA)
Asynchronous message passing is an important paradigm in writing applications for embedded heterogeneous multicore systems. The Multicore Association (MCA), an industry consortium promoting
multicore technology, is working to standardize message passing into a single API, MCAPI, for bare metal implementation and portability across platforms. Correctness in such an API is difficult to reason about manually, and testing against reference solutions is equally difficult as reference solutions implement an unknown set of allowed behaviors, and programmers have no way to directly control API internals to expose or reproduce errors. This paper provides a way to encode an MCAPI execution as a Satisfiability Modulo Theories (SMT) problem, which if satisfiable, yields a feasible execution schedule on the same trace, such that it resolves non-determinism in the MCAPI runtime in a way that it now fails user provided assertions. The paper proves the problem is NP-complete. The encoding is useful for test, debug, and verification of MCAPI program execution. The novelty in
the encoding is the direct use of match pairs (potential send and receive couplings). Match-pair encoding for MCAPI executions, when compared to other encoding strategies, is simpler to reason about, results in significantly fewer terms in the SMT problem, and captures feasible behaviors that are ignored in previously published techniques. Further, to our knowledge, this is the first SMT encoding that is able to run in infinite-buffer semantics, meaning the runtime has unlimited internal buffering as opposed to no internal buffering. Results demonstrate that the SMT encoding, restricted to zero-buffer semantics, uses fewer clauses when compared to another zero-buffer technique, and it runs faster and uses less memory. As a result the encoding scales well for programs with high levels of non-determinism in how sends and receives may potentially match.
@InProceedings{ASE13p26,
author = {Yu Huang and Eric Mercer and Jay McCarthy},
title = {Proving MCAPI Executions Are Correct using SMT},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {26--36},
doi = {},
year = {2013},
}
Efficient Data Race Prediction with Incremental Reasoning on Time-Stamped Lock History
Malay K. Ganai
(NEC Labs, USA)
We present an efficient data race prediction algorithm that uses lock-reordering based incremental search on time-stamped lock histories for solving multiple races effectively. We balance prediction accuracy, coverage, and performance with a specially designed pairwise reachability algorithm that can store and re-use past search results, thereby, amortizing the cost of reasoning over redundant and overlapping search space. Compared to graph-based search algorithms, our algorithm incurs much smaller overhead due to amortization, and can potentially be used while a program under test is executing. To demonstrate such a possibility, we implemented our approach as an (iPA) module in a predictive testing framework. Our approach can handle traces with a few hundreds to half a million events, and predict known/unknown real data races with a performance penalty of less than 4
@InProceedings{ASE13p37,
author = {Malay K. Ganai},
title = {Efficient Data Race Prediction with Incremental Reasoning on Time-Stamped Lock History},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {37--47},
doi = {},
year = {2013},
}
Dynamic Analysis
PIEtrace: Platform Independent Executable Trace
Yonghwi Kwon, Xiangyu Zhang, and Dongyan Xu
(Purdue University, USA)
To improve software dependability, a large number
of software engineering tools have been developed over years.
Many of them are difficult to apply in practice because their
system and library requirements are incompatible with those of
the subject software. We propose a technique called platform
independent executable trace. Our technique traces and virtualizes
a regular program execution that is platform dependent, and
generates a stand-alone program called the trace program. Running
the trace program re-generates the original execution. More
importantly, trace program execution is completely independent
of the underlying operating system and libraries such that it can
be compiled and executed on arbitrary platforms. As such, it
can be analyzed by a third party tool on a platform preferred
by the tool. We have implemented the technique on x86 and
sensor platforms. We show that buggy executions of 10 real-world
Windows and sensor applications can be traced and virtualized,
and later analyzed by existing Linux tools. We also
demonstrate how the technique can be used in cross-platform
malware analysis.
@InProceedings{ASE13p48,
author = {Yonghwi Kwon and Xiangyu Zhang and Dongyan Xu},
title = {PIEtrace: Platform Independent Executable Trace},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {48--58},
doi = {},
year = {2013},
}
Info
ACM SIGSOFT Distinguished Paper Award
Improving Efficiency of Dynamic Analysis with Dynamic Dependence Summaries
Vijay Krishna Palepu, Guoqing Xu, and
James A. Jones
(University of California at Irvine, USA)
Modern applications make heavy use of third-party libraries and components, which poses new challenges for efficient dynamic analysis. To perform such analyses, transitive dependent components at all layers of the call stack must be monitored and analyzed, and as such may be prohibitively expensive for systems with large libraries and components. As an approach to address such expenses, we record, summarize, and reuse dynamic dataflows between inputs and outputs of components, based on dynamic control and data traces. These summarized dataflows are computed at a fine-grained instruction level; the result of which, we call “dynamic dependence summaries.” Although static summaries have been proposed, to the best of our knowledge, this work presents the first technique for dynamic dependence summaries. The benefits to efficiency of such summarization may be afforded with losses of accuracy. As such, we evaluate the degree of accuracy loss and the degree of efficiency gain when using dynamic dependence summaries of library methods. On five large programs from the DaCapo benchmark (for which no existing whole-program dynamic dependence analyses have been shown to scale) and 21 versions of NANOXML, the summarized dependence analysis provided 90% accuracy and a speed-up of 100% (i.e., ×2), on average, when compared to traditional exhaustive dynamic dependence analysis.
@InProceedings{ASE13p59,
author = {Vijay Krishna Palepu and Guoqing Xu and James A. Jones},
title = {Improving Efficiency of Dynamic Analysis with Dynamic Dependence Summaries},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {59--69},
doi = {},
year = {2013},
}
Efficient Parametric Runtime Verification with Deterministic String Rewriting
Patrick Meredith and
Grigore Roşu
(University of Illinois at Urbana-Champaign, USA)
Early efforts in runtime verification show
that parametric regular and temporal logic specifications
can be monitored efficiently. These approaches, however,
have limited expressiveness: their specifications always
reduce to monitors with finite state.
More recent developments
showed that parametric context-free properties can be efficiently
monitored with overheads generally lower than 12-15%.
While context-free grammars are more expressive than finite-state
languages, they still do not allow every computable safety property.
This paper presents a monitor synthesis algorithm for string rewriting
systems (SRS).
SRSs are well known to be Turing complete, allowing
for the formal specification of any computable safety property.
Earlier attempts at Turing complete monitoring have been relatively
inefficient. This paper demonstrates that monitoring parametric
SRSs is practical. The presented algorithm uses
a modified version of Aho-Corasick string searching
for quick pattern matching with an incremental rewriting approach
that avoids reexamining parts of the string known to contain no
redexes.
@InProceedings{ASE13p70,
author = {Patrick Meredith and Grigore Roşu},
title = {Efficient Parametric Runtime Verification with Deterministic String Rewriting},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {70--80},
doi = {},
year = {2013},
}
Info
Identifying Execution Points for Dynamic Analyses
William N. Sumner and
Xiangyu Zhang
(Simon Fraser University, Canada; Purdue University, USA)
Dynamic analyses rely on the ability to identify points within or across
executions.
In spite of this being a core task for dynamic analyses, new solutions are
frequently developed without an awareness of existing solutions,
their strengths, their weaknesses, or their caveats.
This paper surveys the existing approaches for identifying execution points and
examines their analytical and empirical properties that researchers and
developers should be aware of when using them within an analysis.
In addition, based on limitations in precision, correctness, and efficiency
for techniques that identify corresponding execution points across multiple
executions, we designed and implemented a new technique, Precise Execution Point
IDs.
This technique avoids correctness and precision issues in prior
solutions, enabling analyses that use our approach to also produce more correct
results.
Empirical comparison with the surveyed techniques shows that our approach has
25% overhead on average, several times less than existing solutions.
@InProceedings{ASE13p81,
author = {William N. Sumner and Xiangyu Zhang},
title = {Identifying Execution Points for Dynamic Analyses},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {81--91},
doi = {},
year = {2013},
}
Testing
Operator-Based and Random Mutant Selection: Better Together
Lingming Zhang, Milos Gligoric,
Darko Marinov, and Sarfraz Khurshid
(University of Texas at Austin, USA; University of Illinois at Urbana-Champaign, USA)
Mutation testing is a powerful methodology for evaluating the quality of a test suite. However, the methodology is also very costly, as the test suite may have to be executed for each mutant. Selective mutation testing is a well-studied technique to reduce this cost by selecting a subset of all mutants, which would otherwise have to be considered in their entirety. Two common approaches are operator-based mutant selection, which only generates mutants using a subset of mutation operators, and random mutant selection, which selects a subset of mutants generated using all mutation operators. While each of the two approaches provides some reduction in the number of mutants to execute, applying either of the two to medium-sized, real- world programs can still generate a huge number of mutants, which makes their execution too expensive. This paper presents eight random sampling strategies defined on top of operator- based mutant selection, and empirically validates that operator- based selection and random selection can be applied in tandem to further reduce the cost of mutation testing. The experimental results show that even sampling only 5% of mutants generated by operator-based selection can still provide precise mutation testing results, while reducing the average mutation testing time to 6.54% (i.e., on average less than 5 minutes for this study).
@InProceedings{ASE13p92,
author = {Lingming Zhang and Milos Gligoric and Darko Marinov and Sarfraz Khurshid},
title = {Operator-Based and Random Mutant Selection: Better Together},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {92--102},
doi = {},
year = {2013},
}
Info
Testing Properties of Dataflow Program Operators
Zhihong Xu, Martin Hirzel, Gregg Rothermel, and Kun-Lung Wu
(University of Nebraska-Lincoln, USA; IBM Research, USA)
Dataflow programming languages, which represent
programs as graphs of data streams and operators, are becoming
increasingly popular and being used to create a wide array of
commercial software applications. The dependability of programs
written in these languages, as well as the systems used to
compile and run these programs, hinges on the correctness of
the semantic properties associated with operators. Unfortunately,
these properties are often poorly defined, and frequently are not
checked, and this can lead to a wide range of problems in the
programs that use the operators. In this paper we present an
approach for improving the dependability of dataflow programs
by checking operators for necessary properties. Our approach is
dynamic, and involves generating tests whose results are checked
to determine whether specific properties hold or not. We present
empirical data that shows that our approach is both effective
and efficient at assessing the status of properties.
@InProceedings{ASE13p103,
author = {Zhihong Xu and Martin Hirzel and Gregg Rothermel and Kun-Lung Wu},
title = {Testing Properties of Dataflow Program Operators},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {103--113},
doi = {},
year = {2013},
}
Bita: Coverage-Guided, Automatic Testing of Actor Programs
Samira Tasharofi,
Michael Pradel, Yu Lin, and Ralph E. Johnson
(University of Illinois at Urbana-Champaign, USA; ETH Zurich, Switzerland)
Actor programs are concurrent programs where concurrent entities communicate
asynchronously by exchanging messages.
Testing actor programs is challenging because the order of message receives
depends on the non-deterministic scheduler and because exploring all
schedules does not scale to large programs.
This paper presents Bita, a scalable, automatic approach for testing
non-deterministic behavior of actor programs.
The key idea is to generate and explore schedules that are likely to reveal
concurrency bugs because these schedules increase the schedule coverage.
We present three schedule coverage criteria for actor programs, an algorithm
to generate feasible schedules that increase coverage, and a technique to
force a program to comply with a schedule.
Applying Bita to real-world actor programs implemented in Scala reveals
eight previously unknown concurrency bugs, of which six have already been
fixed by the developers.
Furthermore, we show our approach to find bugs 122x faster than random
scheduling, on average.
@InProceedings{ASE13p114,
author = {Samira Tasharofi and Michael Pradel and Yu Lin and Ralph E. Johnson},
title = {Bita: Coverage-Guided, Automatic Testing of Actor Programs},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {114--124},
doi = {},
year = {2013},
}
SABRINE: State-Based Robustness Testing of Operating Systems
Domenico Cotroneo, Domenico Di Leo, Francesco Fucci, and
Roberto Natella
(Università degli Studi di Napoli Federico II, Italy; Critiware, Italy)
The assessment of operating systems robustness with respect to unexpected or anomalous events is a fundamental requirement for mission-critical systems. Robustness can be tested by deliberately exposing the system to erroneous events during its execution, and then analyzing the OS behavior to evaluate its ability to gracefully handle these events. Since OSs are complex and stateful systems, robustness testing needs to account for the timing of erroneous events, in order to evaluate the robust behavior of the OS under different states. This paper presents SABRINE (StAte-Based Robustness testIng of operatiNg systEms), an approach for state-aware robustness testing of OSs. SABRINE automatically extracts state models from execution traces, and generates a set of test cases that cover different OS states. We evaluate the approach on a Linux-based Real-Time Operating System adopted in the avionic domain. Experimental results show that SABRINE can automatically identify relevant OS states, and find robustness vulnerabilities while keeping low the number of test cases.
@InProceedings{ASE13p125,
author = {Domenico Cotroneo and Domenico Di Leo and Francesco Fucci and Roberto Natella},
title = {SABRINE: State-Based Robustness Testing of Operating Systems},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {125--135},
doi = {},
year = {2013},
}
Verification
Blitz: Compositional Bounded Model Checking for Real-World Programs
Chia Yuan Cho, Vijay D’Silva, and Dawn Song
(University of California at Berkeley, USA)
Bounded Model Checking (BMC) for software is a
precise bug-finding technique that builds upon the efficiency of
modern SAT and SMT solvers. BMC currently does not scale
to large programs because the size of the generated formulae
exceeds the capacity of existing solvers. We present a new,
compositional and property-sensitive algorithm that enables BMC
to automatically find bugs in large programs. A novel feature of
our technique is to decompose the behaviour of a program into
a sequence of BMC instances and use a combination of satisfying
assignments and unsatisfiability proofs to propagate information
across instances. A second novelty is to use the control- and
data-flow of the program as well as information from proofs
to prune the set of variables and procedures considered and
hence, generate smaller instances. Our tool BLITZ outperforms
existing tools and scales to programs with over 100,000 lines
of code. BLITZ automatically and efficiently discovers bugs
in widely deployed software including new vulnerabilities in
Internet infrastructure software.
@InProceedings{ASE13p136,
author = {Chia Yuan Cho and Vijay D’Silva and Dawn Song},
title = {Blitz: Compositional Bounded Model Checking for Real-World Programs},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {136--146},
doi = {},
year = {2013},
}
Ranger: Parallel Analysis of Alloy Models by Range Partitioning
Nicolás Rosner, Junaid H. Siddiqui,
Nazareno Aguirre, Sarfraz Khurshid, and
Marcelo F. Frias
(Universidad de Buenos Aires, Argentina; LUMS School of Science and Engineering, Pakistan; Universidad Nacional de Río Cuarto, Argentina; University of Texas at Austin, USA; Instituto Tecnológico de Buenos Aires, Argentina)
We present a novel approach for parallel analysis of models written in Alloy, a declarative extension of first-order logic based on relations. The Alloy language is supported by the fully automatic Alloy Analyzer, which translates models into propositional formulas and uses off-the-shelf SAT technology to solve them. Our key insight is that the underlying constraint satisfaction problem can be split into subproblems of lesser complexity by using ranges of candidate solutions, which partition the space of all candidate solutions. Conceptually, we define a total ordering among the candidate solutions, split this space of candidates into ranges, and let independent SAT searches take place within these ranges’ endpoints. Our tool, Ranger, embodies our insight. Experimental evaluation shows that Ranger provides substantial speedups (in several cases, superlinear ones) for a variety of hard-to-solve Alloy models, and that adding more hardware reduces analysis costs almost linearly.
@InProceedings{ASE13p147,
author = {Nicolás Rosner and Junaid H. Siddiqui and Nazareno Aguirre and Sarfraz Khurshid and Marcelo F. Frias},
title = {Ranger: Parallel Analysis of Alloy Models by Range Partitioning},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {147--157},
doi = {},
year = {2013},
}
Automated Verification of Pattern-Based Interaction Invariants in Ajax Applications
Yuta Maezawa, Hironori Washizaki, Yoshinori Tanabe, and Shinichi Honiden
(University of Tokyo, Japan; Waseda University, Japan; National Institute of Informatics, Japan)
When developing asynchronous JavaScript and XML (Ajax) applications, developers implement Ajax design patterns for increasing the usability of the applications. However, unpredictable contexts of running applications might conceal faults that will break the design patterns, which decreases usability. We propose a support tool called JSVerifier that automatically verifies interaction invariants; the applications handle their interactions in invariant occurrence and order. We also present a selective set of interaction invariants derived from Ajax design patterns, as input. If the application behavior breaks the design patterns, JSVerifier automatically outputs faulty execution paths for debugging. The results of our case studies show that JSVerifier can verify the interaction invariants in a feasible amount of time, and we conclude that it can help developers increase the usability of Ajax applications.
@InProceedings{ASE13p158,
author = {Yuta Maezawa and Hironori Washizaki and Yoshinori Tanabe and Shinichi Honiden},
title = {Automated Verification of Pattern-Based Interaction Invariants in Ajax Applications},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {158--168},
doi = {},
year = {2013},
}
Software Model Checking for Distributed Systems with Selector-Based, Non-blocking Communication
Cyrille Artho, Masami Hagiya, Richard Potter, Yoshinori Tanabe, Franz Weitl, and Mitsuharu Yamamoto
(AIST, Japan; University of Tokyo, Japan; National Institute of Informatics, Japan; Chiba University, Japan)
Many modern software systems are implemented as client/server architectures, where a server handles multiple clients concurrently. Testing does not cover the outcomes of all possible thread and communication schedules reliably. Software model checking, on the other hand, covers all possible outcomes but is often limited to subsets of commonly used protocols and libraries.
Earlier work in cache-based software model checking handles implementations using socket-based TCP/IP networking, with one thread per client connection using blocking input/output. Recently, servers using non-blocking, selector-based input/output have become prevalent. This paper describes our work extending the Java PathFinder extension net-iocache to such software, and the application of our tool to modern server software.
@InProceedings{ASE13p169,
author = {Cyrille Artho and Masami Hagiya and Richard Potter and Yoshinori Tanabe and Franz Weitl and Mitsuharu Yamamoto},
title = {Software Model Checking for Distributed Systems with Selector-Based, Non-blocking Communication},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {169--179},
doi = {},
year = {2013},
}
Info
Evolution
A Study of Repetitiveness of Code Changes in Software Evolution
Hoan Anh Nguyen, Anh Tuan Nguyen, Tung Thanh Nguyen, Tien N. Nguyen, and
Hridesh Rajan
(Iowa State University, USA)
In this paper, we present a large-scale study of repetitiveness of code changes in software evolution. We collected a large data set of 2,841 Java projects, with 1.7 billion source lines of code (SLOC) at the latest revisions, 1.8 million code change revisions (0.4 million fixes), 6.2 million changed files, and 2.5 billion changed SLOCs. A change is considered repeated within or cross-project if it matches another change having occurred in the history of the project or another project, respectively. We report the following important findings. First, repetitiveness of changes could be as high as 70-100% at small sizes and decreases exponentially as size increases. Second, repetitiveness is higher and more stable in the cross-project setting than in the within-project one. Third, fixing changes repeat similarly to general changes. Importantly, learning code changes and recommending them in software evolution is beneficial with accuracy for top-1 recommendation of over 30% and top-3 of nearly 35%. Repeated fixing changes could also be useful for automatic program repair.
@InProceedings{ASE13p180,
author = {Hoan Anh Nguyen and Anh Tuan Nguyen and Tung Thanh Nguyen and Tien N. Nguyen and Hridesh Rajan},
title = {A Study of Repetitiveness of Code Changes in Software Evolution},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {180--190},
doi = {},
year = {2013},
}
Consistency-Preserving Edit Scripts in Model Versioning
Timo Kehrer, Udo Kelter, and Gabriele Taentzer
(University of Siegen, Germany; Philipps-Universität Marburg, Germany)
In model-based software development, models are
iteratively
evolved. To optimally support
model evolution, developers need adequate tools for model
versioning tasks, including comparison, patching, and
merging of models.
A significant disadvantage of tools currently available
is that they display, and operate with, low-level model
changes which refer to internal model representations
and which can lead to intermediate inconsistent states.
Higher-level consistency-preserving edit operations
including refactorings are better suited to explain
changes or to resolve conflicts.
This paper presents an automatic procedure which
transforms a low-level difference into an executable
edit script which uses consistency-preserving
edit operations only.
Edit scripts support consistent model patching and
merging on a higher abstraction level.
Our approach to edit script generation has been evaluated in a
larger real-world case study.
@InProceedings{ASE13p191,
author = {Timo Kehrer and Udo Kelter and Gabriele Taentzer},
title = {Consistency-Preserving Edit Scripts in Model Versioning},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {191--201},
doi = {},
year = {2013},
}
JFlow: Practical Refactorings for Flow-Based Parallelism
Nicholas Chen and Ralph E. Johnson
(University of Illinois at Urbana-Champaign, USA)
Emerging applications in the domains of recognition, mining and synthesis (RMS); image and video processing; data warehousing; and automatic financial trading admit a particular style of parallelism termed flow-based parallelism. To help developers exploit flow-based parallelism, popular parallel libraries such as Groovy's GPars, Intel's TBB Flow Graph and Microsoft's TPL Dataflow have begun introducing many new and useful constructs. However, to reap the benefits of such constructs, developers must first use them. This involves refactoring their existing sequential code to incorporate these constructs – a manual process that overwhelms even experts. To alleviate this burden, we introduce a set of novel analyses and transformations targeting flow-based parallelism. We implemented these ideas in JFlow, an interactive refactoring tool integrated into the Eclipse IDE. We used JFlow to parallelize seven applications: four from a previously known benchmark and three from a suite of large open source projects. JFlow, with minimal interaction from the developer, can successfully parallelize applications from the aforementioned domains with good performance (offering up to 3.45x speedup on a 4-core machine) and is fast enough to be used interactively as part of a developer's workflow.
@InProceedings{ASE13p202,
author = {Nicholas Chen and Ralph E. Johnson},
title = {JFlow: Practical Refactorings for Flow-Based Parallelism},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {202--212},
doi = {},
year = {2013},
}
Automated Planning for Software Architecture Evolution
Jeffrey M. Barnes, Ashutosh Pandey, and David Garlan
(Carnegie Mellon University, USA)
In previous research, we have developed a theoretical framework to help software architects make better decisions when planning software evolution. Our approach is based on representation and analysis of candidate evolution paths--sequences of transitional architectures leading from the current system to a desired target architecture. One problem with this kind of approach is that it imposes a heavy burden on the software architect, who must explicitly define and model these candidate paths. In this paper, we show how automated planning techniques can be used to support automatic generation of evolution paths, relieving this burden on the architect. We illustrate our approach by applying it to a data migration scenario, showing how this architecture evolution problem can be translated into a planning problem and solved using existing automated planning tools.
@InProceedings{ASE13p213,
author = {Jeffrey M. Barnes and Ashutosh Pandey and David Garlan},
title = {Automated Planning for Software Architecture Evolution},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {213--223},
doi = {},
year = {2013},
}
Info
Generation and Synthesis
Automatically Synthesizing SQL Queries from Input-Output Examples
Sai Zhang and Yuyin Sun
(University of Washington, USA)
Many computer end-users, such as research scientists and business analysts, need to frequently query a database, yet lack enough programming knowledge to write a correct SQL query. To alleviate this problem, we present a programming by example technique (and its tool implementation, called SQLSynthesizer) to help end-users automate such query tasks. SQLSynthesizer takes from users an example input and output of how the database should be queried, and then synthesizes a SQL query that reproduces the example output from the example input. If the synthesized SQL query is applied to another, potentially larger, database with a similar schema, the synthesized SQL query produces a corresponding result that is similar to the example output.
We evaluated SQLSynthesizer on 23 exercises from a classic database textbook and 5 forum questions about writing SQL queries. SQLSynthesizer synthesized correct answers for 15 textbook exercises and all 5 forum questions, and it did so from relatively small examples.
@InProceedings{ASE13p224,
author = {Sai Zhang and Yuyin Sun},
title = {Automatically Synthesizing SQL Queries from Input-Output Examples},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {224--234},
doi = {},
year = {2013},
}
SEDGE: Symbolic Example Data Generation for Dataflow Programs
Kaituo Li, Christoph Reichenbach,
Yannis Smaragdakis, Yanlei Diao, and
Christoph Csallner
(University of Massachusetts at Amherst, USA; Goethe University Frankfurt, Germany; University of Athens, Greece; University of Texas at Arlington, USA)
Exhaustive, automatic testing of dataflow (esp. map-reduce) programs has emerged as an important challenge. Past work demonstrated effective ways to generate small example data sets that exercise operators in the Pig platform, used to generate Hadoop map-reduce programs. Although such prior techniques attempt to cover all cases of operator use, in practice they often fail. Our SEDGE system addresses these completeness problems: for every dataflow operator, we produce data aiming to cover all cases that arise in the dataflow program (e.g., both passing and failing a filter). SEDGE relies on transforming the program into symbolic constraints, and solving the constraints using a symbolic reasoning engine (a powerful SMT solver), while using input data as concrete aids in the solution process. The approach resembles dynamic-symbolic (a.k.a. ``concolic'') execution in a conventional programming language, adapted to the unique features of the dataflow domain.
In third-party benchmarks, SEDGE achieves higher coverage than past techniques for 5 out of 20 PigMix benchmarks and 7 out of 11 SDSS benchmarks and (with equal coverage for the rest of the benchmarks). We also show that our targeting of the high-level dataflow language pays off: for complex programs, state-of-the-art dynamic-symbolic execution at the level of the generated map-reduce code (instead of the original dataflow program) requires many more test cases or achieves much lower coverage than our approach.
@InProceedings{ASE13p235,
author = {Kaituo Li and Christoph Reichenbach and Yannis Smaragdakis and Yanlei Diao and Christoph Csallner},
title = {SEDGE: Symbolic Example Data Generation for Dataflow Programs},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {235--245},
doi = {},
year = {2013},
}
Characteristic Studies of Loop Problems for Structural Test Generation via Symbolic Execution
Xusheng Xiao, Sihan Li, Tao Xie, and Nikolai Tillmann
(North Carolina State University, USA; University of Illinois at Urbana-Champaign, USA; Microsoft Research, USA)
Dynamic Symbolic Execution (DSE) is a state-of-the-art test-generation approach
that systematically explores program paths to generate high-covering tests.
In DSE, the presence of loops (especially unbound loops)
can cause an enormous or even infinite number of paths to be explored.
There exist techniques (such as bounded iteration, heuristics, and summarization)
that assist DSE in addressing loop problems.
However, there exists no literature-survey or empirical work that shows the pervasiveness of loop problems or
identifies challenges faced by these techniques on real-world open-source applications.
To fill this gap,
we provide characteristic studies to guide future research on addressing loop problems for DSE.
Our proposed study methodology starts with conducting a literature-survey study to
investigate how technical problems such as loop problems
compromise automated software-engineering tasks such as test generation,
and which existing techniques are proposed to deal with such technical problems.
Then the study methodology continues with conducting an empirical study of applying the existing techniques
on real-world software applications sampled based on the literature-survey results and major open-source project hostings. This empirical study investigates the pervasiveness of the technical problems
and how well existing techniques can address such problems among real-world software applications.
Based on such study methodology,
our two-phase characteristic studies
identify that bounded iteration and heuristics are effective in addressing loop problems when used properly.
Our studies further identify challenges faced by these techniques
and provide guidelines for effectively addressing these challenges.
@InProceedings{ASE13p246,
author = {Xusheng Xiao and Sihan Li and Tao Xie and Nikolai Tillmann},
title = {Characteristic Studies of Loop Problems for Structural Test Generation via Symbolic Execution},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {246--256},
doi = {},
year = {2013},
}
Info
Entropy-Based Test Generation for Improved Fault Localization
José Campos,
Rui Abreu, Gordon Fraser, and Marcelo d'Amorim
(University of Porto, Portugal; University of Sheffield, UK; Federal University of Pernambuco, Brazil)
Spectrum-based Bayesian reasoning can effectively rank candidate fault locations based on passing/failing test cases, but the diagnostic quality highly depends on the size and diversity of the underlying test suite. As test suites in practice often do not exhibit the necessary properties, we present a technique to extend existing test suites with new test cases that optimize the diagnostic quality. We apply probability theory concepts to guide test case generation using entropy, such that the amount of uncertainty in the diagnostic ranking is minimized. Our EntBug prototype extends the search-based test generation tool EvoSuite to use entropy in the fitness function of its underlying genetic algorithm, and we applied it to seven real faults. Empirical results show that our approach reduces the entropy of the diagnostic ranking by 49% on average (compared to using the original test suite), leading to a 91% average reduction of diagnosis candidates needed to inspect to find the true faulty one.
@InProceedings{ASE13p257,
author = {José Campos and Rui Abreu and Gordon Fraser and Marcelo d'Amorim},
title = {Entropy-Based Test Generation for Improved Fault Localization},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {257--267},
doi = {},
year = {2013},
}
Recommendations
Detecting Bad Smells in Source Code using Change History Information
Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta,
Rocco Oliveto,
Andrea De Lucia, and Denys Poshyvanyk
(University of Salerno, Italy; University of Sannio, Italy; University of Molise, Italy; College of William and Mary, USA)
Code smells represent symptoms of poor implementation choices. Previous studies found that these smells make source code more difficult to maintain, possibly also increasing its fault-proneness. There are several approaches that identify smells based on code analysis techniques. However, we observe that many code smells are intrinsically characterized by how code elements change over time. Thus, relying solely on structural information may not be sufficient to detect all the smells accurately.
We propose an approach to detect five different code smells, namely Divergent Change, Shotgun Surgery, Parallel Inheritance, Blob, and Feature Envy, by exploiting change history information mined from versioning systems. We applied approach, coined as HIST (Historical Information for Smell deTection), to eight software projects written in Java, and wherever possible compared with existing state-of-the-art smell detectors based on source code analysis. The results indicate that HIST's precision ranges between 61% and 80%, and its recall ranges between 61% and 100%. More importantly, the results confirm that HIST is able to identify code smells that cannot be identified through approaches solely based on code analysis.
@InProceedings{ASE13p268,
author = {Fabio Palomba and Gabriele Bavota and Massimiliano Di Penta and Rocco Oliveto and Andrea De Lucia and Denys Poshyvanyk},
title = {Detecting Bad Smells in Source Code using Change History Information},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {268--278},
doi = {},
year = {2013},
}
ACM SIGSOFT Distinguished Paper Award
Personalized Defect Prediction
Tian Jiang, Lin Tan, and Sunghun Kim
(University of Waterloo, Canada; Hong Kong University of Science and Technology, China)
Many defect prediction techniques have been proposed. While they often take the author of the code into consideration, none of these techniques build a separate prediction model for each developer. Different developers have different coding styles, commit frequencies, and experience levels, causing different defect patterns. When the defects of different developers are combined, such differences are obscured, hurting prediction performance.
This paper proposes personalized defect prediction—building a separate prediction model for each developer to predict software defects. As a proof of concept, we apply our personalized defect prediction to classify defects at the file change level. We evaluate our personalized change classification technique on six large software projects written in C and Java—the Linux kernel, PostgreSQL, Xorg, Eclipse, Lucene and Jackrabbit. Our personalized approach can discover up to 155 more bugs than the traditional change classification (210 versus 55) if developers inspect the top 20% lines of code that are predicted buggy. In addition, our approach improves the F1-score by 0.01–0.06 compared to the traditional change classification.
@InProceedings{ASE13p279,
author = {Tian Jiang and Lin Tan and Sunghun Kim},
title = {Personalized Defect Prediction},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {279--289},
doi = {},
year = {2013},
}
Automatic Recommendation of API Methods from Feature Requests
Ferdian Thung, Shaowei Wang,
David Lo, and Julia Lawall
(Singapore Management University, Singapore; Inria, France; Lip6, France)
Developers often receive many feature requests. To implement these features, developers can leverage various methods from third party libraries. In this work, we propose an automated approach that takes as input a textual description of a feature request. It then recommends methods in library APIs that developers can use to implement the feature. Our recommendation approach learns from records of other changes made to software systems, and compares the textual description of the requested feature with the textual descriptions of various API methods. We have evaluated our approach on more than 500 feature requests of Axis2/Java, CXF, Hadoop Common, HBase, and Struts 2. Our experiments show that our approach is able to recommend the right methods from 10 libraries with an average recall-rate@5 of 0.690 and recall-rate@10 of 0.779 respectively. We also show that the state-of-the-art approach by Chan et al., that recommends API methods based on precise text phrases, is unable to handle feature requests.
@InProceedings{ASE13p290,
author = {Ferdian Thung and Shaowei Wang and David Lo and Julia Lawall},
title = {Automatic Recommendation of API Methods from Feature Requests},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {290--300},
doi = {},
year = {2013},
}
Variability-Aware Performance Prediction: A Statistical Learning Approach
Jianmei Guo, Krzysztof Czarnecki, Sven Apel, Norbert Siegmund, and
Andrzej Wąsowski
(University of Waterloo, Canada; University of Passau, Germany; IT University of Copenhagen, Denmark)
Configurable software systems allow stakeholders to derive program variants by selecting features. Understanding the correlation between feature selections and performance is important for stakeholders to be able to derive a program variant that meets their requirements. A major challenge in practice is to accurately predict performance based on a small sample of measured variants, especially when features interact. We propose a variability-aware approach to performance prediction via statistical learning. The approach works progressively with random samples, without additional effort to detect feature interactions. Empirical results on six real-world case studies demonstrate an average of 94% prediction accuracy based on small random samples. Furthermore, we investigate why the approach works by a comparative analysis of performance distributions. Finally, we compare our approach to an existing technique and guide users to choose one or the other in practice.
@InProceedings{ASE13p301,
author = {Jianmei Guo and Krzysztof Czarnecki and Sven Apel and Norbert Siegmund and Andrzej Wąsowski},
title = {Variability-Aware Performance Prediction: A Statistical Learning Approach},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {301--311},
doi = {},
year = {2013},
}
Info
Security
A Scalable Approach for Malware Detection through Bounded Feature Space Behavior Modeling
Mahinthan Chandramohan, Hee Beng Kuan Tan,
Lionel C. Briand, Lwin Khin Shar, and Bindu Madhavi Padmanabhuni
(Nanyang Technological University, Singapore; University of Luxembourg, Luxembourg)
In recent years, malware (malicious software) has greatly evolved and has become very sophisticated. The evolution of malware makes it difficult to detect using traditional signature-based malware detectors. Thus, researchers have proposed various behavior-based malware detection techniques to mitigate this problem. However, there are still serious shortcomings, related to scalability and computational complexity, in existing malware behavior modeling techniques. This raises questions about the practical applicability of these techniques.
This paper proposes and evaluates a bounded feature space behavior modeling (BOFM) framework for scalable malware detection. BOFM models the interactions between software (which can be malware or benign) and security-critical OS resources in a scalable manner. Information collected at run-time according to this model is then used by machine learning algorithms to learn how to accurately classify software as malware or benign. One of the key problems with simple malware behavior modeling (e.g., n-gram model) is that the number of malware features (i.e., signatures) grows proportional to the size of execution traces, with a resulting malware feature space that is so large that it makes the detection process very challenging. On the other hand, in BOFM, the malware feature space is bounded by an upper limit N, a constant, and the results of our experiments show that its computation time and memory usage are vastly lower than in currently reported, malware detection techniques, while preserving or even improving their high detection accuracy.
@InProceedings{ASE13p312,
author = {Mahinthan Chandramohan and Hee Beng Kuan Tan and Lionel C. Briand and Lwin Khin Shar and Bindu Madhavi Padmanabhuni},
title = {A Scalable Approach for Malware Detection through Bounded Feature Space Behavior Modeling},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {312--322},
doi = {},
year = {2013},
}
Automatically Partition Software into Least Privilege Components using Dynamic Data Dependency Analysis
Yongzheng Wu, Jun Sun,
Yang Liu, and Jin Song Dong
(Singapore University of Technology and Design, Singapore; Nanyang Technological University, Singapore; National University of Singapore, Singapore)
The principle of least privilege requires that software components should be
granted only necessary privileges, so that compromising one component does not
lead to compromising others. However, writing privilege separated software is
difficult and as a result, a large number of software is monolithic, i.e., it
runs as a whole without separation. Manually rewriting monolithic software
into privilege separated software requires significant effort and can be error
prone. We propose ProgramCutter, a novel approach to automatically
partitioning monolithic software using dynamic data dependency analysis.
ProgramCutter works by constructing a data dependency graph whose nodes are
functions and edges are data dependencies between functions. The graph is then
partitioned into subgraphs where each subgraph represents a least privilege
component. The privilege separated software runs each component in a separated
process with confined system privileges. We evaluate it by applying it on four
open source software. We can reduce the privileged part of the program from
100% to below 22%, while having a reasonable execution time overhead. Since
ProgramCutter does not require any expert knowledge of the software, it not
only can be used by its developers for software refactoring, but also by end
users or system administrators. Our contributions are threefold: (i) we define
a quantitative measure of the security and performance of privilege separation;
(ii) we propose a graph-based approach to compute the optimal separation based
on dynamic information flow analysis; and (iii) the separation process is
automatic and does not require expert knowledge of the software.
@InProceedings{ASE13p323,
author = {Yongzheng Wu and Jun Sun and Yang Liu and Jin Song Dong},
title = {Automatically Partition Software into Least Privilege Components using Dynamic Data Dependency Analysis},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {323--333},
doi = {},
year = {2013},
}
Finding Architectural Flaws using Constraints
Radu Vanciu and Marwan Abi-Antoun
(Wayne State University, USA)
During Architectural Risk Analysis (ARA), security architects use a runtime architecture to look for security vulnerabilities that are architectural flaws rather than coding defects. The current ARA process, however, is mostly informal and manual. In this paper, we propose Scoria, a semi-automated approach for finding architectural flaws. Scoria uses a sound, hierarchical object graph with abstract objects and dataflow edges, where edges can refer to nodes in the graph. The architects can augment the object graph with security properties, which can express security information unavailable in code. Scoria allows architects to write queries on the graph in terms of the hierarchy, reachability, and provenance of a dataflow object. Based on the query results, the architects enhance their knowledge of the system security and write expressive constraints. The expressiveness is richer than previous approaches that check only for the presence or absence of communication or do not track a dataflow as an object. To evaluate Scoria, we apply these constraints to several extended examples adapted from the CERT standard for Java to confirm that Scoria can detect injected architectural flaws. Next, we write constraints to enforce an Android security policy and find one architectural flaw in one Android application.
@InProceedings{ASE13p334,
author = {Radu Vanciu and Marwan Abi-Antoun},
title = {Finding Architectural Flaws using Constraints},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {334--344},
doi = {},
year = {2013},
}
Debugging
Improving Bug Localization using Structured Information Retrieval
Ripon K. Saha, Matthew Lease, Sarfraz Khurshid, and Dewayne E. Perry
(University of Texas at Austin, USA)
Locating bugs is important, difficult, and expensive, particularly for large-scale systems. To address this, natural language information retrieval techniques are increasingly being used to suggest potential faulty source files given bug reports. While these techniques are very scalable, in practice their effectiveness remains low in accurately localizing bugs to a small number of files. Our key insight is that structured information retrieval based on code constructs, such as class and method names, enables more accurate bug localization. We present BLUiR, which embodies this insight, requires only the source code and bug reports, and takes advantage of bug similarity data if available. We build BLUiR on a proven, open source IR toolkit that anyone can use. Our work provides a thorough grounding of IR-based bug localization research in fundamental IR theoretical and empirical knowledge and practice. We evaluate BLUiR on four open source projects with approximately 3,400 bugs. Results show that BLUiR matches or outperforms a current state-of-the- art tool across applications considered, even when BLUiR does not use bug similarity data used by the other tool.
@InProceedings{ASE13p345,
author = {Ripon K. Saha and Matthew Lease and Sarfraz Khurshid and Dewayne E. Perry},
title = {Improving Bug Localization using Structured Information Retrieval},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {345--355},
doi = {},
year = {2013},
}
Leveraging Program Equivalence for Adaptive Program Repair: Models and First Results
Westley Weimer, Zachary P. Fry, and Stephanie Forrest
(University of Virginia, USA; University of New Mexico, USA)
Software bugs remain a compelling problem. Automated program repair is a
promising approach for reducing cost, and many methods have recently
demonstrated positive results. However, success on any particular
bug is variable, as is the cost to find a repair. This paper focuses
on generate-and-validate repair methods that enumerate candidate repairs
and use test cases to define correct behavior. We formalize repair cost in
terms of test executions, which dominate most test-based repair algorithms.
Insights from this model lead to a novel deterministic repair algorithm
that computes a patch quotient space with respect to an approximate
semantic equivalence relation. This allows syntactic and dataflow
analysis techniques to dramatically reduce the repair search space.
Generate-and-validate program repair is
shown to be a dual of mutation testing, suggesting several possible
cross-fertilizations. Evaluating on 105 real-world bugs in programs
totaling 5MLOC and involving 10,000 tests, our new algorithm requires an
order-of-magnitude fewer test evaluations than the previous
state-of-the-art and is over three times more efficient monetarily.
@InProceedings{ASE13p356,
author = {Westley Weimer and Zachary P. Fry and Stephanie Forrest},
title = {Leveraging Program Equivalence for Adaptive Program Repair: Models and First Results},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {356--366},
doi = {},
year = {2013},
}
Detecting and Characterizing Semantic Inconsistencies in Ported Code
Baishakhi Ray, Miryung Kim, Suzette Person, and Neha Rungta
(University of Texas at Austin, USA; NASA Langley Research Center, USA; NASA Ames Research Center, USA)
Adding similar features and bug fixes often requires porting program patches from reference implementations and adapting them to target implementations. Porting errors may result from faulty adaptations or inconsistent updates. This paper investigates (1) the types of porting errors found in practice, and (2) how to detect and characterize potential porting errors. Analyzing version histories, we define five categories of porting errors, including incorrect control- and data-flow, code redundancy, inconsistent identifier renaming, etc. Leveraging this categorization, we design a static control- and data-dependence analysis technique, SPA, to detect and characterize porting inconsistencies. Our evaluation on code from four open-source projects shows that SPA can detect porting inconsistencies with 65% to 73% precision and 90% recall, and identify inconsistency types with 58% to 63% precision and 92% to 100% recall. In a comparison with two existing error detection tools, SPA improves precision by 14 to 17 percentage points.
@InProceedings{ASE13p367,
author = {Baishakhi Ray and Miryung Kim and Suzette Person and Neha Rungta},
title = {Detecting and Characterizing Semantic Inconsistencies in Ported Code},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {367--377},
doi = {},
year = {2013},
}
Lightweight Control-Flow Instrumentation and Postmortem Analysis in Support of Debugging
Peter Ohmann and Ben Liblit
(University of Wisconsin-Madison, USA)
Debugging is difficult and costly. As a human programmer looks for a bug, it would be helpful to see a complete trace of events leading to the point of failure. Unfortunately, full tracing is simply too slow to use in deployment, and may even be impractical during testing.
We aid post-deployment debugging by giving programmers additional information about program activity shortly before failure. We use latent information in post-failure memory dumps, augmented by low-overhead, tunable run-time tracing. Our results with a realistically-tuned tracing scheme show low enough overhead (0-5%) to be used in production runs. We demonstrate several potential uses of this enhanced information, including a novel postmortem static slice restriction technique and a reduced view of potentially-executed code. Experimental evaluation shows our approach to be very effective, such as shrinking stack-sensitive interprocedural static slices by 49-78% in larger applications.
@InProceedings{ASE13p378,
author = {Peter Ohmann and Ben Liblit},
title = {Lightweight Control-Flow Instrumentation and Postmortem Analysis in Support of Debugging},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {378--388},
doi = {},
year = {2013},
}
Info
ACM SIGSOFT Distinguished Paper Award
Resources
Characterizing and Detecting Resource Leaks in Android Applications
Chaorong Guo,
Jian Zhang, Jun Yan, Zhiqiang Zhang, and Yanli Zhang
(Institute of Software at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China)
Android phones come with a host of hardware
components embedded in them, such as Camera, Media Player
and Sensor. Most of these components are exclusive resources
or resources consuming more memory/energy than general. And
they should be explicitly released by developers. Missing release
operations of these resources might cause serious problems such
as performance degradation or system crash. These kinds of
defects are called resource leaks. This paper focuses on resource
leak problems in Android apps, and presents our lightweight
static analysis tool called Relda, which can automatically analyze
an application’s resource operations and locate the resource leaks.
We propose an automatic method for detecting resource leaks
based on a modified Function Call Graph, which handles the
features of event-driven mobile programming by analyzing the
callbacks defined in Android framework. Our experimental data
shows that Relda is effective in detecting resource leaks in real
Android apps.
@InProceedings{ASE13p389,
author = {Chaorong Guo and Jian Zhang and Jun Yan and Zhiqiang Zhang and Yanli Zhang},
title = {Characterizing and Detecting Resource Leaks in Android Applications},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {389--398},
doi = {},
year = {2013},
}
Dangling References in Multi-configuration and Dynamic PHP-Based Web Applications
Hung Viet Nguyen, Hoan Anh Nguyen, Tung Thanh Nguyen, Anh Tuan Nguyen, and Tien N. Nguyen
(Iowa State University, USA)
PHP is a dynamic language popularly used in Web development for writing server-side code to dynamically create multiple versions of client-side pages at run time for different configurations. A PHP program contains code to be executed or produced for multiple configurations/versions. That dynamism and multi-configuration nature leads to dangling references. Specifically, in the execution for a configuration, a reference to a variable or a call to a function is dangling if its corresponding declaration cannot be found. We conducted an exploratory study to confirm the existence of such dangling reference errors including dangling cross-language and embedded references in the client-side HTML/JavaScript code and in data-accessing SQL code that are embedded in scattered PHP code. Dangling references have caused run-time fatal failures and security vulnerabilities.
We developed DRC, a static analysis method to detect such dangling references. DRC uses symbolic execution to collect PHP declarations/references and to approximate all versions of the generated output, and then extracts embedded declarations/references. It associates each detected declaration/reference with a conditional constraint that represents the execution paths (i.e. configurations/versions) containing that declaration/reference. It then validates references against declarations via a novel dangling reference detection algorithm. Our empirical evaluation shows that DRC detects dangling references with high accuracy. It revealed 83 yet undiscovered defects caused by dangling references.
@InProceedings{ASE13p399,
author = {Hung Viet Nguyen and Hoan Anh Nguyen and Tung Thanh Nguyen and Anh Tuan Nguyen and Tien N. Nguyen},
title = {Dangling References in Multi-configuration and Dynamic PHP-Based Web Applications},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {399--409},
doi = {},
year = {2013},
}
Dynamically Transforming Data Structures
Erik Österlund and Welf Löwe
(Linnaeus University, Sweden)
Fine-tuning which data structure implementation to use for a given problem is sometimes tedious work since the optimum solution depends on the context, i.e., on the operation sequences, actual parameters as well as on the hardware available at run time. Sometimes a data structure with higher asymptotic time complexity performs better in certain contexts because of lower constants. The optimal solution may not even be possible to determine at compile time.
We introduce transformation data structures that dynamically change their internal representation variant based on a possibly changing context. The most suitable variant is selected at run time rather than at compile time.
We demonstrate the effect on performance with a transforma- tion ArrayList data structure using an array variant and a linked hash bag variant as alternative internal representations. Using our transformation ArrayList, the standard DaCapo benchmark suite shows a performance gain of 5.19% in average.
@InProceedings{ASE13p410,
author = {Erik Österlund and Welf Löwe},
title = {Dynamically Transforming Data Structures},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {410--420},
doi = {},
year = {2013},
}
Towards Precise Metrics for Predicting Graph Query Performance
Benedek Izsó, Zoltán Szatmári,
Gábor Bergmann, Ákos Horváth, and István Ráth
(Budapest University of Technology and Economics, Hungary)
Queries are the foundations of data intensive applications. In model-driven software engineering (MDSE), model queries are core technologies of tools and transformations. As software models are rapidly increasing in size and complexity, most MDSE tools frequently exhibit scalability issues that decrease developer productivity and increase costs. As a result, choosing the right model representation and query evaluation approach is a significant challenge for tool engineers. In the current paper, we aim to provide a benchmarking framework for the systematic investigation of query evaluation performance. More specifically, we experimentally evaluate (existing and novel) query and instance model metrics to highlight which provide sufficient performance estimates for different MDSE scenarios in various model query tools. For that purpose, we also present a comparative benchmark, which is designed to differentiate model representation and graph query evaluation approaches according to their performance when using large models and complex queries.
@InProceedings{ASE13p421,
author = {Benedek Izsó and Zoltán Szatmári and Gábor Bergmann and Ákos Horváth and István Ráth},
title = {Towards Precise Metrics for Predicting Graph Query Performance},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {421--431},
doi = {},
year = {2013},
}
Specification Mining
TzuYu: Learning Stateful Typestates
Hao Xiao, Jun Sun,
Yang Liu, Shang-Wei Lin, and Chengnian Sun
(Nanyang Technological University, Singapore; Singapore University of Technology and Design, Singapore; National University of Singapore, Singapore)
Behavioral models are useful for various software engineering tasks. They are, however, often missing in practice. Thus, specification mining was proposed to tackle this problem. Existing work either focuses on learning simple behavioral models such as finite-state automata, or relies on techniques (e.g., symbolic execution) to infer finite-state machines equipped with data states, referred to as stateful typestates. The former is often inadequate as finite-state automata lack expressiveness in capturing behaviors of data-rich programs, whereas the latter is often not scalable. In this work, we propose a fully automated approach to learn stateful typestates by extending the classic active learning process to generate transition guards (i.e., propositions on data states). The proposed approach has been implemented in a tool called TzuYu and evaluated against a number of Java classes. The evaluation results show that TzuYu is capable of learning correct stateful typestates more efficiently.
@InProceedings{ASE13p432,
author = {Hao Xiao and Jun Sun and Yang Liu and Shang-Wei Lin and Chengnian Sun},
title = {TzuYu: Learning Stateful Typestates},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {432--442},
doi = {},
year = {2013},
}
Mining Branching-Time Scenarios
Dirk Fahland,
David Lo, and
Shahar Maoz
(Eindhoven University of Technology, Netherlands; Singapore Management University, Singapore; Tel Aviv University, Israel)
Specification mining extracts candidate specification from existing systems, to
be used for downstream tasks such as testing and verification. Specifically, we
are interested in the extraction of behavior models from execution traces.
In this paper we introduce mining of branching-time scenarios in the form of
existential, conditional Live Sequence Charts, using a statistical
data-mining algorithm. We show the power of branching scenarios to reveal
alternative scenario-based behaviors, which could not be mined by previous
approaches.
The work contrasts and complements previous works on mining linear-time
scenarios. An implementation and evaluation over execution trace sets recorded
from several real-world applications shows the unique contribution of mining
branching-time scenarios to the state-of-the-art in specification mining.
@InProceedings{ASE13p443,
author = {Dirk Fahland and David Lo and Shahar Maoz},
title = {Mining Branching-Time Scenarios},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {443--453},
doi = {},
year = {2013},
}
Models and Complexity
Measuring the Structural Complexity of Feature Models
Richard Pohl, Vanessa Stricker, and Klaus Pohl
(University of Duisburg-Essen, Germany)
The automated analysis of feature models (FM) is
based on SAT, BDD, and CSP – known NP-complete problems.
Therefore, the analysis could have an exponential worst-case
execution time. However, for many practical relevant analysis
cases, state-of-the-art (SOTA) analysis tools quite successfully
master the problem of exponential worst-case execution time
based on heuristics. So far, however, very little is known about
the structure of FMs that cause the cases in which the execution
time (hardness) for analyzing a given FM increases unpredictably
for SOTA analysis tools. In this paper, we propose to use
width measures from graph theory to characterize the structural
complexity of FMs as a basis for an estimation of the hardness of
analysis operations on FMs with SOTA analysis tools. We present
an experiment that we use to analyze the reasonability of graph
width measures as metric for the structural complexity of FMs
and the hardness of FM analysis. Such a complexity metric can
be used as a basis for a unified method to systematically improve
SOTA analysis tools.
@InProceedings{ASE13p454,
author = {Richard Pohl and Vanessa Stricker and Klaus Pohl},
title = {Measuring the Structural Complexity of Feature Models},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {454--464},
doi = {},
year = {2013},
}
Info
Scalable Product Line Configuration: A Straw to Break the Camel’s Back
Abdel Salam Sayyad, Joseph Ingram, Tim Menzies, and Hany Ammar
(West Virginia University, USA)
Software product lines are hard to configure. Techniques that work for medium sized product lines
fail for much larger product lines such as the Linux kernel with 6000+ features. This paper presents simple heuristics that help the Indicator-Based Evolutionary Algorithm (IBEA) in finding sound and optimum configurations of very large variability models in the presence of competing objectives. We employ a combination of static and evolutionary learning of model structure, in addition to utilizing a pre-computed solution used as a “seed” in the midst of a randomly-generated initial population. The seed solution works like a single straw that is enough to break the camel’s back –given that it is a feature-rich seed. We show promising results where we can find 30 sound solutions for configuring upward of 6000 features within 30 minutes.
@InProceedings{ASE13p465,
author = {Abdel Salam Sayyad and Joseph Ingram and Tim Menzies and Hany Ammar},
title = {Scalable Product Line Configuration: A Straw to Break the Camel’s Back},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {465--474},
doi = {},
year = {2013},
}
Experience Track
Experience: Software Analysis
Software Analytics for Incident Management of Online Services: An Experience Report
Jian-Guang Lou,
Qingwei Lin, Rui Ding, Qiang Fu,
Dongmei Zhang, and Tao Xie
(Microsoft Research, China; University of Illinois at Urbana-Champaign, USA)
As online services become more and more popular, incident management has become a critical task that aims to minimize the service downtime and to ensure high quality of the provided services. In practice, incident management is conducted through analyzing a huge amount of monitoring data collected at runtime of a service. Such data-driven incident management faces several significant challenges such as the large data scale, complex problem space, and incomplete knowledge. To address these challenges, we carried out two-year software-analytics research where we designed a set of novel data-driven techniques and developed an industrial system called the Service Analysis Studio (SAS) targeting real scenarios in a large-scale online service of Microsoft. SAS has been deployed to worldwide product datacenters and widely used by on-call engineers for incident management. This paper shares our experience about using software analytics to solve engineers’ pain points in incident management, the developed data-analysis techniques, and the lessons learned from the process of research development and technology transfer.
@InProceedings{ASE13p475,
author = {Jian-Guang Lou and Qingwei Lin and Rui Ding and Qiang Fu and Dongmei Zhang and Tao Xie},
title = {Software Analytics for Incident Management of Online Services: An Experience Report},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {475--485},
doi = {},
year = {2013},
}
A Comparative Analysis of Software Architecture Recovery Techniques
Joshua Garcia, Igor Ivkovic, and
Nenad Medvidovic
(University of Southern California, USA; Wilfrid Laurier University, Canada)
Many automated techniques of varying accuracy
have been developed to help recover the architecture of a software
system from its implementation. However, rigorously assessing
these techniques has been hampered by the lack of architectural
“ground truths”. Over the past several years, we have collected
a set of eight architectures that have been recovered from open-source
systems and independently, carefully verified. In this paper,
we use these architectures as ground truths in performing a
comparative analysis of six state-of-the-art software architecture
recovery techniques. We use a number of metrics to assess
each technique for its ability to identify a system’s architectural
components and overall architectural structure. Our results
suggest that two of the techniques routinely outperform the rest,
but even the best of the lot has surprisingly low accuracy. Based
on the empirical data, we identify several avenues of future
research in software architecture recovery.
@InProceedings{ASE13p486,
author = {Joshua Garcia and Igor Ivkovic and Nenad Medvidovic},
title = {A Comparative Analysis of Software Architecture Recovery Techniques},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {486--496},
doi = {},
year = {2013},
}
Towards Contextual and On-Demand Code Clone Management by Continuous Monitoring
Gang Zhang,
Xin Peng, Zhenchang Xing, Shihai Jiang, Hai Wang, and Wenyun Zhao
(Fudan University, China; Nanyang Technological University, Singapore)
Effective clone management is essential for developers to recognize the introduction and evolution of code clones, to judge their impact on software quality, and to take appropriate measures if required. Our previous study shows that cloning practice is not simply a technical issue. It must be interpreted and considered in a larger context from technical, personal, and organizational perspectives. In this paper, we propose a contextual and on-demand code clone management approach called CCEvents (Code Cloning Events). Our approach provides timely notification about relevant code cloning events for different stakeholders through continuous monitoring of code repositories. It supports on-demand customization of clone monitoring strategies in specific technical, personal, and organizational contexts using a domain-specific language. We implemented the proposed approach and conducted an empirical study with an industrial project. The results confirm the requirements for contextual and on-demand code clone management and show the effectiveness of CCEvents in providing timely code cloning notifications and in helping to achieve effective clone management.
@InProceedings{ASE13p497,
author = {Gang Zhang and Xin Peng and Zhenchang Xing and Shihai Jiang and Hai Wang and Wenyun Zhao},
title = {Towards Contextual and On-Demand Code Clone Management by Continuous Monitoring},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {497--507},
doi = {},
year = {2013},
}
The Potential of Polyhedral Optimization: An Empirical Study
Andreas Simbürger, Sven Apel, Armin Größlinger, and Christian Lengauer
(University of Passau, Germany)
Present-day automatic optimization relies on powerful static (i.e.,
compile-time) analysis and transformation methods. One popular
platform for automatic optimization is the polyhedron model. Yet,
after several decades of development, there remains a lack of empirical
evidence of the model's benefits for real-world
software systems. We report on an empirical study in which we analyzed
a set of popular software systems, distributed across various application
domains. We found that polyhedral analysis at compile time often lacks
the information necessary to exploit the potential for optimization
of a program's execution. However, when conducted also at run time,
polyhedral analysis shows greater relevance for real-world applications.
On average, the share of the execution time amenable to polyhedral
optimization is increased by a factor of nearly 3.
Based on our experimental results, we discuss the merits and potential
of polyhedral optimization at compile time and run time.
@InProceedings{ASE13p508,
author = {Andreas Simbürger and Sven Apel and Armin Größlinger and Christian Lengauer},
title = {The Potential of Polyhedral Optimization: An Empirical Study},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {508--518},
doi = {},
year = {2013},
}
Info
Experience: Testing and Verification
Automated Unit Testing of Large Industrial Embedded Software using Concolic Testing
Yunho Kim, Youil Kim, Taeksu Kim, Gunwoo Lee, Yoonkyu Jang, and Moonzoo Kim
(KAIST, South Korea; Samsung Electronics, South Korea)
Current testing practice in industry is often ineffective
and slow to detect bugs, since most projects utilize
manually generated test cases. Concolic testing alleviates this
problem by automatically generating test cases that achieve
high coverage. However, specialized execution platforms and
resource constraints of embedded software hinder application
of concolic testing to embedded software. To overcome these
limitations, we have developed CONcrete and symBOLic (CONBOL)
testing framework to unit test large size industrial embedded
software automatically. To address the aforementioned
limitations, CONBOL tests target units on a host PC platform
by generating symbolic unit testing drivers/stubs automatically
and applying heuristics to reduce false alarms caused by the
imprecise drivers/stubs. We have applied CONBOL to four
million lines long industrial embedded software and detected 24
new crash bugs. Furthermore, the development team of the target
software adopted CONBOL to their development process to apply
CONBOL to the revised target software regularly.
@InProceedings{ASE13p519,
author = {Yunho Kim and Youil Kim and Taeksu Kim and Gunwoo Lee and Yoonkyu Jang and Moonzoo Kim},
title = {Automated Unit Testing of Large Industrial Embedded Software using Concolic Testing},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {519--528},
doi = {},
year = {2013},
}
Minimizing CPU Time Shortage Risks in Integrated Embedded Software
Shiva Nejati, Morayo Adedjouma,
Lionel C. Briand, Jonathan Hellebaut, Julien Begey, and Yves Clement
(University of Luxembourg, Luxembourg; Delphi, Luxembourg)
A major activity in many industries is to integrate software artifacts such that the functional and performance requirements are properly taken care of. In this paper, we focus on the problem of minimizing the risk of CPU time shortage in integrated embedded systems. In order to minimize this risk, we manipulate the start time (offset) of the software executables such that the system real-time constraints are satisfied, and further, the maximum CPU time usage is minimized. We develop a number of search-based optimization algorithms, specifically designed to work for large search spaces, to compute offsets for concurrent software executables with the objective of minimizing CPU usage. We evaluated and compared our algorithms by applying them to a large automotive software system. Our experience shows that our algorithms can automatically generate offsets such that the maximum CPU usage is very close to the known lower bound imposed by the domain constraints. Further, our approach finds limits on the maximum CPU usage lower than those found by a random strategy, and is not slower than a random strategy. Finally, our work achieves better results than the CPU usage minimization techniques devised by domain experts.
@InProceedings{ASE13p529,
author = {Shiva Nejati and Morayo Adedjouma and Lionel C. Briand and Jonathan Hellebaut and Julien Begey and Yves Clement},
title = {Minimizing CPU Time Shortage Risks in Integrated Embedded Software},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {529--539},
doi = {},
year = {2013},
}
Model Based Test Validation and Oracles for Data Acquisition Systems
Daniel Di Nardo, Nadia Alshahwan,
Lionel C. Briand, Elizabeta Fourneret, Tomislav Nakić-Alfirević, and Vincent Masquelier
(University of Luxembourg, Luxembourg; SES, Luxembourg)
This paper presents an automated, model based test validation and oracle approach for systems with complex input and output structures, such as Data Acquisition (DAQ) systems, which are common in many sectors including the satellite communications industry. We present a customised modelling methodology for such systems and a tool that automatically validates test inputs and, after test execution, applies an oracle that is based on mappings between the input and output. We also apply our proposed approach and tool to a complex industrial DAQ system and investigate the scalability and effectiveness of the approach in validating test cases, the DAQ system, or its specifications (captured as models). The results of the case study show that the approach is indeed scalable with respect to two dimensions: (1) model size and (2) test validation and oracle execution time. The size of the model for the DAQ system under study remains within practical bounds, and far below that of typical system models, as it includes a class diagram with 68 classes and 49 constraints. The developed test validation and oracles tool can handle satellite transmission files up to two GB within practical time constraints, taking, on a standard PC, less than three minutes for test validation and less than 50 minutes for applying the oracle. The approach was also effective in automatically applying the oracle successfully for the actual test suite of the DAQ system, accurately identifying all issues and violations that were expected, thus showing that an approach based on models can be sufficiently accurate.
@InProceedings{ASE13p540,
author = {Daniel Di Nardo and Nadia Alshahwan and Lionel C. Briand and Elizabeta Fourneret and Tomislav Nakić-Alfirević and Vincent Masquelier},
title = {Model Based Test Validation and Oracles for Data Acquisition Systems},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {540--550},
doi = {},
year = {2013},
}
Automated Verification of Interactive Rule-Based Configuration Systems
Deepak Dhungana, Ching Hoo Tang, Christoph Weidenbach, and Patrick Wischnewski
(Siemens, Austria; Max-Planck-Institute for Informatics, Germany; Logic4Business, Germany)
Rule-based specifications of systems have again become
common in the context of product line variability modeling
and configuration systems. In this paper, we define a logical foundation
for rule-based specifications that has enough expressivity
and operational behavior to be practically useful and at the same
time enables decidability of important overall properties such
as consistency or cycle-freeness. Our logic supports rule-based
interactive user transitions as well as the definition of a domain
theory via rule transitions. As a running example, we model
DOPLER, a rule-based configuration system currently in use at
Siemens.
@InProceedings{ASE13p551,
author = {Deepak Dhungana and Ching Hoo Tang and Christoph Weidenbach and Patrick Wischnewski},
title = {Automated Verification of Interactive Rule-Based Configuration Systems},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {551--561},
doi = {},
year = {2013},
}
New Ideas Track
New Ideas: Adaptation and Transformation
AutoComment: Mining Question and Answer Sites for Automatic Comment Generation
Edmund Wong, Jinqiu Yang, and Lin Tan
(University of Waterloo, Canada)
Code comments improve software maintainability. To address the comment scarcity issue, we propose a new automatic comment generation approach, which mines comments from a large programming Question and Answer (Q&A) site. Q&A sites allow programmers to post questions and receive solutions, which contain code segments together with their descriptions, referred to as code-description mappings. We develop AutoComment to extract such mappings, and leverage them to generate description comments automatically for similar code segments matched in open-source projects. We apply AutoComment to analyze Java and Android tagged Q&A posts to extract 132,767 code-description mappings, which help AutoComment to generate 102 comments automatically for 23 Java and Android projects. The user study results show that the majority of the participants consider the generated comments accurate, adequate, concise, and useful in helping them understand the code.
@InProceedings{ASE13p562,
author = {Edmund Wong and Jinqiu Yang and Lin Tan},
title = {AutoComment: Mining Question and Answer Sites for Automatic Comment Generation},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {562--567},
doi = {},
year = {2013},
}
Detecting System Use Cases and Validations from Documents
Smita Ghaisas, Manish Motwani, and Preethu Rose Anish
(Tata Consultancy Services, India)
Identifying system use cases and corresponding validations involves analyzing large requirement documents to understand the descriptions of business processes, rules and policies. This consumes a significant amount of effort and time. We discuss an approach to automate the detection of system use cases and corresponding validations from documents. We have devised a representation that allows for capturing the essence of rule statements as a composition of atomic ‘Rule intents’ and key phrases associated with the intents. Rule intents that co-occur frequently constitute 'Rule acts’ analogous to the Speech acts in Linguistics. Our approach is based on NLP techniques designed around this Rule Model. We employ syntactic and semantic NL analyses around the model to identify and classify rules and annotate them with Rule acts. We map the Rule acts to business process steps and highlight the combinations as potential system use cases and validations for human supervision.
@InProceedings{ASE13p568,
author = {Smita Ghaisas and Manish Motwani and Preethu Rose Anish},
title = {Detecting System Use Cases and Validations from Documents},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {568--573},
doi = {},
year = {2013},
}
Multi-user Variability Configuration: A Game Theoretic Approach
Jesús García-Galán, Pablo Trinidad, and
Antonio Ruiz-Cortés
(University of Seville, Spain)
Multi-user configuration is a neglected problem in variability-intensive systems area. The appearance of conflicts among user configurations is a main concern. Current approaches focus on avoiding such conflicts, applying the mutual exclusion principle. However, this perspective has a negative impact on
users satisfaction, who cannot make any decision fairly. In this work, we propose an interpretation of multi-user configuration as a game theoretic problem. Game theory is a well-known discipline which analyzes conflicts and cooperation among intelligent rational decision-makers. We present a taxonomy of multi-user configuration approaches, and how they can be interpreted as different problems of game theory. We focus on cooperative game theory to propose and automate a tradeoff-based bargaining approach, as a way to solve the conflicts and maximize user satisfaction at the same time.
@InProceedings{ASE13p574,
author = {Jesús García-Galán and Pablo Trinidad and Antonio Ruiz-Cortés},
title = {Multi-user Variability Configuration: A Game Theoretic Approach},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {574--579},
doi = {},
year = {2013},
}
From Comparison Matrix to Variability Model: The Wikipedia Case Study
Nicolas Sannier, Mathieu Acher, and Benoit Baudry
(University of Rennes 1, France; Inria, France; Irisa, France)
Product comparison matrices (PCMs) provide a convenient way to document the discriminant features of a family of related products and now abound on the internet.
Despite their apparent simplicity, the information present in existing PCMs can be very heterogeneous, partial, ambiguous, hard to exploit by users who desire to choose an appropriate product.
Variability Models (VMs) can be employed to formulate in a more precise way the semantics of PCMs and enable automated reasoning such as assisted configuration. Yet, the gap between PCMs and VMs should be precisely understood and automated techniques should support the transition between the two.
In this paper, we propose variability patterns that describe PCMs content and conduct an empirical analysis of 300+ PCMs mined from Wikipedia. Our findings are a first step toward better engineering techniques for maintaining and configuring PCMs.
@InProceedings{ASE13p580,
author = {Nicolas Sannier and Mathieu Acher and Benoit Baudry},
title = {From Comparison Matrix to Variability Model: The Wikipedia Case Study},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {580--585},
doi = {},
year = {2013},
}
Learning Effective Query Transformations for Enhanced Requirements Trace Retrieval
Timothy Dietrich, Jane Cleland-Huang, and Yonghee Shin
(DePaul University, USA)
In automated requirements traceability, significant
improvements can be realized through incorporating user feedback
into the trace retrieval process. However, existing feedback
techniques are designed to improve results for individual queries.
In this paper we present a novel technique designed to extend
the benefits of user feedback across multiple trace queries. Our
approach, named Trace Query Transformation (TQT), utilizes
a novel form of Association Rule Mining to learn a set of
query transformation rules which are used to improve the
efficacy of future trace queries. We evaluate TQT using two
different kinds of training sets. The first represents an initial
set of queries directly modified by human analysts, while the
second represents a set of queries generated by applying a
query optimization process based on initial relevance feedback
for trace links between a set of source and target documents. Both
techniques are evaluated using requirements from theWorldVista
Healthcare system, traced against certification requirements for
the Commission for Healthcare Information Technology. Results
show that the TQT technique returns significant improvements
in the quality of generated trace links.
@InProceedings{ASE13p586,
author = {Timothy Dietrich and Jane Cleland-Huang and Yonghee Shin},
title = {Learning Effective Query Transformations for Enhanced Requirements Trace Retrieval},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {586--591},
doi = {},
year = {2013},
}
Environment Rematching: Toward Dependability Improvement for Self-Adaptive Applications
Chang Xu, Wenhua Yang,
Xiaoxing Ma,
Chun Cao, and Jian Lü
(Nanjing University, China)
Self-adaptive applications can easily contain faults. Existing approaches detect faults, but can still leave some undetected and manifesting into failures at runtime. In this paper, we study the correlation between occurrences of application failure and those of consistency failure. We propose fixing consistency failure to reduce application failure at runtime. We name this environment rematching, which can systematically reconnect a self-adaptive application to its environment in a consistent way. We also propose enforcing atomicity for application semantics during the rematching to avoid its side effect. We evaluated our approach using 12 self-adaptive robot-car applications by both simulated and real experiments. The experimental results confirmed our approach’s effectiveness in improving dependability for all applications by 12.5-52.5%.
@InProceedings{ASE13p592,
author = {Chang Xu and Wenhua Yang and Xiaoxing Ma and Chun Cao and Jian Lü},
title = {Environment Rematching: Toward Dependability Improvement for Self-Adaptive Applications},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {592--597},
doi = {},
year = {2013},
}
Cloud Twin: Native Execution of Android Applications on the Windows Phone
Ethan Holder, Eeshan Shah, Mohammed Davoodi, and
Eli Tilevich
(Virginia Tech, USA)
To successfully compete in the software marketplace, modern mobile applications must run on multiple competing platforms, such as Android, iOS, and Windows Phone. Companies producing mobile applications spend substantial amounts of time, effort, and money to port applications across platforms. Creating individual program versions for different platforms further exacerbates the maintenance burden. This paper presents Cloud Twin, a novel approach to natively executing the functionality of a mobile application written for another platform. The functionality is accessed by means of dynamic cross-platform replay, in which the source application’s execution in the cloud
is mimicked natively on the target platform. The reference implementation of Cloud Twin natively emulates the behavior of Android applications on a Windows Phone. Specifically, Cloud
Twin transmits, via web sockets, the UI actions performed on the Windows Phone to the cloud server, which then mimics the received actions on the Android emulator. The UI updates on the emulator are efficiently captured by means of Aspect Oriented Programming and sent back to be replayed on the Windows Phone. Our case studies with third-party applications indicate that the Cloud Twin approach can become a viable solution to the heterogeneity of the mobile application market.
@InProceedings{ASE13p598,
author = {Ethan Holder and Eeshan Shah and Mohammed Davoodi and Eli Tilevich},
title = {Cloud Twin: Native Execution of Android Applications on the Windows Phone},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {598--603},
doi = {},
year = {2013},
}
SBFR: A Search Based Approach for Reproducing Failures of Programs with Grammar Based Input
Fitsum Meshesha Kifetew, Wei Jin, Roberto Tiella,
Alessandro Orso, and Paolo Tonella
(Fondazione Bruno Kessler, Italy; Georgia Institute of Technology, USA)
Reproducing field failures in-house, a step developers must perform when assigned a bug report, is an arduous task. In most cases, developers must be able to reproduce a reported failure using only a stack trace and/or some informal description of the failure. The problem becomes even harder for the large class of programs whose input is highly structured and strictly specified by a grammar. To address this problem, we present SBFR, a search-based failure-reproduction technique for programs with structured input. SBFR formulates failure reproduction as a search problem. Starting from a reported failure and a limited amount of dynamic information about the failure, SBFR exploits the potential of genetic programming to iteratively find legal inputs that can trigger the failure.
@InProceedings{ASE13p604,
author = {Fitsum Meshesha Kifetew and Wei Jin and Roberto Tiella and Alessandro Orso and Paolo Tonella},
title = {SBFR: A Search Based Approach for Reproducing Failures of Programs with Grammar Based Input},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {604--609},
doi = {},
year = {2013},
}
New Ideas: Testing and Debugging
Pythia: Generating Test Cases with Oracles for JavaScript Applications
Shabnam Mirshokraie,
Ali Mesbah, and
Karthik Pattabiraman
(University of British Columbia, Canada)
Web developers often write test cases manually using testing frameworks such as Selenium. Testing JavaScript-based applications is challenging as manually exploring various execution paths of the application is difficult. Also JavaScript’s highly dynamic nature as well as its complex interaction with the DOM make it difficult for the tester to achieve high coverage. We present a framework to automatically generate unit test cases for individual JavaScript functions. These test cases are strengthened by automatically generated test oracles capable of detecting faults in JavaScript code. Our approach is implemented in a tool called PYTHIA. Our preliminary evaluation results point to the efficacy of the approach in achieving high coverage and detecting faults.
@InProceedings{ASE13p610,
author = {Shabnam Mirshokraie and Ali Mesbah and Karthik Pattabiraman},
title = {Pythia: Generating Test Cases with Oracles for JavaScript Applications},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {610--615},
doi = {},
year = {2013},
}
Randomizing Regression Tests using Game Theory
Nupul Kukreja,
William G. J. Halfond, and Milind Tambe
(University of Southern California, USA)
As software evolves, the number of test-cases in the regression test
suites continues to increase, requiring testers to prioritize their
execution. Usually only a subset of the test cases is executed due
to limited testing resources. This subset is often known to the
developers who may try to ``game'' the system by committing
insufficiently tested code for parts of the software that will
not be tested. In this new ideas paper, we propose a novel approach
for randomizing regression test scheduling, based on Stackelberg
games for deployment of scarce resources. We apply this approach
to randomizing test cases in such a way as to maximize the
testers' expected payoff when executing the test cases. Our
approach accounts for resource limitations ( number of testers)
and provides a probabilistic distribution for scheduling test
cases. We provide an example application of our approach
showcasing the idea of using Stackelberg games for randomized
regression test scheduling.
@InProceedings{ASE13p616,
author = {Nupul Kukreja and William G. J. Halfond and Milind Tambe},
title = {Randomizing Regression Tests using Game Theory},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {616--621},
doi = {},
year = {2013},
}
Automated Inference of Classifications and Dependencies for Combinatorial Testing
Cu Duy Nguyen and Paolo Tonella
(Fondazione Bruno Kessler, Italy)
Even for small programs, the input space is huge – often unbounded. Partition testing divides the input space into disjoint equivalence classes and combinatorial testing selects a subset of all possible input class combinations, according to criteria such as pairwise coverage. The down side of this approach is that the partitioning of the input space into equivalence classes (input classification) is done manually. It is expensive and requires deep domain and implementation understanding. In this paper, we propose a novel approach to classify test inputs and their dependencies automatically. Firstly, random (or automatically generated) input vectors are sent to the system under test (SUT). For each input vector, an observed “hit vector” is produced by monitoring the execution of the SUT. Secondly, hit vectors are grouped into clusters using machine learning. Each cluster contains similar hit vectors, i.e., similar behaviors, and from them we obtain corresponding clusters of input vectors. Input classes are then extracted for each input parameter straightforwardly. Our experiments with a number of subjects show good results as the automatically generated classifications are the same or very close to the expected ones.
@InProceedings{ASE13p622,
author = {Cu Duy Nguyen and Paolo Tonella},
title = {Automated Inference of Classifications and Dependencies for Combinatorial Testing},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {622--627},
doi = {},
year = {2013},
}
Adding Context to Fault Localization with Integration Coverage
Higor Amario de Souza and Marcos Lordello Chaim
(University of Sao Paulo, Brazil)
Fault localization is a costly task in the debugging process. Several techniques to automate fault localization have been proposed aiming at reducing effort and time spent. Some techniques use heuristics based on code coverage data. The goal is to indicate program code excerpts more likely to contain faults. The coverage data mostly used in automated debugging is based on white-box unit testing (e.g., statements, basic blocks, predicates). This paper presents a technique which uses integration coverage data to guide the fault localization process. By ranking most suspicious pairs of method invocations, roadmaps - sorted lists of methods to be investigated - are created. At each method, unit coverage (e.g., basic blocks) is used to locate the fault site. Fifty-five bugs of four programs containing 2K to 80K lines of code (LOC) were analyzed. The results indicate that, by using the roadmaps, the effectiveness of the fault localization process are improved: 78% of all the faults are reached within a fixed amount of basic blocks; 40% more than an approach based on the Tarantula technique. Furthermore, fewer blocks have to be investigated until reaching the fault.
@InProceedings{ASE13p628,
author = {Higor Amario de Souza and Marcos Lordello Chaim},
title = {Adding Context to Fault Localization with Integration Coverage},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {628--633},
doi = {},
year = {2013},
}
Using Automatically Generated Invariants for Regression Testing and Bug Localization
Parth Sagdeo, Nicholas Ewalt, Debjit Pal, and Shobha Vasudevan
(University of Illinois at Urbana-Champaign, USA)
We present PREAMBL, an approach that applies automatically
generated invariants to regression testing and bug localization.
Our invariant generation methodology is PRECIS, an automatic and
scalable engine that uses program predicates to guide clustering of
dynamically obtained path information. In this paper, we apply
it for regression testing and for capturing program predicates
information to guide statistical analysis based bug localization. We
present a technique to localize bugs in paths of variable lengths. We are able to map the localized post-deployment bugs on a
path to pre-release invariants generated along that path. Our experimental results demonstrate
the efficacy of the use of PRECIS for regression testing, as well as
the ability of PREAMBL to zone in on relevant segments of program
paths.
@InProceedings{ASE13p634,
author = {Parth Sagdeo and Nicholas Ewalt and Debjit Pal and Shobha Vasudevan},
title = {Using Automatically Generated Invariants for Regression Testing and Bug Localization},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {634--639},
doi = {},
year = {2013},
}
Class Level Fault Prediction using Software Clustering
Giuseppe Scanniello, Carmine Gravino, Andrian Marcus, and Tim Menzies
(University of Basilicata, Italy; University of Salerno, Italy; Wayne State University, USA; West Virginia University, USA)
Defect prediction approaches use software metrics and fault data to learn which software properties associate with faults in classes. Existing techniques predict fault-prone classes in the same release (intra) or in a subsequent releases (inter) of a subject software system. We propose an intra-release fault prediction technique, which learns from clusters of related classes, rather than from the entire system. Classes are clustered using structural information and fault prediction models are built using the properties of the classes in each cluster. We present an empirical investigation on data from 29 releases of eight open source software systems from the PROMISE repository, with predictors built using multivariate linear regression. The results indicate that the prediction models built on clusters outperform those built on all the classes of the system.
@InProceedings{ASE13p640,
author = {Giuseppe Scanniello and Carmine Gravino and Andrian Marcus and Tim Menzies},
title = {Class Level Fault Prediction using Software Clustering},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {640--645},
doi = {},
year = {2013},
}
Info
ExPort: Detecting and Visualizing API Usages in Large Source Code Repositories
Evan Moritz, Mario Linares-Vásquez,
Denys Poshyvanyk, Mark Grechanik,
Collin McMillan, and Malcom Gethers
(College of William and Mary, USA; University of Illinois at Chicago, USA; University of Notre Dame, USA; University of Maryland in Baltimore County, USA)
This paper presents a technique for automatically mining and visualizing API usage examples. In contrast to previous approaches, our technique is capable of finding examples of API usage that occur across several functions in a program. This distinction is important because of a gap between what current API learning tools provide and what programmers need: current tools extract relatively small examples from single files/functions, even though programmers use APIs to build large software. The small examples are helpful in the initial stages of API learning, but leave out details that are helpful in later stages. Our technique is intended to fill this gap. It works by representing software as a Relational Topic Model, where API calls and the functions that use them are modeled as a document network. Given a starting API, our approach can recommend complex API usage examples mined from a repository of over 14 million Java methods
@InProceedings{ASE13p646,
author = {Evan Moritz and Mario Linares-Vásquez and Denys Poshyvanyk and Mark Grechanik and Collin McMillan and Malcom Gethers},
title = {ExPort: Detecting and Visualizing API Usages in Large Source Code Repositories},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {646--651},
doi = {},
year = {2013},
}
Info
Flow Permissions for Android
Shashank Holavanalli, Don Manuel, Vishwas Nanjundaswamy, Brian Rosenberg, Feng Shen, Steven Y. Ko, and
Lukasz Ziarek
(SUNY Buffalo, USA)
This paper proposes Flow Permissions, an extension to the Android permission mechanism. Unlike the existing permission mechanism our permission mechanism
contains semantic information based on information flows. Flow Permissions
allow users to examine and grant explicit information flows within an
application (e.g., a permission for reading the phone number and sending it
over the network) as well as implicit information flows across multiple
applications (e.g., a permission for reading the phone number and sending it
to another application already installed on the user's phone). Our goal with
Flow Permissions is to provide visibility into the holistic behavior of the
applications installed on a user's phone. Our evaluation compares our approach to
dynamic flow tracking techniques; our results with 600 popular applications
and 1,200 malicious applications show that our approach is practical
and effective in deriving Flow Permissions statically.
@InProceedings{ASE13p652,
author = {Shashank Holavanalli and Don Manuel and Vishwas Nanjundaswamy and Brian Rosenberg and Feng Shen and Steven Y. Ko and Lukasz Ziarek},
title = {Flow Permissions for Android},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {652--657},
doi = {},
year = {2013},
}
New Ideas: Models and Requirements
A Pattern-Based Approach to Parametric Specification Mining
Giles Reger, Howard Barringer, and David Rydeheard
(University of Manchester, UK)
This paper presents a technique for using execution traces to mine parametric temporal specifications in the form of quantified event automata (QEA) - previously introduced as an expressive and efficient formalism for runtime verification. We consider a pattern-based mining approach that uses a pattern library to generate and check potential properties over given traces, and then combines successful patterns. By using predefined models to measure the tool’s precision and recall we demonstrate that our approach can effectively and efficiently extract specifications in realistic scenarios.
@InProceedings{ASE13p658,
author = {Giles Reger and Howard Barringer and David Rydeheard},
title = {A Pattern-Based Approach to Parametric Specification Mining},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {658--663},
doi = {},
year = {2013},
}
Semi-automatic Generation of Metamodels from Model Sketches
Dustin Wüest, Norbert Seyff, and Martin Glinz
(University of Zurich, Switzerland)
Traditionally, metamodeling is an upfront activity performed by experts for defining modeling languages. Modeling tools then typically restrict modelers to using only constructs defined in the metamodel. This is inappropriate when users want to sketch graphical models without any restrictions and only later assign meanings to the sketched elements. Upfront metamodeling also complicates the creation of domain-specific languages, as it requires experts with both domain and metamodeling expertise.
In this paper we present a new approach that supports modelers in creating metamodels for diagrams they have sketched or are currently sketching. Metamodels are defined in a semi-automatic, interactive way by annotating diagram elements and automated model analysis. Our approach requires no metamodeling expertise and supports the co-evolution of models and meta-models.
@InProceedings{ASE13p664,
author = {Dustin Wüest and Norbert Seyff and Martin Glinz},
title = {Semi-automatic Generation of Metamodels from Model Sketches},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {664--669},
doi = {},
year = {2013},
}
Assessing the Maturity of Requirements through Argumentation: A Good Enough Approach
Varsha Veerappa and Rachel Harrison
(Oxford Brookes University, UK)
Requirements engineers need to be confident that enough requirements analysis has been done before a project can move forward. In the context of KAOS, this information can be derived from the soundness of the refinements: sound refinements indicate that the requirements in the goal-graph are mature enough or good enough for implementation. We can estimate how close we are to ‘good enough’ requirements using the judgments of experts and other data from the goals. We apply Toulmin’s model of argumentation to evaluate how sound refinements are. We then implement the resulting argumentation model using Bayesian Belief Networks and provide a semi-automated way aided by Natural Language Processing techniques to carry out the proposed evaluation. We have performed an initial validation on our work using a small case-study involving an electronic document management system.
@InProceedings{ASE13p670,
author = {Varsha Veerappa and Rachel Harrison},
title = {Assessing the Maturity of Requirements through Argumentation: A Good Enough Approach},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {670--675},
doi = {},
year = {2013},
}
Natural Language Requirements Quality Analysis Based on Business Domain Models
Annervaz K.M.,
Vikrant Kaulgud, Shubhashis Sengupta, and Milind Savagaonkar
(Accenture Technology Labs, India)
Quality of requirements written in natural language
has always been a critical concern in software engineering. Poorly
written requirements lead to ambiguity and false interpretation
in different phases of a software delivery project. Further, incomplete
requirements lead to partial implementation of the desired
system behavior. In this paper, we present a model for harvesting
domain (functional or business) knowledge. Subsequently
we present natural language processing and ontology based
techniques for leveraging the model to analyze requirements
quality and for requirements comprehension. The prototype also
provides an advisory to business analysts so that the requirements
can be aligned to the expected domain standard. The prototype
developed is currently being used in practice, and the initial
results are very encouraging.
@InProceedings{ASE13p676,
author = {Annervaz K.M. and Vikrant Kaulgud and Shubhashis Sengupta and Milind Savagaonkar},
title = {Natural Language Requirements Quality Analysis Based on Business Domain Models},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {676--681},
doi = {},
year = {2013},
}
Model/Code Co-Refactoring: An MDE Approach
Jens von Pilgrim, Bastian Ulke, Andreas Thies, and
Friedrich Steimann
(Fernuniversität in Hagen, Germany)
Model-driven engineering suggests that models are the primary artefacts of software development. This means that models may be refactored even after code has been generated from them, in which case the code must be changed to reflect the refactoring. However, as we show neither re-generating the code from the re-factored model nor applying an equivalent refactoring to the gen-erated code is sufficient to keep model and code in sync — rather, model and code need to be refactored jointly. To enable this, we investigate the technical requirements of model/code co-refactoring, and implement a model-driven solution that we eval-uate using a set of open-source programs and their structural models. Results suggest that our approach is feasible.
@InProceedings{ASE13p682,
author = {Jens von Pilgrim and Bastian Ulke and Andreas Thies and Friedrich Steimann},
title = {Model/Code Co-Refactoring: An MDE Approach},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {682--687},
doi = {},
year = {2013},
}
Recovering Model Transformation Traces using Multi-Objective Optimization
Hajer Saada, Marianne Huchard, Clémentine Nebut, and
Houari Sahraoui
(Université Montpellier 2, France; CNRS, France; Université de Montréal, Canada)
Model Driven Engineering (MDE) is based on a large set of models that are used and manipulated throughout the development cycle. These models are manually or automatically produced and/or exploited using model transformations. To allow engineers to maintain the models and track their changes, recovering transformation traces is essential. In this paper, we propose an automated approach, based on multi-objective optimization, to recover transformation traces between models. Our approach takes as input a source model in the form of a set of fragments (fragments are defined using the source meta-model cardinalities and OCL constraints), and a target model. The recovered transformation traces take the form of many-to-many mappings between the constructs of the two models.
@InProceedings{ASE13p688,
author = {Hajer Saada and Marianne Huchard and Clémentine Nebut and Houari Sahraoui},
title = {Recovering Model Transformation Traces using Multi-Objective Optimization},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {688--693},
doi = {},
year = {2013},
}
Tool Demonstrations
Model Repair and Transformation with Echo
Nuno Macedo, Tiago Guimarães, and
Alcino Cunha
(HASLab at INESC TEC, Portugal; Universidade do Minho, Portugal)
Models are paramount in model-driven engineering. In a software project many models may coexist, capturing different views of the system or different levels of abstraction.
A key and arduous task in this development method is to keep all such models consistent, both with their meta-models (and the respective constraints) and among themselves.
This paper describes Echo, a tool that aims at simplifying this task by automating inconsistency detection and repair using a solver based engine. Consistency between different models can be specified by bidirectional model transformations,
and is guaranteed to be recovered by minimal updates on the inconsistent models.
The tool is freely available as an Eclipse plugin, developed on top of the popular EMF framework, and supports constraints and transformations specified in the OMG standard languages OCL and QVT-R, respectively.
@InProceedings{ASE13p694,
author = {Nuno Macedo and Tiago Guimarães and Alcino Cunha},
title = {Model Repair and Transformation with Echo},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {694--697},
doi = {},
year = {2013},
}
Video
Info
Smart Cloud Broker: Finding Your Home in the Clouds
Mohan Baruwal Chhetri, Sergei Chichin, Quoc Bao Vo, and Ryszard Kowalczyk
(Swinburne University of Technology, Australia)
As the rate of cloud computing adoption grows, so
does the need for consumption assistance. Enterprises looking to
migrate their IT systems to the cloud require assistance in identifying
providers that offer resources with the most appropriate
pricing and performance levels to match their specific business
needs. In this paper, we present Smart Cloud Broker – a suite
of software tools that allows cloud infrastructure consumers to
evaluate and compare the performance of different Infrastructure
as a Service (IaaS) offerings from competing cloud service
providers, and consequently supports selection of the cloud
configuration and provider with the specifications that best meet
the user’s requirements. Using Smart Cloud Broker, prospective
cloud users can estimate the performance of the different cloud
platforms by running live tests against representative benchmark
applications under representative load conditions.
@InProceedings{ASE13p698,
author = {Mohan Baruwal Chhetri and Sergei Chichin and Quoc Bao Vo and Ryszard Kowalczyk},
title = {Smart Cloud Broker: Finding Your Home in the Clouds},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {698--701},
doi = {},
year = {2013},
}
OCRA: A Tool for Checking the Refinement of Temporal Contracts
Alessandro Cimatti, Michele Dorigatti, and Stefano Tonetta
(Fondazione Bruno Kessler, Italy)
Contract-based design enriches a component model with properties structured in pairs of assumptions and guarantees. These properties are expressed in term of the variables at the interface of the components, and specify how a component interacts with its environment: the assumption is a property
that must be satisfied by the environment of the component, while the guarantee is a property that the component must satisfy in response. Contract-based design has been recently proposed in many methodologies for taming the complexity of embedded systems. In fact, contract-based design enables
stepwise refinement, compositional verification, and reuse of components. However, only few tools exist to support the formal verification underlying these methods.
OCRA (Othello Contracts Refinement Analysis) is a new tool that provides means for checking the refinement of contracts specified in a linear-time temporal logic. The specification language allows to express discrete as well as metric real-time constraints. The underlying reasoning engine allows checking if the contract refinement is correct. OCRA has been used in different projects and integrated in CASE tools.
@InProceedings{ASE13p702,
author = {Alessandro Cimatti and Michele Dorigatti and Stefano Tonetta},
title = {OCRA: A Tool for Checking the Refinement of Temporal Contracts},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {702--705},
doi = {},
year = {2013},
}
The Bounded Model Checker LLBMC
Stephan Falke, Florian Merz, and Carsten Sinz
(KIT, Germany)
This paper presents LLBMC, a tool for finding bugs and runtime errors in sequential C/C++ programs. LLBMC employs bounded model checking using an SMT-solver for the theory of bitvectors and arrays and thus achieves precision down to the level of single bits. The two main features of LLBMC that distinguish it from other bounded model checking tools for C/C++ are (i) its bit-precise memory model, which makes it possible to support arbitrary type conversions via stores and loads; and (ii) that it operates on a compiler intermediate representation and not directly on the source code.
@InProceedings{ASE13p706,
author = {Stephan Falke and Florian Merz and Carsten Sinz},
title = {The Bounded Model Checker LLBMC},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {706--709},
doi = {},
year = {2013},
}
Video
Info
CSeq: A Concurrency Pre-processor for Sequential C Verification Tools
Bernd Fischer, Omar Inverso, and Gennaro Parlato
(Stellenbosch University, South Africa; University of Southampton, UK)
Sequentialization translates concurrent programs into equivalent nondeterministic sequential programs so that the different concurrent schedules no longer need to be handled explicitly. It can thus be used as a concurrency preprocessing technique for automated sequential program verification tools. Our CSeq tool implements a novel sequentialization for C programs using pthreads, which extends the Lal/Reps sequentialization to support dynamic thread creation. CSeq now works with three different backend tools, CBMC, ESBMC, and LLBMC, and is competitive with state-of-the-art verification tools for concurrent programs.
@InProceedings{ASE13p710,
author = {Bernd Fischer and Omar Inverso and Gennaro Parlato},
title = {CSeq: A Concurrency Pre-processor for Sequential C Verification Tools},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {710--713},
doi = {},
year = {2013},
}
Info
Automated Testing of Cloud-Based Elastic Systems with AUToCLES
Alessio Gambi, Waldemar Hummer, and Schahram Dustdar
(University of Lugano, Switzerland; Vienna University of Technology, Austria)
Cloud-based elastic computing systems dynamically change their resources
allocation to provide consistent quality of service and minimal usage of
resources in the face of workload fluctuations. workload increase and decrease.
As elastic systems are increasingly adopted to implement business critical
functions in a cost-efficient way, their reliability is becoming a key concern
for developers. Without proper testing, cloud-based systems might fail to
provide the required functionalities with the expected service level and costs.
Using system testing techniques, developers can expose problems that escaped the
previous quality assurance activities and have a last chance to fix bugs before
releasing the system in production.
System testing of cloud-based systems accounts for a series of complex and time
demanding activities, from the deployment and configuration of the elastic
system, to the execution of synthetic clients, and the collection and
persistence of execution data.
Furthermore, clouds enable parallel executions of the same elastic system that
can reduce the overall test execution time. However, manually managing the
concurrent testing of multiple system instances might quickly overwhelm
developers' capabilities, and automatic support for test generation, system
test execution, and management of execution data is needed.
In this demo we showcase AUToCLES, our tool for automatic testing of cloud-based
elastic systems. Given specifications of the test suite and the system under
test, AUToCLES implements testing as a service (TaaS): It automatically
instantiates the SUT, configures the testing scaffoldings, and automatically
executes test suites. Designers can inspect executions both during and after the tests.
@InProceedings{ASE13p714,
author = {Alessio Gambi and Waldemar Hummer and Schahram Dustdar},
title = {Automated Testing of Cloud-Based Elastic Systems with AUToCLES},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {714--717},
doi = {},
year = {2013},
}
Tool Support for Automatic Model Transformation Specification using Concrete Visualisations
Iman Avazpour, John Grundy, and Lars Grunske
(Swinburne University of Technology, Australia; University of Stuttgart, Germany)
Complex model transformation is crucial in several domains, including Model-Driven Engineering (MDE), information visualisation and data mapping. Most current approaches use meta-model-driven transformation specification via coding in textual scripting languages. This paper demonstrates a novel approach and tool support that instead provides for specification of correspondences between models using concrete visualisations of source and target models, and generates transformation scripts from these by-example model correspondence specifications.
@InProceedings{ASE13p718,
author = {Iman Avazpour and John Grundy and Lars Grunske},
title = {Tool Support for Automatic Model Transformation Specification using Concrete Visualisations},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {718--721},
doi = {},
year = {2013},
}
Video
Info
CCmutator: A Mutation Generator for Concurrency Constructs in Multithreaded C/C++ Applications
Markus Kusano and Chao Wang
(Virginia Tech, USA)
We introduce CCmutator, a mutation generation tool for multithreaded C/C++ programs written using POSIX threads and the recently standardized C++11 concurrency constructs. CCmutator is capable of performing partial mutations and generating higher order mutants, which allow for more focused and complex combinations of elementary mutation operators leading to higher quality mutants. We
have implemented CCmutator based on the popular Clang/LLVM compiler framework, which allows CCmutator to be extremely scalable and robust in handling real-world C/C++ applications. CCmutator is also designed in such a way that all mutants of the same order can be generated in parallel, which allows the tool to be easily parallelized on commodity multicore hardware to improve performance.
@InProceedings{ASE13p722,
author = {Markus Kusano and Chao Wang},
title = {CCmutator: A Mutation Generator for Concurrency Constructs in Multithreaded C/C++ Applications},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {722--725},
doi = {},
year = {2013},
}
Video
Info
Crushinator: A Framework towards Game-Independent Testing
Christopher Schaefer, Hyunsook Do, and Brian M. Slator
(North Dakota State University, USA)
Testing game applications relies heavily on beta testing methods. The effectiveness of beta testing depends on how well beta testers represent the common game-application users and if users are willing to participate in the beta test. An automated testing tool framework could reduce the dependence upon beta testing by most companies to analyze their game applications. This paper presents the Crushinator as one such framework. This framework provides a game-independent testing
tool that implements multiple testing methods that can assist and possibly replace the use of beta testing.
@InProceedings{ASE13p726,
author = {Christopher Schaefer and Hyunsook Do and Brian M. Slator},
title = {Crushinator: A Framework towards Game-Independent Testing},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {726--729},
doi = {},
year = {2013},
}
Pex4Fun: A Web-Based Environment for Educational Gaming via Automated Test Generation
Nikolai Tillmann,
Jonathan de Halleux, Tao Xie, and Judith Bishop
(Microsoft Research, USA; University of Illinois at Urbana-Champaign, USA)
Pex4Fun (http://www.pex4fun.com/) is a web-based educational gaming environment for teaching and learning programming and software engineering. Pex4Fun can be used to teach and learn programming and software engineering at many levels, from high school all the way through graduate courses. With Pex4Fun, a student edits code in any browser – with Intellisense – and Pex4Fun executes it and analyzes it in the cloud. Pex4Fun connects teachers, curriculum authors, and students in a unique social experience, tracking and streaming progress updates in real time. In particular, Pex4Fun finds interesting and unexpected input values (with Pex, an advanced test-generation tool) that help students understand what their code is actually doing. The real fun starts with coding duels where a student writes code to implement a teacher’s secret specification (in the form of sample-solution code not visible to the student). Pex4Fun finds any discrepancies in behavior between the student’s code and the secret specification. Such discrepancies are given as feedback to the student to guide how to fix the student’s code to match the behavior of the secret specification.
This tool demonstration shows how Pex4Fun can be used in teaching and learning, such as solving coding duels, exploring course materials in feature courses, creating and teaching a course, creating and publishing coding duels, and learning advanced topics behind Pex4Fun.
@InProceedings{ASE13p730,
author = {Nikolai Tillmann and Jonathan de Halleux and Tao Xie and Judith Bishop},
title = {Pex4Fun: A Web-Based Environment for Educational Gaming via Automated Test Generation},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {730--733},
doi = {},
year = {2013},
}
Developing Self-Verifying Service-Based Systems
Radu Calinescu, Kenneth Johnson, and Yasmin Rafiq
(University of York, UK)
We present a tool-supported framework for the engineering of service-based systems (SBSs) capable of self-verifying their compliance with developer-specified reliability requirements. These self-verifying systems select their services dynamically by using a combination of continual quantitative verification and online updating of the verified models. Our framework enables the practical exploitation of recent theoretical advances in the development of self-adaptive SBSs through (a) automating the generation of the software components responsible for model updating, continual verification and service selection; and (b) employing standard SBS development processes.
@InProceedings{ASE13p734,
author = {Radu Calinescu and Kenneth Johnson and Yasmin Rafiq},
title = {Developing Self-Verifying Service-Based Systems},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {734--737},
doi = {},
year = {2013},
}
Info
TRAM: A Tool for Transforming Textual Requirements into Analysis Models
Keletso J. Letsholo, Liping Zhao, and Erol-Valeriu Chioasca
(University of Manchester, UK)
Tool support for automatically constructing analysis models from the natural language specification of requirements (NLR) is critical to model driven development (MDD), as it can bring forward the use of precise formal languages from the coding to the specification phase in the MDD lifecycle. TRAM provides such a support through a novel approach. By using a set of conceptual patterns to facilitate the transformation of an NLR to its target software model, TRAM has shown its potential as an automated tool to support the earliest phase of MDD. This paper describes TRAM and evaluates the tool against three benchmark approaches.
@InProceedings{ASE13p738,
author = {Keletso J. Letsholo and Liping Zhao and Erol-Valeriu Chioasca},
title = {TRAM: A Tool for Transforming Textual Requirements into Analysis Models},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {738--741},
doi = {},
year = {2013},
}
iProbe: A Lightweight User-Level Dynamic Instrumentation Tool
Nipun Arora, Hui Zhang, Junghwan Rhee, Kenji Yoshihira, and Guofei Jiang
(NEC Labs, USA)
We introduce a new hybrid instrumentation tool for dynamic application instrumentation called iProbe, which is flexible and has low overhead. iProbe takes a novel 2-stage design, and offloads much of the dynamic instrumentation complexity to an offline compilation stage. It leverages standard compiler flags to introduce “place-holders” for hooks in the program executable. Then it utilizes an efficient user-space “HotPatching” mechanism which modifies the functions to be traced and enables execution of instrumented code in a safe and secure manner. In its evaluation on a micro-benchmark and SPEC CPU2006 benchmark applications, the iProbe prototype achieved the instrumentation overhead an order of magnitude lower than existing state-of-the- art dynamic instrumentation tools like SystemTap and DynInst.
@InProceedings{ASE13p742,
author = {Nipun Arora and Hui Zhang and Junghwan Rhee and Kenji Yoshihira and Guofei Jiang},
title = {iProbe: A Lightweight User-Level Dynamic Instrumentation Tool},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {742--745},
doi = {},
year = {2013},
}
Video
Doctoral Symposium
Detecting and Fixing Emergent Behaviors in Distributed Software Systems using a Message Content Independent Method
Fatemeh Hendijani Fard
(University of Calgary, Canada)
This research is intended to automatically detect emergent behaviors of scenario based Distributed Software Systems (DSS) in design phase. The direct significance of our work is reducing the cost of verifying DSS for unexpected behavior in execution time. Existing approaches have some drawbacks which we try to cover in our work. The main contributions are modeling the DSS components as a social network and not using behavioral modeling, detecting components with no emergent behavior, and investigating the interactions of instances of one type.
@InProceedings{ASE13p746,
author = {Fatemeh Hendijani Fard},
title = {Detecting and Fixing Emergent Behaviors in Distributed Software Systems using a Message Content Independent Method},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {746--749},
doi = {},
year = {2013},
}
Synthesizing Fault-Tolerant Programs from Deontic Logic Specifications
Ramiro Demasi
(McMaster University, Canada)
We study the problem of synthesizing fault-tolerant components from specifications, i.e.,
the problem of automatically constructing a fault-tolerant component implementation
from a logical specification of the component, and the system's required level of fault-tolerance.
In our approach, the logical specification of the component is given in dCTL, a branching
time temporal logic with deontic operators, especially designed for fault-tolerant
component specification. The synthesis algorithm takes the component specification,
and a user-defined level of fault-tolerance (masking, nonmasking, failsafe), and
automatically determines whether a component with the required fault-tolerance
is realizable. Moreover, if the answer is positive, then the algorithm produces such a
fault-tolerant implementation. Our technique for synthesis is based on the use of
(bi)simulation algorithms for capturing different fault-tolerance classes, and the
extension of a synthesis algorithm for CTL to cope with dCTL specifications.
@InProceedings{ASE13p750,
author = {Ramiro Demasi},
title = {Synthesizing Fault-Tolerant Programs from Deontic Logic Specifications},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {750--753},
doi = {},
year = {2013},
}
Supporting Bug Investigation using History Analysis
Francisco Servant
(University of California at Irvine, USA)
In my research, I propose an automated technique to support bug investigation by using a novel analysis of the history of the source code. During the bug-fixing process, developers spend a high amount of manual effort investigating the bug in order to answer a series of questions about it. My research will support developers in answering the following questions about a bug: Who is the most suitable developer to fix the bug?, Where is the bug located?, When was the bug inserted? and Why was the bug inserted?
@InProceedings{ASE13p754,
author = {Francisco Servant},
title = {Supporting Bug Investigation using History Analysis},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {754--757},
doi = {},
year = {2013},
}
Context-Aware Task Allocation for Distributed Agile Team
Jun Lin
(Nanyang Technological University, Singapore)
The philosophy of Agile software development advocates the spirit of open discussion and coordination among team members to adapt to incremental changes encountered during the process. Based on our observations from 20 agile student development teams over an 8-week study in Beihang University, China, we found that the task allocation strategy as a result of following the Agile process heavily depends on the experience of the users, and cannot be guaranteed to result in efficient utilization of team resources. In this research, we propose a context-aware task allocation decision support system that balances the considerations for quality and timeliness to improve the overall utility derived from an agile software development project.We formulate the agile process as a distributed constraint optimization problem, and propose a technology framework that assesses individual developers’ situations based on data collected from a Scrum-based agile process, and helps individual developers make situation-aware decisions on which tasks from the backlog to select in real-time. Preliminary analysis and simulation results show that it can achieve close to optimally efficient utilization of the developers’ collective capacity. We plan to build the framework into a computer-supported collaborative development platform and refine the method through more realistic projects.
@InProceedings{ASE13p758,
author = {Jun Lin},
title = {Context-Aware Task Allocation for Distributed Agile Team},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {758--761},
doi = {},
year = {2013},
}
Preventing Erosion of Architectural Tactics through Their Strategic Implementation, Preservation, and Visualization
Mehdi Mirakhorli
(DePaul University, USA)
Nowadays, a successful software production is increasingly dependent on how the final deployed system addresses customers’ and users’ quality concerns such as security, reliability, availability, interoperability, performance and many other types of such requirements. In order to satisfy such quality concerns, software architects are accountable for devising and comparing various alternate solutions, assessing the trade-offs, and finally adopting strategic design decisions which optimize the degree to which each of the quality concerns is satisfied. Although designing and implementing a good architecture is necessary, it is not usually enough. Even a good architecture can deteriorate in subsequent releases and then fail to address those concerns for which it was initially designed. In this work, we present a novel traceability approach for automating the construction of traceabilty links for architectural tactics and utilizing those links to implement a change impact analysis infrastructure to mitigate the problem of architecture degradation. Our approach utilizes machine learning methods to detect tactic-related classes.
The detected tactic-related classes are then mapped to a Tactic Traceability Pattern. We train our trace algorithm using code extracted from fifty performance-centric and safety-critical open source software systems and then evaluate it against a real case study.
@InProceedings{ASE13p762,
author = {Mehdi Mirakhorli},
title = {Preventing Erosion of Architectural Tactics through Their Strategic Implementation, Preservation, and Visualization},
booktitle = {Proc.\ ASE},
publisher = {IEEE},
pages = {762--765},
doi = {},
year = {2013},
}
proc time: 0.25