ESEC/FSE 2013 – Author Index |
Contents -
Abstracts -
Authors
|
A B C D E F G H J K L M N O P Q R S T V W X Y Z
Acher, Mathieu |
ESEC/FSE '13: "Feature Model Extraction from ..."
Feature Model Extraction from Large Collections of Informal Product Descriptions
Jean-Marc Davril, Edouard Delfosse, Negar Hariri, Mathieu Acher, Jane Cleland-Huang, and Patrick Heymans (University of Namur, Belgium; DePaul University, USA; University of Rennes I, France; INRIA, France) Feature Models (FMs) are used extensively in software product line engineering to help generate and validate individual product configurations and to provide support for domain analysis. As FM construction can be tedious and time-consuming, researchers have previously developed techniques for extracting FMs from sets of formally specified individual configurations, or from software requirements specifications for families of existing products. However, such artifacts are often not available. In this paper we present a novel, automated approach for constructing FMs from publicly available product descriptions found in online product repositories and marketing websites such as SoftPedia and CNET. While each individual product description provides only a partial view of features in the domain, a large set of descriptions can provide fairly comprehensive coverage. Our approach utilizes hundreds of partial product descriptions to construct an FM and is described and evaluated against antivirus product descriptions mined from SoftPedia. @InProceedings{ESEC/FSE13p290, author = {Jean-Marc Davril and Edouard Delfosse and Negar Hariri and Mathieu Acher and Jane Cleland-Huang and Patrick Heymans}, title = {Feature Model Extraction from Large Collections of Informal Product Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {290--300}, doi = {}, year = {2013}, } |
|
Amornborvornwong, Sorawit |
ESEC/FSE '13: "Improving Trace Accuracy through ..."
Improving Trace Accuracy through Data-Driven Configuration and Composition of Tracing Features
Sugandha Lohar, Sorawit Amornborvornwong, Andrea Zisman, and Jane Cleland-Huang (DePaul University, USA; Open University, UK) Software traceability is a sought-after, yet often elusive quality in large software-intensive systems primarily because the cost and effort of tracing can be overwhelming. State-of-the art solutions address this problem through utilizing trace retrieval techniques to automate the process of creating and maintaining trace links. However, there is no simple one- size-fits all solution to trace retrieval. As this paper will show, finding the right combination of tracing techniques can lead to significant improvements in the quality of generated links. We present a novel approach to trace retrieval in which the underlying infrastructure is configured at runtime to optimize trace quality. We utilize a machine-learning approach to search for the best configuration given an initial training set of validated trace links, a set of available tracing techniques specified in a feature model, and an architecture capable of instantiating all valid configurations of features. We evaluate our approach through a series of experiments using project data from the transportation, healthcare, and space exploration domains, and discuss its implementation in an industrial environment. Finally, we show how our approach can create a robust baseline against which new tracing techniques can be evaluated. @InProceedings{ESEC/FSE13p378, author = {Sugandha Lohar and Sorawit Amornborvornwong and Andrea Zisman and Jane Cleland-Huang}, title = {Improving Trace Accuracy through Data-Driven Configuration and Composition of Tracing Features}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {378--388}, doi = {}, year = {2013}, } |
|
Apel, Sven |
ESEC/FSE '13: "Scalable Analysis of Variable ..."
Scalable Analysis of Variable Software
Jörg Liebig, Alexander von Rhein, Christian Kästner , Sven Apel, Jens Dörre, and Christian Lengauer (University of Passau, Germany; CMU, USA) The advent of variability management and generator technology enables users to derive individual variants from a variable code base based on a selection of desired configuration options. This approach gives rise to the generation of possibly billions of variants that, however, cannot be efficiently analyzed for errors with classic analysis techniques. To address this issue, researchers and practitioners usually apply sampling heuristics. While sampling reduces the analysis effort significantly, the information obtained is necessarily incomplete and it is unknown whether sampling heuristics scale to billions of variants. Recently, researchers have begun to develop variability-aware analyses that analyze the variable code base directly exploiting the similarities among individual variants to reduce analysis effort. However, while being promising, so far, variability-aware analyses have been applied mostly only to small academic systems. To learn about the mutual strengths and weaknesses of variability-aware and sampling-based analyses of software systems, we compared the two strategies by means of two concrete analysis implementations (type checking and liveness analysis), applied them to three subject systems: Busybox, the x86 Linux kernel, and OpenSSL. Our key finding is that variability-aware analysis outperforms most sampling heuristics with respect to analysis time while preserving completeness. @InProceedings{ESEC/FSE13p81, author = {Jörg Liebig and Alexander von Rhein and Christian Kästner and Sven Apel and Jens Dörre and Christian Lengauer}, title = {Scalable Analysis of Variable Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {81--91}, doi = {}, year = {2013}, } Info |
|
Barros, Paulo |
ESEC/FSE '13: "SPLat: Lightweight Dynamic ..."
SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems
Chang Hwan Peter Kim, Darko Marinov, Sarfraz Khurshid, Don Batory, Sabrina Souto, Paulo Barros, and Marcelo d'Amorim (University of Texas at Austin, USA; University of Illinois at Urbana-Champaign, USA; Groupon, USA; Federal University of Pernambuco, Brazil) Many programs can be configured through dynamic and/or static selection of configuration variables. A software product line (SPL), for example, specifies a family of programs where each program is defined by a unique combination of features. Systematically testing SPL programs is expensive as it can require running each test against a combinatorial number of configurations. Fortunately, a test is often independent of many configuration variables and need not be run against every combination. Configurations that are not required for a test can be pruned from execution. This paper presents SPLat, a new way to dynamically prune irrelevant configurations: the configurations to run for a test can be determined during test execution by monitoring accesses to configuration variables. SPLat achieves an optimal reduction in the number of configurations and is lightweight compared to prior work that used static analysis and heavyweight dynamic execution. Experimental results on 10 SPLs written in Java show that SPLat substantially reduces the total test execution time in many cases. Moreover, we demonstrate the scalability of SPLat by applying it to a large industrial code base written in Ruby on Rails. @InProceedings{ESEC/FSE13p257, author = {Chang Hwan Peter Kim and Darko Marinov and Sarfraz Khurshid and Don Batory and Sabrina Souto and Paulo Barros and Marcelo d'Amorim}, title = {SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2013}, } Info |
|
Batory, Don |
ESEC/FSE '13: "SPLat: Lightweight Dynamic ..."
SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems
Chang Hwan Peter Kim, Darko Marinov, Sarfraz Khurshid, Don Batory, Sabrina Souto, Paulo Barros, and Marcelo d'Amorim (University of Texas at Austin, USA; University of Illinois at Urbana-Champaign, USA; Groupon, USA; Federal University of Pernambuco, Brazil) Many programs can be configured through dynamic and/or static selection of configuration variables. A software product line (SPL), for example, specifies a family of programs where each program is defined by a unique combination of features. Systematically testing SPL programs is expensive as it can require running each test against a combinatorial number of configurations. Fortunately, a test is often independent of many configuration variables and need not be run against every combination. Configurations that are not required for a test can be pruned from execution. This paper presents SPLat, a new way to dynamically prune irrelevant configurations: the configurations to run for a test can be determined during test execution by monitoring accesses to configuration variables. SPLat achieves an optimal reduction in the number of configurations and is lightweight compared to prior work that used static analysis and heavyweight dynamic execution. Experimental results on 10 SPLs written in Java show that SPLat substantially reduces the total test execution time in many cases. Moreover, we demonstrate the scalability of SPLat by applying it to a large industrial code base written in Ruby on Rails. @InProceedings{ESEC/FSE13p257, author = {Chang Hwan Peter Kim and Darko Marinov and Sarfraz Khurshid and Don Batory and Sabrina Souto and Paulo Barros and Marcelo d'Amorim}, title = {SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2013}, } Info |
|
Bavota, Gabriele |
ESEC/FSE '13: "API Change and Fault Proneness: ..."
API Change and Fault Proneness: A Threat to the Success of Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto , and Denys Poshyvanyk (College of William and Mary, USA; University of Sannio, Italy; Universidad Nacional de Colombia, Colombia; University of Molise, Italy) During the recent years, the market of mobile software applications (apps) has maintained an impressive upward trajectory. Many small and large software development companies invest considerable resources to target available opportunities. As of today, the markets for such devices feature over 850K+ apps for Android and 900K+ for iOS. Availability, cost, functionality, and usability are just some factors that determine the success or lack of success for a given app. Among the other factors, reliability is an important criteria: users easily get frustrated by repeated failures, crashes, and other bugs; hence, abandoning some apps in favor of others. This paper reports a study analyzing how the fault- and change-proneness of APIs used by 7,097 (free) Android apps relates to applications' lack of success, estimated from user ratings. Results of this study provide important insights into a crucial issue: making heavy use of fault- and change-prone APIs can negatively impact the success of these apps. @InProceedings{ESEC/FSE13p477, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Bernal-Cárdenas and Massimiliano Di Penta and Rocco Oliveto and Denys Poshyvanyk}, title = {API Change and Fault Proneness: A Threat to the Success of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {477--487}, doi = {}, year = {2013}, } Info |
|
Bernal-Cárdenas, Carlos |
ESEC/FSE '13: "API Change and Fault Proneness: ..."
API Change and Fault Proneness: A Threat to the Success of Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto , and Denys Poshyvanyk (College of William and Mary, USA; University of Sannio, Italy; Universidad Nacional de Colombia, Colombia; University of Molise, Italy) During the recent years, the market of mobile software applications (apps) has maintained an impressive upward trajectory. Many small and large software development companies invest considerable resources to target available opportunities. As of today, the markets for such devices feature over 850K+ apps for Android and 900K+ for iOS. Availability, cost, functionality, and usability are just some factors that determine the success or lack of success for a given app. Among the other factors, reliability is an important criteria: users easily get frustrated by repeated failures, crashes, and other bugs; hence, abandoning some apps in favor of others. This paper reports a study analyzing how the fault- and change-proneness of APIs used by 7,097 (free) Android apps relates to applications' lack of success, estimated from user ratings. Results of this study provide important insights into a crucial issue: making heavy use of fault- and change-prone APIs can negatively impact the success of these apps. @InProceedings{ESEC/FSE13p477, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Bernal-Cárdenas and Massimiliano Di Penta and Rocco Oliveto and Denys Poshyvanyk}, title = {API Change and Fault Proneness: A Threat to the Success of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {477--487}, doi = {}, year = {2013}, } Info |
|
Bertolino, Antonia |
ESEC/FSE '13: "Adequate Monitoring of Service ..."
Adequate Monitoring of Service Compositions
Antonia Bertolino, Eda Marchetti, and Andrea Morichetta (ISTI-CNR, Italy) Monitoring is essential to validate the runtime behaviour of dynamic distributed systems. However, monitors can inform of relevant events as they occur, but by their very nature they will not report about all those events that are not happening. In service-oriented applications it would be desirable to have means to assess the thoroughness of the interactions among the services that are being monitored. In case some events or message sequences or interaction patterns have not been observed for a while, in fact, one could timely check whether this happens because something is going wrong. In this paper, we introduce the novel notion of monitoring adequacy, which is generic and can be defined on different entities. We then define two adequacy criteria for service compositions and implement a proof-of-concept adequate monitoring framework. We validate the approach on two case studies, the Travel Reservation System and the Future Market choreographies. @InProceedings{ESEC/FSE13p59, author = {Antonia Bertolino and Eda Marchetti and Andrea Morichetta}, title = {Adequate Monitoring of Service Compositions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {59--69}, doi = {}, year = {2013}, } |
|
Beyer, Dirk |
ESEC/FSE '13: "Precision Reuse for Efficient ..."
Precision Reuse for Efficient Regression Verification
Dirk Beyer , Stefan Löwe, Evgeny Novikov, Andreas Stahlbauer, and Philipp Wendler (University of Passau, Germany; ISP RAS, Russia) Continuous testing during development is a well-established technique for software-quality assurance. Continuous model checking from revision to revision is not yet established as a standard practice, because the enormous resource consumption makes its application impractical. Model checkers compute a large number of verification facts that are necessary for verifying if a given specification holds. We have identified a category of such intermediate results that are easy to store and efficient to reuse: abstraction precisions. The precision of an abstract domain specifies the level of abstraction that the analysis works on. Precisions are thus a precious result of the verification effort and it is a waste of resources to throw them away after each verification run. In particular, precisions are reasonably small and thus easy to store; they are easy to process and have a large impact on resource consumption. We experimentally show the impact of precision reuse on industrial verification problems created from 62 Linux kernel device drivers with 1119 revisions. @InProceedings{ESEC/FSE13p389, author = {Dirk Beyer and Stefan Löwe and Evgeny Novikov and Andreas Stahlbauer and Philipp Wendler}, title = {Precision Reuse for Efficient Regression Verification}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {389--399}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Bird, Christian |
ESEC/FSE '13: "Diversity in Software Engineering ..."
Diversity in Software Engineering Research
Meiyappan Nagappan, Thomas Zimmermann , and Christian Bird (Queen’s University, Canada; Microsoft Research, USA) One of the goals of software engineering research is to achieve generality: Are the phenomena found in a few projects reflective of others? Will a technique perform as well on projects other than the projects it is evaluated on? While it is common sense to select a sample that is representative of a population, the importance of diversity is often overlooked, yet as important. In this paper, we combine ideas from representativeness and diversity and introduce a measure called sample coverage, defined as the percentage of projects in a population that are similar to the given sample. We introduce algorithms to compute the sample coverage for a given set of projects and to select the projects that increase the coverage the most. We demonstrate our technique on research presented over the span of two years at ICSE and FSE with respect to a population of 20,000 active open source projects monitored by Ohloh.net. Knowing the coverage of a sample enhances our ability to reason about the findings of a study. Furthermore, we propose reporting guidelines for research: in addition to coverage scores, papers should discuss the target population of the research (universe) and dimensions that potentially can influence the outcomes of a research (space). @InProceedings{ESEC/FSE13p466, author = {Meiyappan Nagappan and Thomas Zimmermann and Christian Bird}, title = {Diversity in Software Engineering Research}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {466--476}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation ESEC/FSE '13: "Convergent Contemporary Software ..." Convergent Contemporary Software Peer Review Practices Peter C. Rigby and Christian Bird (Concordia University, Canada; Microsoft Research, USA) Software peer review is practiced on a diverse set of software projects that have drastically different settings, cultures, incentive systems, and time pressures. In an effort to characterize and understand these differences we examine two Google-led projects, Android and Chromium OS, three Microsoft projects, Bing, Office, and MS SQL, and projects internal to AMD. We contrast our findings with data taken from traditional software inspection conducted on a Lucent project and from open source software peer review on six projects, including Apache, Linux, and KDE. Our measures of interest include the review interval, the number of developers involved in review, and proxy measures for the number of defects found during review. We find that despite differences among projects, many of the characteristics of the review process have independently converged to similar values which we think indicate general principles of code review practice. We also introduce a measure of the degree to which knowledge is shared during review. This is an aspect of review practice that has traditionally only had experiential support. Our knowledge sharing measure shows that conducting peer review increases the number of distinct files a developer knows about by 66% to 150% depending on the project. This paper is one of the first studies of contemporary review in software firms and the most diverse study of peer review to date. @InProceedings{ESEC/FSE13p202, author = {Peter C. Rigby and Christian Bird}, title = {Convergent Contemporary Software Peer Review Practices}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {202--212}, doi = {}, year = {2013}, } |
|
Blincoe, Kelly |
ESEC/FSE '13: "Do All Task Dependencies Require ..."
Do All Task Dependencies Require Coordination? The Role of Task Properties in Identifying Critical Coordination Needs in Software Projects
Kelly Blincoe, Giuseppe Valetto, and Daniela Damian (Drexel University, USA; University of Victoria, Canada) Several methods exist to detect the coordination needs within software teams. Evidence exists that developers’ awareness about coordination needs improves work performance. Distinguishing with certainty between critical and trivial coordination needs and identifying and prioritizing which specific tasks a pair of developers should coordinate about remains an open problem. We investigate what work dependencies should be considered when establishing coordination needs within a development team. We use our conceptualization of work dependencies named Proximity and leverage machine learning techniques to analyze what additional task properties are indicative of coordination needs. In a case study of the Mylyn project, we were able to identify from all potential coordination requirements a subset of 17% that are most critical. We define critical coordination requirements as those that can cause the most disruption to task duration when left unmanaged. These results imply that coordination awareness tools could be enhanced to make developers aware of only the coordination needs that can bring about the highest performance benefit. @InProceedings{ESEC/FSE13p213, author = {Kelly Blincoe and Giuseppe Valetto and Daniela Damian}, title = {Do All Task Dependencies Require Coordination? The Role of Task Properties in Identifying Critical Coordination Needs in Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {213--223}, doi = {}, year = {2013}, } |
|
Böhme, Marcel |
ESEC/FSE '13: "Regression Tests to Expose ..."
Regression Tests to Expose Change Interaction Errors
Marcel Böhme, Bruno C. d. S. Oliveira, and Abhik Roychoudhury (National University of Singapore, Singapore) Changes often introduce program errors, and hence recent software testing literature has focused on generating tests which stress changes. In this paper, we argue that changes cannot be treated as isolated program artifacts which are stressed via testing. Instead, it is the complex dependency across multiple changes which introduce subtle errors. Furthermore, the complex dependence structures, that need to be exercised to expose such errors, ensure that they remain undiscovered even in well tested and deployed software. We motivate our work based on empirical evidence from a well tested and stable project - Linux GNU Coreutils - where we found that one third of the regressions take more than two (2) years to be fixed, and that two thirds of such long-standing regressions are introduced due to change interactions for the utilities we investigated. To combat change interaction errors, we first define a notion of change interaction where several program changes are found to affect the result of a program statement via program dependencies. Based on this notion, we propose a change sequence graph (CSG) to summarize the control-flow and dependencies across changes. The CSG is then used as a guide during program path exploration via symbolic execution - thereby efficiently producing test cases which witness change interaction errors. Our experimental infra-structure was deployed on various utilities of GNU Coreutils, which have been distributed with Linux for almost twenty years. Apart from finding five (5) previously unknown errors in the utilities, we found that only one in five generated test cases exercises a sequence that is critical to exposing a change-interaction error, while being an order of magnitude more likely to expose an error. On the other hand, stressing changes in isolation only exposed half of the change interaction errors. These results demonstrate the importance and difficulty of change dependence-aware regression testing. @InProceedings{ESEC/FSE13p334, author = {Marcel Böhme and Bruno C. d. S. Oliveira and Abhik Roychoudhury}, title = {Regression Tests to Expose Change Interaction Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {334--344}, doi = {}, year = {2013}, } |
|
Borgström, Johannes |
ESEC/FSE '13: "Bayesian Inference using Data ..."
Bayesian Inference using Data Flow Analysis
Guillaume Claret, Sriram K. Rajamani , Aditya V. Nori, Andrew D. Gordon, and Johannes Borgström (INRIA, France; Microsoft Research, India; Microsoft Research, UK; Uppsala University, Sweden) We present a new algorithm for Bayesian inference over probabilistic programs, based on data flow analysis techniques from the program analysis community. Unlike existing techniques for Bayesian inference on probabilistic programs, our data flow analysis algorithm is able to perform inference directly on probabilistic programs with loops. Even for loop-free programs, we show that data flow analysis offers better precision and better performance benefits over existing techniques. We also describe heuristics that are crucial for our inference to scale, and present an empirical evaluation of our algorithm over a range of benchmarks. @InProceedings{ESEC/FSE13p92, author = {Guillaume Claret and Sriram K. Rajamani and Aditya V. Nori and Andrew D. Gordon and Johannes Borgström}, title = {Bayesian Inference using Data Flow Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {92--102}, doi = {}, year = {2013}, } |
|
Braione, Pietro |
ESEC/FSE '13: "Enhancing Symbolic Execution ..."
Enhancing Symbolic Execution with Built-In Term Rewriting and Constrained Lazy Initialization
Pietro Braione, Giovanni Denaro, and Mauro Pezzè (University of Milano-Bicocca, Italy; University of Lugano, Switzerland) Symbolic execution suffers from problems when analyzing programs that handle complex data structures as their inputs and take decisions over non-linear expressions. For these programs, symbolic execution may incur invalid inputs or unidentified infeasible traces, and may raise large amounts of false alarms. Some symbolic executors tackle these problems by introducing executable preconditions to exclude invalid inputs, and some solvers exploit rewrite rules to address non linear problems. In this paper, we discuss the core limitations of executable preconditions, and address these limitations by proposing invariants specifically designed to harmonize with the lazy initialization algorithm. We exploit rewrite rules applied within the symbolic executor, to address simplifications of inverse relationships fostered from either program-specific calculations or the logic of the verification tasks. We present a symbolic executor that integrates the two techniques, and validate our approach against the verification of a relevant set of properties of the Tactical Separation Assisted Flight Environment. The empirical data show that the integrated approach can improve the effectiveness of symbolic execution. @InProceedings{ESEC/FSE13p411, author = {Pietro Braione and Giovanni Denaro and Mauro Pezzè}, title = {Enhancing Symbolic Execution with Built-In Term Rewriting and Constrained Lazy Initialization}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {411--421}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Brenner, Christian |
ESEC/FSE '13: "Incrementally Synthesizing ..."
Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications
Joel Greenyer, Christian Brenner, Maxime Cordy, Patrick Heymans, and Erika Gressi (Leibniz Universität Hannover, Germany; University of Paderborn, Germany; University of Namur, Belgium; Politecnico di Milano, Italy) Many software-intensive systems consist of components that interact to fulfill complex functionality. Moreover, often many variants of such systems have to be designed at once. This adds complexity to the design task. Recently, we proposed a scenario-based approach to design product lines, which combines feature diagrams and Modal Sequence Diagrams. We proposed a consistency-checking technique based on a dedicated product line model checker. One limitation of this technique is that it is incomplete, i.e., it may fail to show the consistency of some consistent specifications. In this paper we propose a new game-based approach that overcomes this incompleteness and, in addition, automatically synthesizes controllers for the consistent product specifications. We exploit the fact that many variants are similar and efficiently synthesize product controllers incrementally. We provide a prototype tool and evaluate the efficiency of the approach. @InProceedings{ESEC/FSE13p433, author = {Joel Greenyer and Christian Brenner and Maxime Cordy and Patrick Heymans and Erika Gressi}, title = {Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {433--443}, doi = {}, year = {2013}, } |
|
Brun, Yuriy |
ESEC/FSE '13: "Making Offline Analyses Continuous ..."
Making Offline Analyses Continuous
Kıvanç Muşlu, Yuriy Brun , Michael D. Ernst , and David Notkin (University of Washington, USA; University of Massachusetts at Amherst, USA) Developers use analysis tools to help write, debug, and understand software systems under development. A developer's change to the system source code may affect analysis results. Typically, to learn those effects, the developer must explicitly initiate the analysis. This may interrupt the developer's workflow and/or the delay until the developer learns the implications of the change. The situation is even worse for impure analyses — ones that modify the code on which it runs — because such analyses block the developer from working on the code. This paper presents Codebase Replication, a novel approach to easily convert an offline analysis — even an impure one — into a continuous analysis that informs the developer of the implications of recent changes as quickly as possible after the change is made. Codebase Replication copies the developer's codebase, incrementally keeps this copy codebase in sync with the developer's codebase, makes that copy codebase available for offline analyses to run without disturbing the developer and without the developer's changes disturbing the analyses, and makes analysis results available to be presented to the developer. We have implemented Codebase Replication in Solstice, an open-source, publicly-available Eclipse plug-in. We have used Solstice to convert three offline analyses — FindBugs, PMD, and unit testing — into continuous ones. Each conversion required on average 436 NCSL and took, on average, 18 hours. Solstice-based analyses experience no more than 2.5 milliseconds of runtime overhead per developer action. @InProceedings{ESEC/FSE13p323, author = {Kıvanç Muşlu and Yuriy Brun and Michael D. Ernst and David Notkin}, title = {Making Offline Analyses Continuous}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {323--333}, doi = {}, year = {2013}, } |
|
Brutch, Tasneem |
ESEC/FSE '13: "Jalangi: A Selective Record-Replay ..."
Jalangi: A Selective Record-Replay and Dynamic Analysis Framework for JavaScript
Koushik Sen, Swaroop Kalasapur, Tasneem Brutch, and Simon Gibbs (UC Berkeley, USA; Samsung Research, USA) JavaScript is widely used for writing client-side web applications and is getting increasingly popular for writing mobile applications. However, unlike C, C++, and Java, there are not that many tools available for analysis and testing of JavaScript applications. In this paper, we present a simple yet powerful framework, called Jalangi, for writing heavy-weight dynamic analyses. Our framework incorporates two key techniques: 1) selective record-replay, a technique which enables to record and to faithfully replay a user-selected part of the program, and 2) shadow values and shadow execution, which enables easy implementation of heavy-weight dynamic analyses. Our implementation makes no special assumption about JavaScript, which makes it applicable to real-world JavaScript programs running on multiple platforms. We have implemented concolic testing, an analysis to track origins of nulls and undefined, a simple form of taint analysis, an analysis to detect likely type inconsistencies, and an object allocation profiler in Jalangi. Our evaluation of Jalangi on the SunSpider benchmark suite and on five web applications shows that Jalangi has an average slowdown of 26X during recording and 30X slowdown during replay and analysis. The slowdowns are comparable with slowdowns reported for similar tools, such as PIN and Valgrind for x86 binaries. We believe that the techniques proposed in this paper are applicable to other dynamic languages. @InProceedings{ESEC/FSE13p488, author = {Koushik Sen and Swaroop Kalasapur and Tasneem Brutch and Simon Gibbs}, title = {Jalangi: A Selective Record-Replay and Dynamic Analysis Framework for JavaScript}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {488--498}, doi = {}, year = {2013}, } |
|
Buy, Ugo |
ESEC/FSE '13: "Preventing Database Deadlocks ..."
Preventing Database Deadlocks in Applications
Mark Grechanik, B. M. Mainul Hossain, Ugo Buy, and Haisheng Wang (University of Illinois at Chicago, USA; Oracle, USA) Many organizations deploy applications that use databases by sending Structured Query Language (SQL) statements to them and obtaining data that result from the execution of these statements. Since applications often share the same databases concurrently, database deadlocks routinely occur in these databases resulting in major performance degradation in these applications. Database engines do not prevent database deadlocks for the same reason that the schedulers of operating system kernels do not preempt processes in a way to avoid race conditions and deadlocks - it is not feasible to find an optimal context switching schedule quickly for multiple processes (and SQL statements), and the overhead of doing it is prohibitive. We created a novel approach that combines run-time monitoring, which automatically prevents database deadlocks, with static analysis, which detects hold-and-wait cycles that specify how resources (e.g., database tables) are held in contention during executions of SQL statements. We rigorously evaluated our approach. For a realistic case of over 1,200 SQL statements, our algorithm detects all hold-and-wait cycles in less than two seconds. We built a toolset and experimented with three applications. Our tool prevented all existing database deadlocks in these applications and increased their throughputs by up to three orders of magnitude. @InProceedings{ESEC/FSE13p356, author = {Mark Grechanik and B. M. Mainul Hossain and Ugo Buy and Haisheng Wang}, title = {Preventing Database Deadlocks in Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {356--366}, doi = {}, year = {2013}, } Info |
|
Cadar, Cristian |
ESEC/FSE '13: "KATCH: High-Coverage Testing ..."
KATCH: High-Coverage Testing of Software Patches
Paul Dan Marinescu and Cristian Cadar (Imperial College London, UK) One of the distinguishing characteristics of software systems is that they evolve: new patches are committed to software repositories and new versions are released to users on a continuous basis. Unfortunately, many of these changes bring unexpected bugs that break the stability of the system or affect its security. In this paper, we address this problem using a technique for automatically testing code patches. Our technique combines symbolic execution with several novel heuristics based on static and dynamic program analysis which allow it to quickly reach the code of the patch. We have implemented our approach in a tool called KATCH, which we have applied to all the patches written in a combined period of approximately six years for nineteen mature programs from the popular GNU diffutils, GNU binutils and GNU findutils utility suites, which are shipped with virtually all UNIX-based distributions. Our results show that KATCH can automatically synthesise inputs that significantly increase the patch coverage achieved by the existing manual test suites, and find bugs at the moment they are introduced. @InProceedings{ESEC/FSE13p235, author = {Paul Dan Marinescu and Cristian Cadar}, title = {KATCH: High-Coverage Testing of Software Patches}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {235--245}, doi = {}, year = {2013}, } Info Distinguished Artifact |
|
Chechik, Marsha |
ESEC/FSE '13: "N-Way Model Merging ..."
N-Way Model Merging
Julia Rubin and Marsha Chechik (IBM Research, Israel; University of Toronto, Canada) Model merging is widely recognized as an essential step in a variety of software development activities. During the process of combining a set of related products into a product line or consolidating model views of multiple stakeholders, we need to merge multiple input models into one; yet, most of the existing approaches are applicable to merging only two models. In this paper, we define the n-way merge problem. We show that it can be reduced to the known and widely studied NP-hard problem of weighted set packing. Yet, the approximation solutions for that problem do not scale for real-sized software models. We thus evaluate alternative approaches of merging models that incrementally process input models in small subsets and propose our own algorithm that considerably improves precision over such approaches without sacrificing performance. @InProceedings{ESEC/FSE13p301, author = {Julia Rubin and Marsha Chechik}, title = {N-Way Model Merging}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {301--311}, doi = {}, year = {2013}, } |
|
Claret, Guillaume |
ESEC/FSE '13: "Bayesian Inference using Data ..."
Bayesian Inference using Data Flow Analysis
Guillaume Claret, Sriram K. Rajamani , Aditya V. Nori, Andrew D. Gordon, and Johannes Borgström (INRIA, France; Microsoft Research, India; Microsoft Research, UK; Uppsala University, Sweden) We present a new algorithm for Bayesian inference over probabilistic programs, based on data flow analysis techniques from the program analysis community. Unlike existing techniques for Bayesian inference on probabilistic programs, our data flow analysis algorithm is able to perform inference directly on probabilistic programs with loops. Even for loop-free programs, we show that data flow analysis offers better precision and better performance benefits over existing techniques. We also describe heuristics that are crucial for our inference to scale, and present an empirical evaluation of our algorithm over a range of benchmarks. @InProceedings{ESEC/FSE13p92, author = {Guillaume Claret and Sriram K. Rajamani and Aditya V. Nori and Andrew D. Gordon and Johannes Borgström}, title = {Bayesian Inference using Data Flow Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {92--102}, doi = {}, year = {2013}, } |
|
Cleland-Huang, Jane |
ESEC/FSE '13: "Improving Trace Accuracy through ..."
Improving Trace Accuracy through Data-Driven Configuration and Composition of Tracing Features
Sugandha Lohar, Sorawit Amornborvornwong, Andrea Zisman, and Jane Cleland-Huang (DePaul University, USA; Open University, UK) Software traceability is a sought-after, yet often elusive quality in large software-intensive systems primarily because the cost and effort of tracing can be overwhelming. State-of-the art solutions address this problem through utilizing trace retrieval techniques to automate the process of creating and maintaining trace links. However, there is no simple one- size-fits all solution to trace retrieval. As this paper will show, finding the right combination of tracing techniques can lead to significant improvements in the quality of generated links. We present a novel approach to trace retrieval in which the underlying infrastructure is configured at runtime to optimize trace quality. We utilize a machine-learning approach to search for the best configuration given an initial training set of validated trace links, a set of available tracing techniques specified in a feature model, and an architecture capable of instantiating all valid configurations of features. We evaluate our approach through a series of experiments using project data from the transportation, healthcare, and space exploration domains, and discuss its implementation in an industrial environment. Finally, we show how our approach can create a robust baseline against which new tracing techniques can be evaluated. @InProceedings{ESEC/FSE13p378, author = {Sugandha Lohar and Sorawit Amornborvornwong and Andrea Zisman and Jane Cleland-Huang}, title = {Improving Trace Accuracy through Data-Driven Configuration and Composition of Tracing Features}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {378--388}, doi = {}, year = {2013}, } ESEC/FSE '13: "Feature Model Extraction from ..." Feature Model Extraction from Large Collections of Informal Product Descriptions Jean-Marc Davril, Edouard Delfosse, Negar Hariri, Mathieu Acher, Jane Cleland-Huang, and Patrick Heymans (University of Namur, Belgium; DePaul University, USA; University of Rennes I, France; INRIA, France) Feature Models (FMs) are used extensively in software product line engineering to help generate and validate individual product configurations and to provide support for domain analysis. As FM construction can be tedious and time-consuming, researchers have previously developed techniques for extracting FMs from sets of formally specified individual configurations, or from software requirements specifications for families of existing products. However, such artifacts are often not available. In this paper we present a novel, automated approach for constructing FMs from publicly available product descriptions found in online product repositories and marketing websites such as SoftPedia and CNET. While each individual product description provides only a partial view of features in the domain, a large set of descriptions can provide fairly comprehensive coverage. Our approach utilizes hundreds of partial product descriptions to construct an FM and is described and evaluated against antivirus product descriptions mined from SoftPedia. @InProceedings{ESEC/FSE13p290, author = {Jean-Marc Davril and Edouard Delfosse and Negar Hariri and Mathieu Acher and Jane Cleland-Huang and Patrick Heymans}, title = {Feature Model Extraction from Large Collections of Informal Product Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {290--300}, doi = {}, year = {2013}, } |
|
Cohen, Myra B. |
ESEC/FSE '13: "Efficiency and Early Fault ..."
Efficiency and Early Fault Detection with Lower and Higher Strength Combinatorial Interaction Testing
Justyna Petke , Shin Yoo, Myra B. Cohen, and Mark Harman (University College London, UK; University of Nebraska-Lincoln, USA) Combinatorial Interaction Testing (CIT) is important because it tests the interactions between the many features and parameters that make up the configuration space of software systems. However, in order to be practically applicable, it must be able to cater for soft and hard real-world constraints and should, ideally, report a test priority order that maximises earliest fault detection. We show that we can achieve the highest strength CIT in 5.65 minutes on average. This was previously thought to be too computationally expensive to be feasible. Furthermore, we show that higher strength suites find more faults, while prioritisations using lower strengths are no worse at achieving early fault revelation. @InProceedings{ESEC/FSE13p26, author = {Justyna Petke and Shin Yoo and Myra B. Cohen and Mark Harman}, title = {Efficiency and Early Fault Detection with Lower and Higher Strength Combinatorial Interaction Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {26--36}, doi = {}, year = {2013}, } Info |
|
Cordy, Maxime |
ESEC/FSE '13: "Incrementally Synthesizing ..."
Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications
Joel Greenyer, Christian Brenner, Maxime Cordy, Patrick Heymans, and Erika Gressi (Leibniz Universität Hannover, Germany; University of Paderborn, Germany; University of Namur, Belgium; Politecnico di Milano, Italy) Many software-intensive systems consist of components that interact to fulfill complex functionality. Moreover, often many variants of such systems have to be designed at once. This adds complexity to the design task. Recently, we proposed a scenario-based approach to design product lines, which combines feature diagrams and Modal Sequence Diagrams. We proposed a consistency-checking technique based on a dedicated product line model checker. One limitation of this technique is that it is incomplete, i.e., it may fail to show the consistency of some consistent specifications. In this paper we propose a new game-based approach that overcomes this incompleteness and, in addition, automatically synthesizes controllers for the consistent product specifications. We exploit the fact that many variants are similar and efficiently synthesize product controllers incrementally. We provide a prototype tool and evaluate the efficiency of the approach. @InProceedings{ESEC/FSE13p433, author = {Joel Greenyer and Christian Brenner and Maxime Cordy and Patrick Heymans and Erika Gressi}, title = {Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {433--443}, doi = {}, year = {2013}, } |
|
Damian, Daniela |
ESEC/FSE '13: "Do All Task Dependencies Require ..."
Do All Task Dependencies Require Coordination? The Role of Task Properties in Identifying Critical Coordination Needs in Software Projects
Kelly Blincoe, Giuseppe Valetto, and Daniela Damian (Drexel University, USA; University of Victoria, Canada) Several methods exist to detect the coordination needs within software teams. Evidence exists that developers’ awareness about coordination needs improves work performance. Distinguishing with certainty between critical and trivial coordination needs and identifying and prioritizing which specific tasks a pair of developers should coordinate about remains an open problem. We investigate what work dependencies should be considered when establishing coordination needs within a development team. We use our conceptualization of work dependencies named Proximity and leverage machine learning techniques to analyze what additional task properties are indicative of coordination needs. In a case study of the Mylyn project, we were able to identify from all potential coordination requirements a subset of 17% that are most critical. We define critical coordination requirements as those that can cause the most disruption to task duration when left unmanaged. These results imply that coordination awareness tools could be enhanced to make developers aware of only the coordination needs that can bring about the highest performance benefit. @InProceedings{ESEC/FSE13p213, author = {Kelly Blincoe and Giuseppe Valetto and Daniela Damian}, title = {Do All Task Dependencies Require Coordination? The Role of Task Properties in Identifying Critical Coordination Needs in Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {213--223}, doi = {}, year = {2013}, } |
|
D'Amorim, Marcelo |
ESEC/FSE '13: "SPLat: Lightweight Dynamic ..."
SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems
Chang Hwan Peter Kim, Darko Marinov, Sarfraz Khurshid, Don Batory, Sabrina Souto, Paulo Barros, and Marcelo d'Amorim (University of Texas at Austin, USA; University of Illinois at Urbana-Champaign, USA; Groupon, USA; Federal University of Pernambuco, Brazil) Many programs can be configured through dynamic and/or static selection of configuration variables. A software product line (SPL), for example, specifies a family of programs where each program is defined by a unique combination of features. Systematically testing SPL programs is expensive as it can require running each test against a combinatorial number of configurations. Fortunately, a test is often independent of many configuration variables and need not be run against every combination. Configurations that are not required for a test can be pruned from execution. This paper presents SPLat, a new way to dynamically prune irrelevant configurations: the configurations to run for a test can be determined during test execution by monitoring accesses to configuration variables. SPLat achieves an optimal reduction in the number of configurations and is lightweight compared to prior work that used static analysis and heavyweight dynamic execution. Experimental results on 10 SPLs written in Java show that SPLat substantially reduces the total test execution time in many cases. Moreover, we demonstrate the scalability of SPLat by applying it to a large industrial code base written in Ruby on Rails. @InProceedings{ESEC/FSE13p257, author = {Chang Hwan Peter Kim and Darko Marinov and Sarfraz Khurshid and Don Batory and Sabrina Souto and Paulo Barros and Marcelo d'Amorim}, title = {SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2013}, } Info |
|
Davril, Jean-Marc |
ESEC/FSE '13: "Feature Model Extraction from ..."
Feature Model Extraction from Large Collections of Informal Product Descriptions
Jean-Marc Davril, Edouard Delfosse, Negar Hariri, Mathieu Acher, Jane Cleland-Huang, and Patrick Heymans (University of Namur, Belgium; DePaul University, USA; University of Rennes I, France; INRIA, France) Feature Models (FMs) are used extensively in software product line engineering to help generate and validate individual product configurations and to provide support for domain analysis. As FM construction can be tedious and time-consuming, researchers have previously developed techniques for extracting FMs from sets of formally specified individual configurations, or from software requirements specifications for families of existing products. However, such artifacts are often not available. In this paper we present a novel, automated approach for constructing FMs from publicly available product descriptions found in online product repositories and marketing websites such as SoftPedia and CNET. While each individual product description provides only a partial view of features in the domain, a large set of descriptions can provide fairly comprehensive coverage. Our approach utilizes hundreds of partial product descriptions to construct an FM and is described and evaluated against antivirus product descriptions mined from SoftPedia. @InProceedings{ESEC/FSE13p290, author = {Jean-Marc Davril and Edouard Delfosse and Negar Hariri and Mathieu Acher and Jane Cleland-Huang and Patrick Heymans}, title = {Feature Model Extraction from Large Collections of Informal Product Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {290--300}, doi = {}, year = {2013}, } |
|
Delac, Goran |
ESEC/FSE '13: "Prediction of Atomic Web Services ..."
Prediction of Atomic Web Services Reliability Based on K-Means Clustering
Marin Silic, Goran Delac, and Sinisa Srbljic (University of Zagreb, Croatia) Contemporary web applications are often designed as composite services built by coordinating atomic services with the aim of providing the appropriate functionality. Although functional properties of each atomic service assure correct functionality of the entire application, nonfunctional properties such as availability, reliability, or security might significantly influence the user-perceived quality of the application. In this paper, we present CLUS, a model for reliability prediction of atomic web services that improves state-of-the-art approaches used in modern recommendation systems. CLUS predicts the reliability for the ongoing service invocation using the data collected from previous invocations. We improve the accuracy of the current state-of-the-art prediction models by considering user-, service- and environment-specific parameters of the invocation context. To address the computational performance related to scalability issues, we aggregate the available previous invocation data using K-means clustering algorithm. We evaluated our model by conducting experiments on services deployed in different regions of the Amazon cloud. The evaluation results suggest that our model improves both performance and accuracy of the prediction when compared to the current state-of-the-art models. @InProceedings{ESEC/FSE13p70, author = {Marin Silic and Goran Delac and Sinisa Srbljic}, title = {Prediction of Atomic Web Services Reliability Based on K-Means Clustering}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {70--80}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Delfosse, Edouard |
ESEC/FSE '13: "Feature Model Extraction from ..."
Feature Model Extraction from Large Collections of Informal Product Descriptions
Jean-Marc Davril, Edouard Delfosse, Negar Hariri, Mathieu Acher, Jane Cleland-Huang, and Patrick Heymans (University of Namur, Belgium; DePaul University, USA; University of Rennes I, France; INRIA, France) Feature Models (FMs) are used extensively in software product line engineering to help generate and validate individual product configurations and to provide support for domain analysis. As FM construction can be tedious and time-consuming, researchers have previously developed techniques for extracting FMs from sets of formally specified individual configurations, or from software requirements specifications for families of existing products. However, such artifacts are often not available. In this paper we present a novel, automated approach for constructing FMs from publicly available product descriptions found in online product repositories and marketing websites such as SoftPedia and CNET. While each individual product description provides only a partial view of features in the domain, a large set of descriptions can provide fairly comprehensive coverage. Our approach utilizes hundreds of partial product descriptions to construct an FM and is described and evaluated against antivirus product descriptions mined from SoftPedia. @InProceedings{ESEC/FSE13p290, author = {Jean-Marc Davril and Edouard Delfosse and Negar Hariri and Mathieu Acher and Jane Cleland-Huang and Patrick Heymans}, title = {Feature Model Extraction from Large Collections of Informal Product Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {290--300}, doi = {}, year = {2013}, } |
|
Denaro, Giovanni |
ESEC/FSE '13: "Enhancing Symbolic Execution ..."
Enhancing Symbolic Execution with Built-In Term Rewriting and Constrained Lazy Initialization
Pietro Braione, Giovanni Denaro, and Mauro Pezzè (University of Milano-Bicocca, Italy; University of Lugano, Switzerland) Symbolic execution suffers from problems when analyzing programs that handle complex data structures as their inputs and take decisions over non-linear expressions. For these programs, symbolic execution may incur invalid inputs or unidentified infeasible traces, and may raise large amounts of false alarms. Some symbolic executors tackle these problems by introducing executable preconditions to exclude invalid inputs, and some solvers exploit rewrite rules to address non linear problems. In this paper, we discuss the core limitations of executable preconditions, and address these limitations by proposing invariants specifically designed to harmonize with the lazy initialization algorithm. We exploit rewrite rules applied within the symbolic executor, to address simplifications of inverse relationships fostered from either program-specific calculations or the logic of the verification tasks. We present a symbolic executor that integrates the two techniques, and validate our approach against the verification of a relevant set of properties of the Tactical Separation Assisted Flight Environment. The empirical data show that the integrated approach can improve the effectiveness of symbolic execution. @InProceedings{ESEC/FSE13p411, author = {Pietro Braione and Giovanni Denaro and Mauro Pezzè}, title = {Enhancing Symbolic Execution with Built-In Term Rewriting and Constrained Lazy Initialization}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {411--421}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Detlefs, Dave |
ESEC/FSE '13: "Will You Still Compile Me ..."
Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation
Chris Hawblitzel , Shuvendu K. Lahiri , Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, and Scott Wadsworth (Microsoft, USA) This paper describes a cross-version compiler validator and measures its effectiveness on the CLR JIT compiler. The validator checks for semantically equivalent assembly language output from various versions of the compiler, including versions across a seven-month time period, across two architectures (x86 and ARM), across two compilation scenarios (JIT and MDIL), and across optimizations levels. For month-to-month comparisons, the validator achieves a false alarm rate of just 2.2%. To help understand reported semantic differences, the validator performs a root-cause analysis on the counterexample traces generated by the underlying automated theorem proving tools. This root-cause analysis groups most of the counterexamples into a small number of buckets, reducing the number of counterexamples analyzed by hand by anywhere from 53% to 96%. The validator ran on over 500,000 methods across a large suite of test programs, finding 12 previously unknown correctness and performance bugs in the CLR compiler. @InProceedings{ESEC/FSE13p191, author = {Chris Hawblitzel and Shuvendu K. Lahiri and Kshama Pawar and Hammad Hashmi and Sedar Gokbulut and Lakshan Fernando and Dave Detlefs and Scott Wadsworth}, title = {Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2013}, } |
|
Devanbu, Premkumar |
ESEC/FSE '13: "Sample Size vs. Bias in Defect ..."
Sample Size vs. Bias in Defect Prediction
Foyzur Rahman, Daryl Posnett, Israel Herraiz, and Premkumar Devanbu (UC Davis, USA; Universidad Politécnica de Madrid, Spain) Most empirical disciplines promote the reuse and sharing of datasets, as it leads to greater possibility of replication. While this is increasingly the case in Empirical Software Engineering, some of the most popular bug-fix datasets are now known to be biased. This raises two significant concerns: first, that sample bias may lead to underperforming prediction models, and second, that the external validity of the studies based on biased datasets may be suspect. This issue has raised considerable consternation in the ESE literature in recent years. However, there is a confounding factor of these datasets that has not been examined carefully: size. Biased datasets are sampling only some of the data that could be sampled, and doing so in a biased fashion; but biased samples could be smaller, or larger. Smaller data sets in general provide less reliable bases for estimating models, and thus could lead to inferior model performance. In this setting, we ask the question, what affects performance more, bias, or size? We conduct a detailed, large-scale meta-analysis, using simulated datasets sampled with bias from a high-quality dataset which is relatively free of bias. Our results suggest that size always matters just as much bias direction, and in fact much more than bias direction when considering information-retrieval measures such as AUCROC and F-score. This indicates that at least for prediction models, even when dealing with sampling bias, simply finding larger samples can sometimes be sufficient. Our analysis also exposes the complexity of the bias issue, and raises further issues to be explored in the future. @InProceedings{ESEC/FSE13p147, author = {Foyzur Rahman and Daryl Posnett and Israel Herraiz and Premkumar Devanbu}, title = {Sample Size vs. Bias in Defect Prediction}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {147--157}, doi = {}, year = {2013}, } |
|
Dhoolia, Pankaj |
ESEC/FSE '13: "Distributed Program Tracing ..."
Distributed Program Tracing
Diptikalyan Saha, Pankaj Dhoolia, and Gaurab Paul (IBM Research, India; IIT Kharagpur, India) Dynamic program analysis techniques depend on accurate program traces. Program instrumentation is commonly used to collect these traces, which causes overhead to the program execution. Various techniques have addressed this problem by minimizing the number of probes/witnesses used to collect traces. In this paper, we present a novel distributed trace collection framework wherein, a program is executed multiple times with the same input for different sets of witnesses. The partial traces such obtained are then merged to create the whole program trace. Such divide-and-conquer strategy enables parallel collection of partial traces, thereby reducing the total time of collection. The problem is particularly challenging as arbitrary distribution of witnesses cannot guarantee correct formation of traces. We provide and prove a necessary and sufficient condition for distributing the witnesses which ensures correct formation of trace. Moreover, we describe witness distribution strategies that are suitable for parallel collection. We use the framework to collect traces of field SAP-ABAP programs using breakpoints as witnesses as instrumentation cannot be performed due to practical constraints. To optimize such collection, we extend Ball-Larus' optimal edge-based profiling algorithm to an optimal node-based algorithm. We demonstrate the effectiveness of the framework for collecting traces of SAP-ABAP programs. @InProceedings{ESEC/FSE13p180, author = {Diptikalyan Saha and Pankaj Dhoolia and Gaurab Paul}, title = {Distributed Program Tracing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {180--190}, doi = {}, year = {2013}, } |
|
Dig, Danny |
ESEC/FSE '13: "Crossing the Gap from Imperative ..."
Crossing the Gap from Imperative to Functional Programming through Refactoring
Alex Gyori, Lyle Franklin, Danny Dig, and Jan Lahoda (University of Illinois, USA; Ball State University, USA; Oregon State University, USA; Oracle, Czech Republic) Java 8 introduces two functional features: lambda expressions and functional operations like map or filter that apply a lambda expression over the elements of a Collection. Refactoring existing code to use these new features enables explicit but unobtrusive parallelism and makes the code more succinct. However, refactoring is tedious: it requires changing many lines of code. It is also error-prone: the programmer must reason about the control-, data-flow, and side-effects. Fortunately, refactorings can be automated. We designed and implemented LambdaFicator, a tool which automates two refactorings. The first refactoring converts anonymous inner classes to lambda expressions. The second refactoring converts for loops that iterate over Collections to functional operations that use lambda expressions. Using 9 open-source projects, we have applied these two refactorings 1263 and 1709 times, respectively. The results show that LambdaFicator is useful: (i) it is widely applicable, (ii) it reduces the code bloat, (iii) it increases programmer productivity, and (iv) it is accurate. @InProceedings{ESEC/FSE13p543, author = {Alex Gyori and Lyle Franklin and Danny Dig and Jan Lahoda}, title = {Crossing the Gap from Imperative to Functional Programming through Refactoring}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {543--553}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Di Penta, Massimiliano |
ESEC/FSE '13: "API Change and Fault Proneness: ..."
API Change and Fault Proneness: A Threat to the Success of Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto , and Denys Poshyvanyk (College of William and Mary, USA; University of Sannio, Italy; Universidad Nacional de Colombia, Colombia; University of Molise, Italy) During the recent years, the market of mobile software applications (apps) has maintained an impressive upward trajectory. Many small and large software development companies invest considerable resources to target available opportunities. As of today, the markets for such devices feature over 850K+ apps for Android and 900K+ for iOS. Availability, cost, functionality, and usability are just some factors that determine the success or lack of success for a given app. Among the other factors, reliability is an important criteria: users easily get frustrated by repeated failures, crashes, and other bugs; hence, abandoning some apps in favor of others. This paper reports a study analyzing how the fault- and change-proneness of APIs used by 7,097 (free) Android apps relates to applications' lack of success, estimated from user ratings. Results of this study provide important insights into a crucial issue: making heavy use of fault- and change-prone APIs can negatively impact the success of these apps. @InProceedings{ESEC/FSE13p477, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Bernal-Cárdenas and Massimiliano Di Penta and Rocco Oliveto and Denys Poshyvanyk}, title = {API Change and Fault Proneness: A Threat to the Success of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {477--487}, doi = {}, year = {2013}, } Info |
|
Dolby, Julian |
ESEC/FSE '13: "Finding Incorrect Compositions ..."
Finding Incorrect Compositions of Atomicity
Peng Liu, Julian Dolby , and Charles Zhang (Hong Kong University of Science and Technology, China; IBM Research, USA) In object-oriented code, atomicity is ideally isolated in a library which encapsulates shared program state and provides atomic APIs for access. The library provides a convenient way for programmers to reason about the needed synchronization. However, as the library exports a limited set of APIs, it cannot satisfy every unplanned atomicity demand; therefore, clients may have to compose invocations of the library APIs to obtain new atomic functionality. This process is error-prone due to the complexity of reasoning required, hence tool support for uncovering incorrect compositions (i.e., atomic compositions that are implemented incorrectly) would be very helpful. A key difficulty is how to determine the intended atomic compositions, which are rarely documented. Existing inference techniques cannot be used to infer the atomic compositions because they cannot recognize the library and the client, which requires understanding the related program state. Even if extended to support the library/client, they lead to many false positives or false negatives because they miss the key program logic which reflects programmers’ coding paradigms for atomic compositions. We define a new inference technique which identifies intended atomic compositions using two key symptoms based on program dependence. We then check dynamically whether these atomic compositions are implemented incorrectly as non-atomic. Evaluation on thirteen large applications shows that our approach finds around 50 previously unknown incorrect compositions. Further study on Tomcat shows that almost half (5 out of 12) of discovered incorrect compositions are confirmed as bugs by the developers. Given that Tomcat is heavily used in 250,000 sites including Linkedin.com and Ebay.com, we believe finding multiple new bugs in it automatically with relatively few false positives supports our heuristics for determining intended atomicity. @InProceedings{ESEC/FSE13p158, author = {Peng Liu and Julian Dolby and Charles Zhang}, title = {Finding Incorrect Compositions of Atomicity}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {158--168}, doi = {}, year = {2013}, } Artifact Accepted for Presentation |
|
Dörre, Jens |
ESEC/FSE '13: "Scalable Analysis of Variable ..."
Scalable Analysis of Variable Software
Jörg Liebig, Alexander von Rhein, Christian Kästner , Sven Apel, Jens Dörre, and Christian Lengauer (University of Passau, Germany; CMU, USA) The advent of variability management and generator technology enables users to derive individual variants from a variable code base based on a selection of desired configuration options. This approach gives rise to the generation of possibly billions of variants that, however, cannot be efficiently analyzed for errors with classic analysis techniques. To address this issue, researchers and practitioners usually apply sampling heuristics. While sampling reduces the analysis effort significantly, the information obtained is necessarily incomplete and it is unknown whether sampling heuristics scale to billions of variants. Recently, researchers have begun to develop variability-aware analyses that analyze the variable code base directly exploiting the similarities among individual variants to reduce analysis effort. However, while being promising, so far, variability-aware analyses have been applied mostly only to small academic systems. To learn about the mutual strengths and weaknesses of variability-aware and sampling-based analyses of software systems, we compared the two strategies by means of two concrete analysis implementations (type checking and liveness analysis), applied them to three subject systems: Busybox, the x86 Linux kernel, and OpenSSL. Our key finding is that variability-aware analysis outperforms most sampling heuristics with respect to analysis time while preserving completeness. @InProceedings{ESEC/FSE13p81, author = {Jörg Liebig and Alexander von Rhein and Christian Kästner and Sven Apel and Jens Dörre and Christian Lengauer}, title = {Scalable Analysis of Variable Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {81--91}, doi = {}, year = {2013}, } Info |
|
Elbaum, Sebastian |
ESEC/FSE '13: "Cascading Verification: An ..."
Cascading Verification: An Integrated Method for Domain-Specific Model Checking
Fokion Zervoudakis, David S. Rosenblum, Sebastian Elbaum, and Anthony Finkelstein (University College London, UK; National University of Singapore, Singapore; University of Nebraska-Lincoln, USA) Model checking is an established method for verifying behavioral properties of system models. But model checkers tend to support low-level modeling languages that require intricate models to represent even the simplest systems. Modeling complexity arises in part from the need to encode domain knowledge at relatively low levels of abstraction. In this paper, we demonstrate that formalized domain knowledge can be reused to raise the abstraction level of model and property specifications, and hence the effectiveness of model checking. We describe a novel method for domain-specific model checking called cascading verification that uses composite reasoning over high-level system specifications and formalized domain knowledge to synthesize both low-level system models and their behavioral properties for verification. In particular, model builders use a high-level domain-specific language (DSL) based on YAML to express system specifications that can be verified with probabilistic model checking. Domain knowledge is encoded in the Web Ontology Language (OWL), the Semantic Web Rule Language (SWRL) and Prolog, which are used in combination to overcome their individual limitations. A compiler then synthesizes models and properties for verification by the probabilistic model checker PRISM. We illustrate cascading verification for the domain of uninhabited aerial vehicles (UAVs), for which we have constructed a prototype implementation. An evaluation of this prototype reveals nontrivial reductions in the size and complexity of input specifications compared to the artifacts synthesized for PRISM. @InProceedings{ESEC/FSE13p400, author = {Fokion Zervoudakis and David S. Rosenblum and Sebastian Elbaum and Anthony Finkelstein}, title = {Cascading Verification: An Integrated Method for Domain-Specific Model Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {400--410}, doi = {}, year = {2013}, } |
|
Ernst, Michael D. |
ESEC/FSE '13: "Making Offline Analyses Continuous ..."
Making Offline Analyses Continuous
Kıvanç Muşlu, Yuriy Brun , Michael D. Ernst , and David Notkin (University of Washington, USA; University of Massachusetts at Amherst, USA) Developers use analysis tools to help write, debug, and understand software systems under development. A developer's change to the system source code may affect analysis results. Typically, to learn those effects, the developer must explicitly initiate the analysis. This may interrupt the developer's workflow and/or the delay until the developer learns the implications of the change. The situation is even worse for impure analyses — ones that modify the code on which it runs — because such analyses block the developer from working on the code. This paper presents Codebase Replication, a novel approach to easily convert an offline analysis — even an impure one — into a continuous analysis that informs the developer of the implications of recent changes as quickly as possible after the change is made. Codebase Replication copies the developer's codebase, incrementally keeps this copy codebase in sync with the developer's codebase, makes that copy codebase available for offline analyses to run without disturbing the developer and without the developer's changes disturbing the analyses, and makes analysis results available to be presented to the developer. We have implemented Codebase Replication in Solstice, an open-source, publicly-available Eclipse plug-in. We have used Solstice to convert three offline analyses — FindBugs, PMD, and unit testing — into continuous ones. Each conversion required on average 436 NCSL and took, on average, 18 hours. Solstice-based analyses experience no more than 2.5 milliseconds of runtime overhead per developer action. @InProceedings{ESEC/FSE13p323, author = {Kıvanç Muşlu and Yuriy Brun and Michael D. Ernst and David Notkin}, title = {Making Offline Analyses Continuous}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {323--333}, doi = {}, year = {2013}, } |
|
Fanning, Michael |
ESEC/FSE '13: "Practical Static Analysis ..."
Practical Static Analysis of JavaScript Applications in the Presence of Frameworks and Libraries
Magnus Madsen , Benjamin Livshits, and Michael Fanning (Aarhus University, Denmark; Microsoft Research, USA; Microsoft, USA) JavaScript is a language that is widely-used for both web- based and standalone applications such as those in the upcoming Windows 8 operating system. Analysis of JavaScript has long been known to be challenging due to its dynamic nature. On top of that, most JavaScript applications rely on large and complex libraries and frameworks, often written in a combination of JavaScript and native code such as C and C++. Stubs have been commonly employed as a partial specification mechanism to address the library problem; however, they are tedious to write, incomplete, and occasionally incorrect. However, the manner in which library code is used within applications often sheds light on what library APIs return or consume as parameters. In this paper, we propose a technique which combines pointer analysis with use analysis to handle many challenges posed by large JavaScript libraries. Our approach enables a variety of applications, ranging from call graph discovery to auto-complete to supporting runtime optimizations. Our techniques have been implemented and empirically validated on a set of 25 Windows 8 JavaScript applications, averaging 1,587 lines of code, demonstrating a combination of scalability and precision. @InProceedings{ESEC/FSE13p499, author = {Magnus Madsen and Benjamin Livshits and Michael Fanning}, title = {Practical Static Analysis of JavaScript Applications in the Presence of Frameworks and Libraries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {499--509}, doi = {}, year = {2013}, } |
|
Farzan, Azadeh |
ESEC/FSE '13: "Con2colic Testing ..."
Con2colic Testing
Azadeh Farzan , Andreas Holzer, Niloofar Razavi, and Helmut Veith (University of Toronto, Canada; Vienna University of Technology, Austria) In this paper, we describe (con)2colic testing - a systematic testing approach for concurrent software. Based on concrete and symbolic executions of a concurrent program, (con)2colic testing derives inputs and schedules such that the execution space of the program under investigation is systematically explored. We introduce interference scenarios as key concept in (con)2colic testing. Interference scenarios capture the flow of data among different threads and enable a unified representation of path and interference constraints. We have implemented a (con)2colic testing engine and demonstrate the effectiveness of our approach by experiments. @InProceedings{ESEC/FSE13p37, author = {Azadeh Farzan and Andreas Holzer and Niloofar Razavi and Helmut Veith}, title = {Con2colic Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {37--47}, doi = {}, year = {2013}, } |
|
Fernando, Lakshan |
ESEC/FSE '13: "Will You Still Compile Me ..."
Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation
Chris Hawblitzel , Shuvendu K. Lahiri , Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, and Scott Wadsworth (Microsoft, USA) This paper describes a cross-version compiler validator and measures its effectiveness on the CLR JIT compiler. The validator checks for semantically equivalent assembly language output from various versions of the compiler, including versions across a seven-month time period, across two architectures (x86 and ARM), across two compilation scenarios (JIT and MDIL), and across optimizations levels. For month-to-month comparisons, the validator achieves a false alarm rate of just 2.2%. To help understand reported semantic differences, the validator performs a root-cause analysis on the counterexample traces generated by the underlying automated theorem proving tools. This root-cause analysis groups most of the counterexamples into a small number of buckets, reducing the number of counterexamples analyzed by hand by anywhere from 53% to 96%. The validator ran on over 500,000 methods across a large suite of test programs, finding 12 previously unknown correctness and performance bugs in the CLR compiler. @InProceedings{ESEC/FSE13p191, author = {Chris Hawblitzel and Shuvendu K. Lahiri and Kshama Pawar and Hammad Hashmi and Sedar Gokbulut and Lakshan Fernando and Dave Detlefs and Scott Wadsworth}, title = {Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2013}, } |
|
Finkelstein, Anthony |
ESEC/FSE '13: "Cascading Verification: An ..."
Cascading Verification: An Integrated Method for Domain-Specific Model Checking
Fokion Zervoudakis, David S. Rosenblum, Sebastian Elbaum, and Anthony Finkelstein (University College London, UK; National University of Singapore, Singapore; University of Nebraska-Lincoln, USA) Model checking is an established method for verifying behavioral properties of system models. But model checkers tend to support low-level modeling languages that require intricate models to represent even the simplest systems. Modeling complexity arises in part from the need to encode domain knowledge at relatively low levels of abstraction. In this paper, we demonstrate that formalized domain knowledge can be reused to raise the abstraction level of model and property specifications, and hence the effectiveness of model checking. We describe a novel method for domain-specific model checking called cascading verification that uses composite reasoning over high-level system specifications and formalized domain knowledge to synthesize both low-level system models and their behavioral properties for verification. In particular, model builders use a high-level domain-specific language (DSL) based on YAML to express system specifications that can be verified with probabilistic model checking. Domain knowledge is encoded in the Web Ontology Language (OWL), the Semantic Web Rule Language (SWRL) and Prolog, which are used in combination to overcome their individual limitations. A compiler then synthesizes models and properties for verification by the probabilistic model checker PRISM. We illustrate cascading verification for the domain of uninhabited aerial vehicles (UAVs), for which we have constructed a prototype implementation. An evaluation of this prototype reveals nontrivial reductions in the size and complexity of input specifications compared to the artifacts synthesized for PRISM. @InProceedings{ESEC/FSE13p400, author = {Fokion Zervoudakis and David S. Rosenblum and Sebastian Elbaum and Anthony Finkelstein}, title = {Cascading Verification: An Integrated Method for Domain-Specific Model Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {400--410}, doi = {}, year = {2013}, } |
|
Franklin, Lyle |
ESEC/FSE '13: "Crossing the Gap from Imperative ..."
Crossing the Gap from Imperative to Functional Programming through Refactoring
Alex Gyori, Lyle Franklin, Danny Dig, and Jan Lahoda (University of Illinois, USA; Ball State University, USA; Oregon State University, USA; Oracle, Czech Republic) Java 8 introduces two functional features: lambda expressions and functional operations like map or filter that apply a lambda expression over the elements of a Collection. Refactoring existing code to use these new features enables explicit but unobtrusive parallelism and makes the code more succinct. However, refactoring is tedious: it requires changing many lines of code. It is also error-prone: the programmer must reason about the control-, data-flow, and side-effects. Fortunately, refactorings can be automated. We designed and implemented LambdaFicator, a tool which automates two refactorings. The first refactoring converts anonymous inner classes to lambda expressions. The second refactoring converts for loops that iterate over Collections to functional operations that use lambda expressions. Using 9 open-source projects, we have applied these two refactorings 1263 and 1709 times, respectively. The results show that LambdaFicator is useful: (i) it is widely applicable, (ii) it reduces the code bloat, (iii) it increases programmer productivity, and (iv) it is accurate. @InProceedings{ESEC/FSE13p543, author = {Alex Gyori and Lyle Franklin and Danny Dig and Jan Lahoda}, title = {Crossing the Gap from Imperative to Functional Programming through Refactoring}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {543--553}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Ganesh, Vijay |
ESEC/FSE '13: "Z3-str: A Z3-Based String ..."
Z3-str: A Z3-Based String Solver for Web Application Analysis
Yunhui Zheng, Xiangyu Zhang , and Vijay Ganesh (Purdue University, USA; University of Waterloo, Canada) Analyzing web applications requires reasoning about strings and non-strings cohesively. Existing string solvers either ignore non-string program behavior or support limited set of string operations. In this paper, we develop a general purpose string solver, called Z3-str, as an extension of the Z3 SMT solver through its plug-in interface. Z3-str treats strings as a primitive type, thus avoiding the inherent limitations observed in many existing solvers that encode strings in terms of other primitives. The logic of the plug-in has three sorts, namely, bool, int and string. The string-sorted terms include string constants and variables of arbitrary length, with functions such as concatenation, sub-string, and replace. The int-sorted terms are standard, with the exception of the length function over string terms. The atomic formulas are equations over string terms, and (in)-equalities over integer terms. Not only does our solver have features that enable whole program symbolic, static and dynamic analysis, but also it performs better than other solvers in our experiments. The application of Z3-str in remote code execution detection shows that its support of a wide spectrum of string operations is key to reducing false positives. @InProceedings{ESEC/FSE13p114, author = {Yunhui Zheng and Xiangyu Zhang and Vijay Ganesh}, title = {Z3-str: A Z3-Based String Solver for Web Application Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {114--124}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Garcia, Joshua |
ESEC/FSE '13: "Identifying Message Flow in ..."
Identifying Message Flow in Distributed Event-Based Systems
Joshua Garcia, Daniel Popescu, Gholamreza Safi, William G. J. Halfond , and Nenad Medvidovic (University of Southern California, USA) Distributed event-based (DEB) systems contain highly-decoupled components that interact by exchanging messages. This enables flexible system composition and adaptation, but also makes DEB systems difficult to maintain. Most existing program analysis techniques to support maintenance are not well suited to DEB systems, while those that are tend to suffer from inaccuracy or make assumptions that limit their applicability. This paper presents Eos, a static analysis technique that identifies message information useful for maintaining a DEB system, namely, message types and message flow within a system. Eos has been evaluated on six off-the-shelf DEB systems spanning five different middleware platforms, and has exhibited excellent accuracy and efficiency. Furthermore, a case study involving a range of maintenance activities undertaken on three existing DEB systems shows that, on average, Eos enables an engineer to identify the scope and impact of required changes more accurately than existing alternatives. @InProceedings{ESEC/FSE13p367, author = {Joshua Garcia and Daniel Popescu and Gholamreza Safi and William G. J. Halfond and Nenad Medvidovic}, title = {Identifying Message Flow in Distributed Event-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {367--377}, doi = {}, year = {2013}, } Info |
|
Gibbs, Simon |
ESEC/FSE '13: "Jalangi: A Selective Record-Replay ..."
Jalangi: A Selective Record-Replay and Dynamic Analysis Framework for JavaScript
Koushik Sen, Swaroop Kalasapur, Tasneem Brutch, and Simon Gibbs (UC Berkeley, USA; Samsung Research, USA) JavaScript is widely used for writing client-side web applications and is getting increasingly popular for writing mobile applications. However, unlike C, C++, and Java, there are not that many tools available for analysis and testing of JavaScript applications. In this paper, we present a simple yet powerful framework, called Jalangi, for writing heavy-weight dynamic analyses. Our framework incorporates two key techniques: 1) selective record-replay, a technique which enables to record and to faithfully replay a user-selected part of the program, and 2) shadow values and shadow execution, which enables easy implementation of heavy-weight dynamic analyses. Our implementation makes no special assumption about JavaScript, which makes it applicable to real-world JavaScript programs running on multiple platforms. We have implemented concolic testing, an analysis to track origins of nulls and undefined, a simple form of taint analysis, an analysis to detect likely type inconsistencies, and an object allocation profiler in Jalangi. Our evaluation of Jalangi on the SunSpider benchmark suite and on five web applications shows that Jalangi has an average slowdown of 26X during recording and 30X slowdown during replay and analysis. The slowdowns are comparable with slowdowns reported for similar tools, such as PIN and Valgrind for x86 binaries. We believe that the techniques proposed in this paper are applicable to other dynamic languages. @InProceedings{ESEC/FSE13p488, author = {Koushik Sen and Swaroop Kalasapur and Tasneem Brutch and Simon Gibbs}, title = {Jalangi: A Selective Record-Replay and Dynamic Analysis Framework for JavaScript}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {488--498}, doi = {}, year = {2013}, } |
|
Gokbulut, Sedar |
ESEC/FSE '13: "Will You Still Compile Me ..."
Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation
Chris Hawblitzel , Shuvendu K. Lahiri , Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, and Scott Wadsworth (Microsoft, USA) This paper describes a cross-version compiler validator and measures its effectiveness on the CLR JIT compiler. The validator checks for semantically equivalent assembly language output from various versions of the compiler, including versions across a seven-month time period, across two architectures (x86 and ARM), across two compilation scenarios (JIT and MDIL), and across optimizations levels. For month-to-month comparisons, the validator achieves a false alarm rate of just 2.2%. To help understand reported semantic differences, the validator performs a root-cause analysis on the counterexample traces generated by the underlying automated theorem proving tools. This root-cause analysis groups most of the counterexamples into a small number of buckets, reducing the number of counterexamples analyzed by hand by anywhere from 53% to 96%. The validator ran on over 500,000 methods across a large suite of test programs, finding 12 previously unknown correctness and performance bugs in the CLR compiler. @InProceedings{ESEC/FSE13p191, author = {Chris Hawblitzel and Shuvendu K. Lahiri and Kshama Pawar and Hammad Hashmi and Sedar Gokbulut and Lakshan Fernando and Dave Detlefs and Scott Wadsworth}, title = {Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2013}, } |
|
Gordon, Andrew D. |
ESEC/FSE '13: "Bayesian Inference using Data ..."
Bayesian Inference using Data Flow Analysis
Guillaume Claret, Sriram K. Rajamani , Aditya V. Nori, Andrew D. Gordon, and Johannes Borgström (INRIA, France; Microsoft Research, India; Microsoft Research, UK; Uppsala University, Sweden) We present a new algorithm for Bayesian inference over probabilistic programs, based on data flow analysis techniques from the program analysis community. Unlike existing techniques for Bayesian inference on probabilistic programs, our data flow analysis algorithm is able to perform inference directly on probabilistic programs with loops. Even for loop-free programs, we show that data flow analysis offers better precision and better performance benefits over existing techniques. We also describe heuristics that are crucial for our inference to scale, and present an empirical evaluation of our algorithm over a range of benchmarks. @InProceedings{ESEC/FSE13p92, author = {Guillaume Claret and Sriram K. Rajamani and Aditya V. Nori and Andrew D. Gordon and Johannes Borgström}, title = {Bayesian Inference using Data Flow Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {92--102}, doi = {}, year = {2013}, } |
|
Grechanik, Mark |
ESEC/FSE '13: "Preventing Database Deadlocks ..."
Preventing Database Deadlocks in Applications
Mark Grechanik, B. M. Mainul Hossain, Ugo Buy, and Haisheng Wang (University of Illinois at Chicago, USA; Oracle, USA) Many organizations deploy applications that use databases by sending Structured Query Language (SQL) statements to them and obtaining data that result from the execution of these statements. Since applications often share the same databases concurrently, database deadlocks routinely occur in these databases resulting in major performance degradation in these applications. Database engines do not prevent database deadlocks for the same reason that the schedulers of operating system kernels do not preempt processes in a way to avoid race conditions and deadlocks - it is not feasible to find an optimal context switching schedule quickly for multiple processes (and SQL statements), and the overhead of doing it is prohibitive. We created a novel approach that combines run-time monitoring, which automatically prevents database deadlocks, with static analysis, which detects hold-and-wait cycles that specify how resources (e.g., database tables) are held in contention during executions of SQL statements. We rigorously evaluated our approach. For a realistic case of over 1,200 SQL statements, our algorithm detects all hold-and-wait cycles in less than two seconds. We built a toolset and experimented with three applications. Our tool prevented all existing database deadlocks in these applications and increased their throughputs by up to three orders of magnitude. @InProceedings{ESEC/FSE13p356, author = {Mark Grechanik and B. M. Mainul Hossain and Ugo Buy and Haisheng Wang}, title = {Preventing Database Deadlocks in Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {356--366}, doi = {}, year = {2013}, } Info |
|
Greenyer, Joel |
ESEC/FSE '13: "Incrementally Synthesizing ..."
Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications
Joel Greenyer, Christian Brenner, Maxime Cordy, Patrick Heymans, and Erika Gressi (Leibniz Universität Hannover, Germany; University of Paderborn, Germany; University of Namur, Belgium; Politecnico di Milano, Italy) Many software-intensive systems consist of components that interact to fulfill complex functionality. Moreover, often many variants of such systems have to be designed at once. This adds complexity to the design task. Recently, we proposed a scenario-based approach to design product lines, which combines feature diagrams and Modal Sequence Diagrams. We proposed a consistency-checking technique based on a dedicated product line model checker. One limitation of this technique is that it is incomplete, i.e., it may fail to show the consistency of some consistent specifications. In this paper we propose a new game-based approach that overcomes this incompleteness and, in addition, automatically synthesizes controllers for the consistent product specifications. We exploit the fact that many variants are similar and efficiently synthesize product controllers incrementally. We provide a prototype tool and evaluate the efficiency of the approach. @InProceedings{ESEC/FSE13p433, author = {Joel Greenyer and Christian Brenner and Maxime Cordy and Patrick Heymans and Erika Gressi}, title = {Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {433--443}, doi = {}, year = {2013}, } |
|
Gressi, Erika |
ESEC/FSE '13: "Incrementally Synthesizing ..."
Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications
Joel Greenyer, Christian Brenner, Maxime Cordy, Patrick Heymans, and Erika Gressi (Leibniz Universität Hannover, Germany; University of Paderborn, Germany; University of Namur, Belgium; Politecnico di Milano, Italy) Many software-intensive systems consist of components that interact to fulfill complex functionality. Moreover, often many variants of such systems have to be designed at once. This adds complexity to the design task. Recently, we proposed a scenario-based approach to design product lines, which combines feature diagrams and Modal Sequence Diagrams. We proposed a consistency-checking technique based on a dedicated product line model checker. One limitation of this technique is that it is incomplete, i.e., it may fail to show the consistency of some consistent specifications. In this paper we propose a new game-based approach that overcomes this incompleteness and, in addition, automatically synthesizes controllers for the consistent product specifications. We exploit the fact that many variants are similar and efficiently synthesize product controllers incrementally. We provide a prototype tool and evaluate the efficiency of the approach. @InProceedings{ESEC/FSE13p433, author = {Joel Greenyer and Christian Brenner and Maxime Cordy and Patrick Heymans and Erika Gressi}, title = {Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {433--443}, doi = {}, year = {2013}, } |
|
Gros, Charles-Henri |
ESEC/FSE '13: "Scalable and Incremental Software ..."
Scalable and Incremental Software Bug Detection
Scott McPeak, Charles-Henri Gros, and Murali Krishna Ramanathan (Coverity, USA; Indian Institute of Science, India) An important, but often neglected, goal of static analysis for detecting bugs is the ability to show defects to the programmer quickly. Unfortunately, existing static analysis tools scale very poorly, or are shallow and cannot find complex interprocedural defects. Previous attempts at reducing the analysis time by adding more resources (CPU, memory) or by splitting the analysis into multiple sub-analyses based on defect detection capabilities resulted in limited/negligible improvements. We present a technique for parallel and incremental static analysis using top-down, bottom-up and global specification inference based around the concept of a work unit, a self-contained atom of analysis input, that deterministically maps to its output. A work unit contains both abstract and concrete syntax to analyze, a supporting fragment of the class hierarchy, summarized interprocedural behavior, and defect reporting information, factored to ensure a high level of reuse when analyzing successive versions incrementally. Work units are created and consumed by an analysis master process that coordinates the multiple analysis passes, the flow of information among them, and incrementalizes the computation. Meanwhile, multiple analysis worker processes use abstract interpretation to compute work unit results. Process management and interprocess communication is done by a general-purpose computation distributor layer. We have implemented our approach and our experimental results show that using eight processor cores, we can perform complete analysis of code bases with millions of lines of code in a few hours, and even faster after incremental changes to that code. The analysis is thorough and accurate: it usually reports about one crash-causing defect per thousand lines of code, with a false positive rate of 10--20% @InProceedings{ESEC/FSE13p554, author = {Scott McPeak and Charles-Henri Gros and Murali Krishna Ramanathan}, title = {Scalable and Incremental Software Bug Detection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {554--564}, doi = {}, year = {2013}, } |
|
Gyori, Alex |
ESEC/FSE '13: "Crossing the Gap from Imperative ..."
Crossing the Gap from Imperative to Functional Programming through Refactoring
Alex Gyori, Lyle Franklin, Danny Dig, and Jan Lahoda (University of Illinois, USA; Ball State University, USA; Oregon State University, USA; Oracle, Czech Republic) Java 8 introduces two functional features: lambda expressions and functional operations like map or filter that apply a lambda expression over the elements of a Collection. Refactoring existing code to use these new features enables explicit but unobtrusive parallelism and makes the code more succinct. However, refactoring is tedious: it requires changing many lines of code. It is also error-prone: the programmer must reason about the control-, data-flow, and side-effects. Fortunately, refactorings can be automated. We designed and implemented LambdaFicator, a tool which automates two refactorings. The first refactoring converts anonymous inner classes to lambda expressions. The second refactoring converts for loops that iterate over Collections to functional operations that use lambda expressions. Using 9 open-source projects, we have applied these two refactorings 1263 and 1709 times, respectively. The results show that LambdaFicator is useful: (i) it is widely applicable, (ii) it reduces the code bloat, (iii) it increases programmer productivity, and (iv) it is accurate. @InProceedings{ESEC/FSE13p543, author = {Alex Gyori and Lyle Franklin and Danny Dig and Jan Lahoda}, title = {Crossing the Gap from Imperative to Functional Programming through Refactoring}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {543--553}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Halfond, William G. J. |
ESEC/FSE '13: "Identifying Message Flow in ..."
Identifying Message Flow in Distributed Event-Based Systems
Joshua Garcia, Daniel Popescu, Gholamreza Safi, William G. J. Halfond , and Nenad Medvidovic (University of Southern California, USA) Distributed event-based (DEB) systems contain highly-decoupled components that interact by exchanging messages. This enables flexible system composition and adaptation, but also makes DEB systems difficult to maintain. Most existing program analysis techniques to support maintenance are not well suited to DEB systems, while those that are tend to suffer from inaccuracy or make assumptions that limit their applicability. This paper presents Eos, a static analysis technique that identifies message information useful for maintaining a DEB system, namely, message types and message flow within a system. Eos has been evaluated on six off-the-shelf DEB systems spanning five different middleware platforms, and has exhibited excellent accuracy and efficiency. Furthermore, a case study involving a range of maintenance activities undertaken on three existing DEB systems shows that, on average, Eos enables an engineer to identify the scope and impact of required changes more accurately than existing alternatives. @InProceedings{ESEC/FSE13p367, author = {Joshua Garcia and Daniel Popescu and Gholamreza Safi and William G. J. Halfond and Nenad Medvidovic}, title = {Identifying Message Flow in Distributed Event-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {367--377}, doi = {}, year = {2013}, } Info |
|
Hariri, Negar |
ESEC/FSE '13: "Feature Model Extraction from ..."
Feature Model Extraction from Large Collections of Informal Product Descriptions
Jean-Marc Davril, Edouard Delfosse, Negar Hariri, Mathieu Acher, Jane Cleland-Huang, and Patrick Heymans (University of Namur, Belgium; DePaul University, USA; University of Rennes I, France; INRIA, France) Feature Models (FMs) are used extensively in software product line engineering to help generate and validate individual product configurations and to provide support for domain analysis. As FM construction can be tedious and time-consuming, researchers have previously developed techniques for extracting FMs from sets of formally specified individual configurations, or from software requirements specifications for families of existing products. However, such artifacts are often not available. In this paper we present a novel, automated approach for constructing FMs from publicly available product descriptions found in online product repositories and marketing websites such as SoftPedia and CNET. While each individual product description provides only a partial view of features in the domain, a large set of descriptions can provide fairly comprehensive coverage. Our approach utilizes hundreds of partial product descriptions to construct an FM and is described and evaluated against antivirus product descriptions mined from SoftPedia. @InProceedings{ESEC/FSE13p290, author = {Jean-Marc Davril and Edouard Delfosse and Negar Hariri and Mathieu Acher and Jane Cleland-Huang and Patrick Heymans}, title = {Feature Model Extraction from Large Collections of Informal Product Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {290--300}, doi = {}, year = {2013}, } |
|
Harman, Mark |
ESEC/FSE '13: "Efficiency and Early Fault ..."
Efficiency and Early Fault Detection with Lower and Higher Strength Combinatorial Interaction Testing
Justyna Petke , Shin Yoo, Myra B. Cohen, and Mark Harman (University College London, UK; University of Nebraska-Lincoln, USA) Combinatorial Interaction Testing (CIT) is important because it tests the interactions between the many features and parameters that make up the configuration space of software systems. However, in order to be practically applicable, it must be able to cater for soft and hard real-world constraints and should, ideally, report a test priority order that maximises earliest fault detection. We show that we can achieve the highest strength CIT in 5.65 minutes on average. This was previously thought to be too computationally expensive to be feasible. Furthermore, we show that higher strength suites find more faults, while prioritisations using lower strengths are no worse at achieving early fault revelation. @InProceedings{ESEC/FSE13p26, author = {Justyna Petke and Shin Yoo and Myra B. Cohen and Mark Harman}, title = {Efficiency and Early Fault Detection with Lower and Higher Strength Combinatorial Interaction Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {26--36}, doi = {}, year = {2013}, } Info ESEC/FSE '13: "Searching for Better Configurations: ..." Searching for Better Configurations: A Rigorous Approach to Clone Evaluation Tiantian Wang, Mark Harman, Yue Jia, and Jens Krinke (Harbin Institute of Technology, China; University College London, UK) Clone detection finds application in many software engineering activities such as comprehension and refactoring. However, the confounding configuration choice problem poses a widely-acknowledged threat to the validity of previous empirical analyses. We introduce desktop and parallelised cloud-deployed versions of a search based solution that finds suitable configurations for empirical studies. We evaluate our approach on 6 widely used clone detection tools applied to the Bellon suite of 8 subject systems. Our evaluation reports the results of 9.3 million total executions of a clone tool; the largest study yet reported. Our approach finds significantly better configurations (p < 0.05) than those currently used, providing evidence that our approach can ameliorate the confounding configuration choice problem. @InProceedings{ESEC/FSE13p455, author = {Tiantian Wang and Mark Harman and Yue Jia and Jens Krinke}, title = {Searching for Better Configurations: A Rigorous Approach to Clone Evaluation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {455--465}, doi = {}, year = {2013}, } Info |
|
Hashmi, Hammad |
ESEC/FSE '13: "Will You Still Compile Me ..."
Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation
Chris Hawblitzel , Shuvendu K. Lahiri , Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, and Scott Wadsworth (Microsoft, USA) This paper describes a cross-version compiler validator and measures its effectiveness on the CLR JIT compiler. The validator checks for semantically equivalent assembly language output from various versions of the compiler, including versions across a seven-month time period, across two architectures (x86 and ARM), across two compilation scenarios (JIT and MDIL), and across optimizations levels. For month-to-month comparisons, the validator achieves a false alarm rate of just 2.2%. To help understand reported semantic differences, the validator performs a root-cause analysis on the counterexample traces generated by the underlying automated theorem proving tools. This root-cause analysis groups most of the counterexamples into a small number of buckets, reducing the number of counterexamples analyzed by hand by anywhere from 53% to 96%. The validator ran on over 500,000 methods across a large suite of test programs, finding 12 previously unknown correctness and performance bugs in the CLR compiler. @InProceedings{ESEC/FSE13p191, author = {Chris Hawblitzel and Shuvendu K. Lahiri and Kshama Pawar and Hammad Hashmi and Sedar Gokbulut and Lakshan Fernando and Dave Detlefs and Scott Wadsworth}, title = {Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2013}, } |
|
Hawblitzel, Chris |
ESEC/FSE '13: "Will You Still Compile Me ..."
Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation
Chris Hawblitzel , Shuvendu K. Lahiri , Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, and Scott Wadsworth (Microsoft, USA) This paper describes a cross-version compiler validator and measures its effectiveness on the CLR JIT compiler. The validator checks for semantically equivalent assembly language output from various versions of the compiler, including versions across a seven-month time period, across two architectures (x86 and ARM), across two compilation scenarios (JIT and MDIL), and across optimizations levels. For month-to-month comparisons, the validator achieves a false alarm rate of just 2.2%. To help understand reported semantic differences, the validator performs a root-cause analysis on the counterexample traces generated by the underlying automated theorem proving tools. This root-cause analysis groups most of the counterexamples into a small number of buckets, reducing the number of counterexamples analyzed by hand by anywhere from 53% to 96%. The validator ran on over 500,000 methods across a large suite of test programs, finding 12 previously unknown correctness and performance bugs in the CLR compiler. @InProceedings{ESEC/FSE13p191, author = {Chris Hawblitzel and Shuvendu K. Lahiri and Kshama Pawar and Hammad Hashmi and Sedar Gokbulut and Lakshan Fernando and Dave Detlefs and Scott Wadsworth}, title = {Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2013}, } ESEC/FSE '13: "Differential Assertion Checking ..." Differential Assertion Checking Shuvendu K. Lahiri , Kenneth L. McMillan, Rahul Sharma, and Chris Hawblitzel (Microsoft Research, USA; Stanford University, USA) Previous version of a program can be a powerful enabler for program analysis by defining new relative specifications and making the results of current program analysis more relevant. In this paper, we describe the approach of differential assertion checking (DAC) for comparing different versions of a program with respect to a set of assertions. DAC provides a natural way to write relative specifications over two programs. We introduce a novel modular approach to DAC by reducing it to safety checking of a composed program, which can be accomplished by standard program verifiers. In particular, we leverage automatic invariant generation to synthesize relative specifications for pairs of loops and procedures. We provide a preliminary evaluation of a prototype implementation within the SymDiff tool along two directions (a) soundly verifying bug fixes in the presence of loops and (b) providing a knob for suppressing alarms when checking a new version of a program. @InProceedings{ESEC/FSE13p345, author = {Shuvendu K. Lahiri and Kenneth L. McMillan and Rahul Sharma and Chris Hawblitzel}, title = {Differential Assertion Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {345--355}, doi = {}, year = {2013}, } |
|
Herraiz, Israel |
ESEC/FSE '13: "Sample Size vs. Bias in Defect ..."
Sample Size vs. Bias in Defect Prediction
Foyzur Rahman, Daryl Posnett, Israel Herraiz, and Premkumar Devanbu (UC Davis, USA; Universidad Politécnica de Madrid, Spain) Most empirical disciplines promote the reuse and sharing of datasets, as it leads to greater possibility of replication. While this is increasingly the case in Empirical Software Engineering, some of the most popular bug-fix datasets are now known to be biased. This raises two significant concerns: first, that sample bias may lead to underperforming prediction models, and second, that the external validity of the studies based on biased datasets may be suspect. This issue has raised considerable consternation in the ESE literature in recent years. However, there is a confounding factor of these datasets that has not been examined carefully: size. Biased datasets are sampling only some of the data that could be sampled, and doing so in a biased fashion; but biased samples could be smaller, or larger. Smaller data sets in general provide less reliable bases for estimating models, and thus could lead to inferior model performance. In this setting, we ask the question, what affects performance more, bias, or size? We conduct a detailed, large-scale meta-analysis, using simulated datasets sampled with bias from a high-quality dataset which is relatively free of bias. Our results suggest that size always matters just as much bias direction, and in fact much more than bias direction when considering information-retrieval measures such as AUCROC and F-score. This indicates that at least for prediction models, even when dealing with sampling bias, simply finding larger samples can sometimes be sufficient. Our analysis also exposes the complexity of the bias issue, and raises further issues to be explored in the future. @InProceedings{ESEC/FSE13p147, author = {Foyzur Rahman and Daryl Posnett and Israel Herraiz and Premkumar Devanbu}, title = {Sample Size vs. Bias in Defect Prediction}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {147--157}, doi = {}, year = {2013}, } |
|
Heymans, Patrick |
ESEC/FSE '13: "Incrementally Synthesizing ..."
Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications
Joel Greenyer, Christian Brenner, Maxime Cordy, Patrick Heymans, and Erika Gressi (Leibniz Universität Hannover, Germany; University of Paderborn, Germany; University of Namur, Belgium; Politecnico di Milano, Italy) Many software-intensive systems consist of components that interact to fulfill complex functionality. Moreover, often many variants of such systems have to be designed at once. This adds complexity to the design task. Recently, we proposed a scenario-based approach to design product lines, which combines feature diagrams and Modal Sequence Diagrams. We proposed a consistency-checking technique based on a dedicated product line model checker. One limitation of this technique is that it is incomplete, i.e., it may fail to show the consistency of some consistent specifications. In this paper we propose a new game-based approach that overcomes this incompleteness and, in addition, automatically synthesizes controllers for the consistent product specifications. We exploit the fact that many variants are similar and efficiently synthesize product controllers incrementally. We provide a prototype tool and evaluate the efficiency of the approach. @InProceedings{ESEC/FSE13p433, author = {Joel Greenyer and Christian Brenner and Maxime Cordy and Patrick Heymans and Erika Gressi}, title = {Incrementally Synthesizing Controllers from Scenario-Based Product Line Specifications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {433--443}, doi = {}, year = {2013}, } ESEC/FSE '13: "Feature Model Extraction from ..." Feature Model Extraction from Large Collections of Informal Product Descriptions Jean-Marc Davril, Edouard Delfosse, Negar Hariri, Mathieu Acher, Jane Cleland-Huang, and Patrick Heymans (University of Namur, Belgium; DePaul University, USA; University of Rennes I, France; INRIA, France) Feature Models (FMs) are used extensively in software product line engineering to help generate and validate individual product configurations and to provide support for domain analysis. As FM construction can be tedious and time-consuming, researchers have previously developed techniques for extracting FMs from sets of formally specified individual configurations, or from software requirements specifications for families of existing products. However, such artifacts are often not available. In this paper we present a novel, automated approach for constructing FMs from publicly available product descriptions found in online product repositories and marketing websites such as SoftPedia and CNET. While each individual product description provides only a partial view of features in the domain, a large set of descriptions can provide fairly comprehensive coverage. Our approach utilizes hundreds of partial product descriptions to construct an FM and is described and evaluated against antivirus product descriptions mined from SoftPedia. @InProceedings{ESEC/FSE13p290, author = {Jean-Marc Davril and Edouard Delfosse and Negar Hariri and Mathieu Acher and Jane Cleland-Huang and Patrick Heymans}, title = {Feature Model Extraction from Large Collections of Informal Product Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {290--300}, doi = {}, year = {2013}, } |
|
Holzer, Andreas |
ESEC/FSE '13: "Con2colic Testing ..."
Con2colic Testing
Azadeh Farzan , Andreas Holzer, Niloofar Razavi, and Helmut Veith (University of Toronto, Canada; Vienna University of Technology, Austria) In this paper, we describe (con)2colic testing - a systematic testing approach for concurrent software. Based on concrete and symbolic executions of a concurrent program, (con)2colic testing derives inputs and schedules such that the execution space of the program under investigation is systematically explored. We introduce interference scenarios as key concept in (con)2colic testing. Interference scenarios capture the flow of data among different threads and enable a unified representation of path and interference constraints. We have implemented a (con)2colic testing engine and demonstrate the effectiveness of our approach by experiments. @InProceedings{ESEC/FSE13p37, author = {Azadeh Farzan and Andreas Holzer and Niloofar Razavi and Helmut Veith}, title = {Con2colic Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {37--47}, doi = {}, year = {2013}, } |
|
Hossain, B. M. Mainul |
ESEC/FSE '13: "Preventing Database Deadlocks ..."
Preventing Database Deadlocks in Applications
Mark Grechanik, B. M. Mainul Hossain, Ugo Buy, and Haisheng Wang (University of Illinois at Chicago, USA; Oracle, USA) Many organizations deploy applications that use databases by sending Structured Query Language (SQL) statements to them and obtaining data that result from the execution of these statements. Since applications often share the same databases concurrently, database deadlocks routinely occur in these databases resulting in major performance degradation in these applications. Database engines do not prevent database deadlocks for the same reason that the schedulers of operating system kernels do not preempt processes in a way to avoid race conditions and deadlocks - it is not feasible to find an optimal context switching schedule quickly for multiple processes (and SQL statements), and the overhead of doing it is prohibitive. We created a novel approach that combines run-time monitoring, which automatically prevents database deadlocks, with static analysis, which detects hold-and-wait cycles that specify how resources (e.g., database tables) are held in contention during executions of SQL statements. We rigorously evaluated our approach. For a realistic case of over 1,200 SQL statements, our algorithm detects all hold-and-wait cycles in less than two seconds. We built a toolset and experimented with three applications. Our tool prevented all existing database deadlocks in these applications and increased their throughputs by up to three orders of magnitude. @InProceedings{ESEC/FSE13p356, author = {Mark Grechanik and B. M. Mainul Hossain and Ugo Buy and Haisheng Wang}, title = {Preventing Database Deadlocks in Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {356--366}, doi = {}, year = {2013}, } Info |
|
Hu, Gang |
ESEC/FSE '13: "Effective Dynamic Detection ..."
Effective Dynamic Detection of Alias Analysis Errors
Jingyue Wu, Gang Hu, Yang Tang, and Junfeng Yang (Columbia University, USA) Alias analysis is perhaps one of the most crucial and widely used analyses, and has attracted tremendous research efforts over the years. Yet, advanced alias analyses are extremely difficult to get right, and the bugs in these analyses are one key reason that they have not been adopted to production compilers. This paper presents NeonGoby, a system for effectively detecting errors in alias analysis implementations, improving their correctness and hopefully widening their adoption. NeonGoby detects the worst type of bugs where the alias analysis claims that two pointers never alias, but they actually alias at runtime. NeonGoby works by dynamically observing pointer addresses during the execution of a test program and then checking these addresses against an alias analysis for errors. It is explicitly designed to (1) be agnostic to the alias analysis it checks for maximum applicability and ease of use and (2) detect alias analysis errors that manifest on real-world programs and workloads. It emits no false positives as long as test programs do not have undefined behavior per ANSI C specification or call external functions that interfere with our detection algorithm. It reduces performance overhead using a practical selection of techniques. Evaluation on three popular alias analyses and real-world programs Apache and MySQL shows that NeonGoby effectively finds 29 alias analysis bugs with zero false positives and reasonable overhead; the most serious four bugs have been patched by the developers. To enable alias analysis builders to start using NeonGoby today, we have released it open-source at https://github.com/columbia/neongoby, along with our error detection results and proposed patches. @InProceedings{ESEC/FSE13p279, author = {Jingyue Wu and Gang Hu and Yang Tang and Junfeng Yang}, title = {Effective Dynamic Detection of Alias Analysis Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {279--289}, doi = {}, year = {2013}, } Info |
|
Jaffar, Joxan |
ESEC/FSE '13: "Boosting Concolic Testing ..."
Boosting Concolic Testing via Interpolation
Joxan Jaffar , Vijayaraghavan Murali, and Jorge A. Navas (National University of Singapore, Singapore; University of Melbourne, Australia) Concolic testing has been very successful in automatically generating test inputs for programs. However one of its major limitations is path-explosion that limits the generation of high coverage inputs. Since its inception several ideas have been proposed to attack this problem from various angles: defining search heuristics that increase coverage, caching of function summaries, pruning of paths using static/dynamic information etc. We propose a new and complementary method based on interpolation, that greatly mitigates path-explosion by subsuming paths that can be guaranteed to not hit a bug. We discuss new challenges in using interpolation that arise specifically in the context of concolic testing. We experimentally evaluate our method with different search heuristics using Crest, a publicly available concolic tester. @InProceedings{ESEC/FSE13p48, author = {Joxan Jaffar and Vijayaraghavan Murali and Jorge A. Navas}, title = {Boosting Concolic Testing via Interpolation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {48--58}, doi = {}, year = {2013}, } |
|
Jensen, Casper S. |
ESEC/FSE '13: "Server Interface Descriptions ..."
Server Interface Descriptions for Automated Testing of JavaScript Web Applications
Casper S. Jensen, Anders Møller , and Zhendong Su (Aarhus University, Denmark; UC Davis, USA) Automated testing of JavaScript web applications is complicated by the communication with servers. Specifically, it is difficult to test the JavaScript code in isolation from the server code and database contents. We present a practical solution to this problem. First, we demonstrate that formal server interface descriptions are useful in automated testing of JavaScript web applications for separating the concerns of the client and the server. Second, to support the construction of server interface descriptions for existing applications, we introduce an effective inference technique that learns communication patterns from sample data. By incorporating interface descriptions into the testing tool Artemis, our experimental results show that we increase the level of automation for high-coverage testing on a collection of JavaScript web applications that exchange JSON data between the clients and servers. Moreover, we demonstrate that the inference technique can quickly and accurately learn useful server interface descriptions. @InProceedings{ESEC/FSE13p510, author = {Casper S. Jensen and Anders Møller and Zhendong Su}, title = {Server Interface Descriptions for Automated Testing of JavaScript Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {510--520}, doi = {}, year = {2013}, } Info |
|
Jia, Yue |
ESEC/FSE '13: "Searching for Better Configurations: ..."
Searching for Better Configurations: A Rigorous Approach to Clone Evaluation
Tiantian Wang, Mark Harman, Yue Jia, and Jens Krinke (Harbin Institute of Technology, China; University College London, UK) Clone detection finds application in many software engineering activities such as comprehension and refactoring. However, the confounding configuration choice problem poses a widely-acknowledged threat to the validity of previous empirical analyses. We introduce desktop and parallelised cloud-deployed versions of a search based solution that finds suitable configurations for empirical studies. We evaluate our approach on 6 widely used clone detection tools applied to the Bellon suite of 8 subject systems. Our evaluation reports the results of 9.3 million total executions of a clone tool; the largest study yet reported. Our approach finds significantly better configurations (p < 0.05) than those currently used, providing evidence that our approach can ameliorate the confounding configuration choice problem. @InProceedings{ESEC/FSE13p455, author = {Tiantian Wang and Mark Harman and Yue Jia and Jens Krinke}, title = {Searching for Better Configurations: A Rigorous Approach to Clone Evaluation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {455--465}, doi = {}, year = {2013}, } Info |
|
Kalasapur, Swaroop |
ESEC/FSE '13: "Jalangi: A Selective Record-Replay ..."
Jalangi: A Selective Record-Replay and Dynamic Analysis Framework for JavaScript
Koushik Sen, Swaroop Kalasapur, Tasneem Brutch, and Simon Gibbs (UC Berkeley, USA; Samsung Research, USA) JavaScript is widely used for writing client-side web applications and is getting increasingly popular for writing mobile applications. However, unlike C, C++, and Java, there are not that many tools available for analysis and testing of JavaScript applications. In this paper, we present a simple yet powerful framework, called Jalangi, for writing heavy-weight dynamic analyses. Our framework incorporates two key techniques: 1) selective record-replay, a technique which enables to record and to faithfully replay a user-selected part of the program, and 2) shadow values and shadow execution, which enables easy implementation of heavy-weight dynamic analyses. Our implementation makes no special assumption about JavaScript, which makes it applicable to real-world JavaScript programs running on multiple platforms. We have implemented concolic testing, an analysis to track origins of nulls and undefined, a simple form of taint analysis, an analysis to detect likely type inconsistencies, and an object allocation profiler in Jalangi. Our evaluation of Jalangi on the SunSpider benchmark suite and on five web applications shows that Jalangi has an average slowdown of 26X during recording and 30X slowdown during replay and analysis. The slowdowns are comparable with slowdowns reported for similar tools, such as PIN and Valgrind for x86 binaries. We believe that the techniques proposed in this paper are applicable to other dynamic languages. @InProceedings{ESEC/FSE13p488, author = {Koushik Sen and Swaroop Kalasapur and Tasneem Brutch and Simon Gibbs}, title = {Jalangi: A Selective Record-Replay and Dynamic Analysis Framework for JavaScript}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {488--498}, doi = {}, year = {2013}, } |
|
Karim, Rezwana |
ESEC/FSE '13: "Compiling Mockups to Flexible ..."
Compiling Mockups to Flexible UIs
Nishant Sinha and Rezwana Karim (IBM Research, India; Rutgers University, USA) As the web becomes ubiquitous, developers are obliged to develop web applications for a variety of desktop and mobile platforms. Re- designing the user interface for every such platform is clearly cumbersome. We propose a new framework based on model-based compilation to assist the designer in solving this problem. Starting from an under-specified visual design mockup drawn by the designer, we show how faithful and flexible web pages can be obtained with virtually no manual effort. Our framework, in sharp contrast to existing web design tools, overcomes the tough challenges involved in mockup compilation by (a) employing combinatorial search to infer hierarchical layouts and (b) mechanizing adhoc principles for CSS design into a modular, extensible rule-based architecture. We believe ours is the first disciplined effort to solve the problem and will inspire rapid, low-effort web design. @InProceedings{ESEC/FSE13p312, author = {Nishant Sinha and Rezwana Karim}, title = {Compiling Mockups to Flexible UIs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {312--322}, doi = {}, year = {2013}, } Video Info Artifact Accepted for Presentation |
|
Kästner, Christian |
ESEC/FSE '13: "Scalable Analysis of Variable ..."
Scalable Analysis of Variable Software
Jörg Liebig, Alexander von Rhein, Christian Kästner , Sven Apel, Jens Dörre, and Christian Lengauer (University of Passau, Germany; CMU, USA) The advent of variability management and generator technology enables users to derive individual variants from a variable code base based on a selection of desired configuration options. This approach gives rise to the generation of possibly billions of variants that, however, cannot be efficiently analyzed for errors with classic analysis techniques. To address this issue, researchers and practitioners usually apply sampling heuristics. While sampling reduces the analysis effort significantly, the information obtained is necessarily incomplete and it is unknown whether sampling heuristics scale to billions of variants. Recently, researchers have begun to develop variability-aware analyses that analyze the variable code base directly exploiting the similarities among individual variants to reduce analysis effort. However, while being promising, so far, variability-aware analyses have been applied mostly only to small academic systems. To learn about the mutual strengths and weaknesses of variability-aware and sampling-based analyses of software systems, we compared the two strategies by means of two concrete analysis implementations (type checking and liveness analysis), applied them to three subject systems: Busybox, the x86 Linux kernel, and OpenSSL. Our key finding is that variability-aware analysis outperforms most sampling heuristics with respect to analysis time while preserving completeness. @InProceedings{ESEC/FSE13p81, author = {Jörg Liebig and Alexander von Rhein and Christian Kästner and Sven Apel and Jens Dörre and Christian Lengauer}, title = {Scalable Analysis of Variable Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {81--91}, doi = {}, year = {2013}, } Info |
|
Khoo, Siau-Cheng |
ESEC/FSE '13: "Mining Succinct Predicated ..."
Mining Succinct Predicated Bug Signatures
Chengnian Sun and Siau-Cheng Khoo (National University of Singapore, Singapore) A bug signature is a set of program elements highlighting the cause or effect of a bug, and provides contextual information for debugging. In order to mine a signature for a buggy program, two sets of execution profiles of the program, one capturing the correct execution and the other capturing the faulty, are examined to identify the program elements contrasting faulty from correct. Signatures solely consisting of control flow transitions have been investigated via discriminative sequence and graph mining algorithms. These signatures might be handicapped in cases where the effect of a bug is not manifested by any deviation in control flow transitions. In this paper, we introduce the notion of predicated bug signature that aims to enhance the predictive power of bug signatures by utilizing both data predicates and control-flow information. We introduce a novel “discriminative itemset generator” mining technique to generate succinct signatures which do not contain redundant or irrelevant program elements. Our case studies demonstrate that predicated signatures can hint at more scenarios of bugs where traditional control-flow signatures fail. @InProceedings{ESEC/FSE13p576, author = {Chengnian Sun and Siau-Cheng Khoo}, title = {Mining Succinct Predicated Bug Signatures}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {576--586}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Khurshid, Sarfraz |
ESEC/FSE '13: "SPLat: Lightweight Dynamic ..."
SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems
Chang Hwan Peter Kim, Darko Marinov, Sarfraz Khurshid, Don Batory, Sabrina Souto, Paulo Barros, and Marcelo d'Amorim (University of Texas at Austin, USA; University of Illinois at Urbana-Champaign, USA; Groupon, USA; Federal University of Pernambuco, Brazil) Many programs can be configured through dynamic and/or static selection of configuration variables. A software product line (SPL), for example, specifies a family of programs where each program is defined by a unique combination of features. Systematically testing SPL programs is expensive as it can require running each test against a combinatorial number of configurations. Fortunately, a test is often independent of many configuration variables and need not be run against every combination. Configurations that are not required for a test can be pruned from execution. This paper presents SPLat, a new way to dynamically prune irrelevant configurations: the configurations to run for a test can be determined during test execution by monitoring accesses to configuration variables. SPLat achieves an optimal reduction in the number of configurations and is lightweight compared to prior work that used static analysis and heavyweight dynamic execution. Experimental results on 10 SPLs written in Java show that SPLat substantially reduces the total test execution time in many cases. Moreover, we demonstrate the scalability of SPLat by applying it to a large industrial code base written in Ruby on Rails. @InProceedings{ESEC/FSE13p257, author = {Chang Hwan Peter Kim and Darko Marinov and Sarfraz Khurshid and Don Batory and Sabrina Souto and Paulo Barros and Marcelo d'Amorim}, title = {SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2013}, } Info |
|
Kim, Chang Hwan Peter |
ESEC/FSE '13: "SPLat: Lightweight Dynamic ..."
SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems
Chang Hwan Peter Kim, Darko Marinov, Sarfraz Khurshid, Don Batory, Sabrina Souto, Paulo Barros, and Marcelo d'Amorim (University of Texas at Austin, USA; University of Illinois at Urbana-Champaign, USA; Groupon, USA; Federal University of Pernambuco, Brazil) Many programs can be configured through dynamic and/or static selection of configuration variables. A software product line (SPL), for example, specifies a family of programs where each program is defined by a unique combination of features. Systematically testing SPL programs is expensive as it can require running each test against a combinatorial number of configurations. Fortunately, a test is often independent of many configuration variables and need not be run against every combination. Configurations that are not required for a test can be pruned from execution. This paper presents SPLat, a new way to dynamically prune irrelevant configurations: the configurations to run for a test can be determined during test execution by monitoring accesses to configuration variables. SPLat achieves an optimal reduction in the number of configurations and is lightweight compared to prior work that used static analysis and heavyweight dynamic execution. Experimental results on 10 SPLs written in Java show that SPLat substantially reduces the total test execution time in many cases. Moreover, we demonstrate the scalability of SPLat by applying it to a large industrial code base written in Ruby on Rails. @InProceedings{ESEC/FSE13p257, author = {Chang Hwan Peter Kim and Darko Marinov and Sarfraz Khurshid and Don Batory and Sabrina Souto and Paulo Barros and Marcelo d'Amorim}, title = {SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2013}, } Info |
|
Krinke, Jens |
ESEC/FSE '13: "Searching for Better Configurations: ..."
Searching for Better Configurations: A Rigorous Approach to Clone Evaluation
Tiantian Wang, Mark Harman, Yue Jia, and Jens Krinke (Harbin Institute of Technology, China; University College London, UK) Clone detection finds application in many software engineering activities such as comprehension and refactoring. However, the confounding configuration choice problem poses a widely-acknowledged threat to the validity of previous empirical analyses. We introduce desktop and parallelised cloud-deployed versions of a search based solution that finds suitable configurations for empirical studies. We evaluate our approach on 6 widely used clone detection tools applied to the Bellon suite of 8 subject systems. Our evaluation reports the results of 9.3 million total executions of a clone tool; the largest study yet reported. Our approach finds significantly better configurations (p < 0.05) than those currently used, providing evidence that our approach can ameliorate the confounding configuration choice problem. @InProceedings{ESEC/FSE13p455, author = {Tiantian Wang and Mark Harman and Yue Jia and Jens Krinke}, title = {Searching for Better Configurations: A Rigorous Approach to Clone Evaluation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {455--465}, doi = {}, year = {2013}, } Info |
|
Lahiri, Shuvendu K. |
ESEC/FSE '13: "Will You Still Compile Me ..."
Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation
Chris Hawblitzel , Shuvendu K. Lahiri , Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, and Scott Wadsworth (Microsoft, USA) This paper describes a cross-version compiler validator and measures its effectiveness on the CLR JIT compiler. The validator checks for semantically equivalent assembly language output from various versions of the compiler, including versions across a seven-month time period, across two architectures (x86 and ARM), across two compilation scenarios (JIT and MDIL), and across optimizations levels. For month-to-month comparisons, the validator achieves a false alarm rate of just 2.2%. To help understand reported semantic differences, the validator performs a root-cause analysis on the counterexample traces generated by the underlying automated theorem proving tools. This root-cause analysis groups most of the counterexamples into a small number of buckets, reducing the number of counterexamples analyzed by hand by anywhere from 53% to 96%. The validator ran on over 500,000 methods across a large suite of test programs, finding 12 previously unknown correctness and performance bugs in the CLR compiler. @InProceedings{ESEC/FSE13p191, author = {Chris Hawblitzel and Shuvendu K. Lahiri and Kshama Pawar and Hammad Hashmi and Sedar Gokbulut and Lakshan Fernando and Dave Detlefs and Scott Wadsworth}, title = {Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2013}, } ESEC/FSE '13: "Differential Assertion Checking ..." Differential Assertion Checking Shuvendu K. Lahiri , Kenneth L. McMillan, Rahul Sharma, and Chris Hawblitzel (Microsoft Research, USA; Stanford University, USA) Previous version of a program can be a powerful enabler for program analysis by defining new relative specifications and making the results of current program analysis more relevant. In this paper, we describe the approach of differential assertion checking (DAC) for comparing different versions of a program with respect to a set of assertions. DAC provides a natural way to write relative specifications over two programs. We introduce a novel modular approach to DAC by reducing it to safety checking of a composed program, which can be accomplished by standard program verifiers. In particular, we leverage automatic invariant generation to synthesize relative specifications for pairs of loops and procedures. We provide a preliminary evaluation of a prototype implementation within the SymDiff tool along two directions (a) soundly verifying bug fixes in the presence of loops and (b) providing a knob for suppressing alarms when checking a new version of a program. @InProceedings{ESEC/FSE13p345, author = {Shuvendu K. Lahiri and Kenneth L. McMillan and Rahul Sharma and Chris Hawblitzel}, title = {Differential Assertion Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {345--355}, doi = {}, year = {2013}, } |
|
Lahoda, Jan |
ESEC/FSE '13: "Crossing the Gap from Imperative ..."
Crossing the Gap from Imperative to Functional Programming through Refactoring
Alex Gyori, Lyle Franklin, Danny Dig, and Jan Lahoda (University of Illinois, USA; Ball State University, USA; Oregon State University, USA; Oracle, Czech Republic) Java 8 introduces two functional features: lambda expressions and functional operations like map or filter that apply a lambda expression over the elements of a Collection. Refactoring existing code to use these new features enables explicit but unobtrusive parallelism and makes the code more succinct. However, refactoring is tedious: it requires changing many lines of code. It is also error-prone: the programmer must reason about the control-, data-flow, and side-effects. Fortunately, refactorings can be automated. We designed and implemented LambdaFicator, a tool which automates two refactorings. The first refactoring converts anonymous inner classes to lambda expressions. The second refactoring converts for loops that iterate over Collections to functional operations that use lambda expressions. Using 9 open-source projects, we have applied these two refactorings 1263 and 1709 times, respectively. The results show that LambdaFicator is useful: (i) it is widely applicable, (ii) it reduces the code bloat, (iii) it increases programmer productivity, and (iv) it is accurate. @InProceedings{ESEC/FSE13p543, author = {Alex Gyori and Lyle Franklin and Danny Dig and Jan Lahoda}, title = {Crossing the Gap from Imperative to Functional Programming through Refactoring}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {543--553}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Lengauer, Christian |
ESEC/FSE '13: "Scalable Analysis of Variable ..."
Scalable Analysis of Variable Software
Jörg Liebig, Alexander von Rhein, Christian Kästner , Sven Apel, Jens Dörre, and Christian Lengauer (University of Passau, Germany; CMU, USA) The advent of variability management and generator technology enables users to derive individual variants from a variable code base based on a selection of desired configuration options. This approach gives rise to the generation of possibly billions of variants that, however, cannot be efficiently analyzed for errors with classic analysis techniques. To address this issue, researchers and practitioners usually apply sampling heuristics. While sampling reduces the analysis effort significantly, the information obtained is necessarily incomplete and it is unknown whether sampling heuristics scale to billions of variants. Recently, researchers have begun to develop variability-aware analyses that analyze the variable code base directly exploiting the similarities among individual variants to reduce analysis effort. However, while being promising, so far, variability-aware analyses have been applied mostly only to small academic systems. To learn about the mutual strengths and weaknesses of variability-aware and sampling-based analyses of software systems, we compared the two strategies by means of two concrete analysis implementations (type checking and liveness analysis), applied them to three subject systems: Busybox, the x86 Linux kernel, and OpenSSL. Our key finding is that variability-aware analysis outperforms most sampling heuristics with respect to analysis time while preserving completeness. @InProceedings{ESEC/FSE13p81, author = {Jörg Liebig and Alexander von Rhein and Christian Kästner and Sven Apel and Jens Dörre and Christian Lengauer}, title = {Scalable Analysis of Variable Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {81--91}, doi = {}, year = {2013}, } Info |
|
Li, Bixin |
ESEC/FSE '13: "An Empirical Analysis of the ..."
An Empirical Analysis of the Co-evolution of Schema and Code in Database Applications
Dong Qiu, Bixin Li, and Zhendong Su (Southeast University, China; UC Davis, USA) Modern database applications are among the most widely used and complex software systems. They constantly evolve, responding to changes to data, database schemas, and code. It is challenging to manage these changes and ensure that everything co-evolves consistently. For example, when a database schema is modified, all the code that interacts with the database must be changed accordingly. Although database evolution and software evolution have been extensively studied in isolation, the co-evolution of schema and code has largely been unexplored. This paper presents the first comprehensive empirical analysis of the co-evolution of database schemas and code in ten popular large open-source database applications, totaling over 160K revisions. Our major findings include: 1) Database schemas evolve frequently during the application lifecycle, exhibiting a variety of change types with similar distributions across the studied applications; 2) Overall, schema changes induce significant code-level modifications, while certain change types have more impact on code than others; and 3) Co-change analyses can be viable to automate or assist with database application evolution. We have also observed that: 1) 80% of the schema changes happened in 20-30% of the tables, while nearly 40% of the tables did not change; and 2) Referential integrity constraints and stored procedures are rarely used in our studied subjects. We believe that our study reveals new insights into how database applications evolve and useful guidelines for designing assistive tools to aid their evolution. @InProceedings{ESEC/FSE13p125, author = {Dong Qiu and Bixin Li and Zhendong Su}, title = {An Empirical Analysis of the Co-evolution of Schema and Code in Database Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {125--135}, doi = {}, year = {2013}, } Info |
|
Li, Kaituo |
ESEC/FSE '13: "Second-Order Constraints in ..."
Second-Order Constraints in Dynamic Invariant Inference
Kaituo Li, Christoph Reichenbach, Yannis Smaragdakis , and Michal Young (University of Massachusetts at Amherst, USA; Goethe University Frankfurt, Germany; University of Athens, Greece; University of Oregon, USA) The current generation of dynamic invariant detectors often produce invariants that are inconsistent with program semantics or programmer knowledge. We improve the consistency of dynamically discovered invariants by taking into account higher-level constraints. These constraints encode knowledge about invariants, even when the invariants themselves are unknown. For instance, even though the invariants describing the behavior of two functions f1 and f2 may be unknown, we may know that any valid input for f1 is also valid for f2, i.e., the precondition of f1 implies that of f2. We explore techniques for expressing and employing such consistency constraints to improve the quality of produced invariants. We further introduce techniques for dynamically discovering potential second-order constraints that the programmer can subsequently approve or reject. Our implementation builds on the Daikon tool, with a vocabulary of constraints that the programmer can use to enhance and constrain Daikon’s inference. We show that dynamic inference of second-order constraints together with minimal human effort can significantly influence the produced (first-order) invariants even in systems of substantial size, such as the Apache Commons Collections and the AspectJ compiler. We also find that 99% of the dynamically inferred second-order constraints we sampled are true. @InProceedings{ESEC/FSE13p103, author = {Kaituo Li and Christoph Reichenbach and Yannis Smaragdakis and Michal Young}, title = {Second-Order Constraints in Dynamic Invariant Inference}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {103--113}, doi = {}, year = {2013}, } |
|
Liang, Guangtai |
ESEC/FSE '13: "Inferring Project-Specific ..."
Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs
Guangtai Liang, Qianxiang Wang, Tao Xie, and Hong Mei (Peking University, China; University of Illinois at Urbana-Champaign, USA) Lightweight static bug-detection tools such as FindBugs, PMD, Jlint, and Lint4j detect bugs with the knowledge of generic bug patterns (e.g., objects of java.io.InputStream are not closed in time after used). Besides generic bug patterns, different projects under analysis may have some project-specific bug patterns. For example, in a revision of the Xerces project, the class field "fDTDHandler" is dereferenced without proper null-checks, while it could actually be null at runtime. We name such bug patterns directly related to objects instantiated in specific projects as Project-Specific Bug Patterns (PSBPs). Due to lack of such PSBP knowledge, existing tools usually fail in effectively detecting most of this kind of bugs. We name bugs belonging to the same project and sharing the same PSBP as sibling bugs. If some sibling bugs are fixed in a fix revision but some others remain, we treat such fix as an incomplete fix. To address such incomplete fixes, we propose a PSBP-based approach for detecting sibling bugs and implement a tool called Sibling-Bug Detector (SBD). Given a fix revision, SBD first infers the PSBPs implied by the fix revision. Then, based on the inferred PSBPs, SBD detects their related sibling bugs in the same project. To evaluate SBD, we apply it to seven popular open-source projects. Among the 108 warnings reported by SBD, 63 of them have been confirmed as real bugs by the project developers, while two existing popular static detectors (FindBugs and PMD) cannot report most of them. @InProceedings{ESEC/FSE13p565, author = {Guangtai Liang and Qianxiang Wang and Tao Xie and Hong Mei}, title = {Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {565--575}, doi = {}, year = {2013}, } Info |
|
Liebig, Jörg |
ESEC/FSE '13: "Scalable Analysis of Variable ..."
Scalable Analysis of Variable Software
Jörg Liebig, Alexander von Rhein, Christian Kästner , Sven Apel, Jens Dörre, and Christian Lengauer (University of Passau, Germany; CMU, USA) The advent of variability management and generator technology enables users to derive individual variants from a variable code base based on a selection of desired configuration options. This approach gives rise to the generation of possibly billions of variants that, however, cannot be efficiently analyzed for errors with classic analysis techniques. To address this issue, researchers and practitioners usually apply sampling heuristics. While sampling reduces the analysis effort significantly, the information obtained is necessarily incomplete and it is unknown whether sampling heuristics scale to billions of variants. Recently, researchers have begun to develop variability-aware analyses that analyze the variable code base directly exploiting the similarities among individual variants to reduce analysis effort. However, while being promising, so far, variability-aware analyses have been applied mostly only to small academic systems. To learn about the mutual strengths and weaknesses of variability-aware and sampling-based analyses of software systems, we compared the two strategies by means of two concrete analysis implementations (type checking and liveness analysis), applied them to three subject systems: Busybox, the x86 Linux kernel, and OpenSSL. Our key finding is that variability-aware analysis outperforms most sampling heuristics with respect to analysis time while preserving completeness. @InProceedings{ESEC/FSE13p81, author = {Jörg Liebig and Alexander von Rhein and Christian Kästner and Sven Apel and Jens Dörre and Christian Lengauer}, title = {Scalable Analysis of Variable Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {81--91}, doi = {}, year = {2013}, } Info |
|
Linares-Vásquez, Mario |
ESEC/FSE '13: "API Change and Fault Proneness: ..."
API Change and Fault Proneness: A Threat to the Success of Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto , and Denys Poshyvanyk (College of William and Mary, USA; University of Sannio, Italy; Universidad Nacional de Colombia, Colombia; University of Molise, Italy) During the recent years, the market of mobile software applications (apps) has maintained an impressive upward trajectory. Many small and large software development companies invest considerable resources to target available opportunities. As of today, the markets for such devices feature over 850K+ apps for Android and 900K+ for iOS. Availability, cost, functionality, and usability are just some factors that determine the success or lack of success for a given app. Among the other factors, reliability is an important criteria: users easily get frustrated by repeated failures, crashes, and other bugs; hence, abandoning some apps in favor of others. This paper reports a study analyzing how the fault- and change-proneness of APIs used by 7,097 (free) Android apps relates to applications' lack of success, estimated from user ratings. Results of this study provide important insights into a crucial issue: making heavy use of fault- and change-prone APIs can negatively impact the success of these apps. @InProceedings{ESEC/FSE13p477, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Bernal-Cárdenas and Massimiliano Di Penta and Rocco Oliveto and Denys Poshyvanyk}, title = {API Change and Fault Proneness: A Threat to the Success of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {477--487}, doi = {}, year = {2013}, } Info |
|
Liu, Peng |
ESEC/FSE '13: "Finding Incorrect Compositions ..."
Finding Incorrect Compositions of Atomicity
Peng Liu, Julian Dolby , and Charles Zhang (Hong Kong University of Science and Technology, China; IBM Research, USA) In object-oriented code, atomicity is ideally isolated in a library which encapsulates shared program state and provides atomic APIs for access. The library provides a convenient way for programmers to reason about the needed synchronization. However, as the library exports a limited set of APIs, it cannot satisfy every unplanned atomicity demand; therefore, clients may have to compose invocations of the library APIs to obtain new atomic functionality. This process is error-prone due to the complexity of reasoning required, hence tool support for uncovering incorrect compositions (i.e., atomic compositions that are implemented incorrectly) would be very helpful. A key difficulty is how to determine the intended atomic compositions, which are rarely documented. Existing inference techniques cannot be used to infer the atomic compositions because they cannot recognize the library and the client, which requires understanding the related program state. Even if extended to support the library/client, they lead to many false positives or false negatives because they miss the key program logic which reflects programmers’ coding paradigms for atomic compositions. We define a new inference technique which identifies intended atomic compositions using two key symptoms based on program dependence. We then check dynamically whether these atomic compositions are implemented incorrectly as non-atomic. Evaluation on thirteen large applications shows that our approach finds around 50 previously unknown incorrect compositions. Further study on Tomcat shows that almost half (5 out of 12) of discovered incorrect compositions are confirmed as bugs by the developers. Given that Tomcat is heavily used in 250,000 sites including Linkedin.com and Ebay.com, we believe finding multiple new bugs in it automatically with relatively few false positives supports our heuristics for determining intended atomicity. @InProceedings{ESEC/FSE13p158, author = {Peng Liu and Julian Dolby and Charles Zhang}, title = {Finding Incorrect Compositions of Atomicity}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {158--168}, doi = {}, year = {2013}, } Artifact Accepted for Presentation |
|
Livshits, Benjamin |
ESEC/FSE '13: "Practical Static Analysis ..."
Practical Static Analysis of JavaScript Applications in the Presence of Frameworks and Libraries
Magnus Madsen , Benjamin Livshits, and Michael Fanning (Aarhus University, Denmark; Microsoft Research, USA; Microsoft, USA) JavaScript is a language that is widely-used for both web- based and standalone applications such as those in the upcoming Windows 8 operating system. Analysis of JavaScript has long been known to be challenging due to its dynamic nature. On top of that, most JavaScript applications rely on large and complex libraries and frameworks, often written in a combination of JavaScript and native code such as C and C++. Stubs have been commonly employed as a partial specification mechanism to address the library problem; however, they are tedious to write, incomplete, and occasionally incorrect. However, the manner in which library code is used within applications often sheds light on what library APIs return or consume as parameters. In this paper, we propose a technique which combines pointer analysis with use analysis to handle many challenges posed by large JavaScript libraries. Our approach enables a variety of applications, ranging from call graph discovery to auto-complete to supporting runtime optimizations. Our techniques have been implemented and empirically validated on a set of 25 Windows 8 JavaScript applications, averaging 1,587 lines of code, demonstrating a combination of scalability and precision. @InProceedings{ESEC/FSE13p499, author = {Magnus Madsen and Benjamin Livshits and Michael Fanning}, title = {Practical Static Analysis of JavaScript Applications in the Presence of Frameworks and Libraries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {499--509}, doi = {}, year = {2013}, } |
|
Lohar, Sugandha |
ESEC/FSE '13: "Improving Trace Accuracy through ..."
Improving Trace Accuracy through Data-Driven Configuration and Composition of Tracing Features
Sugandha Lohar, Sorawit Amornborvornwong, Andrea Zisman, and Jane Cleland-Huang (DePaul University, USA; Open University, UK) Software traceability is a sought-after, yet often elusive quality in large software-intensive systems primarily because the cost and effort of tracing can be overwhelming. State-of-the art solutions address this problem through utilizing trace retrieval techniques to automate the process of creating and maintaining trace links. However, there is no simple one- size-fits all solution to trace retrieval. As this paper will show, finding the right combination of tracing techniques can lead to significant improvements in the quality of generated links. We present a novel approach to trace retrieval in which the underlying infrastructure is configured at runtime to optimize trace quality. We utilize a machine-learning approach to search for the best configuration given an initial training set of validated trace links, a set of available tracing techniques specified in a feature model, and an architecture capable of instantiating all valid configurations of features. We evaluate our approach through a series of experiments using project data from the transportation, healthcare, and space exploration domains, and discuss its implementation in an industrial environment. Finally, we show how our approach can create a robust baseline against which new tracing techniques can be evaluated. @InProceedings{ESEC/FSE13p378, author = {Sugandha Lohar and Sorawit Amornborvornwong and Andrea Zisman and Jane Cleland-Huang}, title = {Improving Trace Accuracy through Data-Driven Configuration and Composition of Tracing Features}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {378--388}, doi = {}, year = {2013}, } |
|
Löwe, Stefan |
ESEC/FSE '13: "Precision Reuse for Efficient ..."
Precision Reuse for Efficient Regression Verification
Dirk Beyer , Stefan Löwe, Evgeny Novikov, Andreas Stahlbauer, and Philipp Wendler (University of Passau, Germany; ISP RAS, Russia) Continuous testing during development is a well-established technique for software-quality assurance. Continuous model checking from revision to revision is not yet established as a standard practice, because the enormous resource consumption makes its application impractical. Model checkers compute a large number of verification facts that are necessary for verifying if a given specification holds. We have identified a category of such intermediate results that are easy to store and efficient to reuse: abstraction precisions. The precision of an abstract domain specifies the level of abstraction that the analysis works on. Precisions are thus a precious result of the verification effort and it is a waste of resources to throw them away after each verification run. In particular, precisions are reasonably small and thus easy to store; they are easy to process and have a large impact on resource consumption. We experimentally show the impact of precision reuse on industrial verification problems created from 62 Linux kernel device drivers with 1119 revisions. @InProceedings{ESEC/FSE13p389, author = {Dirk Beyer and Stefan Löwe and Evgeny Novikov and Andreas Stahlbauer and Philipp Wendler}, title = {Precision Reuse for Efficient Regression Verification}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {389--399}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Machiry, Aravind |
ESEC/FSE '13: "Dynodroid: An Input Generation ..."
Dynodroid: An Input Generation System for Android Apps
Aravind Machiry, Rohan Tahiliani, and Mayur Naik (Georgia Tech, USA) We present a system Dynodroid for generating relevant inputs to unmodified Android apps. Dynodroid views an app as an event-driven program that interacts with its environment by means of a sequence of events through the Android framework. By instrumenting the framework once and for all, Dynodroid monitors the reaction of an app upon each event in a lightweight manner, using it to guide the generation of the next event to the app. Dynodroid also allows interleaving events from machines, which are better at generating a large number of simple inputs, with events from humans, who are better at providing intelligent inputs. We evaluated Dynodroid on 50 open-source Android apps, and compared it with two prevalent approaches: users manually exercising apps, and Monkey, a popular fuzzing tool. Dynodroid, humans, and Monkey covered 55%, 60%, and 53%, respectively, of each app's Java source code on average. Monkey took 20X more events on average than Dynodroid. Dynodroid also found 9 bugs in 7 of the 50 apps, and 6 bugs in 5 of the top 1,000 free apps on Google Play. @InProceedings{ESEC/FSE13p224, author = {Aravind Machiry and Rohan Tahiliani and Mayur Naik}, title = {Dynodroid: An Input Generation System for Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {224--234}, doi = {}, year = {2013}, } Info Distinguished Artifact |
|
Madsen, Magnus |
ESEC/FSE '13: "Practical Static Analysis ..."
Practical Static Analysis of JavaScript Applications in the Presence of Frameworks and Libraries
Magnus Madsen , Benjamin Livshits, and Michael Fanning (Aarhus University, Denmark; Microsoft Research, USA; Microsoft, USA) JavaScript is a language that is widely-used for both web- based and standalone applications such as those in the upcoming Windows 8 operating system. Analysis of JavaScript has long been known to be challenging due to its dynamic nature. On top of that, most JavaScript applications rely on large and complex libraries and frameworks, often written in a combination of JavaScript and native code such as C and C++. Stubs have been commonly employed as a partial specification mechanism to address the library problem; however, they are tedious to write, incomplete, and occasionally incorrect. However, the manner in which library code is used within applications often sheds light on what library APIs return or consume as parameters. In this paper, we propose a technique which combines pointer analysis with use analysis to handle many challenges posed by large JavaScript libraries. Our approach enables a variety of applications, ranging from call graph discovery to auto-complete to supporting runtime optimizations. Our techniques have been implemented and empirically validated on a set of 25 Windows 8 JavaScript applications, averaging 1,587 lines of code, demonstrating a combination of scalability and precision. @InProceedings{ESEC/FSE13p499, author = {Magnus Madsen and Benjamin Livshits and Michael Fanning}, title = {Practical Static Analysis of JavaScript Applications in the Presence of Frameworks and Libraries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {499--509}, doi = {}, year = {2013}, } |
|
Maoz, Shahar |
ESEC/FSE '13: "Synthesis of Component and ..."
Synthesis of Component and Connector Models from Crosscutting Structural Views
Shahar Maoz , Jan Oliver Ringert, and Bernhard Rumpe (Tel Aviv University, Israel; RWTH Aachen University, Germany) We present component and connector (C&C) views, which specify structural properties of component and connector models in an expressive and intuitive way. C&C views provide means to abstract away direct hierarchy, direct connectivity, port names and types, and thus can crosscut the traditional boundaries of the implementation-oriented hierarchical decomposition of systems and sub-systems, and reflect the partial knowledge available to different stakeholders involved in a system's design. As a primary application for C&C views we investigate the synthesis problem: given a C&C views specification, consisting of mandatory, alternative, and negative views, construct a concrete satisfying C&C model, if one exists. We show that the problem is NP-hard and solve it, in a bounded scope, using a reduction to SAT, via Alloy. We further extend the basic problem with support for library components, specification patterns, and architectural styles. The result of synthesis can be used for further exploration, simulation, and refinement of the C&C model or, as the complete, final model itself, for direct code generation. A prototype tool and an evaluation over four example systems with multiple specifications show promising results and suggest interesting future research directions towards a comprehensive development environment for the structure of component and connector designs. @InProceedings{ESEC/FSE13p444, author = {Shahar Maoz and Jan Oliver Ringert and Bernhard Rumpe}, title = {Synthesis of Component and Connector Models from Crosscutting Structural Views}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {444--454}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Marchetti, Eda |
ESEC/FSE '13: "Adequate Monitoring of Service ..."
Adequate Monitoring of Service Compositions
Antonia Bertolino, Eda Marchetti, and Andrea Morichetta (ISTI-CNR, Italy) Monitoring is essential to validate the runtime behaviour of dynamic distributed systems. However, monitors can inform of relevant events as they occur, but by their very nature they will not report about all those events that are not happening. In service-oriented applications it would be desirable to have means to assess the thoroughness of the interactions among the services that are being monitored. In case some events or message sequences or interaction patterns have not been observed for a while, in fact, one could timely check whether this happens because something is going wrong. In this paper, we introduce the novel notion of monitoring adequacy, which is generic and can be defined on different entities. We then define two adequacy criteria for service compositions and implement a proof-of-concept adequate monitoring framework. We validate the approach on two case studies, the Travel Reservation System and the Future Market choreographies. @InProceedings{ESEC/FSE13p59, author = {Antonia Bertolino and Eda Marchetti and Andrea Morichetta}, title = {Adequate Monitoring of Service Compositions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {59--69}, doi = {}, year = {2013}, } |
|
Marchetto, Alessandro |
ESEC/FSE '13: "Automated Oracles: An Empirical ..."
Automated Oracles: An Empirical Study on Cost and Effectiveness
Cu D. Nguyen, Alessandro Marchetto, and Paolo Tonella (Fondazione Bruno Kessler, Italy) Software testing is an effective, yet expensive, method to improve software quality. Test automation, a potential way to reduce testing cost, has received enormous research attention recently, but the so-called “oracle problem” (how to decide the PASS/FAIL outcome of a test execution) is still a major obstacle to such cost reduction. We have extensively investigated state-of-the-art works that contribute to address this problem, from areas such as specification mining and model inference. In this paper, we compare three types of automated oracles: Data invariants, Temporal invariants, and Finite State Automata. More specifically, we study the training cost and the false positive rate; we evaluate also their fault detection capability. Seven medium to large, industrial application subjects and real faults have been used in our empirical investigation. @InProceedings{ESEC/FSE13p136, author = {Cu D. Nguyen and Alessandro Marchetto and Paolo Tonella}, title = {Automated Oracles: An Empirical Study on Cost and Effectiveness}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {136--146}, doi = {}, year = {2013}, } Info |
|
Marinescu, Paul Dan |
ESEC/FSE '13: "KATCH: High-Coverage Testing ..."
KATCH: High-Coverage Testing of Software Patches
Paul Dan Marinescu and Cristian Cadar (Imperial College London, UK) One of the distinguishing characteristics of software systems is that they evolve: new patches are committed to software repositories and new versions are released to users on a continuous basis. Unfortunately, many of these changes bring unexpected bugs that break the stability of the system or affect its security. In this paper, we address this problem using a technique for automatically testing code patches. Our technique combines symbolic execution with several novel heuristics based on static and dynamic program analysis which allow it to quickly reach the code of the patch. We have implemented our approach in a tool called KATCH, which we have applied to all the patches written in a combined period of approximately six years for nineteen mature programs from the popular GNU diffutils, GNU binutils and GNU findutils utility suites, which are shipped with virtually all UNIX-based distributions. Our results show that KATCH can automatically synthesise inputs that significantly increase the patch coverage achieved by the existing manual test suites, and find bugs at the moment they are introduced. @InProceedings{ESEC/FSE13p235, author = {Paul Dan Marinescu and Cristian Cadar}, title = {KATCH: High-Coverage Testing of Software Patches}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {235--245}, doi = {}, year = {2013}, } Info Distinguished Artifact |
|
Marinov, Darko |
ESEC/FSE '13: "SPLat: Lightweight Dynamic ..."
SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems
Chang Hwan Peter Kim, Darko Marinov, Sarfraz Khurshid, Don Batory, Sabrina Souto, Paulo Barros, and Marcelo d'Amorim (University of Texas at Austin, USA; University of Illinois at Urbana-Champaign, USA; Groupon, USA; Federal University of Pernambuco, Brazil) Many programs can be configured through dynamic and/or static selection of configuration variables. A software product line (SPL), for example, specifies a family of programs where each program is defined by a unique combination of features. Systematically testing SPL programs is expensive as it can require running each test against a combinatorial number of configurations. Fortunately, a test is often independent of many configuration variables and need not be run against every combination. Configurations that are not required for a test can be pruned from execution. This paper presents SPLat, a new way to dynamically prune irrelevant configurations: the configurations to run for a test can be determined during test execution by monitoring accesses to configuration variables. SPLat achieves an optimal reduction in the number of configurations and is lightweight compared to prior work that used static analysis and heavyweight dynamic execution. Experimental results on 10 SPLs written in Java show that SPLat substantially reduces the total test execution time in many cases. Moreover, we demonstrate the scalability of SPLat by applying it to a large industrial code base written in Ruby on Rails. @InProceedings{ESEC/FSE13p257, author = {Chang Hwan Peter Kim and Darko Marinov and Sarfraz Khurshid and Don Batory and Sabrina Souto and Paulo Barros and Marcelo d'Amorim}, title = {SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2013}, } Info |
|
McMillan, Kenneth L. |
ESEC/FSE '13: "Differential Assertion Checking ..."
Differential Assertion Checking
Shuvendu K. Lahiri , Kenneth L. McMillan, Rahul Sharma, and Chris Hawblitzel (Microsoft Research, USA; Stanford University, USA) Previous version of a program can be a powerful enabler for program analysis by defining new relative specifications and making the results of current program analysis more relevant. In this paper, we describe the approach of differential assertion checking (DAC) for comparing different versions of a program with respect to a set of assertions. DAC provides a natural way to write relative specifications over two programs. We introduce a novel modular approach to DAC by reducing it to safety checking of a composed program, which can be accomplished by standard program verifiers. In particular, we leverage automatic invariant generation to synthesize relative specifications for pairs of loops and procedures. We provide a preliminary evaluation of a prototype implementation within the SymDiff tool along two directions (a) soundly verifying bug fixes in the presence of loops and (b) providing a knob for suppressing alarms when checking a new version of a program. @InProceedings{ESEC/FSE13p345, author = {Shuvendu K. Lahiri and Kenneth L. McMillan and Rahul Sharma and Chris Hawblitzel}, title = {Differential Assertion Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {345--355}, doi = {}, year = {2013}, } |
|
McPeak, Scott |
ESEC/FSE '13: "Scalable and Incremental Software ..."
Scalable and Incremental Software Bug Detection
Scott McPeak, Charles-Henri Gros, and Murali Krishna Ramanathan (Coverity, USA; Indian Institute of Science, India) An important, but often neglected, goal of static analysis for detecting bugs is the ability to show defects to the programmer quickly. Unfortunately, existing static analysis tools scale very poorly, or are shallow and cannot find complex interprocedural defects. Previous attempts at reducing the analysis time by adding more resources (CPU, memory) or by splitting the analysis into multiple sub-analyses based on defect detection capabilities resulted in limited/negligible improvements. We present a technique for parallel and incremental static analysis using top-down, bottom-up and global specification inference based around the concept of a work unit, a self-contained atom of analysis input, that deterministically maps to its output. A work unit contains both abstract and concrete syntax to analyze, a supporting fragment of the class hierarchy, summarized interprocedural behavior, and defect reporting information, factored to ensure a high level of reuse when analyzing successive versions incrementally. Work units are created and consumed by an analysis master process that coordinates the multiple analysis passes, the flow of information among them, and incrementalizes the computation. Meanwhile, multiple analysis worker processes use abstract interpretation to compute work unit results. Process management and interprocess communication is done by a general-purpose computation distributor layer. We have implemented our approach and our experimental results show that using eight processor cores, we can perform complete analysis of code bases with millions of lines of code in a few hours, and even faster after incremental changes to that code. The analysis is thorough and accurate: it usually reports about one crash-causing defect per thousand lines of code, with a false positive rate of 10--20% @InProceedings{ESEC/FSE13p554, author = {Scott McPeak and Charles-Henri Gros and Murali Krishna Ramanathan}, title = {Scalable and Incremental Software Bug Detection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {554--564}, doi = {}, year = {2013}, } |
|
Medvidovic, Nenad |
ESEC/FSE '13: "Identifying Message Flow in ..."
Identifying Message Flow in Distributed Event-Based Systems
Joshua Garcia, Daniel Popescu, Gholamreza Safi, William G. J. Halfond , and Nenad Medvidovic (University of Southern California, USA) Distributed event-based (DEB) systems contain highly-decoupled components that interact by exchanging messages. This enables flexible system composition and adaptation, but also makes DEB systems difficult to maintain. Most existing program analysis techniques to support maintenance are not well suited to DEB systems, while those that are tend to suffer from inaccuracy or make assumptions that limit their applicability. This paper presents Eos, a static analysis technique that identifies message information useful for maintaining a DEB system, namely, message types and message flow within a system. Eos has been evaluated on six off-the-shelf DEB systems spanning five different middleware platforms, and has exhibited excellent accuracy and efficiency. Furthermore, a case study involving a range of maintenance activities undertaken on three existing DEB systems shows that, on average, Eos enables an engineer to identify the scope and impact of required changes more accurately than existing alternatives. @InProceedings{ESEC/FSE13p367, author = {Joshua Garcia and Daniel Popescu and Gholamreza Safi and William G. J. Halfond and Nenad Medvidovic}, title = {Identifying Message Flow in Distributed Event-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {367--377}, doi = {}, year = {2013}, } Info |
|
Mei, Hong |
ESEC/FSE '13: "Inferring Project-Specific ..."
Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs
Guangtai Liang, Qianxiang Wang, Tao Xie, and Hong Mei (Peking University, China; University of Illinois at Urbana-Champaign, USA) Lightweight static bug-detection tools such as FindBugs, PMD, Jlint, and Lint4j detect bugs with the knowledge of generic bug patterns (e.g., objects of java.io.InputStream are not closed in time after used). Besides generic bug patterns, different projects under analysis may have some project-specific bug patterns. For example, in a revision of the Xerces project, the class field "fDTDHandler" is dereferenced without proper null-checks, while it could actually be null at runtime. We name such bug patterns directly related to objects instantiated in specific projects as Project-Specific Bug Patterns (PSBPs). Due to lack of such PSBP knowledge, existing tools usually fail in effectively detecting most of this kind of bugs. We name bugs belonging to the same project and sharing the same PSBP as sibling bugs. If some sibling bugs are fixed in a fix revision but some others remain, we treat such fix as an incomplete fix. To address such incomplete fixes, we propose a PSBP-based approach for detecting sibling bugs and implement a tool called Sibling-Bug Detector (SBD). Given a fix revision, SBD first infers the PSBPs implied by the fix revision. Then, based on the inferred PSBPs, SBD detects their related sibling bugs in the same project. To evaluate SBD, we apply it to seven popular open-source projects. Among the 108 warnings reported by SBD, 63 of them have been confirmed as real bugs by the project developers, while two existing popular static detectors (FindBugs and PMD) cannot report most of them. @InProceedings{ESEC/FSE13p565, author = {Guangtai Liang and Qianxiang Wang and Tao Xie and Hong Mei}, title = {Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {565--575}, doi = {}, year = {2013}, } Info |
|
Møller, Anders |
ESEC/FSE '13: "Server Interface Descriptions ..."
Server Interface Descriptions for Automated Testing of JavaScript Web Applications
Casper S. Jensen, Anders Møller , and Zhendong Su (Aarhus University, Denmark; UC Davis, USA) Automated testing of JavaScript web applications is complicated by the communication with servers. Specifically, it is difficult to test the JavaScript code in isolation from the server code and database contents. We present a practical solution to this problem. First, we demonstrate that formal server interface descriptions are useful in automated testing of JavaScript web applications for separating the concerns of the client and the server. Second, to support the construction of server interface descriptions for existing applications, we introduce an effective inference technique that learns communication patterns from sample data. By incorporating interface descriptions into the testing tool Artemis, our experimental results show that we increase the level of automation for high-coverage testing on a collection of JavaScript web applications that exchange JSON data between the clients and servers. Moreover, we demonstrate that the inference technique can quickly and accurately learn useful server interface descriptions. @InProceedings{ESEC/FSE13p510, author = {Casper S. Jensen and Anders Møller and Zhendong Su}, title = {Server Interface Descriptions for Automated Testing of JavaScript Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {510--520}, doi = {}, year = {2013}, } Info |
|
Morichetta, Andrea |
ESEC/FSE '13: "Adequate Monitoring of Service ..."
Adequate Monitoring of Service Compositions
Antonia Bertolino, Eda Marchetti, and Andrea Morichetta (ISTI-CNR, Italy) Monitoring is essential to validate the runtime behaviour of dynamic distributed systems. However, monitors can inform of relevant events as they occur, but by their very nature they will not report about all those events that are not happening. In service-oriented applications it would be desirable to have means to assess the thoroughness of the interactions among the services that are being monitored. In case some events or message sequences or interaction patterns have not been observed for a while, in fact, one could timely check whether this happens because something is going wrong. In this paper, we introduce the novel notion of monitoring adequacy, which is generic and can be defined on different entities. We then define two adequacy criteria for service compositions and implement a proof-of-concept adequate monitoring framework. We validate the approach on two case studies, the Travel Reservation System and the Future Market choreographies. @InProceedings{ESEC/FSE13p59, author = {Antonia Bertolino and Eda Marchetti and Andrea Morichetta}, title = {Adequate Monitoring of Service Compositions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {59--69}, doi = {}, year = {2013}, } |
|
Murali, Vijayaraghavan |
ESEC/FSE '13: "Boosting Concolic Testing ..."
Boosting Concolic Testing via Interpolation
Joxan Jaffar , Vijayaraghavan Murali, and Jorge A. Navas (National University of Singapore, Singapore; University of Melbourne, Australia) Concolic testing has been very successful in automatically generating test inputs for programs. However one of its major limitations is path-explosion that limits the generation of high coverage inputs. Since its inception several ideas have been proposed to attack this problem from various angles: defining search heuristics that increase coverage, caching of function summaries, pruning of paths using static/dynamic information etc. We propose a new and complementary method based on interpolation, that greatly mitigates path-explosion by subsuming paths that can be guaranteed to not hit a bug. We discuss new challenges in using interpolation that arise specifically in the context of concolic testing. We experimentally evaluate our method with different search heuristics using Crest, a publicly available concolic tester. @InProceedings{ESEC/FSE13p48, author = {Joxan Jaffar and Vijayaraghavan Murali and Jorge A. Navas}, title = {Boosting Concolic Testing via Interpolation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {48--58}, doi = {}, year = {2013}, } |
|
Muşlu, Kıvanç |
ESEC/FSE '13: "Making Offline Analyses Continuous ..."
Making Offline Analyses Continuous
Kıvanç Muşlu, Yuriy Brun , Michael D. Ernst , and David Notkin (University of Washington, USA; University of Massachusetts at Amherst, USA) Developers use analysis tools to help write, debug, and understand software systems under development. A developer's change to the system source code may affect analysis results. Typically, to learn those effects, the developer must explicitly initiate the analysis. This may interrupt the developer's workflow and/or the delay until the developer learns the implications of the change. The situation is even worse for impure analyses — ones that modify the code on which it runs — because such analyses block the developer from working on the code. This paper presents Codebase Replication, a novel approach to easily convert an offline analysis — even an impure one — into a continuous analysis that informs the developer of the implications of recent changes as quickly as possible after the change is made. Codebase Replication copies the developer's codebase, incrementally keeps this copy codebase in sync with the developer's codebase, makes that copy codebase available for offline analyses to run without disturbing the developer and without the developer's changes disturbing the analyses, and makes analysis results available to be presented to the developer. We have implemented Codebase Replication in Solstice, an open-source, publicly-available Eclipse plug-in. We have used Solstice to convert three offline analyses — FindBugs, PMD, and unit testing — into continuous ones. Each conversion required on average 436 NCSL and took, on average, 18 hours. Solstice-based analyses experience no more than 2.5 milliseconds of runtime overhead per developer action. @InProceedings{ESEC/FSE13p323, author = {Kıvanç Muşlu and Yuriy Brun and Michael D. Ernst and David Notkin}, title = {Making Offline Analyses Continuous}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {323--333}, doi = {}, year = {2013}, } |
|
Nagappan, Meiyappan |
ESEC/FSE '13: "Diversity in Software Engineering ..."
Diversity in Software Engineering Research
Meiyappan Nagappan, Thomas Zimmermann , and Christian Bird (Queen’s University, Canada; Microsoft Research, USA) One of the goals of software engineering research is to achieve generality: Are the phenomena found in a few projects reflective of others? Will a technique perform as well on projects other than the projects it is evaluated on? While it is common sense to select a sample that is representative of a population, the importance of diversity is often overlooked, yet as important. In this paper, we combine ideas from representativeness and diversity and introduce a measure called sample coverage, defined as the percentage of projects in a population that are similar to the given sample. We introduce algorithms to compute the sample coverage for a given set of projects and to select the projects that increase the coverage the most. We demonstrate our technique on research presented over the span of two years at ICSE and FSE with respect to a population of 20,000 active open source projects monitored by Ohloh.net. Knowing the coverage of a sample enhances our ability to reason about the findings of a study. Furthermore, we propose reporting guidelines for research: in addition to coverage scores, papers should discuss the target population of the research (universe) and dimensions that potentially can influence the outcomes of a research (space). @InProceedings{ESEC/FSE13p466, author = {Meiyappan Nagappan and Thomas Zimmermann and Christian Bird}, title = {Diversity in Software Engineering Research}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {466--476}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Naik, Mayur |
ESEC/FSE '13: "Dynodroid: An Input Generation ..."
Dynodroid: An Input Generation System for Android Apps
Aravind Machiry, Rohan Tahiliani, and Mayur Naik (Georgia Tech, USA) We present a system Dynodroid for generating relevant inputs to unmodified Android apps. Dynodroid views an app as an event-driven program that interacts with its environment by means of a sequence of events through the Android framework. By instrumenting the framework once and for all, Dynodroid monitors the reaction of an app upon each event in a lightweight manner, using it to guide the generation of the next event to the app. Dynodroid also allows interleaving events from machines, which are better at generating a large number of simple inputs, with events from humans, who are better at providing intelligent inputs. We evaluated Dynodroid on 50 open-source Android apps, and compared it with two prevalent approaches: users manually exercising apps, and Monkey, a popular fuzzing tool. Dynodroid, humans, and Monkey covered 55%, 60%, and 53%, respectively, of each app's Java source code on average. Monkey took 20X more events on average than Dynodroid. Dynodroid also found 9 bugs in 7 of the 50 apps, and 6 bugs in 5 of the top 1,000 free apps on Google Play. @InProceedings{ESEC/FSE13p224, author = {Aravind Machiry and Rohan Tahiliani and Mayur Naik}, title = {Dynodroid: An Input Generation System for Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {224--234}, doi = {}, year = {2013}, } Info Distinguished Artifact |
|
Navas, Jorge A. |
ESEC/FSE '13: "Boosting Concolic Testing ..."
Boosting Concolic Testing via Interpolation
Joxan Jaffar , Vijayaraghavan Murali, and Jorge A. Navas (National University of Singapore, Singapore; University of Melbourne, Australia) Concolic testing has been very successful in automatically generating test inputs for programs. However one of its major limitations is path-explosion that limits the generation of high coverage inputs. Since its inception several ideas have been proposed to attack this problem from various angles: defining search heuristics that increase coverage, caching of function summaries, pruning of paths using static/dynamic information etc. We propose a new and complementary method based on interpolation, that greatly mitigates path-explosion by subsuming paths that can be guaranteed to not hit a bug. We discuss new challenges in using interpolation that arise specifically in the context of concolic testing. We experimentally evaluate our method with different search heuristics using Crest, a publicly available concolic tester. @InProceedings{ESEC/FSE13p48, author = {Joxan Jaffar and Vijayaraghavan Murali and Jorge A. Navas}, title = {Boosting Concolic Testing via Interpolation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {48--58}, doi = {}, year = {2013}, } |
|
Nguyen, Anh Tuan |
ESEC/FSE '13: "A Statistical Semantic Language ..."
A Statistical Semantic Language Model for Source Code
Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen (Iowa State University, USA) Recent research has successfully applied the statistical n-gram language model to show that source code exhibits a good level of repetition. The n-gram model is shown to have good predictability in supporting code suggestion and completion. However, the state-of-the-art n-gram approach to capture source code regularities/patterns is based only on the lexical information in a local context of the code units. To improve predictability, we introduce SLAMC, a novel statistical semantic language model for source code. It incorporates semantic information into code tokens and models the regularities/patterns of such semantic annotations, called sememes, rather than their lexemes. It combines the local context in semantic n-grams with the global technical concerns/functionality into an n-gram topic model, together with pairwise associations of program elements. Based on SLAMC, we developed a new code suggestion method, which is empirically evaluated on several projects to have relatively 18-68% higher accuracy than the state-of-the-art approach. @InProceedings{ESEC/FSE13p532, author = {Tung Thanh Nguyen and Anh Tuan Nguyen and Hoan Anh Nguyen and Tien N. Nguyen}, title = {A Statistical Semantic Language Model for Source Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {532--542}, doi = {}, year = {2013}, } |
|
Nguyen, Cu D. |
ESEC/FSE '13: "Automated Oracles: An Empirical ..."
Automated Oracles: An Empirical Study on Cost and Effectiveness
Cu D. Nguyen, Alessandro Marchetto, and Paolo Tonella (Fondazione Bruno Kessler, Italy) Software testing is an effective, yet expensive, method to improve software quality. Test automation, a potential way to reduce testing cost, has received enormous research attention recently, but the so-called “oracle problem” (how to decide the PASS/FAIL outcome of a test execution) is still a major obstacle to such cost reduction. We have extensively investigated state-of-the-art works that contribute to address this problem, from areas such as specification mining and model inference. In this paper, we compare three types of automated oracles: Data invariants, Temporal invariants, and Finite State Automata. More specifically, we study the training cost and the false positive rate; we evaluate also their fault detection capability. Seven medium to large, industrial application subjects and real faults have been used in our empirical investigation. @InProceedings{ESEC/FSE13p136, author = {Cu D. Nguyen and Alessandro Marchetto and Paolo Tonella}, title = {Automated Oracles: An Empirical Study on Cost and Effectiveness}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {136--146}, doi = {}, year = {2013}, } Info |
|
Nguyen, Hoan Anh |
ESEC/FSE '13: "A Statistical Semantic Language ..."
A Statistical Semantic Language Model for Source Code
Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen (Iowa State University, USA) Recent research has successfully applied the statistical n-gram language model to show that source code exhibits a good level of repetition. The n-gram model is shown to have good predictability in supporting code suggestion and completion. However, the state-of-the-art n-gram approach to capture source code regularities/patterns is based only on the lexical information in a local context of the code units. To improve predictability, we introduce SLAMC, a novel statistical semantic language model for source code. It incorporates semantic information into code tokens and models the regularities/patterns of such semantic annotations, called sememes, rather than their lexemes. It combines the local context in semantic n-grams with the global technical concerns/functionality into an n-gram topic model, together with pairwise associations of program elements. Based on SLAMC, we developed a new code suggestion method, which is empirically evaluated on several projects to have relatively 18-68% higher accuracy than the state-of-the-art approach. @InProceedings{ESEC/FSE13p532, author = {Tung Thanh Nguyen and Anh Tuan Nguyen and Hoan Anh Nguyen and Tien N. Nguyen}, title = {A Statistical Semantic Language Model for Source Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {532--542}, doi = {}, year = {2013}, } |
|
Nguyen, Khanh |
ESEC/FSE '13: "Cachetor: Detecting Cacheable ..."
Cachetor: Detecting Cacheable Data to Remove Bloat
Khanh Nguyen and Guoqing Xu (UC Irvine, USA) Modern object-oriented software commonly suffers from runtime bloat that significantly affects its performance and scalability. Studies have shown that one important pattern of bloat is the work repeatedly done to compute the same data values. Very often the cost of computation is very high and it is thus beneficial to memoize the invariant data values for later use. While this is a common practice in real-world development, manually finding invariant data values is a daunting task during development and tuning. To help the developers quickly find such optimization opportunities for performance improvement, we propose a novel run-time profiling tool, called Cachetor, which uses a combination of dynamic dependence profiling and value profiling to identify and report operations that keep generating identical data values. The major challenge in the design of Cachetor is that both dependence and value profiling are extremely expensive techniques that cannot scale to large, real-world applications for which optimizations are important. To overcome this challenge, we propose a series of novel abstractions that are applied to run-time instruction instances during profiling, yielding significantly improved analysis time and scalability. We have implemented Cachetor in Jikes Research Virtual Machine and evaluated it on a set of 14 large Java applications. Our experimental results suggest that Cachetor is effective in exposing caching opportunities and substantial performance gains can be achieved by modifying a program to cache the reported data. @InProceedings{ESEC/FSE13p268, author = {Khanh Nguyen and Guoqing Xu}, title = {Cachetor: Detecting Cacheable Data to Remove Bloat}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {268--278}, doi = {}, year = {2013}, } |
|
Nguyen, Tien N. |
ESEC/FSE '13: "A Statistical Semantic Language ..."
A Statistical Semantic Language Model for Source Code
Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen (Iowa State University, USA) Recent research has successfully applied the statistical n-gram language model to show that source code exhibits a good level of repetition. The n-gram model is shown to have good predictability in supporting code suggestion and completion. However, the state-of-the-art n-gram approach to capture source code regularities/patterns is based only on the lexical information in a local context of the code units. To improve predictability, we introduce SLAMC, a novel statistical semantic language model for source code. It incorporates semantic information into code tokens and models the regularities/patterns of such semantic annotations, called sememes, rather than their lexemes. It combines the local context in semantic n-grams with the global technical concerns/functionality into an n-gram topic model, together with pairwise associations of program elements. Based on SLAMC, we developed a new code suggestion method, which is empirically evaluated on several projects to have relatively 18-68% higher accuracy than the state-of-the-art approach. @InProceedings{ESEC/FSE13p532, author = {Tung Thanh Nguyen and Anh Tuan Nguyen and Hoan Anh Nguyen and Tien N. Nguyen}, title = {A Statistical Semantic Language Model for Source Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {532--542}, doi = {}, year = {2013}, } |
|
Nguyen, Tung Thanh |
ESEC/FSE '13: "A Statistical Semantic Language ..."
A Statistical Semantic Language Model for Source Code
Tung Thanh Nguyen, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N. Nguyen (Iowa State University, USA) Recent research has successfully applied the statistical n-gram language model to show that source code exhibits a good level of repetition. The n-gram model is shown to have good predictability in supporting code suggestion and completion. However, the state-of-the-art n-gram approach to capture source code regularities/patterns is based only on the lexical information in a local context of the code units. To improve predictability, we introduce SLAMC, a novel statistical semantic language model for source code. It incorporates semantic information into code tokens and models the regularities/patterns of such semantic annotations, called sememes, rather than their lexemes. It combines the local context in semantic n-grams with the global technical concerns/functionality into an n-gram topic model, together with pairwise associations of program elements. Based on SLAMC, we developed a new code suggestion method, which is empirically evaluated on several projects to have relatively 18-68% higher accuracy than the state-of-the-art approach. @InProceedings{ESEC/FSE13p532, author = {Tung Thanh Nguyen and Anh Tuan Nguyen and Hoan Anh Nguyen and Tien N. Nguyen}, title = {A Statistical Semantic Language Model for Source Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {532--542}, doi = {}, year = {2013}, } |
|
Nori, Aditya V. |
ESEC/FSE '13: "Bayesian Inference using Data ..."
Bayesian Inference using Data Flow Analysis
Guillaume Claret, Sriram K. Rajamani , Aditya V. Nori, Andrew D. Gordon, and Johannes Borgström (INRIA, France; Microsoft Research, India; Microsoft Research, UK; Uppsala University, Sweden) We present a new algorithm for Bayesian inference over probabilistic programs, based on data flow analysis techniques from the program analysis community. Unlike existing techniques for Bayesian inference on probabilistic programs, our data flow analysis algorithm is able to perform inference directly on probabilistic programs with loops. Even for loop-free programs, we show that data flow analysis offers better precision and better performance benefits over existing techniques. We also describe heuristics that are crucial for our inference to scale, and present an empirical evaluation of our algorithm over a range of benchmarks. @InProceedings{ESEC/FSE13p92, author = {Guillaume Claret and Sriram K. Rajamani and Aditya V. Nori and Andrew D. Gordon and Johannes Borgström}, title = {Bayesian Inference using Data Flow Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {92--102}, doi = {}, year = {2013}, } ESEC/FSE '13: "Termination Proofs from Tests ..." Termination Proofs from Tests Aditya V. Nori and Rahul Sharma (Microsoft Research, India; Stanford University, USA) We show how a test suite for a sequential program can be profitably used to construct a termination proof. In particular, we describe an algorithm TpT for proving termination of a program based on information derived from testing it. TpT iteratively calls two phases: (a) an infer phase, and (b) a validate phase. In the infer phase, machine learning, in particular, linear regression is used to efficiently compute a candidate loop bound for every loop in the program. These loop bounds are verified for correctness by an off-the-shelf checker. If a loop bound is invalid, then the safety checker provides a test or a counterexample that is used to generate more data which is subsequently used by the next infer phase to compute better estimates for loop bounds. On the other hand, if all loop bounds are valid, then we have a proof of termination. We also describe a simple extension to our approach that allows us to infer polynomial loop bounds automatically. We have evaluated TpT on two benchmark sets, micro-benchmarks obtained from recent literature on program termination, and Windows device drivers. Our results are promising -- on the micro-benchmarks, we show that TpT is able to prove termination on 15% more benchmarks than any previously known technique, and our evaluation on Windows device drivers demonstrates TpT's ability to analyze and scale to real world applications. @InProceedings{ESEC/FSE13p246, author = {Aditya V. Nori and Rahul Sharma}, title = {Termination Proofs from Tests}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {246--256}, doi = {}, year = {2013}, } |
|
Notkin, David |
ESEC/FSE '13: "Making Offline Analyses Continuous ..."
Making Offline Analyses Continuous
Kıvanç Muşlu, Yuriy Brun , Michael D. Ernst , and David Notkin (University of Washington, USA; University of Massachusetts at Amherst, USA) Developers use analysis tools to help write, debug, and understand software systems under development. A developer's change to the system source code may affect analysis results. Typically, to learn those effects, the developer must explicitly initiate the analysis. This may interrupt the developer's workflow and/or the delay until the developer learns the implications of the change. The situation is even worse for impure analyses — ones that modify the code on which it runs — because such analyses block the developer from working on the code. This paper presents Codebase Replication, a novel approach to easily convert an offline analysis — even an impure one — into a continuous analysis that informs the developer of the implications of recent changes as quickly as possible after the change is made. Codebase Replication copies the developer's codebase, incrementally keeps this copy codebase in sync with the developer's codebase, makes that copy codebase available for offline analyses to run without disturbing the developer and without the developer's changes disturbing the analyses, and makes analysis results available to be presented to the developer. We have implemented Codebase Replication in Solstice, an open-source, publicly-available Eclipse plug-in. We have used Solstice to convert three offline analyses — FindBugs, PMD, and unit testing — into continuous ones. Each conversion required on average 436 NCSL and took, on average, 18 hours. Solstice-based analyses experience no more than 2.5 milliseconds of runtime overhead per developer action. @InProceedings{ESEC/FSE13p323, author = {Kıvanç Muşlu and Yuriy Brun and Michael D. Ernst and David Notkin}, title = {Making Offline Analyses Continuous}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {323--333}, doi = {}, year = {2013}, } |
|
Novikov, Evgeny |
ESEC/FSE '13: "Precision Reuse for Efficient ..."
Precision Reuse for Efficient Regression Verification
Dirk Beyer , Stefan Löwe, Evgeny Novikov, Andreas Stahlbauer, and Philipp Wendler (University of Passau, Germany; ISP RAS, Russia) Continuous testing during development is a well-established technique for software-quality assurance. Continuous model checking from revision to revision is not yet established as a standard practice, because the enormous resource consumption makes its application impractical. Model checkers compute a large number of verification facts that are necessary for verifying if a given specification holds. We have identified a category of such intermediate results that are easy to store and efficient to reuse: abstraction precisions. The precision of an abstract domain specifies the level of abstraction that the analysis works on. Precisions are thus a precious result of the verification effort and it is a waste of resources to throw them away after each verification run. In particular, precisions are reasonably small and thus easy to store; they are easy to process and have a large impact on resource consumption. We experimentally show the impact of precision reuse on industrial verification problems created from 62 Linux kernel device drivers with 1119 revisions. @InProceedings{ESEC/FSE13p389, author = {Dirk Beyer and Stefan Löwe and Evgeny Novikov and Andreas Stahlbauer and Philipp Wendler}, title = {Precision Reuse for Efficient Regression Verification}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {389--399}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Oliveira, Bruno C. d. S. |
ESEC/FSE '13: "Regression Tests to Expose ..."
Regression Tests to Expose Change Interaction Errors
Marcel Böhme, Bruno C. d. S. Oliveira, and Abhik Roychoudhury (National University of Singapore, Singapore) Changes often introduce program errors, and hence recent software testing literature has focused on generating tests which stress changes. In this paper, we argue that changes cannot be treated as isolated program artifacts which are stressed via testing. Instead, it is the complex dependency across multiple changes which introduce subtle errors. Furthermore, the complex dependence structures, that need to be exercised to expose such errors, ensure that they remain undiscovered even in well tested and deployed software. We motivate our work based on empirical evidence from a well tested and stable project - Linux GNU Coreutils - where we found that one third of the regressions take more than two (2) years to be fixed, and that two thirds of such long-standing regressions are introduced due to change interactions for the utilities we investigated. To combat change interaction errors, we first define a notion of change interaction where several program changes are found to affect the result of a program statement via program dependencies. Based on this notion, we propose a change sequence graph (CSG) to summarize the control-flow and dependencies across changes. The CSG is then used as a guide during program path exploration via symbolic execution - thereby efficiently producing test cases which witness change interaction errors. Our experimental infra-structure was deployed on various utilities of GNU Coreutils, which have been distributed with Linux for almost twenty years. Apart from finding five (5) previously unknown errors in the utilities, we found that only one in five generated test cases exercises a sequence that is critical to exposing a change-interaction error, while being an order of magnitude more likely to expose an error. On the other hand, stressing changes in isolation only exposed half of the change interaction errors. These results demonstrate the importance and difficulty of change dependence-aware regression testing. @InProceedings{ESEC/FSE13p334, author = {Marcel Böhme and Bruno C. d. S. Oliveira and Abhik Roychoudhury}, title = {Regression Tests to Expose Change Interaction Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {334--344}, doi = {}, year = {2013}, } |
|
Oliveto, Rocco |
ESEC/FSE '13: "API Change and Fault Proneness: ..."
API Change and Fault Proneness: A Threat to the Success of Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto , and Denys Poshyvanyk (College of William and Mary, USA; University of Sannio, Italy; Universidad Nacional de Colombia, Colombia; University of Molise, Italy) During the recent years, the market of mobile software applications (apps) has maintained an impressive upward trajectory. Many small and large software development companies invest considerable resources to target available opportunities. As of today, the markets for such devices feature over 850K+ apps for Android and 900K+ for iOS. Availability, cost, functionality, and usability are just some factors that determine the success or lack of success for a given app. Among the other factors, reliability is an important criteria: users easily get frustrated by repeated failures, crashes, and other bugs; hence, abandoning some apps in favor of others. This paper reports a study analyzing how the fault- and change-proneness of APIs used by 7,097 (free) Android apps relates to applications' lack of success, estimated from user ratings. Results of this study provide important insights into a crucial issue: making heavy use of fault- and change-prone APIs can negatively impact the success of these apps. @InProceedings{ESEC/FSE13p477, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Bernal-Cárdenas and Massimiliano Di Penta and Rocco Oliveto and Denys Poshyvanyk}, title = {API Change and Fault Proneness: A Threat to the Success of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {477--487}, doi = {}, year = {2013}, } Info |
|
Paul, Gaurab |
ESEC/FSE '13: "Distributed Program Tracing ..."
Distributed Program Tracing
Diptikalyan Saha, Pankaj Dhoolia, and Gaurab Paul (IBM Research, India; IIT Kharagpur, India) Dynamic program analysis techniques depend on accurate program traces. Program instrumentation is commonly used to collect these traces, which causes overhead to the program execution. Various techniques have addressed this problem by minimizing the number of probes/witnesses used to collect traces. In this paper, we present a novel distributed trace collection framework wherein, a program is executed multiple times with the same input for different sets of witnesses. The partial traces such obtained are then merged to create the whole program trace. Such divide-and-conquer strategy enables parallel collection of partial traces, thereby reducing the total time of collection. The problem is particularly challenging as arbitrary distribution of witnesses cannot guarantee correct formation of traces. We provide and prove a necessary and sufficient condition for distributing the witnesses which ensures correct formation of trace. Moreover, we describe witness distribution strategies that are suitable for parallel collection. We use the framework to collect traces of field SAP-ABAP programs using breakpoints as witnesses as instrumentation cannot be performed due to practical constraints. To optimize such collection, we extend Ball-Larus' optimal edge-based profiling algorithm to an optimal node-based algorithm. We demonstrate the effectiveness of the framework for collecting traces of SAP-ABAP programs. @InProceedings{ESEC/FSE13p180, author = {Diptikalyan Saha and Pankaj Dhoolia and Gaurab Paul}, title = {Distributed Program Tracing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {180--190}, doi = {}, year = {2013}, } |
|
Pawar, Kshama |
ESEC/FSE '13: "Will You Still Compile Me ..."
Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation
Chris Hawblitzel , Shuvendu K. Lahiri , Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, and Scott Wadsworth (Microsoft, USA) This paper describes a cross-version compiler validator and measures its effectiveness on the CLR JIT compiler. The validator checks for semantically equivalent assembly language output from various versions of the compiler, including versions across a seven-month time period, across two architectures (x86 and ARM), across two compilation scenarios (JIT and MDIL), and across optimizations levels. For month-to-month comparisons, the validator achieves a false alarm rate of just 2.2%. To help understand reported semantic differences, the validator performs a root-cause analysis on the counterexample traces generated by the underlying automated theorem proving tools. This root-cause analysis groups most of the counterexamples into a small number of buckets, reducing the number of counterexamples analyzed by hand by anywhere from 53% to 96%. The validator ran on over 500,000 methods across a large suite of test programs, finding 12 previously unknown correctness and performance bugs in the CLR compiler. @InProceedings{ESEC/FSE13p191, author = {Chris Hawblitzel and Shuvendu K. Lahiri and Kshama Pawar and Hammad Hashmi and Sedar Gokbulut and Lakshan Fernando and Dave Detlefs and Scott Wadsworth}, title = {Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2013}, } |
|
Petke, Justyna |
ESEC/FSE '13: "Efficiency and Early Fault ..."
Efficiency and Early Fault Detection with Lower and Higher Strength Combinatorial Interaction Testing
Justyna Petke , Shin Yoo, Myra B. Cohen, and Mark Harman (University College London, UK; University of Nebraska-Lincoln, USA) Combinatorial Interaction Testing (CIT) is important because it tests the interactions between the many features and parameters that make up the configuration space of software systems. However, in order to be practically applicable, it must be able to cater for soft and hard real-world constraints and should, ideally, report a test priority order that maximises earliest fault detection. We show that we can achieve the highest strength CIT in 5.65 minutes on average. This was previously thought to be too computationally expensive to be feasible. Furthermore, we show that higher strength suites find more faults, while prioritisations using lower strengths are no worse at achieving early fault revelation. @InProceedings{ESEC/FSE13p26, author = {Justyna Petke and Shin Yoo and Myra B. Cohen and Mark Harman}, title = {Efficiency and Early Fault Detection with Lower and Higher Strength Combinatorial Interaction Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {26--36}, doi = {}, year = {2013}, } Info |
|
Pezzè, Mauro |
ESEC/FSE '13: "Enhancing Symbolic Execution ..."
Enhancing Symbolic Execution with Built-In Term Rewriting and Constrained Lazy Initialization
Pietro Braione, Giovanni Denaro, and Mauro Pezzè (University of Milano-Bicocca, Italy; University of Lugano, Switzerland) Symbolic execution suffers from problems when analyzing programs that handle complex data structures as their inputs and take decisions over non-linear expressions. For these programs, symbolic execution may incur invalid inputs or unidentified infeasible traces, and may raise large amounts of false alarms. Some symbolic executors tackle these problems by introducing executable preconditions to exclude invalid inputs, and some solvers exploit rewrite rules to address non linear problems. In this paper, we discuss the core limitations of executable preconditions, and address these limitations by proposing invariants specifically designed to harmonize with the lazy initialization algorithm. We exploit rewrite rules applied within the symbolic executor, to address simplifications of inverse relationships fostered from either program-specific calculations or the logic of the verification tasks. We present a symbolic executor that integrates the two techniques, and validate our approach against the verification of a relevant set of properties of the Tactical Separation Assisted Flight Environment. The empirical data show that the integrated approach can improve the effectiveness of symbolic execution. @InProceedings{ESEC/FSE13p411, author = {Pietro Braione and Giovanni Denaro and Mauro Pezzè}, title = {Enhancing Symbolic Execution with Built-In Term Rewriting and Constrained Lazy Initialization}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {411--421}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Popescu, Daniel |
ESEC/FSE '13: "Identifying Message Flow in ..."
Identifying Message Flow in Distributed Event-Based Systems
Joshua Garcia, Daniel Popescu, Gholamreza Safi, William G. J. Halfond , and Nenad Medvidovic (University of Southern California, USA) Distributed event-based (DEB) systems contain highly-decoupled components that interact by exchanging messages. This enables flexible system composition and adaptation, but also makes DEB systems difficult to maintain. Most existing program analysis techniques to support maintenance are not well suited to DEB systems, while those that are tend to suffer from inaccuracy or make assumptions that limit their applicability. This paper presents Eos, a static analysis technique that identifies message information useful for maintaining a DEB system, namely, message types and message flow within a system. Eos has been evaluated on six off-the-shelf DEB systems spanning five different middleware platforms, and has exhibited excellent accuracy and efficiency. Furthermore, a case study involving a range of maintenance activities undertaken on three existing DEB systems shows that, on average, Eos enables an engineer to identify the scope and impact of required changes more accurately than existing alternatives. @InProceedings{ESEC/FSE13p367, author = {Joshua Garcia and Daniel Popescu and Gholamreza Safi and William G. J. Halfond and Nenad Medvidovic}, title = {Identifying Message Flow in Distributed Event-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {367--377}, doi = {}, year = {2013}, } Info |
|
Poshyvanyk, Denys |
ESEC/FSE '13: "API Change and Fault Proneness: ..."
API Change and Fault Proneness: A Threat to the Success of Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Carlos Bernal-Cárdenas, Massimiliano Di Penta, Rocco Oliveto , and Denys Poshyvanyk (College of William and Mary, USA; University of Sannio, Italy; Universidad Nacional de Colombia, Colombia; University of Molise, Italy) During the recent years, the market of mobile software applications (apps) has maintained an impressive upward trajectory. Many small and large software development companies invest considerable resources to target available opportunities. As of today, the markets for such devices feature over 850K+ apps for Android and 900K+ for iOS. Availability, cost, functionality, and usability are just some factors that determine the success or lack of success for a given app. Among the other factors, reliability is an important criteria: users easily get frustrated by repeated failures, crashes, and other bugs; hence, abandoning some apps in favor of others. This paper reports a study analyzing how the fault- and change-proneness of APIs used by 7,097 (free) Android apps relates to applications' lack of success, estimated from user ratings. Results of this study provide important insights into a crucial issue: making heavy use of fault- and change-prone APIs can negatively impact the success of these apps. @InProceedings{ESEC/FSE13p477, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Bernal-Cárdenas and Massimiliano Di Penta and Rocco Oliveto and Denys Poshyvanyk}, title = {API Change and Fault Proneness: A Threat to the Success of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {477--487}, doi = {}, year = {2013}, } Info |
|
Posnett, Daryl |
ESEC/FSE '13: "Sample Size vs. Bias in Defect ..."
Sample Size vs. Bias in Defect Prediction
Foyzur Rahman, Daryl Posnett, Israel Herraiz, and Premkumar Devanbu (UC Davis, USA; Universidad Politécnica de Madrid, Spain) Most empirical disciplines promote the reuse and sharing of datasets, as it leads to greater possibility of replication. While this is increasingly the case in Empirical Software Engineering, some of the most popular bug-fix datasets are now known to be biased. This raises two significant concerns: first, that sample bias may lead to underperforming prediction models, and second, that the external validity of the studies based on biased datasets may be suspect. This issue has raised considerable consternation in the ESE literature in recent years. However, there is a confounding factor of these datasets that has not been examined carefully: size. Biased datasets are sampling only some of the data that could be sampled, and doing so in a biased fashion; but biased samples could be smaller, or larger. Smaller data sets in general provide less reliable bases for estimating models, and thus could lead to inferior model performance. In this setting, we ask the question, what affects performance more, bias, or size? We conduct a detailed, large-scale meta-analysis, using simulated datasets sampled with bias from a high-quality dataset which is relatively free of bias. Our results suggest that size always matters just as much bias direction, and in fact much more than bias direction when considering information-retrieval measures such as AUCROC and F-score. This indicates that at least for prediction models, even when dealing with sampling bias, simply finding larger samples can sometimes be sufficient. Our analysis also exposes the complexity of the bias issue, and raises further issues to be explored in the future. @InProceedings{ESEC/FSE13p147, author = {Foyzur Rahman and Daryl Posnett and Israel Herraiz and Premkumar Devanbu}, title = {Sample Size vs. Bias in Defect Prediction}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {147--157}, doi = {}, year = {2013}, } |
|
Qiu, Dong |
ESEC/FSE '13: "An Empirical Analysis of the ..."
An Empirical Analysis of the Co-evolution of Schema and Code in Database Applications
Dong Qiu, Bixin Li, and Zhendong Su (Southeast University, China; UC Davis, USA) Modern database applications are among the most widely used and complex software systems. They constantly evolve, responding to changes to data, database schemas, and code. It is challenging to manage these changes and ensure that everything co-evolves consistently. For example, when a database schema is modified, all the code that interacts with the database must be changed accordingly. Although database evolution and software evolution have been extensively studied in isolation, the co-evolution of schema and code has largely been unexplored. This paper presents the first comprehensive empirical analysis of the co-evolution of database schemas and code in ten popular large open-source database applications, totaling over 160K revisions. Our major findings include: 1) Database schemas evolve frequently during the application lifecycle, exhibiting a variety of change types with similar distributions across the studied applications; 2) Overall, schema changes induce significant code-level modifications, while certain change types have more impact on code than others; and 3) Co-change analyses can be viable to automate or assist with database application evolution. We have also observed that: 1) 80% of the schema changes happened in 20-30% of the tables, while nearly 40% of the tables did not change; and 2) Referential integrity constraints and stored procedures are rarely used in our studied subjects. We believe that our study reveals new insights into how database applications evolve and useful guidelines for designing assistive tools to aid their evolution. @InProceedings{ESEC/FSE13p125, author = {Dong Qiu and Bixin Li and Zhendong Su}, title = {An Empirical Analysis of the Co-evolution of Schema and Code in Database Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {125--135}, doi = {}, year = {2013}, } Info |
|
Rahman, Foyzur |
ESEC/FSE '13: "Sample Size vs. Bias in Defect ..."
Sample Size vs. Bias in Defect Prediction
Foyzur Rahman, Daryl Posnett, Israel Herraiz, and Premkumar Devanbu (UC Davis, USA; Universidad Politécnica de Madrid, Spain) Most empirical disciplines promote the reuse and sharing of datasets, as it leads to greater possibility of replication. While this is increasingly the case in Empirical Software Engineering, some of the most popular bug-fix datasets are now known to be biased. This raises two significant concerns: first, that sample bias may lead to underperforming prediction models, and second, that the external validity of the studies based on biased datasets may be suspect. This issue has raised considerable consternation in the ESE literature in recent years. However, there is a confounding factor of these datasets that has not been examined carefully: size. Biased datasets are sampling only some of the data that could be sampled, and doing so in a biased fashion; but biased samples could be smaller, or larger. Smaller data sets in general provide less reliable bases for estimating models, and thus could lead to inferior model performance. In this setting, we ask the question, what affects performance more, bias, or size? We conduct a detailed, large-scale meta-analysis, using simulated datasets sampled with bias from a high-quality dataset which is relatively free of bias. Our results suggest that size always matters just as much bias direction, and in fact much more than bias direction when considering information-retrieval measures such as AUCROC and F-score. This indicates that at least for prediction models, even when dealing with sampling bias, simply finding larger samples can sometimes be sufficient. Our analysis also exposes the complexity of the bias issue, and raises further issues to be explored in the future. @InProceedings{ESEC/FSE13p147, author = {Foyzur Rahman and Daryl Posnett and Israel Herraiz and Premkumar Devanbu}, title = {Sample Size vs. Bias in Defect Prediction}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {147--157}, doi = {}, year = {2013}, } |
|
Rajamani, Sriram K. |
ESEC/FSE '13: "Bayesian Inference using Data ..."
Bayesian Inference using Data Flow Analysis
Guillaume Claret, Sriram K. Rajamani , Aditya V. Nori, Andrew D. Gordon, and Johannes Borgström (INRIA, France; Microsoft Research, India; Microsoft Research, UK; Uppsala University, Sweden) We present a new algorithm for Bayesian inference over probabilistic programs, based on data flow analysis techniques from the program analysis community. Unlike existing techniques for Bayesian inference on probabilistic programs, our data flow analysis algorithm is able to perform inference directly on probabilistic programs with loops. Even for loop-free programs, we show that data flow analysis offers better precision and better performance benefits over existing techniques. We also describe heuristics that are crucial for our inference to scale, and present an empirical evaluation of our algorithm over a range of benchmarks. @InProceedings{ESEC/FSE13p92, author = {Guillaume Claret and Sriram K. Rajamani and Aditya V. Nori and Andrew D. Gordon and Johannes Borgström}, title = {Bayesian Inference using Data Flow Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {92--102}, doi = {}, year = {2013}, } |
|
Ramanathan, Murali Krishna |
ESEC/FSE '13: "Scalable and Incremental Software ..."
Scalable and Incremental Software Bug Detection
Scott McPeak, Charles-Henri Gros, and Murali Krishna Ramanathan (Coverity, USA; Indian Institute of Science, India) An important, but often neglected, goal of static analysis for detecting bugs is the ability to show defects to the programmer quickly. Unfortunately, existing static analysis tools scale very poorly, or are shallow and cannot find complex interprocedural defects. Previous attempts at reducing the analysis time by adding more resources (CPU, memory) or by splitting the analysis into multiple sub-analyses based on defect detection capabilities resulted in limited/negligible improvements. We present a technique for parallel and incremental static analysis using top-down, bottom-up and global specification inference based around the concept of a work unit, a self-contained atom of analysis input, that deterministically maps to its output. A work unit contains both abstract and concrete syntax to analyze, a supporting fragment of the class hierarchy, summarized interprocedural behavior, and defect reporting information, factored to ensure a high level of reuse when analyzing successive versions incrementally. Work units are created and consumed by an analysis master process that coordinates the multiple analysis passes, the flow of information among them, and incrementalizes the computation. Meanwhile, multiple analysis worker processes use abstract interpretation to compute work unit results. Process management and interprocess communication is done by a general-purpose computation distributor layer. We have implemented our approach and our experimental results show that using eight processor cores, we can perform complete analysis of code bases with millions of lines of code in a few hours, and even faster after incremental changes to that code. The analysis is thorough and accurate: it usually reports about one crash-causing defect per thousand lines of code, with a false positive rate of 10--20% @InProceedings{ESEC/FSE13p554, author = {Scott McPeak and Charles-Henri Gros and Murali Krishna Ramanathan}, title = {Scalable and Incremental Software Bug Detection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {554--564}, doi = {}, year = {2013}, } |
|
Razavi, Niloofar |
ESEC/FSE '13: "Con2colic Testing ..."
Con2colic Testing
Azadeh Farzan , Andreas Holzer, Niloofar Razavi, and Helmut Veith (University of Toronto, Canada; Vienna University of Technology, Austria) In this paper, we describe (con)2colic testing - a systematic testing approach for concurrent software. Based on concrete and symbolic executions of a concurrent program, (con)2colic testing derives inputs and schedules such that the execution space of the program under investigation is systematically explored. We introduce interference scenarios as key concept in (con)2colic testing. Interference scenarios capture the flow of data among different threads and enable a unified representation of path and interference constraints. We have implemented a (con)2colic testing engine and demonstrate the effectiveness of our approach by experiments. @InProceedings{ESEC/FSE13p37, author = {Azadeh Farzan and Andreas Holzer and Niloofar Razavi and Helmut Veith}, title = {Con2colic Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {37--47}, doi = {}, year = {2013}, } |
|
Reichenbach, Christoph |
ESEC/FSE '13: "Second-Order Constraints in ..."
Second-Order Constraints in Dynamic Invariant Inference
Kaituo Li, Christoph Reichenbach, Yannis Smaragdakis , and Michal Young (University of Massachusetts at Amherst, USA; Goethe University Frankfurt, Germany; University of Athens, Greece; University of Oregon, USA) The current generation of dynamic invariant detectors often produce invariants that are inconsistent with program semantics or programmer knowledge. We improve the consistency of dynamically discovered invariants by taking into account higher-level constraints. These constraints encode knowledge about invariants, even when the invariants themselves are unknown. For instance, even though the invariants describing the behavior of two functions f1 and f2 may be unknown, we may know that any valid input for f1 is also valid for f2, i.e., the precondition of f1 implies that of f2. We explore techniques for expressing and employing such consistency constraints to improve the quality of produced invariants. We further introduce techniques for dynamically discovering potential second-order constraints that the programmer can subsequently approve or reject. Our implementation builds on the Daikon tool, with a vocabulary of constraints that the programmer can use to enhance and constrain Daikon’s inference. We show that dynamic inference of second-order constraints together with minimal human effort can significantly influence the produced (first-order) invariants even in systems of substantial size, such as the Apache Commons Collections and the AspectJ compiler. We also find that 99% of the dynamically inferred second-order constraints we sampled are true. @InProceedings{ESEC/FSE13p103, author = {Kaituo Li and Christoph Reichenbach and Yannis Smaragdakis and Michal Young}, title = {Second-Order Constraints in Dynamic Invariant Inference}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {103--113}, doi = {}, year = {2013}, } |
|
Rigby, Peter C. |
ESEC/FSE '13: "Convergent Contemporary Software ..."
Convergent Contemporary Software Peer Review Practices
Peter C. Rigby and Christian Bird (Concordia University, Canada; Microsoft Research, USA) Software peer review is practiced on a diverse set of software projects that have drastically different settings, cultures, incentive systems, and time pressures. In an effort to characterize and understand these differences we examine two Google-led projects, Android and Chromium OS, three Microsoft projects, Bing, Office, and MS SQL, and projects internal to AMD. We contrast our findings with data taken from traditional software inspection conducted on a Lucent project and from open source software peer review on six projects, including Apache, Linux, and KDE. Our measures of interest include the review interval, the number of developers involved in review, and proxy measures for the number of defects found during review. We find that despite differences among projects, many of the characteristics of the review process have independently converged to similar values which we think indicate general principles of code review practice. We also introduce a measure of the degree to which knowledge is shared during review. This is an aspect of review practice that has traditionally only had experiential support. Our knowledge sharing measure shows that conducting peer review increases the number of distinct files a developer knows about by 66% to 150% depending on the project. This paper is one of the first studies of contemporary review in software firms and the most diverse study of peer review to date. @InProceedings{ESEC/FSE13p202, author = {Peter C. Rigby and Christian Bird}, title = {Convergent Contemporary Software Peer Review Practices}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {202--212}, doi = {}, year = {2013}, } |
|
Rinetzky, Noam |
ESEC/FSE '13: "Tightfit: Adaptive Parallelization ..."
Tightfit: Adaptive Parallelization with Foresight
Omer Tripp and Noam Rinetzky (Tel Aviv University, Israel) Irregular applications often exhibit data-dependent parallelism: Different inputs, and sometimes also different execution phases, enable different levels of parallelism. These changes in available parallelism have motivated work on adaptive concurrency control mechanisms. Existing adaptation techniques mostly learn about available parallelism indirectly, through runtime monitors that detect pathologies (e.g. excessive retries in speculation or high lock contention in mutual exclusion). We present a novel approach to adaptive parallelization, whereby the effective level of parallelism is predicted directly based on input features, rather than through circumstantial indicators over the execution environment (such as retry rate). This enables adaptation with foresight, based on the input data and not the run prefix. For this, the user specifies input features, which our system then correlates with the amount of available parallelism through offline learning. The resulting prediction rule serves in deployment runs to foresee the available parallelism for a given workload and tune the parallelization system accordingly. We have implemented our approach in TIGHTFIT, a general framework for input-centric offline adaptation. Our experimental evaluation of TIGHTFIT over two adaptive runtime systems and eight benchmarks provides positive evidence regarding TIGHTFIT's efficacy and accuracy. @InProceedings{ESEC/FSE13p169, author = {Omer Tripp and Noam Rinetzky}, title = {Tightfit: Adaptive Parallelization with Foresight}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {169--179}, doi = {}, year = {2013}, } |
|
Ringert, Jan Oliver |
ESEC/FSE '13: "Synthesis of Component and ..."
Synthesis of Component and Connector Models from Crosscutting Structural Views
Shahar Maoz , Jan Oliver Ringert, and Bernhard Rumpe (Tel Aviv University, Israel; RWTH Aachen University, Germany) We present component and connector (C&C) views, which specify structural properties of component and connector models in an expressive and intuitive way. C&C views provide means to abstract away direct hierarchy, direct connectivity, port names and types, and thus can crosscut the traditional boundaries of the implementation-oriented hierarchical decomposition of systems and sub-systems, and reflect the partial knowledge available to different stakeholders involved in a system's design. As a primary application for C&C views we investigate the synthesis problem: given a C&C views specification, consisting of mandatory, alternative, and negative views, construct a concrete satisfying C&C model, if one exists. We show that the problem is NP-hard and solve it, in a bounded scope, using a reduction to SAT, via Alloy. We further extend the basic problem with support for library components, specification patterns, and architectural styles. The result of synthesis can be used for further exploration, simulation, and refinement of the C&C model or, as the complete, final model itself, for direct code generation. A prototype tool and an evaluation over four example systems with multiple specifications show promising results and suggest interesting future research directions towards a comprehensive development environment for the structure of component and connector designs. @InProceedings{ESEC/FSE13p444, author = {Shahar Maoz and Jan Oliver Ringert and Bernhard Rumpe}, title = {Synthesis of Component and Connector Models from Crosscutting Structural Views}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {444--454}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Rosenblum, David S. |
ESEC/FSE '13: "Cascading Verification: An ..."
Cascading Verification: An Integrated Method for Domain-Specific Model Checking
Fokion Zervoudakis, David S. Rosenblum, Sebastian Elbaum, and Anthony Finkelstein (University College London, UK; National University of Singapore, Singapore; University of Nebraska-Lincoln, USA) Model checking is an established method for verifying behavioral properties of system models. But model checkers tend to support low-level modeling languages that require intricate models to represent even the simplest systems. Modeling complexity arises in part from the need to encode domain knowledge at relatively low levels of abstraction. In this paper, we demonstrate that formalized domain knowledge can be reused to raise the abstraction level of model and property specifications, and hence the effectiveness of model checking. We describe a novel method for domain-specific model checking called cascading verification that uses composite reasoning over high-level system specifications and formalized domain knowledge to synthesize both low-level system models and their behavioral properties for verification. In particular, model builders use a high-level domain-specific language (DSL) based on YAML to express system specifications that can be verified with probabilistic model checking. Domain knowledge is encoded in the Web Ontology Language (OWL), the Semantic Web Rule Language (SWRL) and Prolog, which are used in combination to overcome their individual limitations. A compiler then synthesizes models and properties for verification by the probabilistic model checker PRISM. We illustrate cascading verification for the domain of uninhabited aerial vehicles (UAVs), for which we have constructed a prototype implementation. An evaluation of this prototype reveals nontrivial reductions in the size and complexity of input specifications compared to the artifacts synthesized for PRISM. @InProceedings{ESEC/FSE13p400, author = {Fokion Zervoudakis and David S. Rosenblum and Sebastian Elbaum and Anthony Finkelstein}, title = {Cascading Verification: An Integrated Method for Domain-Specific Model Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {400--410}, doi = {}, year = {2013}, } |
|
Roth, Andreas |
ESEC/FSE '13: "Mining Behavior Models from ..."
Mining Behavior Models from Enterprise Web Applications
Matthias Schur, Andreas Roth, and Andreas Zeller (SAP, Germany; Saarland University, Germany) Today's enterprise web applications demand very high release cycles---and consequently, frequent tests. Automating these tests typically requires a behavior model: A description of the states the application can be in, the transitions between these states, and the expected results. Furthermore one needs scripts to make the abstract actions (transitions) in the model executable. As specifying such behavior models and writing the necessary scripts manually is a hard task, a possible alternative could be to extract them from existing applications. However, mining such models can be a challenge, in particular because one needs to know when two states are equivalent, as well as how to reach that state. We present ProCrawl (PROcess CRAWLer), a generic approach to mine behavior models from (multi-user) enterprise web applications. ProCrawl observes the behavior of the application through its user interface, generates and executes tests to explore unobserved behavior. In our evaluation of three non-trivial web applications (an open-source shop system, an SAP product compliance application, and an open-source conference manager), ProCrawl produces models that precisely abstract application behavior and which can be directly used for effective model-based regression testing. @InProceedings{ESEC/FSE13p422, author = {Matthias Schur and Andreas Roth and Andreas Zeller}, title = {Mining Behavior Models from Enterprise Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {422--432}, doi = {}, year = {2013}, } |
|
Roychoudhury, Abhik |
ESEC/FSE '13: "Regression Tests to Expose ..."
Regression Tests to Expose Change Interaction Errors
Marcel Böhme, Bruno C. d. S. Oliveira, and Abhik Roychoudhury (National University of Singapore, Singapore) Changes often introduce program errors, and hence recent software testing literature has focused on generating tests which stress changes. In this paper, we argue that changes cannot be treated as isolated program artifacts which are stressed via testing. Instead, it is the complex dependency across multiple changes which introduce subtle errors. Furthermore, the complex dependence structures, that need to be exercised to expose such errors, ensure that they remain undiscovered even in well tested and deployed software. We motivate our work based on empirical evidence from a well tested and stable project - Linux GNU Coreutils - where we found that one third of the regressions take more than two (2) years to be fixed, and that two thirds of such long-standing regressions are introduced due to change interactions for the utilities we investigated. To combat change interaction errors, we first define a notion of change interaction where several program changes are found to affect the result of a program statement via program dependencies. Based on this notion, we propose a change sequence graph (CSG) to summarize the control-flow and dependencies across changes. The CSG is then used as a guide during program path exploration via symbolic execution - thereby efficiently producing test cases which witness change interaction errors. Our experimental infra-structure was deployed on various utilities of GNU Coreutils, which have been distributed with Linux for almost twenty years. Apart from finding five (5) previously unknown errors in the utilities, we found that only one in five generated test cases exercises a sequence that is critical to exposing a change-interaction error, while being an order of magnitude more likely to expose an error. On the other hand, stressing changes in isolation only exposed half of the change interaction errors. These results demonstrate the importance and difficulty of change dependence-aware regression testing. @InProceedings{ESEC/FSE13p334, author = {Marcel Böhme and Bruno C. d. S. Oliveira and Abhik Roychoudhury}, title = {Regression Tests to Expose Change Interaction Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {334--344}, doi = {}, year = {2013}, } |
|
Rubin, Julia |
ESEC/FSE '13: "N-Way Model Merging ..."
N-Way Model Merging
Julia Rubin and Marsha Chechik (IBM Research, Israel; University of Toronto, Canada) Model merging is widely recognized as an essential step in a variety of software development activities. During the process of combining a set of related products into a product line or consolidating model views of multiple stakeholders, we need to merge multiple input models into one; yet, most of the existing approaches are applicable to merging only two models. In this paper, we define the n-way merge problem. We show that it can be reduced to the known and widely studied NP-hard problem of weighted set packing. Yet, the approximation solutions for that problem do not scale for real-sized software models. We thus evaluate alternative approaches of merging models that incrementally process input models in small subsets and propose our own algorithm that considerably improves precision over such approaches without sacrificing performance. @InProceedings{ESEC/FSE13p301, author = {Julia Rubin and Marsha Chechik}, title = {N-Way Model Merging}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {301--311}, doi = {}, year = {2013}, } |
|
Rumpe, Bernhard |
ESEC/FSE '13: "Synthesis of Component and ..."
Synthesis of Component and Connector Models from Crosscutting Structural Views
Shahar Maoz , Jan Oliver Ringert, and Bernhard Rumpe (Tel Aviv University, Israel; RWTH Aachen University, Germany) We present component and connector (C&C) views, which specify structural properties of component and connector models in an expressive and intuitive way. C&C views provide means to abstract away direct hierarchy, direct connectivity, port names and types, and thus can crosscut the traditional boundaries of the implementation-oriented hierarchical decomposition of systems and sub-systems, and reflect the partial knowledge available to different stakeholders involved in a system's design. As a primary application for C&C views we investigate the synthesis problem: given a C&C views specification, consisting of mandatory, alternative, and negative views, construct a concrete satisfying C&C model, if one exists. We show that the problem is NP-hard and solve it, in a bounded scope, using a reduction to SAT, via Alloy. We further extend the basic problem with support for library components, specification patterns, and architectural styles. The result of synthesis can be used for further exploration, simulation, and refinement of the C&C model or, as the complete, final model itself, for direct code generation. A prototype tool and an evaluation over four example systems with multiple specifications show promising results and suggest interesting future research directions towards a comprehensive development environment for the structure of component and connector designs. @InProceedings{ESEC/FSE13p444, author = {Shahar Maoz and Jan Oliver Ringert and Bernhard Rumpe}, title = {Synthesis of Component and Connector Models from Crosscutting Structural Views}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {444--454}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Safi, Gholamreza |
ESEC/FSE '13: "Identifying Message Flow in ..."
Identifying Message Flow in Distributed Event-Based Systems
Joshua Garcia, Daniel Popescu, Gholamreza Safi, William G. J. Halfond , and Nenad Medvidovic (University of Southern California, USA) Distributed event-based (DEB) systems contain highly-decoupled components that interact by exchanging messages. This enables flexible system composition and adaptation, but also makes DEB systems difficult to maintain. Most existing program analysis techniques to support maintenance are not well suited to DEB systems, while those that are tend to suffer from inaccuracy or make assumptions that limit their applicability. This paper presents Eos, a static analysis technique that identifies message information useful for maintaining a DEB system, namely, message types and message flow within a system. Eos has been evaluated on six off-the-shelf DEB systems spanning five different middleware platforms, and has exhibited excellent accuracy and efficiency. Furthermore, a case study involving a range of maintenance activities undertaken on three existing DEB systems shows that, on average, Eos enables an engineer to identify the scope and impact of required changes more accurately than existing alternatives. @InProceedings{ESEC/FSE13p367, author = {Joshua Garcia and Daniel Popescu and Gholamreza Safi and William G. J. Halfond and Nenad Medvidovic}, title = {Identifying Message Flow in Distributed Event-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {367--377}, doi = {}, year = {2013}, } Info |
|
Saha, Diptikalyan |
ESEC/FSE '13: "Distributed Program Tracing ..."
Distributed Program Tracing
Diptikalyan Saha, Pankaj Dhoolia, and Gaurab Paul (IBM Research, India; IIT Kharagpur, India) Dynamic program analysis techniques depend on accurate program traces. Program instrumentation is commonly used to collect these traces, which causes overhead to the program execution. Various techniques have addressed this problem by minimizing the number of probes/witnesses used to collect traces. In this paper, we present a novel distributed trace collection framework wherein, a program is executed multiple times with the same input for different sets of witnesses. The partial traces such obtained are then merged to create the whole program trace. Such divide-and-conquer strategy enables parallel collection of partial traces, thereby reducing the total time of collection. The problem is particularly challenging as arbitrary distribution of witnesses cannot guarantee correct formation of traces. We provide and prove a necessary and sufficient condition for distributing the witnesses which ensures correct formation of trace. Moreover, we describe witness distribution strategies that are suitable for parallel collection. We use the framework to collect traces of field SAP-ABAP programs using breakpoints as witnesses as instrumentation cannot be performed due to practical constraints. To optimize such collection, we extend Ball-Larus' optimal edge-based profiling algorithm to an optimal node-based algorithm. We demonstrate the effectiveness of the framework for collecting traces of SAP-ABAP programs. @InProceedings{ESEC/FSE13p180, author = {Diptikalyan Saha and Pankaj Dhoolia and Gaurab Paul}, title = {Distributed Program Tracing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {180--190}, doi = {}, year = {2013}, } |
|
Schäf, Martin |
ESEC/FSE '13: "Explaining Inconsistent Code ..."
Explaining Inconsistent Code
Martin Schäf, Daniel Schwartz-Narbonne, and Thomas Wies (United Nations University, China; New York University, USA) A code fragment is inconsistent if it is not part of any normally terminating execution. Examples of such inconsistencies include code that is unreachable, code that always fails due to a run-time error, and code that makes conflicting assumptions about the program state. In this paper, we consider the problem of automatically explaining inconsistent code. This problem is difficult because traditional fault localization techniques do not apply. Our solution relies on a novel algorithm that takes an infeasible code fragment as input and generates a so-called error invariant automaton. The error invariant automaton is an abstraction of the input code fragment that only mentions program statements and facts that are relevant for understanding the cause of the inconsistency. We conducted a preliminary usability study which demonstrated that error invariant automata can help programmers better understand inconsistencies in code taken from real-world programs. In particular, access to an error invariant automata tripled the speed at which programmers could diagnose the cause of a code inconsistency. @InProceedings{ESEC/FSE13p521, author = {Martin Schäf and Daniel Schwartz-Narbonne and Thomas Wies}, title = {Explaining Inconsistent Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {521--531}, doi = {}, year = {2013}, } |
|
Schur, Matthias |
ESEC/FSE '13: "Mining Behavior Models from ..."
Mining Behavior Models from Enterprise Web Applications
Matthias Schur, Andreas Roth, and Andreas Zeller (SAP, Germany; Saarland University, Germany) Today's enterprise web applications demand very high release cycles---and consequently, frequent tests. Automating these tests typically requires a behavior model: A description of the states the application can be in, the transitions between these states, and the expected results. Furthermore one needs scripts to make the abstract actions (transitions) in the model executable. As specifying such behavior models and writing the necessary scripts manually is a hard task, a possible alternative could be to extract them from existing applications. However, mining such models can be a challenge, in particular because one needs to know when two states are equivalent, as well as how to reach that state. We present ProCrawl (PROcess CRAWLer), a generic approach to mine behavior models from (multi-user) enterprise web applications. ProCrawl observes the behavior of the application through its user interface, generates and executes tests to explore unobserved behavior. In our evaluation of three non-trivial web applications (an open-source shop system, an SAP product compliance application, and an open-source conference manager), ProCrawl produces models that precisely abstract application behavior and which can be directly used for effective model-based regression testing. @InProceedings{ESEC/FSE13p422, author = {Matthias Schur and Andreas Roth and Andreas Zeller}, title = {Mining Behavior Models from Enterprise Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {422--432}, doi = {}, year = {2013}, } |
|
Schwartz-Narbonne, Daniel |
ESEC/FSE '13: "Explaining Inconsistent Code ..."
Explaining Inconsistent Code
Martin Schäf, Daniel Schwartz-Narbonne, and Thomas Wies (United Nations University, China; New York University, USA) A code fragment is inconsistent if it is not part of any normally terminating execution. Examples of such inconsistencies include code that is unreachable, code that always fails due to a run-time error, and code that makes conflicting assumptions about the program state. In this paper, we consider the problem of automatically explaining inconsistent code. This problem is difficult because traditional fault localization techniques do not apply. Our solution relies on a novel algorithm that takes an infeasible code fragment as input and generates a so-called error invariant automaton. The error invariant automaton is an abstraction of the input code fragment that only mentions program statements and facts that are relevant for understanding the cause of the inconsistency. We conducted a preliminary usability study which demonstrated that error invariant automata can help programmers better understand inconsistencies in code taken from real-world programs. In particular, access to an error invariant automata tripled the speed at which programmers could diagnose the cause of a code inconsistency. @InProceedings{ESEC/FSE13p521, author = {Martin Schäf and Daniel Schwartz-Narbonne and Thomas Wies}, title = {Explaining Inconsistent Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {521--531}, doi = {}, year = {2013}, } |
|
Sen, Koushik |
ESEC/FSE '13: "Jalangi: A Selective Record-Replay ..."
Jalangi: A Selective Record-Replay and Dynamic Analysis Framework for JavaScript
Koushik Sen, Swaroop Kalasapur, Tasneem Brutch, and Simon Gibbs (UC Berkeley, USA; Samsung Research, USA) JavaScript is widely used for writing client-side web applications and is getting increasingly popular for writing mobile applications. However, unlike C, C++, and Java, there are not that many tools available for analysis and testing of JavaScript applications. In this paper, we present a simple yet powerful framework, called Jalangi, for writing heavy-weight dynamic analyses. Our framework incorporates two key techniques: 1) selective record-replay, a technique which enables to record and to faithfully replay a user-selected part of the program, and 2) shadow values and shadow execution, which enables easy implementation of heavy-weight dynamic analyses. Our implementation makes no special assumption about JavaScript, which makes it applicable to real-world JavaScript programs running on multiple platforms. We have implemented concolic testing, an analysis to track origins of nulls and undefined, a simple form of taint analysis, an analysis to detect likely type inconsistencies, and an object allocation profiler in Jalangi. Our evaluation of Jalangi on the SunSpider benchmark suite and on five web applications shows that Jalangi has an average slowdown of 26X during recording and 30X slowdown during replay and analysis. The slowdowns are comparable with slowdowns reported for similar tools, such as PIN and Valgrind for x86 binaries. We believe that the techniques proposed in this paper are applicable to other dynamic languages. @InProceedings{ESEC/FSE13p488, author = {Koushik Sen and Swaroop Kalasapur and Tasneem Brutch and Simon Gibbs}, title = {Jalangi: A Selective Record-Replay and Dynamic Analysis Framework for JavaScript}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {488--498}, doi = {}, year = {2013}, } |
|
Sharma, Rahul |
ESEC/FSE '13: "Termination Proofs from Tests ..."
Termination Proofs from Tests
Aditya V. Nori and Rahul Sharma (Microsoft Research, India; Stanford University, USA) We show how a test suite for a sequential program can be profitably used to construct a termination proof. In particular, we describe an algorithm TpT for proving termination of a program based on information derived from testing it. TpT iteratively calls two phases: (a) an infer phase, and (b) a validate phase. In the infer phase, machine learning, in particular, linear regression is used to efficiently compute a candidate loop bound for every loop in the program. These loop bounds are verified for correctness by an off-the-shelf checker. If a loop bound is invalid, then the safety checker provides a test or a counterexample that is used to generate more data which is subsequently used by the next infer phase to compute better estimates for loop bounds. On the other hand, if all loop bounds are valid, then we have a proof of termination. We also describe a simple extension to our approach that allows us to infer polynomial loop bounds automatically. We have evaluated TpT on two benchmark sets, micro-benchmarks obtained from recent literature on program termination, and Windows device drivers. Our results are promising -- on the micro-benchmarks, we show that TpT is able to prove termination on 15% more benchmarks than any previously known technique, and our evaluation on Windows device drivers demonstrates TpT's ability to analyze and scale to real world applications. @InProceedings{ESEC/FSE13p246, author = {Aditya V. Nori and Rahul Sharma}, title = {Termination Proofs from Tests}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {246--256}, doi = {}, year = {2013}, } ESEC/FSE '13: "Differential Assertion Checking ..." Differential Assertion Checking Shuvendu K. Lahiri , Kenneth L. McMillan, Rahul Sharma, and Chris Hawblitzel (Microsoft Research, USA; Stanford University, USA) Previous version of a program can be a powerful enabler for program analysis by defining new relative specifications and making the results of current program analysis more relevant. In this paper, we describe the approach of differential assertion checking (DAC) for comparing different versions of a program with respect to a set of assertions. DAC provides a natural way to write relative specifications over two programs. We introduce a novel modular approach to DAC by reducing it to safety checking of a composed program, which can be accomplished by standard program verifiers. In particular, we leverage automatic invariant generation to synthesize relative specifications for pairs of loops and procedures. We provide a preliminary evaluation of a prototype implementation within the SymDiff tool along two directions (a) soundly verifying bug fixes in the presence of loops and (b) providing a knob for suppressing alarms when checking a new version of a program. @InProceedings{ESEC/FSE13p345, author = {Shuvendu K. Lahiri and Kenneth L. McMillan and Rahul Sharma and Chris Hawblitzel}, title = {Differential Assertion Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {345--355}, doi = {}, year = {2013}, } |
|
Silic, Marin |
ESEC/FSE '13: "Prediction of Atomic Web Services ..."
Prediction of Atomic Web Services Reliability Based on K-Means Clustering
Marin Silic, Goran Delac, and Sinisa Srbljic (University of Zagreb, Croatia) Contemporary web applications are often designed as composite services built by coordinating atomic services with the aim of providing the appropriate functionality. Although functional properties of each atomic service assure correct functionality of the entire application, nonfunctional properties such as availability, reliability, or security might significantly influence the user-perceived quality of the application. In this paper, we present CLUS, a model for reliability prediction of atomic web services that improves state-of-the-art approaches used in modern recommendation systems. CLUS predicts the reliability for the ongoing service invocation using the data collected from previous invocations. We improve the accuracy of the current state-of-the-art prediction models by considering user-, service- and environment-specific parameters of the invocation context. To address the computational performance related to scalability issues, we aggregate the available previous invocation data using K-means clustering algorithm. We evaluated our model by conducting experiments on services deployed in different regions of the Amazon cloud. The evaluation results suggest that our model improves both performance and accuracy of the prediction when compared to the current state-of-the-art models. @InProceedings{ESEC/FSE13p70, author = {Marin Silic and Goran Delac and Sinisa Srbljic}, title = {Prediction of Atomic Web Services Reliability Based on K-Means Clustering}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {70--80}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Sinha, Nishant |
ESEC/FSE '13: "Compiling Mockups to Flexible ..."
Compiling Mockups to Flexible UIs
Nishant Sinha and Rezwana Karim (IBM Research, India; Rutgers University, USA) As the web becomes ubiquitous, developers are obliged to develop web applications for a variety of desktop and mobile platforms. Re- designing the user interface for every such platform is clearly cumbersome. We propose a new framework based on model-based compilation to assist the designer in solving this problem. Starting from an under-specified visual design mockup drawn by the designer, we show how faithful and flexible web pages can be obtained with virtually no manual effort. Our framework, in sharp contrast to existing web design tools, overcomes the tough challenges involved in mockup compilation by (a) employing combinatorial search to infer hierarchical layouts and (b) mechanizing adhoc principles for CSS design into a modular, extensible rule-based architecture. We believe ours is the first disciplined effort to solve the problem and will inspire rapid, low-effort web design. @InProceedings{ESEC/FSE13p312, author = {Nishant Sinha and Rezwana Karim}, title = {Compiling Mockups to Flexible UIs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {312--322}, doi = {}, year = {2013}, } Video Info Artifact Accepted for Presentation |
|
Smaragdakis, Yannis |
ESEC/FSE '13: "Second-Order Constraints in ..."
Second-Order Constraints in Dynamic Invariant Inference
Kaituo Li, Christoph Reichenbach, Yannis Smaragdakis , and Michal Young (University of Massachusetts at Amherst, USA; Goethe University Frankfurt, Germany; University of Athens, Greece; University of Oregon, USA) The current generation of dynamic invariant detectors often produce invariants that are inconsistent with program semantics or programmer knowledge. We improve the consistency of dynamically discovered invariants by taking into account higher-level constraints. These constraints encode knowledge about invariants, even when the invariants themselves are unknown. For instance, even though the invariants describing the behavior of two functions f1 and f2 may be unknown, we may know that any valid input for f1 is also valid for f2, i.e., the precondition of f1 implies that of f2. We explore techniques for expressing and employing such consistency constraints to improve the quality of produced invariants. We further introduce techniques for dynamically discovering potential second-order constraints that the programmer can subsequently approve or reject. Our implementation builds on the Daikon tool, with a vocabulary of constraints that the programmer can use to enhance and constrain Daikon’s inference. We show that dynamic inference of second-order constraints together with minimal human effort can significantly influence the produced (first-order) invariants even in systems of substantial size, such as the Apache Commons Collections and the AspectJ compiler. We also find that 99% of the dynamically inferred second-order constraints we sampled are true. @InProceedings{ESEC/FSE13p103, author = {Kaituo Li and Christoph Reichenbach and Yannis Smaragdakis and Michal Young}, title = {Second-Order Constraints in Dynamic Invariant Inference}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {103--113}, doi = {}, year = {2013}, } |
|
Souto, Sabrina |
ESEC/FSE '13: "SPLat: Lightweight Dynamic ..."
SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems
Chang Hwan Peter Kim, Darko Marinov, Sarfraz Khurshid, Don Batory, Sabrina Souto, Paulo Barros, and Marcelo d'Amorim (University of Texas at Austin, USA; University of Illinois at Urbana-Champaign, USA; Groupon, USA; Federal University of Pernambuco, Brazil) Many programs can be configured through dynamic and/or static selection of configuration variables. A software product line (SPL), for example, specifies a family of programs where each program is defined by a unique combination of features. Systematically testing SPL programs is expensive as it can require running each test against a combinatorial number of configurations. Fortunately, a test is often independent of many configuration variables and need not be run against every combination. Configurations that are not required for a test can be pruned from execution. This paper presents SPLat, a new way to dynamically prune irrelevant configurations: the configurations to run for a test can be determined during test execution by monitoring accesses to configuration variables. SPLat achieves an optimal reduction in the number of configurations and is lightweight compared to prior work that used static analysis and heavyweight dynamic execution. Experimental results on 10 SPLs written in Java show that SPLat substantially reduces the total test execution time in many cases. Moreover, we demonstrate the scalability of SPLat by applying it to a large industrial code base written in Ruby on Rails. @InProceedings{ESEC/FSE13p257, author = {Chang Hwan Peter Kim and Darko Marinov and Sarfraz Khurshid and Don Batory and Sabrina Souto and Paulo Barros and Marcelo d'Amorim}, title = {SPLat: Lightweight Dynamic Analysis for Reducing Combinatorics in Testing Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2013}, } Info |
|
Srbljic, Sinisa |
ESEC/FSE '13: "Prediction of Atomic Web Services ..."
Prediction of Atomic Web Services Reliability Based on K-Means Clustering
Marin Silic, Goran Delac, and Sinisa Srbljic (University of Zagreb, Croatia) Contemporary web applications are often designed as composite services built by coordinating atomic services with the aim of providing the appropriate functionality. Although functional properties of each atomic service assure correct functionality of the entire application, nonfunctional properties such as availability, reliability, or security might significantly influence the user-perceived quality of the application. In this paper, we present CLUS, a model for reliability prediction of atomic web services that improves state-of-the-art approaches used in modern recommendation systems. CLUS predicts the reliability for the ongoing service invocation using the data collected from previous invocations. We improve the accuracy of the current state-of-the-art prediction models by considering user-, service- and environment-specific parameters of the invocation context. To address the computational performance related to scalability issues, we aggregate the available previous invocation data using K-means clustering algorithm. We evaluated our model by conducting experiments on services deployed in different regions of the Amazon cloud. The evaluation results suggest that our model improves both performance and accuracy of the prediction when compared to the current state-of-the-art models. @InProceedings{ESEC/FSE13p70, author = {Marin Silic and Goran Delac and Sinisa Srbljic}, title = {Prediction of Atomic Web Services Reliability Based on K-Means Clustering}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {70--80}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Stahlbauer, Andreas |
ESEC/FSE '13: "Precision Reuse for Efficient ..."
Precision Reuse for Efficient Regression Verification
Dirk Beyer , Stefan Löwe, Evgeny Novikov, Andreas Stahlbauer, and Philipp Wendler (University of Passau, Germany; ISP RAS, Russia) Continuous testing during development is a well-established technique for software-quality assurance. Continuous model checking from revision to revision is not yet established as a standard practice, because the enormous resource consumption makes its application impractical. Model checkers compute a large number of verification facts that are necessary for verifying if a given specification holds. We have identified a category of such intermediate results that are easy to store and efficient to reuse: abstraction precisions. The precision of an abstract domain specifies the level of abstraction that the analysis works on. Precisions are thus a precious result of the verification effort and it is a waste of resources to throw them away after each verification run. In particular, precisions are reasonably small and thus easy to store; they are easy to process and have a large impact on resource consumption. We experimentally show the impact of precision reuse on industrial verification problems created from 62 Linux kernel device drivers with 1119 revisions. @InProceedings{ESEC/FSE13p389, author = {Dirk Beyer and Stefan Löwe and Evgeny Novikov and Andreas Stahlbauer and Philipp Wendler}, title = {Precision Reuse for Efficient Regression Verification}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {389--399}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Su, Zhendong |
ESEC/FSE '13: "An Empirical Analysis of the ..."
An Empirical Analysis of the Co-evolution of Schema and Code in Database Applications
Dong Qiu, Bixin Li, and Zhendong Su (Southeast University, China; UC Davis, USA) Modern database applications are among the most widely used and complex software systems. They constantly evolve, responding to changes to data, database schemas, and code. It is challenging to manage these changes and ensure that everything co-evolves consistently. For example, when a database schema is modified, all the code that interacts with the database must be changed accordingly. Although database evolution and software evolution have been extensively studied in isolation, the co-evolution of schema and code has largely been unexplored. This paper presents the first comprehensive empirical analysis of the co-evolution of database schemas and code in ten popular large open-source database applications, totaling over 160K revisions. Our major findings include: 1) Database schemas evolve frequently during the application lifecycle, exhibiting a variety of change types with similar distributions across the studied applications; 2) Overall, schema changes induce significant code-level modifications, while certain change types have more impact on code than others; and 3) Co-change analyses can be viable to automate or assist with database application evolution. We have also observed that: 1) 80% of the schema changes happened in 20-30% of the tables, while nearly 40% of the tables did not change; and 2) Referential integrity constraints and stored procedures are rarely used in our studied subjects. We believe that our study reveals new insights into how database applications evolve and useful guidelines for designing assistive tools to aid their evolution. @InProceedings{ESEC/FSE13p125, author = {Dong Qiu and Bixin Li and Zhendong Su}, title = {An Empirical Analysis of the Co-evolution of Schema and Code in Database Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {125--135}, doi = {}, year = {2013}, } Info ESEC/FSE '13: "Server Interface Descriptions ..." Server Interface Descriptions for Automated Testing of JavaScript Web Applications Casper S. Jensen, Anders Møller , and Zhendong Su (Aarhus University, Denmark; UC Davis, USA) Automated testing of JavaScript web applications is complicated by the communication with servers. Specifically, it is difficult to test the JavaScript code in isolation from the server code and database contents. We present a practical solution to this problem. First, we demonstrate that formal server interface descriptions are useful in automated testing of JavaScript web applications for separating the concerns of the client and the server. Second, to support the construction of server interface descriptions for existing applications, we introduce an effective inference technique that learns communication patterns from sample data. By incorporating interface descriptions into the testing tool Artemis, our experimental results show that we increase the level of automation for high-coverage testing on a collection of JavaScript web applications that exchange JSON data between the clients and servers. Moreover, we demonstrate that the inference technique can quickly and accurately learn useful server interface descriptions. @InProceedings{ESEC/FSE13p510, author = {Casper S. Jensen and Anders Møller and Zhendong Su}, title = {Server Interface Descriptions for Automated Testing of JavaScript Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {510--520}, doi = {}, year = {2013}, } Info |
|
Sun, Chengnian |
ESEC/FSE '13: "Mining Succinct Predicated ..."
Mining Succinct Predicated Bug Signatures
Chengnian Sun and Siau-Cheng Khoo (National University of Singapore, Singapore) A bug signature is a set of program elements highlighting the cause or effect of a bug, and provides contextual information for debugging. In order to mine a signature for a buggy program, two sets of execution profiles of the program, one capturing the correct execution and the other capturing the faulty, are examined to identify the program elements contrasting faulty from correct. Signatures solely consisting of control flow transitions have been investigated via discriminative sequence and graph mining algorithms. These signatures might be handicapped in cases where the effect of a bug is not manifested by any deviation in control flow transitions. In this paper, we introduce the notion of predicated bug signature that aims to enhance the predictive power of bug signatures by utilizing both data predicates and control-flow information. We introduce a novel “discriminative itemset generator” mining technique to generate succinct signatures which do not contain redundant or irrelevant program elements. Our case studies demonstrate that predicated signatures can hint at more scenarios of bugs where traditional control-flow signatures fail. @InProceedings{ESEC/FSE13p576, author = {Chengnian Sun and Siau-Cheng Khoo}, title = {Mining Succinct Predicated Bug Signatures}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {576--586}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Tahiliani, Rohan |
ESEC/FSE '13: "Dynodroid: An Input Generation ..."
Dynodroid: An Input Generation System for Android Apps
Aravind Machiry, Rohan Tahiliani, and Mayur Naik (Georgia Tech, USA) We present a system Dynodroid for generating relevant inputs to unmodified Android apps. Dynodroid views an app as an event-driven program that interacts with its environment by means of a sequence of events through the Android framework. By instrumenting the framework once and for all, Dynodroid monitors the reaction of an app upon each event in a lightweight manner, using it to guide the generation of the next event to the app. Dynodroid also allows interleaving events from machines, which are better at generating a large number of simple inputs, with events from humans, who are better at providing intelligent inputs. We evaluated Dynodroid on 50 open-source Android apps, and compared it with two prevalent approaches: users manually exercising apps, and Monkey, a popular fuzzing tool. Dynodroid, humans, and Monkey covered 55%, 60%, and 53%, respectively, of each app's Java source code on average. Monkey took 20X more events on average than Dynodroid. Dynodroid also found 9 bugs in 7 of the 50 apps, and 6 bugs in 5 of the top 1,000 free apps on Google Play. @InProceedings{ESEC/FSE13p224, author = {Aravind Machiry and Rohan Tahiliani and Mayur Naik}, title = {Dynodroid: An Input Generation System for Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {224--234}, doi = {}, year = {2013}, } Info Distinguished Artifact |
|
Tang, Yang |
ESEC/FSE '13: "Effective Dynamic Detection ..."
Effective Dynamic Detection of Alias Analysis Errors
Jingyue Wu, Gang Hu, Yang Tang, and Junfeng Yang (Columbia University, USA) Alias analysis is perhaps one of the most crucial and widely used analyses, and has attracted tremendous research efforts over the years. Yet, advanced alias analyses are extremely difficult to get right, and the bugs in these analyses are one key reason that they have not been adopted to production compilers. This paper presents NeonGoby, a system for effectively detecting errors in alias analysis implementations, improving their correctness and hopefully widening their adoption. NeonGoby detects the worst type of bugs where the alias analysis claims that two pointers never alias, but they actually alias at runtime. NeonGoby works by dynamically observing pointer addresses during the execution of a test program and then checking these addresses against an alias analysis for errors. It is explicitly designed to (1) be agnostic to the alias analysis it checks for maximum applicability and ease of use and (2) detect alias analysis errors that manifest on real-world programs and workloads. It emits no false positives as long as test programs do not have undefined behavior per ANSI C specification or call external functions that interfere with our detection algorithm. It reduces performance overhead using a practical selection of techniques. Evaluation on three popular alias analyses and real-world programs Apache and MySQL shows that NeonGoby effectively finds 29 alias analysis bugs with zero false positives and reasonable overhead; the most serious four bugs have been patched by the developers. To enable alias analysis builders to start using NeonGoby today, we have released it open-source at https://github.com/columbia/neongoby, along with our error detection results and proposed patches. @InProceedings{ESEC/FSE13p279, author = {Jingyue Wu and Gang Hu and Yang Tang and Junfeng Yang}, title = {Effective Dynamic Detection of Alias Analysis Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {279--289}, doi = {}, year = {2013}, } Info |
|
Tonella, Paolo |
ESEC/FSE '13: "Automated Oracles: An Empirical ..."
Automated Oracles: An Empirical Study on Cost and Effectiveness
Cu D. Nguyen, Alessandro Marchetto, and Paolo Tonella (Fondazione Bruno Kessler, Italy) Software testing is an effective, yet expensive, method to improve software quality. Test automation, a potential way to reduce testing cost, has received enormous research attention recently, but the so-called “oracle problem” (how to decide the PASS/FAIL outcome of a test execution) is still a major obstacle to such cost reduction. We have extensively investigated state-of-the-art works that contribute to address this problem, from areas such as specification mining and model inference. In this paper, we compare three types of automated oracles: Data invariants, Temporal invariants, and Finite State Automata. More specifically, we study the training cost and the false positive rate; we evaluate also their fault detection capability. Seven medium to large, industrial application subjects and real faults have been used in our empirical investigation. @InProceedings{ESEC/FSE13p136, author = {Cu D. Nguyen and Alessandro Marchetto and Paolo Tonella}, title = {Automated Oracles: An Empirical Study on Cost and Effectiveness}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {136--146}, doi = {}, year = {2013}, } Info |
|
Tripp, Omer |
ESEC/FSE '13: "Tightfit: Adaptive Parallelization ..."
Tightfit: Adaptive Parallelization with Foresight
Omer Tripp and Noam Rinetzky (Tel Aviv University, Israel) Irregular applications often exhibit data-dependent parallelism: Different inputs, and sometimes also different execution phases, enable different levels of parallelism. These changes in available parallelism have motivated work on adaptive concurrency control mechanisms. Existing adaptation techniques mostly learn about available parallelism indirectly, through runtime monitors that detect pathologies (e.g. excessive retries in speculation or high lock contention in mutual exclusion). We present a novel approach to adaptive parallelization, whereby the effective level of parallelism is predicted directly based on input features, rather than through circumstantial indicators over the execution environment (such as retry rate). This enables adaptation with foresight, based on the input data and not the run prefix. For this, the user specifies input features, which our system then correlates with the amount of available parallelism through offline learning. The resulting prediction rule serves in deployment runs to foresee the available parallelism for a given workload and tune the parallelization system accordingly. We have implemented our approach in TIGHTFIT, a general framework for input-centric offline adaptation. Our experimental evaluation of TIGHTFIT over two adaptive runtime systems and eight benchmarks provides positive evidence regarding TIGHTFIT's efficacy and accuracy. @InProceedings{ESEC/FSE13p169, author = {Omer Tripp and Noam Rinetzky}, title = {Tightfit: Adaptive Parallelization with Foresight}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {169--179}, doi = {}, year = {2013}, } |
|
Valetto, Giuseppe |
ESEC/FSE '13: "Do All Task Dependencies Require ..."
Do All Task Dependencies Require Coordination? The Role of Task Properties in Identifying Critical Coordination Needs in Software Projects
Kelly Blincoe, Giuseppe Valetto, and Daniela Damian (Drexel University, USA; University of Victoria, Canada) Several methods exist to detect the coordination needs within software teams. Evidence exists that developers’ awareness about coordination needs improves work performance. Distinguishing with certainty between critical and trivial coordination needs and identifying and prioritizing which specific tasks a pair of developers should coordinate about remains an open problem. We investigate what work dependencies should be considered when establishing coordination needs within a development team. We use our conceptualization of work dependencies named Proximity and leverage machine learning techniques to analyze what additional task properties are indicative of coordination needs. In a case study of the Mylyn project, we were able to identify from all potential coordination requirements a subset of 17% that are most critical. We define critical coordination requirements as those that can cause the most disruption to task duration when left unmanaged. These results imply that coordination awareness tools could be enhanced to make developers aware of only the coordination needs that can bring about the highest performance benefit. @InProceedings{ESEC/FSE13p213, author = {Kelly Blincoe and Giuseppe Valetto and Daniela Damian}, title = {Do All Task Dependencies Require Coordination? The Role of Task Properties in Identifying Critical Coordination Needs in Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {213--223}, doi = {}, year = {2013}, } |
|
Veith, Helmut |
ESEC/FSE '13: "Con2colic Testing ..."
Con2colic Testing
Azadeh Farzan , Andreas Holzer, Niloofar Razavi, and Helmut Veith (University of Toronto, Canada; Vienna University of Technology, Austria) In this paper, we describe (con)2colic testing - a systematic testing approach for concurrent software. Based on concrete and symbolic executions of a concurrent program, (con)2colic testing derives inputs and schedules such that the execution space of the program under investigation is systematically explored. We introduce interference scenarios as key concept in (con)2colic testing. Interference scenarios capture the flow of data among different threads and enable a unified representation of path and interference constraints. We have implemented a (con)2colic testing engine and demonstrate the effectiveness of our approach by experiments. @InProceedings{ESEC/FSE13p37, author = {Azadeh Farzan and Andreas Holzer and Niloofar Razavi and Helmut Veith}, title = {Con2colic Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {37--47}, doi = {}, year = {2013}, } |
|
Von Rhein, Alexander |
ESEC/FSE '13: "Scalable Analysis of Variable ..."
Scalable Analysis of Variable Software
Jörg Liebig, Alexander von Rhein, Christian Kästner , Sven Apel, Jens Dörre, and Christian Lengauer (University of Passau, Germany; CMU, USA) The advent of variability management and generator technology enables users to derive individual variants from a variable code base based on a selection of desired configuration options. This approach gives rise to the generation of possibly billions of variants that, however, cannot be efficiently analyzed for errors with classic analysis techniques. To address this issue, researchers and practitioners usually apply sampling heuristics. While sampling reduces the analysis effort significantly, the information obtained is necessarily incomplete and it is unknown whether sampling heuristics scale to billions of variants. Recently, researchers have begun to develop variability-aware analyses that analyze the variable code base directly exploiting the similarities among individual variants to reduce analysis effort. However, while being promising, so far, variability-aware analyses have been applied mostly only to small academic systems. To learn about the mutual strengths and weaknesses of variability-aware and sampling-based analyses of software systems, we compared the two strategies by means of two concrete analysis implementations (type checking and liveness analysis), applied them to three subject systems: Busybox, the x86 Linux kernel, and OpenSSL. Our key finding is that variability-aware analysis outperforms most sampling heuristics with respect to analysis time while preserving completeness. @InProceedings{ESEC/FSE13p81, author = {Jörg Liebig and Alexander von Rhein and Christian Kästner and Sven Apel and Jens Dörre and Christian Lengauer}, title = {Scalable Analysis of Variable Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {81--91}, doi = {}, year = {2013}, } Info |
|
Wadsworth, Scott |
ESEC/FSE '13: "Will You Still Compile Me ..."
Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation
Chris Hawblitzel , Shuvendu K. Lahiri , Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, and Scott Wadsworth (Microsoft, USA) This paper describes a cross-version compiler validator and measures its effectiveness on the CLR JIT compiler. The validator checks for semantically equivalent assembly language output from various versions of the compiler, including versions across a seven-month time period, across two architectures (x86 and ARM), across two compilation scenarios (JIT and MDIL), and across optimizations levels. For month-to-month comparisons, the validator achieves a false alarm rate of just 2.2%. To help understand reported semantic differences, the validator performs a root-cause analysis on the counterexample traces generated by the underlying automated theorem proving tools. This root-cause analysis groups most of the counterexamples into a small number of buckets, reducing the number of counterexamples analyzed by hand by anywhere from 53% to 96%. The validator ran on over 500,000 methods across a large suite of test programs, finding 12 previously unknown correctness and performance bugs in the CLR compiler. @InProceedings{ESEC/FSE13p191, author = {Chris Hawblitzel and Shuvendu K. Lahiri and Kshama Pawar and Hammad Hashmi and Sedar Gokbulut and Lakshan Fernando and Dave Detlefs and Scott Wadsworth}, title = {Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2013}, } |
|
Wang, Haisheng |
ESEC/FSE '13: "Preventing Database Deadlocks ..."
Preventing Database Deadlocks in Applications
Mark Grechanik, B. M. Mainul Hossain, Ugo Buy, and Haisheng Wang (University of Illinois at Chicago, USA; Oracle, USA) Many organizations deploy applications that use databases by sending Structured Query Language (SQL) statements to them and obtaining data that result from the execution of these statements. Since applications often share the same databases concurrently, database deadlocks routinely occur in these databases resulting in major performance degradation in these applications. Database engines do not prevent database deadlocks for the same reason that the schedulers of operating system kernels do not preempt processes in a way to avoid race conditions and deadlocks - it is not feasible to find an optimal context switching schedule quickly for multiple processes (and SQL statements), and the overhead of doing it is prohibitive. We created a novel approach that combines run-time monitoring, which automatically prevents database deadlocks, with static analysis, which detects hold-and-wait cycles that specify how resources (e.g., database tables) are held in contention during executions of SQL statements. We rigorously evaluated our approach. For a realistic case of over 1,200 SQL statements, our algorithm detects all hold-and-wait cycles in less than two seconds. We built a toolset and experimented with three applications. Our tool prevented all existing database deadlocks in these applications and increased their throughputs by up to three orders of magnitude. @InProceedings{ESEC/FSE13p356, author = {Mark Grechanik and B. M. Mainul Hossain and Ugo Buy and Haisheng Wang}, title = {Preventing Database Deadlocks in Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {356--366}, doi = {}, year = {2013}, } Info |
|
Wang, Qianxiang |
ESEC/FSE '13: "Inferring Project-Specific ..."
Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs
Guangtai Liang, Qianxiang Wang, Tao Xie, and Hong Mei (Peking University, China; University of Illinois at Urbana-Champaign, USA) Lightweight static bug-detection tools such as FindBugs, PMD, Jlint, and Lint4j detect bugs with the knowledge of generic bug patterns (e.g., objects of java.io.InputStream are not closed in time after used). Besides generic bug patterns, different projects under analysis may have some project-specific bug patterns. For example, in a revision of the Xerces project, the class field "fDTDHandler" is dereferenced without proper null-checks, while it could actually be null at runtime. We name such bug patterns directly related to objects instantiated in specific projects as Project-Specific Bug Patterns (PSBPs). Due to lack of such PSBP knowledge, existing tools usually fail in effectively detecting most of this kind of bugs. We name bugs belonging to the same project and sharing the same PSBP as sibling bugs. If some sibling bugs are fixed in a fix revision but some others remain, we treat such fix as an incomplete fix. To address such incomplete fixes, we propose a PSBP-based approach for detecting sibling bugs and implement a tool called Sibling-Bug Detector (SBD). Given a fix revision, SBD first infers the PSBPs implied by the fix revision. Then, based on the inferred PSBPs, SBD detects their related sibling bugs in the same project. To evaluate SBD, we apply it to seven popular open-source projects. Among the 108 warnings reported by SBD, 63 of them have been confirmed as real bugs by the project developers, while two existing popular static detectors (FindBugs and PMD) cannot report most of them. @InProceedings{ESEC/FSE13p565, author = {Guangtai Liang and Qianxiang Wang and Tao Xie and Hong Mei}, title = {Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {565--575}, doi = {}, year = {2013}, } Info |
|
Wang, Tiantian |
ESEC/FSE '13: "Searching for Better Configurations: ..."
Searching for Better Configurations: A Rigorous Approach to Clone Evaluation
Tiantian Wang, Mark Harman, Yue Jia, and Jens Krinke (Harbin Institute of Technology, China; University College London, UK) Clone detection finds application in many software engineering activities such as comprehension and refactoring. However, the confounding configuration choice problem poses a widely-acknowledged threat to the validity of previous empirical analyses. We introduce desktop and parallelised cloud-deployed versions of a search based solution that finds suitable configurations for empirical studies. We evaluate our approach on 6 widely used clone detection tools applied to the Bellon suite of 8 subject systems. Our evaluation reports the results of 9.3 million total executions of a clone tool; the largest study yet reported. Our approach finds significantly better configurations (p < 0.05) than those currently used, providing evidence that our approach can ameliorate the confounding configuration choice problem. @InProceedings{ESEC/FSE13p455, author = {Tiantian Wang and Mark Harman and Yue Jia and Jens Krinke}, title = {Searching for Better Configurations: A Rigorous Approach to Clone Evaluation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {455--465}, doi = {}, year = {2013}, } Info |
|
Wendler, Philipp |
ESEC/FSE '13: "Precision Reuse for Efficient ..."
Precision Reuse for Efficient Regression Verification
Dirk Beyer , Stefan Löwe, Evgeny Novikov, Andreas Stahlbauer, and Philipp Wendler (University of Passau, Germany; ISP RAS, Russia) Continuous testing during development is a well-established technique for software-quality assurance. Continuous model checking from revision to revision is not yet established as a standard practice, because the enormous resource consumption makes its application impractical. Model checkers compute a large number of verification facts that are necessary for verifying if a given specification holds. We have identified a category of such intermediate results that are easy to store and efficient to reuse: abstraction precisions. The precision of an abstract domain specifies the level of abstraction that the analysis works on. Precisions are thus a precious result of the verification effort and it is a waste of resources to throw them away after each verification run. In particular, precisions are reasonably small and thus easy to store; they are easy to process and have a large impact on resource consumption. We experimentally show the impact of precision reuse on industrial verification problems created from 62 Linux kernel device drivers with 1119 revisions. @InProceedings{ESEC/FSE13p389, author = {Dirk Beyer and Stefan Löwe and Evgeny Novikov and Andreas Stahlbauer and Philipp Wendler}, title = {Precision Reuse for Efficient Regression Verification}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {389--399}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Wies, Thomas |
ESEC/FSE '13: "Explaining Inconsistent Code ..."
Explaining Inconsistent Code
Martin Schäf, Daniel Schwartz-Narbonne, and Thomas Wies (United Nations University, China; New York University, USA) A code fragment is inconsistent if it is not part of any normally terminating execution. Examples of such inconsistencies include code that is unreachable, code that always fails due to a run-time error, and code that makes conflicting assumptions about the program state. In this paper, we consider the problem of automatically explaining inconsistent code. This problem is difficult because traditional fault localization techniques do not apply. Our solution relies on a novel algorithm that takes an infeasible code fragment as input and generates a so-called error invariant automaton. The error invariant automaton is an abstraction of the input code fragment that only mentions program statements and facts that are relevant for understanding the cause of the inconsistency. We conducted a preliminary usability study which demonstrated that error invariant automata can help programmers better understand inconsistencies in code taken from real-world programs. In particular, access to an error invariant automata tripled the speed at which programmers could diagnose the cause of a code inconsistency. @InProceedings{ESEC/FSE13p521, author = {Martin Schäf and Daniel Schwartz-Narbonne and Thomas Wies}, title = {Explaining Inconsistent Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {521--531}, doi = {}, year = {2013}, } |
|
Wu, Jingyue |
ESEC/FSE '13: "Effective Dynamic Detection ..."
Effective Dynamic Detection of Alias Analysis Errors
Jingyue Wu, Gang Hu, Yang Tang, and Junfeng Yang (Columbia University, USA) Alias analysis is perhaps one of the most crucial and widely used analyses, and has attracted tremendous research efforts over the years. Yet, advanced alias analyses are extremely difficult to get right, and the bugs in these analyses are one key reason that they have not been adopted to production compilers. This paper presents NeonGoby, a system for effectively detecting errors in alias analysis implementations, improving their correctness and hopefully widening their adoption. NeonGoby detects the worst type of bugs where the alias analysis claims that two pointers never alias, but they actually alias at runtime. NeonGoby works by dynamically observing pointer addresses during the execution of a test program and then checking these addresses against an alias analysis for errors. It is explicitly designed to (1) be agnostic to the alias analysis it checks for maximum applicability and ease of use and (2) detect alias analysis errors that manifest on real-world programs and workloads. It emits no false positives as long as test programs do not have undefined behavior per ANSI C specification or call external functions that interfere with our detection algorithm. It reduces performance overhead using a practical selection of techniques. Evaluation on three popular alias analyses and real-world programs Apache and MySQL shows that NeonGoby effectively finds 29 alias analysis bugs with zero false positives and reasonable overhead; the most serious four bugs have been patched by the developers. To enable alias analysis builders to start using NeonGoby today, we have released it open-source at https://github.com/columbia/neongoby, along with our error detection results and proposed patches. @InProceedings{ESEC/FSE13p279, author = {Jingyue Wu and Gang Hu and Yang Tang and Junfeng Yang}, title = {Effective Dynamic Detection of Alias Analysis Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {279--289}, doi = {}, year = {2013}, } Info |
|
Xie, Tao |
ESEC/FSE '13: "Inferring Project-Specific ..."
Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs
Guangtai Liang, Qianxiang Wang, Tao Xie, and Hong Mei (Peking University, China; University of Illinois at Urbana-Champaign, USA) Lightweight static bug-detection tools such as FindBugs, PMD, Jlint, and Lint4j detect bugs with the knowledge of generic bug patterns (e.g., objects of java.io.InputStream are not closed in time after used). Besides generic bug patterns, different projects under analysis may have some project-specific bug patterns. For example, in a revision of the Xerces project, the class field "fDTDHandler" is dereferenced without proper null-checks, while it could actually be null at runtime. We name such bug patterns directly related to objects instantiated in specific projects as Project-Specific Bug Patterns (PSBPs). Due to lack of such PSBP knowledge, existing tools usually fail in effectively detecting most of this kind of bugs. We name bugs belonging to the same project and sharing the same PSBP as sibling bugs. If some sibling bugs are fixed in a fix revision but some others remain, we treat such fix as an incomplete fix. To address such incomplete fixes, we propose a PSBP-based approach for detecting sibling bugs and implement a tool called Sibling-Bug Detector (SBD). Given a fix revision, SBD first infers the PSBPs implied by the fix revision. Then, based on the inferred PSBPs, SBD detects their related sibling bugs in the same project. To evaluate SBD, we apply it to seven popular open-source projects. Among the 108 warnings reported by SBD, 63 of them have been confirmed as real bugs by the project developers, while two existing popular static detectors (FindBugs and PMD) cannot report most of them. @InProceedings{ESEC/FSE13p565, author = {Guangtai Liang and Qianxiang Wang and Tao Xie and Hong Mei}, title = {Inferring Project-Specific Bug Patterns for Detecting Sibling Bugs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {565--575}, doi = {}, year = {2013}, } Info |
|
Xu, Guoqing |
ESEC/FSE '13: "Cachetor: Detecting Cacheable ..."
Cachetor: Detecting Cacheable Data to Remove Bloat
Khanh Nguyen and Guoqing Xu (UC Irvine, USA) Modern object-oriented software commonly suffers from runtime bloat that significantly affects its performance and scalability. Studies have shown that one important pattern of bloat is the work repeatedly done to compute the same data values. Very often the cost of computation is very high and it is thus beneficial to memoize the invariant data values for later use. While this is a common practice in real-world development, manually finding invariant data values is a daunting task during development and tuning. To help the developers quickly find such optimization opportunities for performance improvement, we propose a novel run-time profiling tool, called Cachetor, which uses a combination of dynamic dependence profiling and value profiling to identify and report operations that keep generating identical data values. The major challenge in the design of Cachetor is that both dependence and value profiling are extremely expensive techniques that cannot scale to large, real-world applications for which optimizations are important. To overcome this challenge, we propose a series of novel abstractions that are applied to run-time instruction instances during profiling, yielding significantly improved analysis time and scalability. We have implemented Cachetor in Jikes Research Virtual Machine and evaluated it on a set of 14 large Java applications. Our experimental results suggest that Cachetor is effective in exposing caching opportunities and substantial performance gains can be achieved by modifying a program to cache the reported data. @InProceedings{ESEC/FSE13p268, author = {Khanh Nguyen and Guoqing Xu}, title = {Cachetor: Detecting Cacheable Data to Remove Bloat}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {268--278}, doi = {}, year = {2013}, } |
|
Yang, Junfeng |
ESEC/FSE '13: "Effective Dynamic Detection ..."
Effective Dynamic Detection of Alias Analysis Errors
Jingyue Wu, Gang Hu, Yang Tang, and Junfeng Yang (Columbia University, USA) Alias analysis is perhaps one of the most crucial and widely used analyses, and has attracted tremendous research efforts over the years. Yet, advanced alias analyses are extremely difficult to get right, and the bugs in these analyses are one key reason that they have not been adopted to production compilers. This paper presents NeonGoby, a system for effectively detecting errors in alias analysis implementations, improving their correctness and hopefully widening their adoption. NeonGoby detects the worst type of bugs where the alias analysis claims that two pointers never alias, but they actually alias at runtime. NeonGoby works by dynamically observing pointer addresses during the execution of a test program and then checking these addresses against an alias analysis for errors. It is explicitly designed to (1) be agnostic to the alias analysis it checks for maximum applicability and ease of use and (2) detect alias analysis errors that manifest on real-world programs and workloads. It emits no false positives as long as test programs do not have undefined behavior per ANSI C specification or call external functions that interfere with our detection algorithm. It reduces performance overhead using a practical selection of techniques. Evaluation on three popular alias analyses and real-world programs Apache and MySQL shows that NeonGoby effectively finds 29 alias analysis bugs with zero false positives and reasonable overhead; the most serious four bugs have been patched by the developers. To enable alias analysis builders to start using NeonGoby today, we have released it open-source at https://github.com/columbia/neongoby, along with our error detection results and proposed patches. @InProceedings{ESEC/FSE13p279, author = {Jingyue Wu and Gang Hu and Yang Tang and Junfeng Yang}, title = {Effective Dynamic Detection of Alias Analysis Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {279--289}, doi = {}, year = {2013}, } Info |
|
Yoo, Shin |
ESEC/FSE '13: "Efficiency and Early Fault ..."
Efficiency and Early Fault Detection with Lower and Higher Strength Combinatorial Interaction Testing
Justyna Petke , Shin Yoo, Myra B. Cohen, and Mark Harman (University College London, UK; University of Nebraska-Lincoln, USA) Combinatorial Interaction Testing (CIT) is important because it tests the interactions between the many features and parameters that make up the configuration space of software systems. However, in order to be practically applicable, it must be able to cater for soft and hard real-world constraints and should, ideally, report a test priority order that maximises earliest fault detection. We show that we can achieve the highest strength CIT in 5.65 minutes on average. This was previously thought to be too computationally expensive to be feasible. Furthermore, we show that higher strength suites find more faults, while prioritisations using lower strengths are no worse at achieving early fault revelation. @InProceedings{ESEC/FSE13p26, author = {Justyna Petke and Shin Yoo and Myra B. Cohen and Mark Harman}, title = {Efficiency and Early Fault Detection with Lower and Higher Strength Combinatorial Interaction Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {26--36}, doi = {}, year = {2013}, } Info |
|
Young, Michal |
ESEC/FSE '13: "Second-Order Constraints in ..."
Second-Order Constraints in Dynamic Invariant Inference
Kaituo Li, Christoph Reichenbach, Yannis Smaragdakis , and Michal Young (University of Massachusetts at Amherst, USA; Goethe University Frankfurt, Germany; University of Athens, Greece; University of Oregon, USA) The current generation of dynamic invariant detectors often produce invariants that are inconsistent with program semantics or programmer knowledge. We improve the consistency of dynamically discovered invariants by taking into account higher-level constraints. These constraints encode knowledge about invariants, even when the invariants themselves are unknown. For instance, even though the invariants describing the behavior of two functions f1 and f2 may be unknown, we may know that any valid input for f1 is also valid for f2, i.e., the precondition of f1 implies that of f2. We explore techniques for expressing and employing such consistency constraints to improve the quality of produced invariants. We further introduce techniques for dynamically discovering potential second-order constraints that the programmer can subsequently approve or reject. Our implementation builds on the Daikon tool, with a vocabulary of constraints that the programmer can use to enhance and constrain Daikon’s inference. We show that dynamic inference of second-order constraints together with minimal human effort can significantly influence the produced (first-order) invariants even in systems of substantial size, such as the Apache Commons Collections and the AspectJ compiler. We also find that 99% of the dynamically inferred second-order constraints we sampled are true. @InProceedings{ESEC/FSE13p103, author = {Kaituo Li and Christoph Reichenbach and Yannis Smaragdakis and Michal Young}, title = {Second-Order Constraints in Dynamic Invariant Inference}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {103--113}, doi = {}, year = {2013}, } |
|
Zeller, Andreas |
ESEC/FSE '13: "Mining Behavior Models from ..."
Mining Behavior Models from Enterprise Web Applications
Matthias Schur, Andreas Roth, and Andreas Zeller (SAP, Germany; Saarland University, Germany) Today's enterprise web applications demand very high release cycles---and consequently, frequent tests. Automating these tests typically requires a behavior model: A description of the states the application can be in, the transitions between these states, and the expected results. Furthermore one needs scripts to make the abstract actions (transitions) in the model executable. As specifying such behavior models and writing the necessary scripts manually is a hard task, a possible alternative could be to extract them from existing applications. However, mining such models can be a challenge, in particular because one needs to know when two states are equivalent, as well as how to reach that state. We present ProCrawl (PROcess CRAWLer), a generic approach to mine behavior models from (multi-user) enterprise web applications. ProCrawl observes the behavior of the application through its user interface, generates and executes tests to explore unobserved behavior. In our evaluation of three non-trivial web applications (an open-source shop system, an SAP product compliance application, and an open-source conference manager), ProCrawl produces models that precisely abstract application behavior and which can be directly used for effective model-based regression testing. @InProceedings{ESEC/FSE13p422, author = {Matthias Schur and Andreas Roth and Andreas Zeller}, title = {Mining Behavior Models from Enterprise Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {422--432}, doi = {}, year = {2013}, } |
|
Zervoudakis, Fokion |
ESEC/FSE '13: "Cascading Verification: An ..."
Cascading Verification: An Integrated Method for Domain-Specific Model Checking
Fokion Zervoudakis, David S. Rosenblum, Sebastian Elbaum, and Anthony Finkelstein (University College London, UK; National University of Singapore, Singapore; University of Nebraska-Lincoln, USA) Model checking is an established method for verifying behavioral properties of system models. But model checkers tend to support low-level modeling languages that require intricate models to represent even the simplest systems. Modeling complexity arises in part from the need to encode domain knowledge at relatively low levels of abstraction. In this paper, we demonstrate that formalized domain knowledge can be reused to raise the abstraction level of model and property specifications, and hence the effectiveness of model checking. We describe a novel method for domain-specific model checking called cascading verification that uses composite reasoning over high-level system specifications and formalized domain knowledge to synthesize both low-level system models and their behavioral properties for verification. In particular, model builders use a high-level domain-specific language (DSL) based on YAML to express system specifications that can be verified with probabilistic model checking. Domain knowledge is encoded in the Web Ontology Language (OWL), the Semantic Web Rule Language (SWRL) and Prolog, which are used in combination to overcome their individual limitations. A compiler then synthesizes models and properties for verification by the probabilistic model checker PRISM. We illustrate cascading verification for the domain of uninhabited aerial vehicles (UAVs), for which we have constructed a prototype implementation. An evaluation of this prototype reveals nontrivial reductions in the size and complexity of input specifications compared to the artifacts synthesized for PRISM. @InProceedings{ESEC/FSE13p400, author = {Fokion Zervoudakis and David S. Rosenblum and Sebastian Elbaum and Anthony Finkelstein}, title = {Cascading Verification: An Integrated Method for Domain-Specific Model Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {400--410}, doi = {}, year = {2013}, } |
|
Zhang, Charles |
ESEC/FSE '13: "Finding Incorrect Compositions ..."
Finding Incorrect Compositions of Atomicity
Peng Liu, Julian Dolby , and Charles Zhang (Hong Kong University of Science and Technology, China; IBM Research, USA) In object-oriented code, atomicity is ideally isolated in a library which encapsulates shared program state and provides atomic APIs for access. The library provides a convenient way for programmers to reason about the needed synchronization. However, as the library exports a limited set of APIs, it cannot satisfy every unplanned atomicity demand; therefore, clients may have to compose invocations of the library APIs to obtain new atomic functionality. This process is error-prone due to the complexity of reasoning required, hence tool support for uncovering incorrect compositions (i.e., atomic compositions that are implemented incorrectly) would be very helpful. A key difficulty is how to determine the intended atomic compositions, which are rarely documented. Existing inference techniques cannot be used to infer the atomic compositions because they cannot recognize the library and the client, which requires understanding the related program state. Even if extended to support the library/client, they lead to many false positives or false negatives because they miss the key program logic which reflects programmers’ coding paradigms for atomic compositions. We define a new inference technique which identifies intended atomic compositions using two key symptoms based on program dependence. We then check dynamically whether these atomic compositions are implemented incorrectly as non-atomic. Evaluation on thirteen large applications shows that our approach finds around 50 previously unknown incorrect compositions. Further study on Tomcat shows that almost half (5 out of 12) of discovered incorrect compositions are confirmed as bugs by the developers. Given that Tomcat is heavily used in 250,000 sites including Linkedin.com and Ebay.com, we believe finding multiple new bugs in it automatically with relatively few false positives supports our heuristics for determining intended atomicity. @InProceedings{ESEC/FSE13p158, author = {Peng Liu and Julian Dolby and Charles Zhang}, title = {Finding Incorrect Compositions of Atomicity}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {158--168}, doi = {}, year = {2013}, } Artifact Accepted for Presentation |
|
Zhang, Xiangyu |
ESEC/FSE '13: "Z3-str: A Z3-Based String ..."
Z3-str: A Z3-Based String Solver for Web Application Analysis
Yunhui Zheng, Xiangyu Zhang , and Vijay Ganesh (Purdue University, USA; University of Waterloo, Canada) Analyzing web applications requires reasoning about strings and non-strings cohesively. Existing string solvers either ignore non-string program behavior or support limited set of string operations. In this paper, we develop a general purpose string solver, called Z3-str, as an extension of the Z3 SMT solver through its plug-in interface. Z3-str treats strings as a primitive type, thus avoiding the inherent limitations observed in many existing solvers that encode strings in terms of other primitives. The logic of the plug-in has three sorts, namely, bool, int and string. The string-sorted terms include string constants and variables of arbitrary length, with functions such as concatenation, sub-string, and replace. The int-sorted terms are standard, with the exception of the length function over string terms. The atomic formulas are equations over string terms, and (in)-equalities over integer terms. Not only does our solver have features that enable whole program symbolic, static and dynamic analysis, but also it performs better than other solvers in our experiments. The application of Z3-str in remote code execution detection shows that its support of a wide spectrum of string operations is key to reducing false positives. @InProceedings{ESEC/FSE13p114, author = {Yunhui Zheng and Xiangyu Zhang and Vijay Ganesh}, title = {Z3-str: A Z3-Based String Solver for Web Application Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {114--124}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Zheng, Yunhui |
ESEC/FSE '13: "Z3-str: A Z3-Based String ..."
Z3-str: A Z3-Based String Solver for Web Application Analysis
Yunhui Zheng, Xiangyu Zhang , and Vijay Ganesh (Purdue University, USA; University of Waterloo, Canada) Analyzing web applications requires reasoning about strings and non-strings cohesively. Existing string solvers either ignore non-string program behavior or support limited set of string operations. In this paper, we develop a general purpose string solver, called Z3-str, as an extension of the Z3 SMT solver through its plug-in interface. Z3-str treats strings as a primitive type, thus avoiding the inherent limitations observed in many existing solvers that encode strings in terms of other primitives. The logic of the plug-in has three sorts, namely, bool, int and string. The string-sorted terms include string constants and variables of arbitrary length, with functions such as concatenation, sub-string, and replace. The int-sorted terms are standard, with the exception of the length function over string terms. The atomic formulas are equations over string terms, and (in)-equalities over integer terms. Not only does our solver have features that enable whole program symbolic, static and dynamic analysis, but also it performs better than other solvers in our experiments. The application of Z3-str in remote code execution detection shows that its support of a wide spectrum of string operations is key to reducing false positives. @InProceedings{ESEC/FSE13p114, author = {Yunhui Zheng and Xiangyu Zhang and Vijay Ganesh}, title = {Z3-str: A Z3-Based String Solver for Web Application Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {114--124}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Zimmermann, Thomas |
ESEC/FSE '13: "Diversity in Software Engineering ..."
Diversity in Software Engineering Research
Meiyappan Nagappan, Thomas Zimmermann , and Christian Bird (Queen’s University, Canada; Microsoft Research, USA) One of the goals of software engineering research is to achieve generality: Are the phenomena found in a few projects reflective of others? Will a technique perform as well on projects other than the projects it is evaluated on? While it is common sense to select a sample that is representative of a population, the importance of diversity is often overlooked, yet as important. In this paper, we combine ideas from representativeness and diversity and introduce a measure called sample coverage, defined as the percentage of projects in a population that are similar to the given sample. We introduce algorithms to compute the sample coverage for a given set of projects and to select the projects that increase the coverage the most. We demonstrate our technique on research presented over the span of two years at ICSE and FSE with respect to a population of 20,000 active open source projects monitored by Ohloh.net. Knowing the coverage of a sample enhances our ability to reason about the findings of a study. Furthermore, we propose reporting guidelines for research: in addition to coverage scores, papers should discuss the target population of the research (universe) and dimensions that potentially can influence the outcomes of a research (space). @InProceedings{ESEC/FSE13p466, author = {Meiyappan Nagappan and Thomas Zimmermann and Christian Bird}, title = {Diversity in Software Engineering Research}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {466--476}, doi = {}, year = {2013}, } Info Artifact Accepted for Presentation |
|
Zisman, Andrea |
ESEC/FSE '13: "Improving Trace Accuracy through ..."
Improving Trace Accuracy through Data-Driven Configuration and Composition of Tracing Features
Sugandha Lohar, Sorawit Amornborvornwong, Andrea Zisman, and Jane Cleland-Huang (DePaul University, USA; Open University, UK) Software traceability is a sought-after, yet often elusive quality in large software-intensive systems primarily because the cost and effort of tracing can be overwhelming. State-of-the art solutions address this problem through utilizing trace retrieval techniques to automate the process of creating and maintaining trace links. However, there is no simple one- size-fits all solution to trace retrieval. As this paper will show, finding the right combination of tracing techniques can lead to significant improvements in the quality of generated links. We present a novel approach to trace retrieval in which the underlying infrastructure is configured at runtime to optimize trace quality. We utilize a machine-learning approach to search for the best configuration given an initial training set of validated trace links, a set of available tracing techniques specified in a feature model, and an architecture capable of instantiating all valid configurations of features. We evaluate our approach through a series of experiments using project data from the transportation, healthcare, and space exploration domains, and discuss its implementation in an industrial environment. Finally, we show how our approach can create a robust baseline against which new tracing techniques can be evaluated. @InProceedings{ESEC/FSE13p378, author = {Sugandha Lohar and Sorawit Amornborvornwong and Andrea Zisman and Jane Cleland-Huang}, title = {Improving Trace Accuracy through Data-Driven Configuration and Composition of Tracing Features}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {378--388}, doi = {}, year = {2013}, } |
177 authors
proc time: 1.49