ESEC/FSE 2015 – Author Index |
Contents -
Abstracts -
Authors
|
A B C D E F G H J K L M N O P Q R S T V W X Y Z
Afraz, Mohammed |
ESEC/FSE '15: "P3: Partitioned Path Profiling ..."
P3: Partitioned Path Profiling
Mohammed Afraz, Diptikalyan Saha, and Aditya Kanade (Indian Institute of Science, India; IBM Research, India) Acyclic path profile is an abstraction of dynamic control flow paths of procedures and has been found to be useful in a wide spectrum of activities. Unfortunately, the runtime overhead of obtaining such a profile can be high, limiting its use in practice. In this paper, we present partitioned path profiling (P3) which runs K copies of the program in parallel, each with the same input but on a separate core, and collects the profile only for a subset of intra-procedural paths in each copy, thereby, distributing the overhead of profiling. P3 identifies “profitable” procedures and assigns disjoint subsets of paths of a profitable procedure to different copies for profiling. To obtain exact execution frequencies of a subset of paths, we design a new algorithm, called PSPP. All paths of an unprofitable procedure are assigned to the same copy. P3 uses the classic Ball-Larus algorithm for profiling unprofitable procedures. Further, P3 attempts to evenly distribute the profiling overhead across the copies. To the best of our knowledge, P3 is the first algorithm for parallel path profiling. We have applied P3 to profile several programs in the SPEC 2006 benchmark. Compared to sequential profiling, P3 substantially reduced the runtime overhead on these programs averaged across all benchmarks. The reduction was 23%, 43% and 56% on average for 2, 4 and 8 cores respectively. P3 also performed better than a coarse-grained approach that treats all procedures as unprofitable and distributes them across available cores. For 2 cores, the profiling overhead of P3 was on average 5% less compared to the coarse-grained approach across these programs. For 4 and 8 cores, it was respectively 18% and 25% less. @InProceedings{ESEC/FSE15p485, author = {Mohammed Afraz and Diptikalyan Saha and Aditya Kanade}, title = {P3: Partitioned Path Profiling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {485--495}, doi = {}, year = {2015}, } |
|
Allamanis, Miltiadis |
ESEC/FSE '15: "Suggesting Accurate Method ..."
Suggesting Accurate Method and Class Names
Miltiadis Allamanis, Earl T. Barr , Christian Bird, and Charles Sutton (University of Edinburgh, UK; University College London, UK; Microsoft Research, USA) Descriptive names are a vital part of readable, and hence maintainable, code. Recent progress on automatically suggesting names for local variables tantalizes with the prospect of replicating that success with method and class names. However, suggesting names for methods and classes is much more difficult. This is because good method and class names need to be functionally descriptive, but suggesting such names requires that the model goes beyond local context. We introduce a neural probabilistic language model for source code that is specifically designed for the method naming problem. Our model learns which names are semantically similar by assigning them to locations, called embeddings, in a high-dimensional continuous space, in such a way that names with similar embeddings tend to be used in similar contexts. These embeddings seem to contain semantic information about tokens, even though they are learned only from statistical co-occurrences of tokens. Furthermore, we introduce a variant of our model that is, to our knowledge, the first that can propose neologisms, names that have not appeared in the training corpus. We obtain state of the art results on the method, class, and even the simpler variable naming tasks. More broadly, the continuous embeddings that are learned by our model have the potential for wide application within software engineering. @InProceedings{ESEC/FSE15p38, author = {Miltiadis Allamanis and Earl T. Barr and Christian Bird and Charles Sutton}, title = {Suggesting Accurate Method and Class Names}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {38--49}, doi = {}, year = {2015}, } Info |
|
Almorsy, Mohamed |
ESEC/FSE '15: "Rule-Based Extraction of Goal-Use ..."
Rule-Based Extraction of Goal-Use Case Models from Text
Tuong Huan Nguyen, John Grundy, and Mohamed Almorsy (Swinburne University of Technology, Australia) Goal and use case modeling has been recognized as a key approach for understanding and analyzing requirements. However, in practice, goals and use cases are often buried among other content in requirements specifications documents and written in unstructured styles. It is thus a time-consuming and error-prone process to identify such goals and use cases. In addition, having them embedded in natural language documents greatly limits the possibility of formally analyzing the requirements for problems. To address these issues, we have developed a novel rule-based approach to automatically extract goal and use case models from natural language requirements documents. Our approach is able to automatically categorize goals and ensure they are properly specified. We also provide automated semantic parameterization of artifact textual specifications to promote further analysis on the extracted goal-use case models. Our approach achieves 85% precision and 82% recall rates on average for model extraction and 88% accuracy for the automated parameterization. @InProceedings{ESEC/FSE15p591, author = {Tuong Huan Nguyen and John Grundy and Mohamed Almorsy}, title = {Rule-Based Extraction of Goal-Use Case Models from Text}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {591--601}, doi = {}, year = {2015}, } Info |
|
Apel, Sven |
ESEC/FSE '15: "Performance-Influence Models ..."
Performance-Influence Models for Highly Configurable Systems
Norbert Siegmund, Alexander Grebhahn, Sven Apel, and Christian Kästner (University of Passau, Germany; Carnegie Mellon University, USA) Almost every complex software system today is configurable. While configurability has many benefits, it challenges performance prediction, optimization, and debugging. Often, the influences of individual configuration options on performance are unknown. Worse, configuration options may interact, giving rise to a configuration space of possibly exponential size. Addressing this challenge, we propose an approach that derives a performance-influence model for a given configurable system, describing all relevant influences of configuration options and their interactions. Our approach combines machine-learning and sampling heuristics in a novel way. It improves over standard techniques in that it (1) represents influences of options and their interactions explicitly (which eases debugging), (2) smoothly integrates binary and numeric configuration options for the first time, (3) incorporates domain knowledge, if available (which eases learning and increases accuracy), (4) considers complex constraints among options, and (5) systematically reduces the solution space to a tractable size. A series of experiments demonstrates the feasibility of our approach in terms of the accuracy of the models learned as well as the accuracy of the performance predictions one can make with them. @InProceedings{ESEC/FSE15p284, author = {Norbert Siegmund and Alexander Grebhahn and Sven Apel and Christian Kästner}, title = {Performance-Influence Models for Highly Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {284--294}, doi = {}, year = {2015}, } Info |
|
Arcaini, Paolo |
ESEC/FSE '15: "Improving Model-Based Test ..."
Improving Model-Based Test Generation by Model Decomposition
Paolo Arcaini, Angelo Gargantini, and Elvinia Riccobene (Charles University in Prague, Czech Republic; University of Bergamo, Italy; University of Milan, Italy) One of the well-known techniques for model-based test generation exploits the capability of model checkers to return counterexamples upon property violations. However, this approach is not always optimal in practice due to the required time and memory, or even not feasible due to the state explosion problem of model checking. A way to mitigate these limitations consists in decomposing a system model into suitable subsystem models separately analyzable. In this paper, we show a technique to decompose a system model into subsystems by exploiting the model variables dependency, and then we propose a test generation approach which builds tests for the single subsystems and combines them later in order to obtain tests for the system as a whole. Such approach mitigates the exponential increase of the test generation time and memory consumption, and, compared with the same model-based test generation technique applied to the whole system, shows to be more efficient. We prove that, although not complete, the approach is sound. @InProceedings{ESEC/FSE15p119, author = {Paolo Arcaini and Angelo Gargantini and Elvinia Riccobene}, title = {Improving Model-Based Test Generation by Model Decomposition}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {119--130}, doi = {}, year = {2015}, } |
|
Arcuri, Andrea |
ESEC/FSE '15: "Generating TCP/UDP Network ..."
Generating TCP/UDP Network Data for Automated Unit Test Generation
Andrea Arcuri, Gordon Fraser, and Juan Pablo Galeotti (Scienta, Norway; University of Luxembourg, Luxembourg; University of Sheffield, UK; Saarland University, Germany) Although automated unit test generation techniques can in principle generate test suites that achieve high code coverage, in practice this is often inhibited by the dependence of the code under test on external resources. In particular, a common problem in modern programming languages is posed by code that involves networking (e.g., opening a TCP listening port). In order to generate tests for such code, we describe an approach where we mock (simulate) the networking interfaces of the Java standard library, such that a search-based test generator can treat the network as part of the test input space. This not only has the benefit that it overcomes many limitations of testing networking code (e.g., different tests binding to the same local ports, and deterministic resolution of hostnames and ephemeral ports), it also substantially increases code coverage. An evaluation on 23,886 classes from 110 open source projects, totalling more than 6.6 million lines of Java code, reveals that network access happens in 2,642 classes (11%). Our implementation of the proposed technique as part of the EVOSUITE testing tool addresses the networking code contained in 1,672 (63%) of these classes, and leads to an increase of the average line coverage from 29.1% to 50.8%. On a manual selection of 42 Java classes heavily depending on networking, line coverage with EVOSUITE more than doubled with the use of network mocking, increasing from 31.8% to 76.6%. @InProceedings{ESEC/FSE15p155, author = {Andrea Arcuri and Gordon Fraser and Juan Pablo Galeotti}, title = {Generating TCP/UDP Network Data for Automated Unit Test Generation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {155--165}, doi = {}, year = {2015}, } |
|
Aydin, Abdulbaki |
ESEC/FSE '15: "Automatically Computing Path ..."
Automatically Computing Path Complexity of Programs
Lucas Bang, Abdulbaki Aydin, and Tevfik Bultan (University of California at Santa Barbara, USA) Recent automated software testing techniques concentrate on achieving path coverage. We present a complexity measure that provides an upper bound for the number of paths in a program, and hence, can be used for assessing the difficulty of achieving path coverage for a given method. We define the path complexity of a program as a function that takes a depth bound as input and returns the number of paths in the control flow graph that are within that bound. We show how to automatically compute the path complexity function in closed form, and the asymptotic path complexity which identifies the dominant term in the path complexity function. Our results demonstrate that path complexity can be computed efficiently, and it is a better complexity measure for path coverage compared to cyclomatic complexity and NPATH complexity. @InProceedings{ESEC/FSE15p61, author = {Lucas Bang and Abdulbaki Aydin and Tevfik Bultan}, title = {Automatically Computing Path Complexity of Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {61--72}, doi = {}, year = {2015}, } Info |
|
Bae, Gigon |
ESEC/FSE '15: "On the Use of Delta Debugging ..."
On the Use of Delta Debugging to Reduce Recordings and Facilitate Debugging of Web Applications
Mouna Hammoudi, Brian Burg, Gigon Bae, and Gregg Rothermel (University of Nebraska-Lincoln, USA; University of Washington, USA) Recording the sequence of events that lead to a failure of a web application can be an effective aid for debugging. Nevertheless, a recording of an event sequence may include many events that are not related to a failure, and this may render debugging more difficult. To address this problem, we have adapted Delta Debugging to function on recordings of web applications, in a manner that lets it identify and discard portions of those recordings that do not influence the occurrence of a failure. We present the results of three empirical studies that show that (1) recording reduction can achieve significant reductions in recording size and replay time on actual web applications obtained from developer forums, (2) reduced recordings do in fact help programmers locate faults significantly more efficiently as, and no less effectively than non-reduced recordings, and (3) recording reduction produces even greater reductions on larger, more complex applications. @InProceedings{ESEC/FSE15p333, author = {Mouna Hammoudi and Brian Burg and Gigon Bae and Gregg Rothermel}, title = {On the Use of Delta Debugging to Reduce Recordings and Facilitate Debugging of Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {333--344}, doi = {}, year = {2015}, } Info |
|
Bang, Lucas |
ESEC/FSE '15: "Automatically Computing Path ..."
Automatically Computing Path Complexity of Programs
Lucas Bang, Abdulbaki Aydin, and Tevfik Bultan (University of California at Santa Barbara, USA) Recent automated software testing techniques concentrate on achieving path coverage. We present a complexity measure that provides an upper bound for the number of paths in a program, and hence, can be used for assessing the difficulty of achieving path coverage for a given method. We define the path complexity of a program as a function that takes a depth bound as input and returns the number of paths in the control flow graph that are within that bound. We show how to automatically compute the path complexity function in closed form, and the asymptotic path complexity which identifies the dominant term in the path complexity function. Our results demonstrate that path complexity can be computed efficiently, and it is a better complexity measure for path coverage compared to cyclomatic complexity and NPATH complexity. @InProceedings{ESEC/FSE15p61, author = {Lucas Bang and Abdulbaki Aydin and Tevfik Bultan}, title = {Automatically Computing Path Complexity of Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {61--72}, doi = {}, year = {2015}, } Info |
|
Barr, Earl T. |
ESEC/FSE '15: "Suggesting Accurate Method ..."
Suggesting Accurate Method and Class Names
Miltiadis Allamanis, Earl T. Barr , Christian Bird, and Charles Sutton (University of Edinburgh, UK; University College London, UK; Microsoft Research, USA) Descriptive names are a vital part of readable, and hence maintainable, code. Recent progress on automatically suggesting names for local variables tantalizes with the prospect of replicating that success with method and class names. However, suggesting names for methods and classes is much more difficult. This is because good method and class names need to be functionally descriptive, but suggesting such names requires that the model goes beyond local context. We introduce a neural probabilistic language model for source code that is specifically designed for the method naming problem. Our model learns which names are semantically similar by assigning them to locations, called embeddings, in a high-dimensional continuous space, in such a way that names with similar embeddings tend to be used in similar contexts. These embeddings seem to contain semantic information about tokens, even though they are learned only from statistical co-occurrences of tokens. Furthermore, we introduce a variant of our model that is, to our knowledge, the first that can propose neologisms, names that have not appeared in the training corpus. We obtain state of the art results on the method, class, and even the simpler variable naming tasks. More broadly, the continuous embeddings that are learned by our model have the potential for wide application within software engineering. @InProceedings{ESEC/FSE15p38, author = {Miltiadis Allamanis and Earl T. Barr and Christian Bird and Charles Sutton}, title = {Suggesting Accurate Method and Class Names}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {38--49}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "Is the Cure Worse Than the ..." Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair Edward K. Smith, Earl T. Barr , Claire Le Goues , and Yuriy Brun (University of Massachusetts at Amherst, USA; University College London, UK; Carnegie Mellon University, USA; University of Massachusetts, USA) Automated program repair has shown promise for reducing the significant manual effort debugging requires. This paper addresses a deficit of earlier evaluations of automated repair techniques caused by repairing programs and evaluating generated patches' correctness using the same set of tests. Since tests are an imperfect metric of program correctness, evaluations of this type do not discriminate between correct patches and patches that overfit the available tests and break untested but desired functionality. This paper evaluates two well-studied repair tools, GenProg and TrpAutoRepair, on a publicly available benchmark of bugs, each with a human-written patch. By evaluating patches using tests independent from those used during repair, we find that the tools are unlikely to improve the proportion of independent tests passed, and that the quality of the patches is proportional to the coverage of the test suite used during repair. For programs that pass most tests, the tools are as likely to break tests as to fix them. However, novice developers also overfit, and automated repair performs no worse than these developers. In addition to overfitting, we measure the effects of test suite coverage, test suite provenance, and starting program quality, as well as the difference in quality between novice-developer-written and tool-generated patches when quality is assessed with a test suite independent from the one used for patch generation. @InProceedings{ESEC/FSE15p532, author = {Edward K. Smith and Earl T. Barr and Claire Le Goues and Yuriy Brun}, title = {Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {532--543}, doi = {}, year = {2015}, } |
|
Bavota, Gabriele |
ESEC/FSE '15: "Query-Based Configuration ..."
Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks
Laura Moreno, Gabriele Bavota, Sonia Haiduc, Massimiliano Di Penta, Rocco Oliveto, Barbara Russo, and Andrian Marcus (University of Texas at Dallas, USA; Free University of Bolzano, Italy; Florida State University, USA; University of Sannio, Italy; University of Molise, Italy) Text Retrieval (TR) approaches have been used to leverage the textual information contained in software artifacts to address a multitude of software engineering (SE) tasks. However, TR approaches need to be configured properly in order to lead to good results. Current approaches for automatic TR configuration in SE configure a single TR approach and then use it for all possible queries. In this paper, we show that such a configuration strategy leads to suboptimal results, and propose QUEST, the first approach bringing TR configuration selection to the query level. QUEST recommends the best TR configuration for a given query, based on a supervised learning approach that determines the TR configuration that performs the best for each query according to its properties. We evaluated QUEST in the context of feature and bug localization, using a data set with more than 1,000 queries. We found that QUEST is able to recommend one of the top three TR configurations for a query with a 69% accuracy, on average. We compared the results obtained with the configurations recommended by QUEST for every query with those obtained using a single TR configuration for all queries in a system and in the entire data set. We found that using QUEST we obtain better results than with any of the considered TR configurations. @InProceedings{ESEC/FSE15p567, author = {Laura Moreno and Gabriele Bavota and Sonia Haiduc and Massimiliano Di Penta and Rocco Oliveto and Barbara Russo and Andrian Marcus}, title = {Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {567--578}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "Optimizing Energy Consumption ..." Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach Mario Linares-Vásquez, Gabriele Bavota, Carlos Eduardo Bernal Cárdenas, Rocco Oliveto, Massimiliano Di Penta, and Denys Poshyvanyk (College of William and Mary, USA; Free University of Bolzano, Italy; University of Molise, Italy; University of Sannio, Italy) The wide diffusion of mobile devices has motivated research towards optimizing energy consumption of software systems— including apps—targeting such devices. Besides efforts aimed at dealing with various kinds of energy bugs, the adoption of Organic Light-Emitting Diode (OLED) screens has motivated research towards reducing energy consumption by choosing an appropriate color palette. Whilst past research in this area aimed at optimizing energy while keeping an acceptable level of contrast, this paper proposes an approach, named GEMMA (Gui Energy Multi-objective optiMization for Android apps), for generating color palettes using a multi- objective optimization technique, which produces color solutions optimizing energy consumption and contrast while using consistent colors with respect to the original color palette. An empirical evaluation that we performed on 25 Android apps demonstrates not only significant improvements in terms of the three different objectives, but also confirmed that in most cases users still perceived the choices of colors as attractive. Finally, for several apps we interviewed the original developers, who in some cases expressed the intent to adopt the proposed choice of color palette, whereas in other cases pointed out directions for future improvements @InProceedings{ESEC/FSE15p143, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Eduardo Bernal Cárdenas and Rocco Oliveto and Massimiliano Di Penta and Denys Poshyvanyk}, title = {Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {143--154}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Behrang, Farnaz |
ESEC/FSE '15: "Users Beware: Preference Inconsistencies ..."
Users Beware: Preference Inconsistencies Ahead
Farnaz Behrang, Myra B. Cohen, and Alessandro Orso (Georgia Tech, USA; University of Nebraska-Lincoln, USA) The structure of preferences for modern highly-configurable software systems has become extremely complex, usually consisting of multiple layers of access that go from the user interface down to the lowest levels of the source code. This complexity can lead to inconsistencies between layers, especially during software evolution. For example, there may be preferences that users can change through the GUI, but that have no effect on the actual behavior of the system because the related source code is not present or has been removed going from one version to the next. These inconsistencies may result in unexpected program behaviors, which range in severity from mild annoyances to more critical security or performance problems. To address this problem, we present SCIC (Software Configuration Inconsistency Checker), a static analysis technique that can automatically detect these kinds of inconsistencies. Unlike other configuration analysis tools, SCIC can handle software that (1) is written in multiple programming languages and (2) has a complex preference structure. In an empirical evaluation that we performed on 10 years worth of versions of both the widely used Mozilla Core and Firefox, SCIC was able to find 40 real inconsistencies (some determined as severe), whose lifetime spanned multiple versions, and whose detection required the analysis of code written in multiple languages. @InProceedings{ESEC/FSE15p295, author = {Farnaz Behrang and Myra B. Cohen and Alessandro Orso}, title = {Users Beware: Preference Inconsistencies Ahead}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {295--306}, doi = {}, year = {2015}, } Best-Paper Award |
|
Bell, Jonathan |
ESEC/FSE '15: "Efficient Dependency Detection ..."
Efficient Dependency Detection for Safe Java Test Acceleration
Jonathan Bell, Gail Kaiser , Eric Melski, and Mohan Dattatreya (Columbia University, USA; Electric Cloud, USA) Slow builds remain a plague for software developers. The frequency with which code can be built (compiled, tested and packaged) directly impacts the productivity of developers: longer build times mean a longer wait before determining if a change to the application being built was successful. We have discovered that in the case of some languages, such as Java, the majority of build time is spent running tests, where dependencies between individual tests are complicated to discover, making many existing test acceleration techniques unsound to deploy in practice. Without knowledge of which tests are dependent on others, we cannot safely parallelize the execution of the tests, nor can we perform incremental testing (i.e., execute only a subset of an application's tests for each build). The previous techniques for detecting these dependencies did not scale to large test suites: given a test suite that normally ran in two hours, the best-case running scenario for the previous tool would have taken over 422 CPU days to find dependencies between all test methods (and would not soundly find all dependencies) — on the same project the exhaustive technique (to find all dependencies) would have taken over 1e300 years. We present a novel approach to detecting all dependencies between test cases in large projects that can enable safe exploitation of parallelism and test selection with a modest analysis cost. @InProceedings{ESEC/FSE15p770, author = {Jonathan Bell and Gail Kaiser and Eric Melski and Mohan Dattatreya}, title = {Efficient Dependency Detection for Safe Java Test Acceleration}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {770--781}, doi = {}, year = {2015}, } |
|
Beller, Moritz |
ESEC/FSE '15: "When, How, and Why Developers ..."
When, How, and Why Developers (Do Not) Test in Their IDEs
Moritz Beller, Georgios Gousios, Annibale Panichella , and Andy Zaidman (Delft University of Technology, Netherlands; Radboud University Nijmegen, Netherlands) The research community in Software Engineering and Software Testing in particular builds many of its contributions on a set of mutually shared expectations. Despite the fact that they form the basis of many publications as well as open-source and commercial testing applications, these common expectations and beliefs are rarely ever questioned. For example, Frederic Brooks’ statement that testing takes half of the development time seems to have manifested itself within the community since he first made it in the “Mythical Man Month” in 1975. With this paper, we report on the surprising results of a large-scale field study with 416 software engineers whose development activity we closely monitored over the course of five months, resulting in over 13 years of recorded work time in their integrated development environments (IDEs). Our findings question several commonly shared assumptions and beliefs about testing and might be contributing factors to the observed bug proneness of software in practice: the majority of developers in our study does not test; developers rarely run their tests in the IDE; Test-Driven Development (TDD) is not widely practiced; and, last but not least, software developers only spend a quarter of their work time engineering tests, whereas they think they test half of their time. @InProceedings{ESEC/FSE15p179, author = {Moritz Beller and Georgios Gousios and Annibale Panichella and Andy Zaidman}, title = {When, How, and Why Developers (Do Not) Test in Their IDEs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {179--190}, doi = {}, year = {2015}, } |
|
Bellomo, Stephany |
ESEC/FSE '15: "Measure It? Manage It? Ignore ..."
Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt
Neil A. Ernst, Stephany Bellomo, Ipek Ozkaya , Robert L. Nord, and Ian Gorton (SEI, USA) The technical debt metaphor is widely used to encapsulate numerous software quality problems. The metaphor is attractive to practitioners as it communicates to both technical and nontechnical audiences that if quality problems are not addressed, things may get worse. However, it is unclear whether there are practices that move this metaphor beyond a mere communication mechanism. Existing studies of technical debt have largely focused on code metrics and small surveys of developers. In this paper, we report on our survey of 1,831 participants, primarily software engineers and architects working in long-lived, software-intensive projects from three large organizations, and follow-up interviews of seven software engineers. We analyzed our data using both nonparametric statistics and qualitative text analysis. We found that architectural decisions are the most important source of technical debt. Furthermore, while respondents believe the metaphor is itself important for communication, existing tools are not currently helpful in managing the details. We use our results to motivate a technical debt timeline to focus management and tooling approaches. @InProceedings{ESEC/FSE15p50, author = {Neil A. Ernst and Stephany Bellomo and Ipek Ozkaya and Robert L. Nord and Ian Gorton}, title = {Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {50--60}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Bernal-Cárdenas, Carlos |
ESEC/FSE '15: "Auto-completing Bug Reports ..."
Auto-completing Bug Reports for Android Applications
Kevin Moran, Mario Linares-Vásquez, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (College of William and Mary, USA) The modern software development landscape has seen a shift in focus toward mobile applications as tablets and smartphones near ubiquitous adoption. Due to this trend, the complexity of these “apps” has been increasing, making development and maintenance challenging. Additionally, current bug tracking systems are not able to effectively support construction of reports with actionable information that directly lead to a bug’s resolution. To address the need for an improved reporting system, we introduce a novel solution, called FUSION, that helps users auto-complete reproduction steps in bug reports for mobile apps. FUSION links user-provided information to program artifacts extracted through static and dynamic analysis performed before testing or release. The approach that FUSION employs is generalizable to other current mobile software platforms, and constitutes a new method by which off-device bug reporting can be conducted for mobile software projects. In a study involving 28 participants we applied FUSION to support the maintenance tasks of reporting and reproducing defects from 15 real-world bugs found in 14 open source Android apps while qualitatively and qualitatively measuring the user experience of the system. Our results demonstrate that FUSION both effectively facilitates reporting and allows for more reliable reproduction of bugs from reports compared to traditional issue tracking systems by presenting more detailed contextual app information. @InProceedings{ESEC/FSE15p673, author = {Kevin Moran and Mario Linares-Vásquez and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Auto-completing Bug Reports for Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {673--686}, doi = {}, year = {2015}, } Video Info |
|
Beyer, Dirk |
ESEC/FSE '15: "Witness Validation and Stepwise ..."
Witness Validation and Stepwise Testification across Software Verifiers
Dirk Beyer , Matthias Dangl, Daniel Dietsch, Matthias Heizmann, and Andreas Stahlbauer (University of Passau, Germany; University of Freiburg, Germany) It is commonly understood that a verification tool should provide a counterexample to witness a specification violation. Until recently, software verifiers dumped error witnesses in proprietary formats, which are often neither human- nor machine-readable, and an exchange of witnesses between different verifiers was impossible. To close this gap in software-verification technology, we have defined an exchange format for error witnesses that is easy to write and read by verification tools (for further processing, e.g., witness validation) and that is easy to convert into visualizations that conveniently let developers inspect an error path. To eliminate manual inspection of false alarms, we develop the notion of stepwise testification: in a first step, a verifier finds a problematic program path and, in addition to the verification result FALSE, constructs a witness for this path; in the next step, another verifier re-verifies that the witness indeed violates the specification. This process can have more than two steps, each reducing the state space around the error path, making it easier to validate the witness in a later step. An obvious application for testification is the setting where we have two verifiers: one that is efficient but imprecise and another one that is precise but expensive. We have implemented the technique of error-witness-driven program analysis in two state-of-the-art verification tools, CPAchecker and Ultimate Automizer, and show by experimental evaluation that the approach is applicable to a large set of verification tasks. @InProceedings{ESEC/FSE15p721, author = {Dirk Beyer and Matthias Dangl and Daniel Dietsch and Matthias Heizmann and Andreas Stahlbauer}, title = {Witness Validation and Stepwise Testification across Software Verifiers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {721--733}, doi = {}, year = {2015}, } Info |
|
Bianchi, Antonio |
ESEC/FSE '15: "CLAPP: Characterizing Loops ..."
CLAPP: Characterizing Loops in Android Applications
Yanick Fratantonio, Aravind Machiry, Antonio Bianchi, Christopher Kruegel, and Giovanni Vigna (University of California at Santa Barbara, USA) When performing program analysis, loops are one of the most important aspects that needs to be taken into account. In the past, many approaches have been proposed to analyze loops to perform different tasks, ranging from compiler optimizations to Worst-Case Execution Time (WCET) analysis. While these approaches are powerful, they focus on tackling very specific categories of loops and known loop patterns, such as the ones for which the number of iterations can be statically determined. In this work, we developed a static analysis framework to characterize and analyze generic loops, without relying on techniques based on pattern matching. For this work, we focus on the Android platform, and we implemented a prototype, called CLAPP, that we used to perform the first large-scale empirical study of the usage of loops in Android applications. In particular, we used our tool to analyze a total of 4,110,510 loops found in 11,823 Android applications. As part of our evaluation, we provide the detailed results of our empirical study, we show how our analysis was able to determine that the execution of 63.28% of the loops is bounded, and we discuss several interesting insights related to the performance issues and security aspects associated with loops. @InProceedings{ESEC/FSE15p687, author = {Yanick Fratantonio and Aravind Machiry and Antonio Bianchi and Christopher Kruegel and Giovanni Vigna}, title = {CLAPP: Characterizing Loops in Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {687--697}, doi = {}, year = {2015}, } Info |
|
Bird, Christian |
ESEC/FSE '15: "Suggesting Accurate Method ..."
Suggesting Accurate Method and Class Names
Miltiadis Allamanis, Earl T. Barr , Christian Bird, and Charles Sutton (University of Edinburgh, UK; University College London, UK; Microsoft Research, USA) Descriptive names are a vital part of readable, and hence maintainable, code. Recent progress on automatically suggesting names for local variables tantalizes with the prospect of replicating that success with method and class names. However, suggesting names for methods and classes is much more difficult. This is because good method and class names need to be functionally descriptive, but suggesting such names requires that the model goes beyond local context. We introduce a neural probabilistic language model for source code that is specifically designed for the method naming problem. Our model learns which names are semantically similar by assigning them to locations, called embeddings, in a high-dimensional continuous space, in such a way that names with similar embeddings tend to be used in similar contexts. These embeddings seem to contain semantic information about tokens, even though they are learned only from statistical co-occurrences of tokens. Furthermore, we introduce a variant of our model that is, to our knowledge, the first that can propose neologisms, names that have not appeared in the training corpus. We obtain state of the art results on the method, class, and even the simpler variable naming tasks. More broadly, the continuous embeddings that are learned by our model have the potential for wide application within software engineering. @InProceedings{ESEC/FSE15p38, author = {Miltiadis Allamanis and Earl T. Barr and Christian Bird and Charles Sutton}, title = {Suggesting Accurate Method and Class Names}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {38--49}, doi = {}, year = {2015}, } Info |
|
Blanc, Xavier |
ESEC/FSE '15: "Impact of Developer Turnover ..."
Impact of Developer Turnover on Quality in Open-Source Software
Matthieu Foucault, Marc Palyart, Xavier Blanc, Gail C. Murphy, and Jean-Rémy Falleri (University of Bordeaux, France; University of British Columbia, Canada) Turnover is the phenomenon of continuous influx and retreat of human resources in a team. Despite being well-studied in many settings, turnover has not been characterized for open-source software projects. We study the source code repositories of five open-source projects to characterize patterns of turnover and to determine the effects of turnover on software quality. We define the base concepts of both external and internal turnover, which are the mobility of developers in and out of a project, and the mobility of developers inside a project, respectively. We provide a qualitative analysis of turnover patterns. We also found, in a quantitative analysis, that the activity of external newcomers negatively impact software quality. @InProceedings{ESEC/FSE15p829, author = {Matthieu Foucault and Marc Palyart and Xavier Blanc and Gail C. Murphy and Jean-Rémy Falleri}, title = {Impact of Developer Turnover on Quality in Open-Source Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {829--841}, doi = {}, year = {2015}, } Info |
|
Borges, Mateus |
ESEC/FSE '15: "Iterative Distribution-Aware ..."
Iterative Distribution-Aware Sampling for Probabilistic Symbolic Execution
Mateus Borges, Antonio Filieri, Marcelo d'Amorim, and Corina S. Păsăreanu (University of Stuttgart, Germany; Federal University of Pernambuco, Brazil; Carnegie Mellon University, USA; NASA Ames Research Center, USA) Probabilistic symbolic execution aims at quantifying the probability of reaching program events of interest assuming that program inputs follow given probabilistic distributions. The technique collects constraints on the inputs that lead to the target events and analyzes them to quantify how likely it is for an input to satisfy the constraints. Current techniques either handle only linear constraints or only support continuous distributions using a “discretization” of the input domain, leading to imprecise and costly results. We propose an iterative distribution-aware sampling approach to support probabilistic symbolic execution for arbitrarily complex mathematical constraints and continuous input distributions. We follow a compositional approach, where the symbolic constraints are decomposed into sub-problems whose solution can be solved independently. At each iteration the convergence rate of the com- putation is increased by automatically refocusing the analysis on estimating the sub-problems that mostly affect the accuracy of the results, as guided by three different ranking strategies. Experiments on publicly available benchmarks show that the proposed technique improves on previous approaches in terms of scalability and accuracy of the results. @InProceedings{ESEC/FSE15p866, author = {Mateus Borges and Antonio Filieri and Marcelo d'Amorim and Corina S. Păsăreanu}, title = {Iterative Distribution-Aware Sampling for Probabilistic Symbolic Execution}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {866--877}, doi = {}, year = {2015}, } Info |
|
Braione, Pietro |
ESEC/FSE '15: "Symbolic Execution of Programs ..."
Symbolic Execution of Programs with Heap Inputs
Pietro Braione, Giovanni Denaro, and Mauro Pezzè (University of Milano-Bicocca, Italy; University of Lugano, Switzerland) Symbolic analysis is a core component of many automatic test generation and program verication approaches. To verify complex software systems, test and analysis techniques shall deal with the many aspects of the target systems at different granularity levels. In particular, testing software programs that make extensive use of heap data structures at unit and integration levels requires generating suitable input data structures in the heap. This is a main challenge for symbolic testing and analysis techniques that work well when dealing with numeric inputs, but do not satisfactorily cope with heap data structures yet. In this paper we propose a language HEX to specify invariants of partially initialized data structures, and a decision procedure that supports the incremental evaluation of structural properties in HEX. Used in combination with the symbolic execution of heap manipulating programs, HEX prevents the exploration of invalid states, thus improving the eefficiency of program testing and analysis, and avoiding false alarms that negatively impact on verication activities. The experimental data conrm that HEX is an effective and efficient solution to the problem of testing and analyzing heap manipulating programs, and outperforms the alternative approaches that have been proposed so far. @InProceedings{ESEC/FSE15p602, author = {Pietro Braione and Giovanni Denaro and Mauro Pezzè}, title = {Symbolic Execution of Programs with Heap Inputs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {602--613}, doi = {}, year = {2015}, } |
|
Briand, Lionel C. |
ESEC/FSE '15: "Effective Test Suites for ..."
Effective Test Suites for Mixed Discrete-Continuous Stateflow Controllers
Reza Matinnejad, Shiva Nejati, Lionel C. Briand , and Thomas Bruckmann (University of Luxembourg, Luxembourg; Delphi Automotive Systems, Luxembourg) Modeling mixed discrete-continuous controllers using Stateflow is common practice and has a long tradition in the embedded software system industry. Testing Stateflow models is complicated by expensive and manual test oracles that are not amenable to full automation due to the complex continuous behaviors of such models. In this paper, we reduce the cost of manual test oracles by providing test case selection algorithms that help engineers develop small test suites with high fault revealing power for Stateflow models. We present six test selection algorithms for discrete-continuous Stateflows: An adaptive random test selection algorithm that diversifies test inputs, two white-box coverage-based algorithms, a black-box algorithm that diversifies test outputs, and two search-based black-box algorithms that aim to maximize the likelihood of presence of continuous output failure patterns. We evaluate and compare our test selection algorithms, and find that our three output-based algorithms consistently outperform the coverage- and input-based algorithms in revealing faults in discrete-continuous Stateflow models. Further, we show that our output-based algorithms are complementary as the two search-based algorithms perform best in revealing specific failures with small test suites, while the output diversity algorithm is able to identify different failure types better than other algorithms when test suites are above a certain size. @InProceedings{ESEC/FSE15p84, author = {Reza Matinnejad and Shiva Nejati and Lionel C. Briand and Thomas Bruckmann}, title = {Effective Test Suites for Mixed Discrete-Continuous Stateflow Controllers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {84--95}, doi = {}, year = {2015}, } Best-Paper Award |
|
Bruckmann, Thomas |
ESEC/FSE '15: "Effective Test Suites for ..."
Effective Test Suites for Mixed Discrete-Continuous Stateflow Controllers
Reza Matinnejad, Shiva Nejati, Lionel C. Briand , and Thomas Bruckmann (University of Luxembourg, Luxembourg; Delphi Automotive Systems, Luxembourg) Modeling mixed discrete-continuous controllers using Stateflow is common practice and has a long tradition in the embedded software system industry. Testing Stateflow models is complicated by expensive and manual test oracles that are not amenable to full automation due to the complex continuous behaviors of such models. In this paper, we reduce the cost of manual test oracles by providing test case selection algorithms that help engineers develop small test suites with high fault revealing power for Stateflow models. We present six test selection algorithms for discrete-continuous Stateflows: An adaptive random test selection algorithm that diversifies test inputs, two white-box coverage-based algorithms, a black-box algorithm that diversifies test outputs, and two search-based black-box algorithms that aim to maximize the likelihood of presence of continuous output failure patterns. We evaluate and compare our test selection algorithms, and find that our three output-based algorithms consistently outperform the coverage- and input-based algorithms in revealing faults in discrete-continuous Stateflow models. Further, we show that our output-based algorithms are complementary as the two search-based algorithms perform best in revealing specific failures with small test suites, while the output diversity algorithm is able to identify different failure types better than other algorithms when test suites are above a certain size. @InProceedings{ESEC/FSE15p84, author = {Reza Matinnejad and Shiva Nejati and Lionel C. Briand and Thomas Bruckmann}, title = {Effective Test Suites for Mixed Discrete-Continuous Stateflow Controllers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {84--95}, doi = {}, year = {2015}, } Best-Paper Award |
|
Brumley, David |
ESEC/FSE '15: "Automatically Deriving Pointer ..."
Automatically Deriving Pointer Reference Expressions from Binary Code for Memory Dump Analysis
Yangchun Fu, Zhiqiang Lin, and David Brumley (University of Texas at Dallas, USA; Carnegie Mellon University, USA) Given a crash dump or a kernel memory snapshot, it is often desirable to have a capability that can traverse its pointers to locate the root cause of the crash, or check their integrity to detect the control flow hijacks. To achieve this, one key challenge lies in how to locate where the pointers are. While locating a pointer usually requires the data structure knowledge of the corresponding program, an important advance made by this work is that we show a technique of extracting address-independent data reference expressions for pointers through dynamic binary analysis. This novel pointer reference expression encodes how a pointer is accessed through the combination of a base address (usually a global variable) with certain offset and further pointer dereferences. We have applied our techniques to OS kernels, and our experimental results with a number of real world kernel malware show that we can correctly identify the hijacked kernel function pointers by locating them using the extracted pointer reference expressions when only given a memory snapshot. @InProceedings{ESEC/FSE15p614, author = {Yangchun Fu and Zhiqiang Lin and David Brumley}, title = {Automatically Deriving Pointer Reference Expressions from Binary Code for Memory Dump Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {614--624}, doi = {}, year = {2015}, } |
|
Brun, Yuriy |
ESEC/FSE '15: "Is the Cure Worse Than the ..."
Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair
Edward K. Smith, Earl T. Barr , Claire Le Goues , and Yuriy Brun (University of Massachusetts at Amherst, USA; University College London, UK; Carnegie Mellon University, USA; University of Massachusetts, USA) Automated program repair has shown promise for reducing the significant manual effort debugging requires. This paper addresses a deficit of earlier evaluations of automated repair techniques caused by repairing programs and evaluating generated patches' correctness using the same set of tests. Since tests are an imperfect metric of program correctness, evaluations of this type do not discriminate between correct patches and patches that overfit the available tests and break untested but desired functionality. This paper evaluates two well-studied repair tools, GenProg and TrpAutoRepair, on a publicly available benchmark of bugs, each with a human-written patch. By evaluating patches using tests independent from those used during repair, we find that the tools are unlikely to improve the proportion of independent tests passed, and that the quality of the patches is proportional to the coverage of the test suite used during repair. For programs that pass most tests, the tools are as likely to break tests as to fix them. However, novice developers also overfit, and automated repair performs no worse than these developers. In addition to overfitting, we measure the effects of test suite coverage, test suite provenance, and starting program quality, as well as the difference in quality between novice-developer-written and tool-generated patches when quality is assessed with a test suite independent from the one used for patch generation. @InProceedings{ESEC/FSE15p532, author = {Edward K. Smith and Earl T. Barr and Claire Le Goues and Yuriy Brun}, title = {Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {532--543}, doi = {}, year = {2015}, } |
|
Budianto, Enrico |
ESEC/FSE '15: "Auto-patching DOM-Based XSS ..."
Auto-patching DOM-Based XSS at Scale
Inian Parameshwaran, Enrico Budianto, Shweta Shinde, Hung Dang, Atul Sadhu, and Prateek Saxena (National University of Singapore, Singapore) DOM-based cross-site scripting (XSS) is a client-side code injection vulnerability that results from unsafe dynamic code generation in JavaScript applications, and has few known practical defenses. We study dynamic code evaluation practices on nearly a quarter million URLs crawled starting from the the Alexa Top 1000 websites. Of 777,082 cases of dynamic HTML/JS code generation we observe, 13.3% use unsafe string interpolation for dynamic code generation — a well-known dangerous coding practice. To remedy this, we propose a technique to generate secure patches that replace unsafe string interpolation with safer code that utilizes programmatic DOM construction techniques. Our system transparently auto-patches the vulnerable site while incurring only 5.2 − 8.07% overhead. The patching mechanism requires no access to server-side code or modification to browsers, and thus is practical as a turnkey defense. @InProceedings{ESEC/FSE15p272, author = {Inian Parameshwaran and Enrico Budianto and Shweta Shinde and Hung Dang and Atul Sadhu and Prateek Saxena}, title = {Auto-patching DOM-Based XSS at Scale}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {272--283}, doi = {}, year = {2015}, } Info |
|
Bultan, Tevfik |
ESEC/FSE '15: "Automatically Computing Path ..."
Automatically Computing Path Complexity of Programs
Lucas Bang, Abdulbaki Aydin, and Tevfik Bultan (University of California at Santa Barbara, USA) Recent automated software testing techniques concentrate on achieving path coverage. We present a complexity measure that provides an upper bound for the number of paths in a program, and hence, can be used for assessing the difficulty of achieving path coverage for a given method. We define the path complexity of a program as a function that takes a depth bound as input and returns the number of paths in the control flow graph that are within that bound. We show how to automatically compute the path complexity function in closed form, and the asymptotic path complexity which identifies the dominant term in the path complexity function. Our results demonstrate that path complexity can be computed efficiently, and it is a better complexity measure for path coverage compared to cyclomatic complexity and NPATH complexity. @InProceedings{ESEC/FSE15p61, author = {Lucas Bang and Abdulbaki Aydin and Tevfik Bultan}, title = {Automatically Computing Path Complexity of Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {61--72}, doi = {}, year = {2015}, } Info |
|
Burg, Brian |
ESEC/FSE '15: "On the Use of Delta Debugging ..."
On the Use of Delta Debugging to Reduce Recordings and Facilitate Debugging of Web Applications
Mouna Hammoudi, Brian Burg, Gigon Bae, and Gregg Rothermel (University of Nebraska-Lincoln, USA; University of Washington, USA) Recording the sequence of events that lead to a failure of a web application can be an effective aid for debugging. Nevertheless, a recording of an event sequence may include many events that are not related to a failure, and this may render debugging more difficult. To address this problem, we have adapted Delta Debugging to function on recordings of web applications, in a manner that lets it identify and discard portions of those recordings that do not influence the occurrence of a failure. We present the results of three empirical studies that show that (1) recording reduction can achieve significant reductions in recording size and replay time on actual web applications obtained from developer forums, (2) reduced recordings do in fact help programmers locate faults significantly more efficiently as, and no less effectively than non-reduced recordings, and (3) recording reduction produces even greater reductions on larger, more complex applications. @InProceedings{ESEC/FSE15p333, author = {Mouna Hammoudi and Brian Burg and Gigon Bae and Gregg Rothermel}, title = {On the Use of Delta Debugging to Reduce Recordings and Facilitate Debugging of Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {333--344}, doi = {}, year = {2015}, } Info |
|
Cai, Yan |
ESEC/FSE '15: "Effective and Precise Dynamic ..."
Effective and Precise Dynamic Detection of Hidden Races for Java Programs
Yan Cai and Lingwei Cao (Institute of Software at Chinese Academy of Sciences, China) Happens-before relation is widely used to detect data races dynami-cally. However, it could easily hide many data races as it is inter-leaving sensitive. Existing techniques based on randomized sched-uling are ineffective on detecting these hidden races. In this paper, we propose DrFinder, an effective and precise dynamic technique to detect hidden races. Given an execution, DrFinder firstly analyz-es the lock acquisitions in it and collects a set of "may-trigger" relations. Each may-trigger relation consists of a method and a type of a Java object. It indicates that, during execution, the method may directly or indirectly acquire a lock of the type. In the subsequent executions of the same program, DrFinder actively schedules the execution according to the set of collected may-trigger relations. It aims to reverse the set of happens-before relation that may exist in the previous executions so as to expose those hidden races. To effectively detect hidden races in each execution, DrFinder also collects a new set of may-trigger relation during its scheduling, which is used in its next scheduling. Our experiment on a suite of real-world Java multithreaded programs shows that DrFinder is effective to detect 89 new data races in 10 runs. Many of these races could not be detected by existing techniques (i.e., FastTrack, ConTest, and PCT) even in 100 runs. @InProceedings{ESEC/FSE15p450, author = {Yan Cai and Lingwei Cao}, title = {Effective and Precise Dynamic Detection of Hidden Races for Java Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {450--461}, doi = {}, year = {2015}, } |
|
Cámara, Javier |
ESEC/FSE '15: "Proactive Self-Adaptation ..."
Proactive Self-Adaptation under Uncertainty: A Probabilistic Model Checking Approach
Gabriel A. Moreno, Javier Cámara, David Garlan, and Bradley Schmerl (SEI, USA; Carnegie Mellon University, USA) Self-adaptive systems tend to be reactive and myopic, adapting in response to changes without anticipating what the subsequent adaptation needs will be. Adapting reactively can result in inefficiencies due to the system performing a suboptimal sequence of adaptations. Furthermore, when adaptations have latency, and take some time to produce their effect, they have to be started with sufficient lead time so that they complete by the time their effect is needed. Proactive latency-aware adaptation addresses these issues by making adaptation decisions with a look-ahead horizon and taking adaptation latency into account. In this paper we present an approach for proactive latency-aware adaptation under uncertainty that uses probabilistic model checking for adaptation decisions. The key idea is to use a formal model of the adaptive system in which the adaptation decision is left underspecified through nondeterminism, and have the model checker resolve the nondeterministic choices so that the accumulated utility over the horizon is maximized. The adaptation decision is optimal over the horizon, and takes into account the inherent uncertainty of the environment predictions needed for looking ahead. Our results show that the decision based on a look-ahead horizon, and the factoring of both tactic latency and environment uncertainty, considerably improve the effectiveness of adaptation decisions. @InProceedings{ESEC/FSE15p1, author = {Gabriel A. Moreno and Javier Cámara and David Garlan and Bradley Schmerl}, title = {Proactive Self-Adaptation under Uncertainty: A Probabilistic Model Checking Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1--12}, doi = {}, year = {2015}, } |
|
Campos, José |
ESEC/FSE '15: "Modeling Readability to Improve ..."
Modeling Readability to Improve Unit Tests
Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer (University of Sheffield, UK; University of Virginia, USA) Writing good unit tests can be tedious and error prone, but even once they are written, the job is not done: Developers need to reason about unit tests throughout software development and evolution, in order to diagnose test failures, maintain the tests, and to understand code written by other developers. Unreadable tests are more difficult to maintain and lose some of their value to developers. To overcome this problem, we propose a domain-specific model of unit test readability based on human judgements, and use this model to augment automated unit test generation. The resulting approach can automatically generate test suites with both high coverage and also improved readability. In human studies users prefer our improved tests and are able to answer maintenance questions about them 14% more quickly at the same level of accuracy. @InProceedings{ESEC/FSE15p107, author = {Ermira Daka and José Campos and Gordon Fraser and Jonathan Dorn and Westley Weimer}, title = {Modeling Readability to Improve Unit Tests}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {107--118}, doi = {}, year = {2015}, } Best-Paper Award |
|
Cao, Lingwei |
ESEC/FSE '15: "Effective and Precise Dynamic ..."
Effective and Precise Dynamic Detection of Hidden Races for Java Programs
Yan Cai and Lingwei Cao (Institute of Software at Chinese Academy of Sciences, China) Happens-before relation is widely used to detect data races dynami-cally. However, it could easily hide many data races as it is inter-leaving sensitive. Existing techniques based on randomized sched-uling are ineffective on detecting these hidden races. In this paper, we propose DrFinder, an effective and precise dynamic technique to detect hidden races. Given an execution, DrFinder firstly analyz-es the lock acquisitions in it and collects a set of "may-trigger" relations. Each may-trigger relation consists of a method and a type of a Java object. It indicates that, during execution, the method may directly or indirectly acquire a lock of the type. In the subsequent executions of the same program, DrFinder actively schedules the execution according to the set of collected may-trigger relations. It aims to reverse the set of happens-before relation that may exist in the previous executions so as to expose those hidden races. To effectively detect hidden races in each execution, DrFinder also collects a new set of may-trigger relation during its scheduling, which is used in its next scheduling. Our experiment on a suite of real-world Java multithreaded programs shows that DrFinder is effective to detect 89 new data races in 10 runs. Many of these races could not be detected by existing techniques (i.e., FastTrack, ConTest, and PCT) even in 100 runs. @InProceedings{ESEC/FSE15p450, author = {Yan Cai and Lingwei Cao}, title = {Effective and Precise Dynamic Detection of Hidden Races for Java Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {450--461}, doi = {}, year = {2015}, } |
|
Cárdenas, Carlos Eduardo Bernal |
ESEC/FSE '15: "Optimizing Energy Consumption ..."
Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach
Mario Linares-Vásquez, Gabriele Bavota, Carlos Eduardo Bernal Cárdenas, Rocco Oliveto, Massimiliano Di Penta, and Denys Poshyvanyk (College of William and Mary, USA; Free University of Bolzano, Italy; University of Molise, Italy; University of Sannio, Italy) The wide diffusion of mobile devices has motivated research towards optimizing energy consumption of software systems— including apps—targeting such devices. Besides efforts aimed at dealing with various kinds of energy bugs, the adoption of Organic Light-Emitting Diode (OLED) screens has motivated research towards reducing energy consumption by choosing an appropriate color palette. Whilst past research in this area aimed at optimizing energy while keeping an acceptable level of contrast, this paper proposes an approach, named GEMMA (Gui Energy Multi-objective optiMization for Android apps), for generating color palettes using a multi- objective optimization technique, which produces color solutions optimizing energy consumption and contrast while using consistent colors with respect to the original color palette. An empirical evaluation that we performed on 25 Android apps demonstrates not only significant improvements in terms of the three different objectives, but also confirmed that in most cases users still perceived the choices of colors as attractive. Finally, for several apps we interviewed the original developers, who in some cases expressed the intent to adopt the proposed choice of color palette, whereas in other cases pointed out directions for future improvements @InProceedings{ESEC/FSE15p143, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Eduardo Bernal Cárdenas and Rocco Oliveto and Massimiliano Di Penta and Denys Poshyvanyk}, title = {Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {143--154}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Casalnuovo, Casey |
ESEC/FSE '15: "Developer Onboarding in GitHub: ..."
Developer Onboarding in GitHub: The Role of Prior Social Links and Language Experience
Casey Casalnuovo, Bogdan Vasilescu, Premkumar Devanbu , and Vladimir Filkov (University of California at Davis, USA) The team aspects of software engineering have been a subject of great interest since early work by Fred Brooks and others: how well do people work together in teams? why do people join teams? what happens if teams are distributed? Recently, the emergence of project ecosystems such as GitHub have created an entirely new, higher level of organization. GitHub supports numerous teams; they share a common technical platform (for work activities) and a common social platform (via following, commenting, etc). We explore the GitHub evidence for socialization as a precursor to joining a project, and how the technical factors of past experience and social factors of past connections to team members of a project affect productivity both initially and in the long run. We find developers preferentially join projects in GitHub where they have pre-existing relationships; furthermore, we find that the presence of past social connections combined with prior experience in languages dominant in the project leads to higher productivity both initially and cumulatively. Interestingly, we also find that stronger social connections are associated with slightly less productivity initially, but slightly more productivity in the long run. @InProceedings{ESEC/FSE15p817, author = {Casey Casalnuovo and Bogdan Vasilescu and Premkumar Devanbu and Vladimir Filkov}, title = {Developer Onboarding in GitHub: The Role of Prior Social Links and Language Experience}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {817--828}, doi = {}, year = {2015}, } |
|
Chandra, Satish |
ESEC/FSE '15: "Mimic: Computing Models for ..."
Mimic: Computing Models for Opaque Code
Stefan Heule, Manu Sridharan, and Satish Chandra (Stanford University, USA; Samsung Research, USA) Opaque code, which is executable but whose source is unavailable or hard to process, can be problematic in a number of scenarios, such as program analysis. Manual construction of models is often used to handle opaque code, but this process is tedious and error-prone. (In this paper, we use model to mean a representation of a piece of code suitable for program analysis.) We present a novel technique for automatic generation of models for opaque code, based on program synthesis. The technique intercepts memory accesses from the opaque code to client objects, and uses this information to construct partial execution traces. Then, it performs a heuristic search inspired by Markov Chain Monte Carlo techniques to discover an executable code model whose behavior matches the opaque code. Native execution, parallelization, and a carefully-designed fitness function are leveraged to increase the effectiveness of the search. We have implemented our technique in a tool Mimic for discovering models of opaque JavaScript functions, and used Mimic to synthesize correct models for a variety of array-manipulating routines. @InProceedings{ESEC/FSE15p710, author = {Stefan Heule and Manu Sridharan and Satish Chandra}, title = {Mimic: Computing Models for Opaque Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {710--720}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "MemInsight: Platform-Independent ..." MemInsight: Platform-Independent Memory Debugging for JavaScript Simon Holm Jensen, Manu Sridharan, Koushik Sen, and Satish Chandra (Snowflake Computing, USA; Samsung Research, USA; University of California at Berkeley, USA) JavaScript programs often suffer from memory issues that can either hurt performance or eventually cause memory exhaustion. While existing snapshot-based profiling tools can be helpful, the information provided is limited to the coarse granularity at which snapshots can be taken. We present MemInsight, a tool that provides detailed, time-varying analysis of the memory behavior of JavaScript applications, including web applications. MemInsight is platform independent and runs on unmodified JavaScript engines. It employs tuned source-code instrumentation to generate a trace of memory allocations and accesses, and it leverages modern browser features to track precise information for DOM (document object model) objects. It also computes exact object lifetimes without any garbage collector assistance, and exposes this information in an easily-consumable manner for further analysis. We describe several client analyses built into MemInsight, including detection of possible memory leaks and opportunities for stack allocation and object inlining. An experimental evaluation showed that with no modifications to the runtime, MemInsight was able to expose memory issues in several real-world applications. @InProceedings{ESEC/FSE15p345, author = {Simon Holm Jensen and Manu Sridharan and Koushik Sen and Satish Chandra}, title = {MemInsight: Platform-Independent Memory Debugging for JavaScript}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {345--356}, doi = {}, year = {2015}, } |
|
Chen, Fuxiang |
ESEC/FSE '15: "Crowd Debugging ..."
Crowd Debugging
Fuxiang Chen and Sunghun Kim (Hong Kong University of Science and Technology, China) Research shows that, in general, many people turn to QA sites to solicit answers to their problems. We observe in Stack Overflow a huge number of recurring questions, 1,632,590, despite mechanisms having been put into place to prevent these recurring questions. Recurring questions imply developers are facing similar issues in their source code. However, limitations exist in the QA sites. Developers need to visit them frequently and/or should be familiar with all the content to take advantage of the crowd's knowledge. Due to the large and rapid growth of QA data, it is difficult, if not impossible for developers to catch up. To address these limitations, we propose mining the QA site, Stack Overflow, to leverage the huge mass of crowd knowledge to help developers debug their code. Our approach reveals 189 warnings and 171 (90.5%) of them are confirmed by developers from eight high-quality and well-maintained projects. Developers appreciate these findings because the crowd provides solutions and comprehensive explanations to the issues. We compared the confirmed bugs with three popular static analysis tools (FindBugs, JLint and PMD). Of the 171 bugs identified by our approach, only FindBugs detected six of them whereas JLint and PMD detected none. @InProceedings{ESEC/FSE15p320, author = {Fuxiang Chen and Sunghun Kim}, title = {Crowd Debugging}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {320--332}, doi = {}, year = {2015}, } |
|
Chen, Yuting |
ESEC/FSE '15: "Guided Differential Testing ..."
Guided Differential Testing of Certificate Validation in SSL/TLS Implementations
Yuting Chen and Zhendong Su (Shanghai Jiao Tong University, China; University of California at Davis, USA) Certificate validation in SSL/TLS implementations is critical for Internet security. There is recent strong effort, namely frankencert, in automatically synthesizing certificates for stress-testing certificate validation. Despite its early promise, it remains a significant challenge to generate effective test certificates as they are structurally complex with intricate syntactic and semantic constraints. This paper tackles this challenge by introducing mucert, a novel, guided technique to much more effectively test real-world certificate validation code. Our core insight is to (1) leverage easily accessible Internet certificates as seed certificates, and (2) diversify them by adapting Markov Chain Monte Carlo (MCMC) sampling. The diversified certificates are then used to reveal discrepancies, thus potential flaws, among different certificate validation implementations. We have implemented mucert and extensively evaluated it against frankencert. Our experimental results show that mucert is significantly more cost-effective than frankencert. Indeed, 1K mucerts (i.e., mucert-mutated certificates) yield three times as many distinct discrepancies as 8M frankencerts (i.e., frankencert-synthesized certificates), and 200 mucerts can achieve higher code coverage than 100,000 frankencerts. This improvement is significant as it incurs much cost to test each generated certificate. We have analyzed and reported 20+ latent discrepancies (presumably missed by frankencert), and reported an additional 357 discrepancy-triggering certificates to SSL/TLS developers, who have already confirmed some of our reported issues and are investigating causes of all the reported discrepancies. In particular, our reports have led to bug fixes, active discussions in the community, and proposed changes to relevant IETF’s RFCs. We believe that mucert is practical and effective for helping improve the robustness of SSL/TLS implementations. @InProceedings{ESEC/FSE15p793, author = {Yuting Chen and Zhendong Su}, title = {Guided Differential Testing of Certificate Validation in SSL/TLS Implementations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {793--804}, doi = {}, year = {2015}, } Info |
|
Chen, Zhenyu |
ESEC/FSE '15: "Test Report Prioritization ..."
Test Report Prioritization to Assist Crowdsourced Testing
Yang Feng, Zhenyu Chen, James A. Jones , Chunrong Fang, and Baowen Xu (Nanjing University, China; University of California at Irvine, USA) In crowdsourced testing, users can be incentivized to perform testing tasks and report their results, and because crowdsourced workers are often paid per task, there is a financial incentive to complete tasks quickly rather than well. These reports of the crowdsourced testing tasks are called "test reports" and are composed of simple natural language and screenshots. Back at the software-development organization, developers must manually inspect the test reports to judge their value for revealing faults. Due to the nature of crowdsourced work, the number of test reports are often difficult to comprehensively inspect and process. In order to help with this daunting task, we created the first technique of its kind, to the best of our knowledge, to prioritize test reports for manual inspection. Our technique utilizes two key strategies: (1) a diversity strategy to help developers inspect a wide variety of test reports and to avoid duplicates and wasted effort on falsely classified faulty behavior, and (2) a risk strategy to help developers identify test reports that may be more likely to be fault-revealing based on past observations. Together, these strategies form our DivRisk strategy to prioritize test reports in crowd- sourced testing. Three industrial projects have been used to evaluate the effectiveness of test report prioritization methods. The results of the empirical study show that: (1) DivRisk can significantly outperform random prioritization; (2) DivRisk can approximate the best theoretical result for a real-world industrial mobile application. In addition, we provide some practical guidelines of test report prioritization for crowdsourced testing based on the empirical study and our experiences. @InProceedings{ESEC/FSE15p225, author = {Yang Feng and Zhenyu Chen and James A. Jones and Chunrong Fang and Baowen Xu}, title = {Test Report Prioritization to Assist Crowdsourced Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {225--236}, doi = {}, year = {2015}, } |
|
Choi, Wontae |
ESEC/FSE '15: "MultiSE: Multi-path Symbolic ..."
MultiSE: Multi-path Symbolic Execution using Value Summaries
Koushik Sen, George Necula, Liang Gong, and Wontae Choi (University of California at Berkeley, USA) Dynamic symbolic execution (DSE) has been proposed to effectively generate test inputs for real-world programs. Unfortunately, DSE techniques do not scale well for large realistic programs, because often the number of feasible execution paths of a program increases exponentially with the increase in the length of an execution path. In this paper, we propose MultiSE, a new technique for merging states incrementally during symbolic execution, without using auxiliary variables. The key idea of MultiSE is based on an alternative representation of the state, where we map each variable, including the program counter, to a set of guarded symbolic expressions called a value summary. MultiSE has several advantages over conventional DSE and conventional state merging techniques: value summaries enable sharing of symbolic expressions and path constraints along multiple paths and thus avoid redundant execution. MultiSE does not introduce auxiliary symbolic variables, which enables it to 1) make progress even when merging values not supported by the constraint solver, 2) avoid expensive constraint solver calls when resolving function calls and jumps, and 3) carry out most operations concretely. Moreover, MultiSE updates value summaries incrementally at every assignment instruction, which makes it unnecessary to identify the join points and to keep track of variables to merge at join points. We have implemented MultiSE for JavaScript programs in a publicly available open-source tool. Our evaluation of MultiSE on several programs shows that 1) value summaries are an eective technique to take advantage of the sharing of value along multiple execution path, that 2) MultiSE can run significantly faster than traditional dynamic symbolic execution and, 3) MultiSE saves a substantial number of state merges compared to conventional state-merging techniques. @InProceedings{ESEC/FSE15p842, author = {Koushik Sen and George Necula and Liang Gong and Wontae Choi}, title = {MultiSE: Multi-path Symbolic Execution using Value Summaries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {842--853}, doi = {}, year = {2015}, } Best-Paper Award |
|
Chu, Bill |
ESEC/FSE '15: "Questions Developers Ask While ..."
Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis
Justin Smith, Brittany Johnson, Emerson Murphy-Hill, Bill Chu, and Heather Richter Lipford (North Carolina State University, USA; University of North Carolina at Charlotte, USA) Security tools can help developers answer questions about potential vulnerabilities in their code. A better understanding of the types of questions asked by developers may help toolsmiths design more effective tools. In this paper, we describe how we collected and categorized these questions by conducting an exploratory study with novice and experienced software developers. We equipped them with Find Security Bugs, a security-oriented static analysis tool, and observed their interactions with security vulnerabilities in an open-source system that they had previously contributed to. We found that they asked questions not only about security vulnerabilities, associated attacks, and fixes, but also questions about the software itself, the social ecosystem that built the software, and related resources and tools. For example, when participants asked questions about the source of tainted data, their tools forced them to make imperfect tradeoffs between systematic and ad hoc program navigation strategies. @InProceedings{ESEC/FSE15p248, author = {Justin Smith and Brittany Johnson and Emerson Murphy-Hill and Bill Chu and Heather Richter Lipford}, title = {Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {248--259}, doi = {}, year = {2015}, } Info |
|
Cito, Jürgen |
ESEC/FSE '15: "The Making of Cloud Applications: ..."
The Making of Cloud Applications: An Empirical Study on Software Development for the Cloud
Jürgen Cito, Philipp Leitner, Thomas Fritz , and Harald C. Gall (University of Zurich, Switzerland) Cloud computing is gaining more and more traction as a deployment and provisioning model for software. While a large body of research already covers how to optimally operate a cloud system, we still lack insights into how professional software engineers actually use clouds, and how the cloud impacts development practices. This paper reports on the first systematic study on how software developers build applications for the cloud. We conducted a mixed-method study, consisting of qualitative interviews of 25 professional developers and a quantitative survey with 294 responses. Our results show that adopting the cloud has a profound impact throughout the software development process, as well as on how developers utilize tools and data in their daily work. Among other things, we found that (1) developers need better means to anticipate runtime problems and rigorously define metrics for improved fault localization and (2) the cloud offers an abundance of operational data, however, developers still often rely on their experience and intuition rather than utilizing metrics. From our findings, we extracted a set of guidelines for cloud development and identified challenges for researchers and tool vendors. @InProceedings{ESEC/FSE15p393, author = {Jürgen Cito and Philipp Leitner and Thomas Fritz and Harald C. Gall}, title = {The Making of Cloud Applications: An Empirical Study on Software Development for the Cloud}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {393--403}, doi = {}, year = {2015}, } |
|
Cohen, Myra B. |
ESEC/FSE '15: "Users Beware: Preference Inconsistencies ..."
Users Beware: Preference Inconsistencies Ahead
Farnaz Behrang, Myra B. Cohen, and Alessandro Orso (Georgia Tech, USA; University of Nebraska-Lincoln, USA) The structure of preferences for modern highly-configurable software systems has become extremely complex, usually consisting of multiple layers of access that go from the user interface down to the lowest levels of the source code. This complexity can lead to inconsistencies between layers, especially during software evolution. For example, there may be preferences that users can change through the GUI, but that have no effect on the actual behavior of the system because the related source code is not present or has been removed going from one version to the next. These inconsistencies may result in unexpected program behaviors, which range in severity from mild annoyances to more critical security or performance problems. To address this problem, we present SCIC (Software Configuration Inconsistency Checker), a static analysis technique that can automatically detect these kinds of inconsistencies. Unlike other configuration analysis tools, SCIC can handle software that (1) is written in multiple programming languages and (2) has a complex preference structure. In an empirical evaluation that we performed on 10 years worth of versions of both the widely used Mozilla Core and Firefox, SCIC was able to find 40 real inconsistencies (some determined as severe), whose lifetime spanned multiple versions, and whose detection required the analysis of code written in multiple languages. @InProceedings{ESEC/FSE15p295, author = {Farnaz Behrang and Myra B. Cohen and Alessandro Orso}, title = {Users Beware: Preference Inconsistencies Ahead}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {295--306}, doi = {}, year = {2015}, } Best-Paper Award |
|
Daka, Ermira |
ESEC/FSE '15: "Modeling Readability to Improve ..."
Modeling Readability to Improve Unit Tests
Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer (University of Sheffield, UK; University of Virginia, USA) Writing good unit tests can be tedious and error prone, but even once they are written, the job is not done: Developers need to reason about unit tests throughout software development and evolution, in order to diagnose test failures, maintain the tests, and to understand code written by other developers. Unreadable tests are more difficult to maintain and lose some of their value to developers. To overcome this problem, we propose a domain-specific model of unit test readability based on human judgements, and use this model to augment automated unit test generation. The resulting approach can automatically generate test suites with both high coverage and also improved readability. In human studies users prefer our improved tests and are able to answer maintenance questions about them 14% more quickly at the same level of accuracy. @InProceedings{ESEC/FSE15p107, author = {Ermira Daka and José Campos and Gordon Fraser and Jonathan Dorn and Westley Weimer}, title = {Modeling Readability to Improve Unit Tests}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {107--118}, doi = {}, year = {2015}, } Best-Paper Award |
|
D'Amorim, Marcelo |
ESEC/FSE '15: "Iterative Distribution-Aware ..."
Iterative Distribution-Aware Sampling for Probabilistic Symbolic Execution
Mateus Borges, Antonio Filieri, Marcelo d'Amorim, and Corina S. Păsăreanu (University of Stuttgart, Germany; Federal University of Pernambuco, Brazil; Carnegie Mellon University, USA; NASA Ames Research Center, USA) Probabilistic symbolic execution aims at quantifying the probability of reaching program events of interest assuming that program inputs follow given probabilistic distributions. The technique collects constraints on the inputs that lead to the target events and analyzes them to quantify how likely it is for an input to satisfy the constraints. Current techniques either handle only linear constraints or only support continuous distributions using a “discretization” of the input domain, leading to imprecise and costly results. We propose an iterative distribution-aware sampling approach to support probabilistic symbolic execution for arbitrarily complex mathematical constraints and continuous input distributions. We follow a compositional approach, where the symbolic constraints are decomposed into sub-problems whose solution can be solved independently. At each iteration the convergence rate of the com- putation is increased by automatically refocusing the analysis on estimating the sub-problems that mostly affect the accuracy of the results, as guided by three different ranking strategies. Experiments on publicly available benchmarks show that the proposed technique improves on previous approaches in terms of scalability and accuracy of the results. @InProceedings{ESEC/FSE15p866, author = {Mateus Borges and Antonio Filieri and Marcelo d'Amorim and Corina S. Păsăreanu}, title = {Iterative Distribution-Aware Sampling for Probabilistic Symbolic Execution}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {866--877}, doi = {}, year = {2015}, } Info |
|
Dang, Hung |
ESEC/FSE '15: "Auto-patching DOM-Based XSS ..."
Auto-patching DOM-Based XSS at Scale
Inian Parameshwaran, Enrico Budianto, Shweta Shinde, Hung Dang, Atul Sadhu, and Prateek Saxena (National University of Singapore, Singapore) DOM-based cross-site scripting (XSS) is a client-side code injection vulnerability that results from unsafe dynamic code generation in JavaScript applications, and has few known practical defenses. We study dynamic code evaluation practices on nearly a quarter million URLs crawled starting from the the Alexa Top 1000 websites. Of 777,082 cases of dynamic HTML/JS code generation we observe, 13.3% use unsafe string interpolation for dynamic code generation — a well-known dangerous coding practice. To remedy this, we propose a technique to generate secure patches that replace unsafe string interpolation with safer code that utilizes programmatic DOM construction techniques. Our system transparently auto-patches the vulnerable site while incurring only 5.2 − 8.07% overhead. The patching mechanism requires no access to server-side code or modification to browsers, and thus is practical as a turnkey defense. @InProceedings{ESEC/FSE15p272, author = {Inian Parameshwaran and Enrico Budianto and Shweta Shinde and Hung Dang and Atul Sadhu and Prateek Saxena}, title = {Auto-patching DOM-Based XSS at Scale}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {272--283}, doi = {}, year = {2015}, } Info |
|
Dangl, Matthias |
ESEC/FSE '15: "Witness Validation and Stepwise ..."
Witness Validation and Stepwise Testification across Software Verifiers
Dirk Beyer , Matthias Dangl, Daniel Dietsch, Matthias Heizmann, and Andreas Stahlbauer (University of Passau, Germany; University of Freiburg, Germany) It is commonly understood that a verification tool should provide a counterexample to witness a specification violation. Until recently, software verifiers dumped error witnesses in proprietary formats, which are often neither human- nor machine-readable, and an exchange of witnesses between different verifiers was impossible. To close this gap in software-verification technology, we have defined an exchange format for error witnesses that is easy to write and read by verification tools (for further processing, e.g., witness validation) and that is easy to convert into visualizations that conveniently let developers inspect an error path. To eliminate manual inspection of false alarms, we develop the notion of stepwise testification: in a first step, a verifier finds a problematic program path and, in addition to the verification result FALSE, constructs a witness for this path; in the next step, another verifier re-verifies that the witness indeed violates the specification. This process can have more than two steps, each reducing the state space around the error path, making it easier to validate the witness in a later step. An obvious application for testification is the setting where we have two verifiers: one that is efficient but imprecise and another one that is precise but expensive. We have implemented the technique of error-witness-driven program analysis in two state-of-the-art verification tools, CPAchecker and Ultimate Automizer, and show by experimental evaluation that the approach is applicable to a large set of verification tasks. @InProceedings{ESEC/FSE15p721, author = {Dirk Beyer and Matthias Dangl and Daniel Dietsch and Matthias Heizmann and Andreas Stahlbauer}, title = {Witness Validation and Stepwise Testification across Software Verifiers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {721--733}, doi = {}, year = {2015}, } Info |
|
Dattatreya, Mohan |
ESEC/FSE '15: "Efficient Dependency Detection ..."
Efficient Dependency Detection for Safe Java Test Acceleration
Jonathan Bell, Gail Kaiser , Eric Melski, and Mohan Dattatreya (Columbia University, USA; Electric Cloud, USA) Slow builds remain a plague for software developers. The frequency with which code can be built (compiled, tested and packaged) directly impacts the productivity of developers: longer build times mean a longer wait before determining if a change to the application being built was successful. We have discovered that in the case of some languages, such as Java, the majority of build time is spent running tests, where dependencies between individual tests are complicated to discover, making many existing test acceleration techniques unsound to deploy in practice. Without knowledge of which tests are dependent on others, we cannot safely parallelize the execution of the tests, nor can we perform incremental testing (i.e., execute only a subset of an application's tests for each build). The previous techniques for detecting these dependencies did not scale to large test suites: given a test suite that normally ran in two hours, the best-case running scenario for the previous tool would have taken over 422 CPU days to find dependencies between all test methods (and would not soundly find all dependencies) — on the same project the exhaustive technique (to find all dependencies) would have taken over 1e300 years. We present a novel approach to detecting all dependencies between test cases in large projects that can enable safe exploitation of parallelism and test selection with a modest analysis cost. @InProceedings{ESEC/FSE15p770, author = {Jonathan Bell and Gail Kaiser and Eric Melski and Mohan Dattatreya}, title = {Efficient Dependency Detection for Safe Java Test Acceleration}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {770--781}, doi = {}, year = {2015}, } |
|
Denaro, Giovanni |
ESEC/FSE '15: "Symbolic Execution of Programs ..."
Symbolic Execution of Programs with Heap Inputs
Pietro Braione, Giovanni Denaro, and Mauro Pezzè (University of Milano-Bicocca, Italy; University of Lugano, Switzerland) Symbolic analysis is a core component of many automatic test generation and program verication approaches. To verify complex software systems, test and analysis techniques shall deal with the many aspects of the target systems at different granularity levels. In particular, testing software programs that make extensive use of heap data structures at unit and integration levels requires generating suitable input data structures in the heap. This is a main challenge for symbolic testing and analysis techniques that work well when dealing with numeric inputs, but do not satisfactorily cope with heap data structures yet. In this paper we propose a language HEX to specify invariants of partially initialized data structures, and a decision procedure that supports the incremental evaluation of structural properties in HEX. Used in combination with the symbolic execution of heap manipulating programs, HEX prevents the exploration of invalid states, thus improving the eefficiency of program testing and analysis, and avoiding false alarms that negatively impact on verication activities. The experimental data conrm that HEX is an effective and efficient solution to the problem of testing and analyzing heap manipulating programs, and outperforms the alternative approaches that have been proposed so far. @InProceedings{ESEC/FSE15p602, author = {Pietro Braione and Giovanni Denaro and Mauro Pezzè}, title = {Symbolic Execution of Programs with Heap Inputs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {602--613}, doi = {}, year = {2015}, } |
|
Desai, Ankush |
ESEC/FSE '15: "Systematic Testing of Asynchronous ..."
Systematic Testing of Asynchronous Reactive Systems
Ankush Desai, Shaz Qadeer, and Sanjit A. Seshia (University of California at Berkeley, USA; Microsoft Research, USA) We introduce the concept of a delaying explorer with the goal of performing prioritized exploration of the behaviors of an asynchronous reactive program. A delaying explorer stratifies the search space using a custom strategy, and a delay operation that allows deviation from that strategy. We show that prioritized search with a delaying explorer performs significantly better than existing prioritization techniques. We also demonstrate empirically the need for writing different delaying explorers for scalable systematic testing and hence, present a flexible delaying explorer interface. We introduce two new techniques to improve the scalability of search based on delaying explorers. First, we present an algorithm for stratified exhaustive search and use efficient state caching to avoid redundant exploration of schedules. We provide soundness and termination guarantees for our algorithm. Second, for the cases where the state of the system cannot be captured or there are resource constraints, we present an algorithm to randomly sample any execution from the stratified search space. This algorithm guarantees that any such execution that requires d delay operations is sampled with probability at least 1/Ld, where L is the maximum number of program steps. We have implemented our algorithms and evaluated them on a collection of real-world fault-tolerant distributed protocols. @InProceedings{ESEC/FSE15p73, author = {Ankush Desai and Shaz Qadeer and Sanjit A. Seshia}, title = {Systematic Testing of Asynchronous Reactive Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {73--83}, doi = {}, year = {2015}, } Info |
|
Devanbu, Premkumar |
ESEC/FSE '15: "Developer Onboarding in GitHub: ..."
Developer Onboarding in GitHub: The Role of Prior Social Links and Language Experience
Casey Casalnuovo, Bogdan Vasilescu, Premkumar Devanbu , and Vladimir Filkov (University of California at Davis, USA) The team aspects of software engineering have been a subject of great interest since early work by Fred Brooks and others: how well do people work together in teams? why do people join teams? what happens if teams are distributed? Recently, the emergence of project ecosystems such as GitHub have created an entirely new, higher level of organization. GitHub supports numerous teams; they share a common technical platform (for work activities) and a common social platform (via following, commenting, etc). We explore the GitHub evidence for socialization as a precursor to joining a project, and how the technical factors of past experience and social factors of past connections to team members of a project affect productivity both initially and in the long run. We find developers preferentially join projects in GitHub where they have pre-existing relationships; furthermore, we find that the presence of past social connections combined with prior experience in languages dominant in the project leads to higher productivity both initially and cumulatively. Interestingly, we also find that stronger social connections are associated with slightly less productivity initially, but slightly more productivity in the long run. @InProceedings{ESEC/FSE15p817, author = {Casey Casalnuovo and Bogdan Vasilescu and Premkumar Devanbu and Vladimir Filkov}, title = {Developer Onboarding in GitHub: The Role of Prior Social Links and Language Experience}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {817--828}, doi = {}, year = {2015}, } ESEC/FSE '15: "Quality and Productivity Outcomes ..." Quality and Productivity Outcomes Relating to Continuous Integration in GitHub Bogdan Vasilescu, Yue Yu, Huaimin Wang, Premkumar Devanbu , and Vladimir Filkov (University of California at Davis, USA; National University of Defense Technology, China) Software processes comprise many steps; coding is followed by building, integration testing, system testing, deployment, operations, among others. Software process integration and automation have been areas of key concern in software engineering, ever since the pioneering work of Osterweil; market pressures for Agility, and open, decentralized, software development have provided additional pressures for progress in this area. But do these innovations actually help projects? Given the numerous confounding factors that can influence project performance, it can be a challenge to discern the effects of process integration and automation. Software project ecosystems such as GitHub provide a new opportunity in this regard: one can readily find large numbers of projects in various stages of process integration and automation, and gather data on various influencing factors as well as productivity and quality outcomes. In this paper we use large, historical data on process metrics and outcomes in GitHub projects to discern the effects of one specific innovation in process automation: continuous integration. Our main finding is that continuous integration improves the productivity of project teams, who can integrate more outside contributions, without an observable diminishment in code quality. @InProceedings{ESEC/FSE15p805, author = {Bogdan Vasilescu and Yue Yu and Huaimin Wang and Premkumar Devanbu and Vladimir Filkov}, title = {Quality and Productivity Outcomes Relating to Continuous Integration in GitHub}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {805--816}, doi = {}, year = {2015}, } |
|
Dhar, Aritra |
ESEC/FSE '15: "CLOTHO: Saving Programs from ..."
CLOTHO: Saving Programs from Malformed Strings and Incorrect String-Handling
Aritra Dhar, Rahul Purandare , Mohan Dhawan, and Suresh Rangaswamy (Xerox Research Center, India; IIIT Delhi, India; IBM Research, India) Software is susceptible to malformed data originating from untrusted sources. Occasionally the programming logic or constructs used are inappropriate to handle the varied constraints imposed by legal and well-formed data. Consequently, softwares may produce unexpected results or even crash. In this paper, we present CLOTHO, a novel hybrid approach that saves such softwares from crashing when failures originate from malformed strings or inappropriate handling of strings. CLOTHO statically analyses a program to identify statements that are vulnerable to failures related to associated string data. CLOTHO then generates patches that are likely to satisfy constraints on the data, and in case of failures produces program behavior which would be close to the expected. The precision of the patches is improved with the help of a dynamic analysis. We have implemented CLOTHO for the JAVA String API, and our evaluation based on several popular open-source libraries shows that CLOTHO generates patches that are semantically similar to the patches generated by the programmers in the later versions. Additionally, these patches are activated only when a failure is detected, and thus CLOTHO incurs no runtime overhead during normal execution, and negligible overhead in case of failures. @InProceedings{ESEC/FSE15p555, author = {Aritra Dhar and Rahul Purandare and Mohan Dhawan and Suresh Rangaswamy}, title = {CLOTHO: Saving Programs from Malformed Strings and Incorrect String-Handling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {555--566}, doi = {}, year = {2015}, } Info |
|
Dhawan, Mohan |
ESEC/FSE '15: "CLOTHO: Saving Programs from ..."
CLOTHO: Saving Programs from Malformed Strings and Incorrect String-Handling
Aritra Dhar, Rahul Purandare , Mohan Dhawan, and Suresh Rangaswamy (Xerox Research Center, India; IIIT Delhi, India; IBM Research, India) Software is susceptible to malformed data originating from untrusted sources. Occasionally the programming logic or constructs used are inappropriate to handle the varied constraints imposed by legal and well-formed data. Consequently, softwares may produce unexpected results or even crash. In this paper, we present CLOTHO, a novel hybrid approach that saves such softwares from crashing when failures originate from malformed strings or inappropriate handling of strings. CLOTHO statically analyses a program to identify statements that are vulnerable to failures related to associated string data. CLOTHO then generates patches that are likely to satisfy constraints on the data, and in case of failures produces program behavior which would be close to the expected. The precision of the patches is improved with the help of a dynamic analysis. We have implemented CLOTHO for the JAVA String API, and our evaluation based on several popular open-source libraries shows that CLOTHO generates patches that are semantically similar to the patches generated by the programmers in the later versions. Additionally, these patches are activated only when a failure is detected, and thus CLOTHO incurs no runtime overhead during normal execution, and negligible overhead in case of failures. @InProceedings{ESEC/FSE15p555, author = {Aritra Dhar and Rahul Purandare and Mohan Dhawan and Suresh Rangaswamy}, title = {CLOTHO: Saving Programs from Malformed Strings and Incorrect String-Handling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {555--566}, doi = {}, year = {2015}, } Info |
|
Dietsch, Daniel |
ESEC/FSE '15: "Witness Validation and Stepwise ..."
Witness Validation and Stepwise Testification across Software Verifiers
Dirk Beyer , Matthias Dangl, Daniel Dietsch, Matthias Heizmann, and Andreas Stahlbauer (University of Passau, Germany; University of Freiburg, Germany) It is commonly understood that a verification tool should provide a counterexample to witness a specification violation. Until recently, software verifiers dumped error witnesses in proprietary formats, which are often neither human- nor machine-readable, and an exchange of witnesses between different verifiers was impossible. To close this gap in software-verification technology, we have defined an exchange format for error witnesses that is easy to write and read by verification tools (for further processing, e.g., witness validation) and that is easy to convert into visualizations that conveniently let developers inspect an error path. To eliminate manual inspection of false alarms, we develop the notion of stepwise testification: in a first step, a verifier finds a problematic program path and, in addition to the verification result FALSE, constructs a witness for this path; in the next step, another verifier re-verifies that the witness indeed violates the specification. This process can have more than two steps, each reducing the state space around the error path, making it easier to validate the witness in a later step. An obvious application for testification is the setting where we have two verifiers: one that is efficient but imprecise and another one that is precise but expensive. We have implemented the technique of error-witness-driven program analysis in two state-of-the-art verification tools, CPAchecker and Ultimate Automizer, and show by experimental evaluation that the approach is applicable to a large set of verification tasks. @InProceedings{ESEC/FSE15p721, author = {Dirk Beyer and Matthias Dangl and Daniel Dietsch and Matthias Heizmann and Andreas Stahlbauer}, title = {Witness Validation and Stepwise Testification across Software Verifiers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {721--733}, doi = {}, year = {2015}, } Info |
|
Di Penta, Massimiliano |
ESEC/FSE '15: "Query-Based Configuration ..."
Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks
Laura Moreno, Gabriele Bavota, Sonia Haiduc, Massimiliano Di Penta, Rocco Oliveto, Barbara Russo, and Andrian Marcus (University of Texas at Dallas, USA; Free University of Bolzano, Italy; Florida State University, USA; University of Sannio, Italy; University of Molise, Italy) Text Retrieval (TR) approaches have been used to leverage the textual information contained in software artifacts to address a multitude of software engineering (SE) tasks. However, TR approaches need to be configured properly in order to lead to good results. Current approaches for automatic TR configuration in SE configure a single TR approach and then use it for all possible queries. In this paper, we show that such a configuration strategy leads to suboptimal results, and propose QUEST, the first approach bringing TR configuration selection to the query level. QUEST recommends the best TR configuration for a given query, based on a supervised learning approach that determines the TR configuration that performs the best for each query according to its properties. We evaluated QUEST in the context of feature and bug localization, using a data set with more than 1,000 queries. We found that QUEST is able to recommend one of the top three TR configurations for a query with a 69% accuracy, on average. We compared the results obtained with the configurations recommended by QUEST for every query with those obtained using a single TR configuration for all queries in a system and in the entire data set. We found that using QUEST we obtain better results than with any of the considered TR configurations. @InProceedings{ESEC/FSE15p567, author = {Laura Moreno and Gabriele Bavota and Sonia Haiduc and Massimiliano Di Penta and Rocco Oliveto and Barbara Russo and Andrian Marcus}, title = {Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {567--578}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "Optimizing Energy Consumption ..." Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach Mario Linares-Vásquez, Gabriele Bavota, Carlos Eduardo Bernal Cárdenas, Rocco Oliveto, Massimiliano Di Penta, and Denys Poshyvanyk (College of William and Mary, USA; Free University of Bolzano, Italy; University of Molise, Italy; University of Sannio, Italy) The wide diffusion of mobile devices has motivated research towards optimizing energy consumption of software systems— including apps—targeting such devices. Besides efforts aimed at dealing with various kinds of energy bugs, the adoption of Organic Light-Emitting Diode (OLED) screens has motivated research towards reducing energy consumption by choosing an appropriate color palette. Whilst past research in this area aimed at optimizing energy while keeping an acceptable level of contrast, this paper proposes an approach, named GEMMA (Gui Energy Multi-objective optiMization for Android apps), for generating color palettes using a multi- objective optimization technique, which produces color solutions optimizing energy consumption and contrast while using consistent colors with respect to the original color palette. An empirical evaluation that we performed on 25 Android apps demonstrates not only significant improvements in terms of the three different objectives, but also confirmed that in most cases users still perceived the choices of colors as attractive. Finally, for several apps we interviewed the original developers, who in some cases expressed the intent to adopt the proposed choice of color palette, whereas in other cases pointed out directions for future improvements @InProceedings{ESEC/FSE15p143, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Eduardo Bernal Cárdenas and Rocco Oliveto and Massimiliano Di Penta and Denys Poshyvanyk}, title = {Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {143--154}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Dong, Xiwei |
ESEC/FSE '15: "Heterogeneous Cross-Company ..."
Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning
Xiaoyuan Jing, Fei Wu, Xiwei Dong, Fumin Qi, and Baowen Xu (Wuhan University, China; Nanjing University of Posts and Telecommunications, China; Nanjing University, China) Cross-company defect prediction (CCDP) learns a prediction model by using training data from one or multiple projects of a source company and then applies the model to the target company data. Existing CCDP methods are based on the assumption that the data of source and target companies should have the same software metrics. However, for CCDP, the source and target company data is usually heterogeneous, namely the metrics used and the size of metric set are different in the data of two companies. We call CCDP in this scenario as heterogeneous CCDP (HCCDP) task. In this paper, we aim to provide an effective solution for HCCDP. We propose a unified metric representation (UMR) for the data of source and target companies. The UMR consists of three types of metrics, i.e., the common metrics of the source and target companies, source-company specific metrics and target-company specific metrics. To construct UMR for source company data, the target-company specific metrics are set as zeros, while for UMR of the target company data, the source-company specific metrics are set as zeros. Based on the unified metric representation, we for the first time introduce canonical correlation analysis (CCA), an effective transfer learning method, into CCDP to make the data distributions of source and target companies similar. Experiments on 14 public heterogeneous datasets from four companies indicate that: 1) for HCCDP with partially different metrics, our approach significantly outperforms state-of-the-art CCDP methods; 2) for HCCDP with totally different metrics, our approach obtains comparable prediction performances in contrast with within-project prediction results. The proposed approach is effective for HCCDP. @InProceedings{ESEC/FSE15p496, author = {Xiaoyuan Jing and Fei Wu and Xiwei Dong and Fumin Qi and Baowen Xu}, title = {Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {496--507}, doi = {}, year = {2015}, } |
|
Dorn, Jonathan |
ESEC/FSE '15: "Modeling Readability to Improve ..."
Modeling Readability to Improve Unit Tests
Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer (University of Sheffield, UK; University of Virginia, USA) Writing good unit tests can be tedious and error prone, but even once they are written, the job is not done: Developers need to reason about unit tests throughout software development and evolution, in order to diagnose test failures, maintain the tests, and to understand code written by other developers. Unreadable tests are more difficult to maintain and lose some of their value to developers. To overcome this problem, we propose a domain-specific model of unit test readability based on human judgements, and use this model to augment automated unit test generation. The resulting approach can automatically generate test suites with both high coverage and also improved readability. In human studies users prefer our improved tests and are able to answer maintenance questions about them 14% more quickly at the same level of accuracy. @InProceedings{ESEC/FSE15p107, author = {Ermira Daka and José Campos and Gordon Fraser and Jonathan Dorn and Westley Weimer}, title = {Modeling Readability to Improve Unit Tests}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {107--118}, doi = {}, year = {2015}, } Best-Paper Award |
|
Eichberg, Michael |
ESEC/FSE '15: "Getting to Know You: Towards ..."
Getting to Know You: Towards a Capability Model for Java
Ben Hermann, Michael Reif, Michael Eichberg, and Mira Mezini (TU Darmstadt, Germany) Developing software from reusable libraries lets developers face a security dilemma: Either be efficient and reuse libraries as they are or inspect them, know about their resource usage, but possibly miss deadlines as reviews are a time consuming process. In this paper, we propose a novel capability inference mechanism for libraries written in Java. It uses a coarse-grained capability model for system resources that can be presented to developers. We found that the capability inference agrees by 86.81% on expectations towards capabilities that can be derived from project documentation. Moreover, our approach can find capabilities that cannot be discovered using project documentation. It is thus a helpful tool for developers mitigating the aforementioned dilemma. @InProceedings{ESEC/FSE15p758, author = {Ben Hermann and Michael Reif and Michael Eichberg and Mira Mezini}, title = {Getting to Know You: Towards a Capability Model for Java}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {758--769}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "Hidden Truths in Dead Software ..." Hidden Truths in Dead Software Paths Michael Eichberg, Ben Hermann, Mira Mezini , and Leonid Glanz (TU Darmstadt, Germany) Approaches and techniques for statically finding a multitude of issues in source code have been developed in the past. A core property of these approaches is that they are usually targeted towards finding only a very specific kind of issue and that the effort to develop such an analysis is significant. This strictly limits the number of kinds of issues that can be detected. In this paper, we discuss a generic approach based on the detection of infeasible paths in code that can discover a wide range of code smells ranging from useless code that hinders comprehension to real bugs. Code issues are identified by calculating the difference between the control-flow graph that contains all technically possible edges and the corresponding graph recorded while performing a more precise analysis using abstract interpretation. We have evaluated the approach using the Java Development Kit as well as the Qualitas Corpus (a curated collection of over 100 Java Applications) and were able to find thousands of issues across a wide range of categories. @InProceedings{ESEC/FSE15p474, author = {Michael Eichberg and Ben Hermann and Mira Mezini and Leonid Glanz}, title = {Hidden Truths in Dead Software Paths}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {474--484}, doi = {}, year = {2015}, } Info |
|
Elbaum, Sebastian |
ESEC/FSE '15: "How Developers Search for ..."
How Developers Search for Code: A Case Study
Caitlin Sadowski, Kathryn T. Stolee, and Sebastian Elbaum (Google, USA; Iowa State University, USA; University of Nebraska-Lincoln, USA) With the advent of large code repositories and sophisticated search capabilities, code search is increasingly becoming a key software development activity. In this work we shed some light into how developers search for code through a case study performed at Google, using a combination of survey and log-analysis methodologies. Our study provides insights into what developers are doing and trying to learn when per- forming a search, search scope, query properties, and what a search session under different contexts usually entails. Our results indicate that programmers search for code very frequently, conducting an average of five search sessions with 12 total queries each workday. The search queries are often targeted at a particular code location and programmers are typically looking for code with which they are somewhat familiar. Further, programmers are generally seeking answers to questions about how to use an API, what code does, why something is failing, or where code is located. @InProceedings{ESEC/FSE15p191, author = {Caitlin Sadowski and Kathryn T. Stolee and Sebastian Elbaum}, title = {How Developers Search for Code: A Case Study}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2015}, } |
|
Ernst, Neil A. |
ESEC/FSE '15: "Measure It? Manage It? Ignore ..."
Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt
Neil A. Ernst, Stephany Bellomo, Ipek Ozkaya , Robert L. Nord, and Ian Gorton (SEI, USA) The technical debt metaphor is widely used to encapsulate numerous software quality problems. The metaphor is attractive to practitioners as it communicates to both technical and nontechnical audiences that if quality problems are not addressed, things may get worse. However, it is unclear whether there are practices that move this metaphor beyond a mere communication mechanism. Existing studies of technical debt have largely focused on code metrics and small surveys of developers. In this paper, we report on our survey of 1,831 participants, primarily software engineers and architects working in long-lived, software-intensive projects from three large organizations, and follow-up interviews of seven software engineers. We analyzed our data using both nonparametric statistics and qualitative text analysis. We found that architectural decisions are the most important source of technical debt. Furthermore, while respondents believe the metaphor is itself important for communication, existing tools are not currently helpful in managing the details. We use our results to motivate a technical debt timeline to focus management and tooling approaches. @InProceedings{ESEC/FSE15p50, author = {Neil A. Ernst and Stephany Bellomo and Ipek Ozkaya and Robert L. Nord and Ian Gorton}, title = {Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {50--60}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Esmaeilzadeh, Hadi |
ESEC/FSE '15: "FlexJava: Language Support ..."
FlexJava: Language Support for Safe and Modular Approximate Programming
Jongse Park, Hadi Esmaeilzadeh, Xin Zhang, Mayur Naik, and William Harris (Georgia Tech, USA) Energy efficiency is a primary constraint in modern systems. Approximate computing is a promising approach that trades quality of result for gains in efficiency and performance. State- of-the-art approximate programming models require extensive manual annotations on program data and operations to guarantee safe execution of approximate programs. The need for extensive manual annotations hinders the practical use of approximation techniques. This paper describes FlexJava, a small set of language extensions, that significantly reduces the annotation effort, paving the way for practical approximate programming. These extensions enable programmers to annotate approximation-tolerant method outputs. The FlexJava compiler, which is equipped with an approximation safety analysis, automatically infers the operations and data that affect these outputs and selectively marks them approximable while giving safety guarantees. The automation and the language–compiler codesign relieve programmers from manually and explicitly an- notating data declarations or operations as safe to approximate. FlexJava is designed to support safety, modularity, generality, and scalability in software development. We have implemented FlexJava annotations as a Java library and we demonstrate its practicality using a wide range of Java applications and by con- ducting a user study. Compared to EnerJ, a recent approximate programming system, FlexJava provides the same energy savings with significant reduction (from 2× to 17×) in the number of annotations. In our user study, programmers spend 6× to 12× less time annotating programs using FlexJava than when using EnerJ. @InProceedings{ESEC/FSE15p745, author = {Jongse Park and Hadi Esmaeilzadeh and Xin Zhang and Mayur Naik and William Harris}, title = {FlexJava: Language Support for Safe and Modular Approximate Programming}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {745--757}, doi = {}, year = {2015}, } |
|
Falleri, Jean-Rémy |
ESEC/FSE '15: "Impact of Developer Turnover ..."
Impact of Developer Turnover on Quality in Open-Source Software
Matthieu Foucault, Marc Palyart, Xavier Blanc, Gail C. Murphy, and Jean-Rémy Falleri (University of Bordeaux, France; University of British Columbia, Canada) Turnover is the phenomenon of continuous influx and retreat of human resources in a team. Despite being well-studied in many settings, turnover has not been characterized for open-source software projects. We study the source code repositories of five open-source projects to characterize patterns of turnover and to determine the effects of turnover on software quality. We define the base concepts of both external and internal turnover, which are the mobility of developers in and out of a project, and the mobility of developers inside a project, respectively. We provide a qualitative analysis of turnover patterns. We also found, in a quantitative analysis, that the activity of external newcomers negatively impact software quality. @InProceedings{ESEC/FSE15p829, author = {Matthieu Foucault and Marc Palyart and Xavier Blanc and Gail C. Murphy and Jean-Rémy Falleri}, title = {Impact of Developer Turnover on Quality in Open-Source Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {829--841}, doi = {}, year = {2015}, } Info |
|
Fan, Xuepeng |
ESEC/FSE '15: "Hey, You Have Given Me Too ..."
Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software
Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker (University of California at San Diego, USA; Huazhong University of Science and Technology, China; NetApp, USA) Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters ("knobs"). With hundreds of knobs, configuring system software to ensure high reliability and performance becomes a daunting, error-prone task. This paper makes a first step in understanding a fundamental question of configuration design: "do users really need so many knobs?" To provide the quantitatively answer, we study the configuration settings of real-world users, including thousands of customers of a commercial storage system (Storage-A), and hundreds of users of two widely-used open-source system software projects. Our study reveals a series of interesting findings to motivate software architects and developers to be more cautious and disciplined in configuration design. Motivated by these findings, we provide a few concrete, practical guidelines which can significantly reduce the configuration space. Take Storage-A as an example, the guidelines can remove 51.9% of its parameters and simplify 19.7% of the remaining ones with little impact on existing users. Also, we study the existing configuration navigation methods in the context of "too many knobs" to understand their effectiveness in dealing with the over-designed configuration, and to provide practices for building navigation support in system software. @InProceedings{ESEC/FSE15p307, author = {Tianyin Xu and Long Jin and Xuepeng Fan and Yuanyuan Zhou and Shankar Pasupathy and Rukma Talwadker}, title = {Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {307--319}, doi = {}, year = {2015}, } Video Info |
|
Fang, Chunrong |
ESEC/FSE '15: "Test Report Prioritization ..."
Test Report Prioritization to Assist Crowdsourced Testing
Yang Feng, Zhenyu Chen, James A. Jones , Chunrong Fang, and Baowen Xu (Nanjing University, China; University of California at Irvine, USA) In crowdsourced testing, users can be incentivized to perform testing tasks and report their results, and because crowdsourced workers are often paid per task, there is a financial incentive to complete tasks quickly rather than well. These reports of the crowdsourced testing tasks are called "test reports" and are composed of simple natural language and screenshots. Back at the software-development organization, developers must manually inspect the test reports to judge their value for revealing faults. Due to the nature of crowdsourced work, the number of test reports are often difficult to comprehensively inspect and process. In order to help with this daunting task, we created the first technique of its kind, to the best of our knowledge, to prioritize test reports for manual inspection. Our technique utilizes two key strategies: (1) a diversity strategy to help developers inspect a wide variety of test reports and to avoid duplicates and wasted effort on falsely classified faulty behavior, and (2) a risk strategy to help developers identify test reports that may be more likely to be fault-revealing based on past observations. Together, these strategies form our DivRisk strategy to prioritize test reports in crowd- sourced testing. Three industrial projects have been used to evaluate the effectiveness of test report prioritization methods. The results of the empirical study show that: (1) DivRisk can significantly outperform random prioritization; (2) DivRisk can approximate the best theoretical result for a real-world industrial mobile application. In addition, we provide some practical guidelines of test report prioritization for crowdsourced testing based on the empirical study and our experiences. @InProceedings{ESEC/FSE15p225, author = {Yang Feng and Zhenyu Chen and James A. Jones and Chunrong Fang and Baowen Xu}, title = {Test Report Prioritization to Assist Crowdsourced Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {225--236}, doi = {}, year = {2015}, } |
|
Feng, Yang |
ESEC/FSE '15: "Test Report Prioritization ..."
Test Report Prioritization to Assist Crowdsourced Testing
Yang Feng, Zhenyu Chen, James A. Jones , Chunrong Fang, and Baowen Xu (Nanjing University, China; University of California at Irvine, USA) In crowdsourced testing, users can be incentivized to perform testing tasks and report their results, and because crowdsourced workers are often paid per task, there is a financial incentive to complete tasks quickly rather than well. These reports of the crowdsourced testing tasks are called "test reports" and are composed of simple natural language and screenshots. Back at the software-development organization, developers must manually inspect the test reports to judge their value for revealing faults. Due to the nature of crowdsourced work, the number of test reports are often difficult to comprehensively inspect and process. In order to help with this daunting task, we created the first technique of its kind, to the best of our knowledge, to prioritize test reports for manual inspection. Our technique utilizes two key strategies: (1) a diversity strategy to help developers inspect a wide variety of test reports and to avoid duplicates and wasted effort on falsely classified faulty behavior, and (2) a risk strategy to help developers identify test reports that may be more likely to be fault-revealing based on past observations. Together, these strategies form our DivRisk strategy to prioritize test reports in crowd- sourced testing. Three industrial projects have been used to evaluate the effectiveness of test report prioritization methods. The results of the empirical study show that: (1) DivRisk can significantly outperform random prioritization; (2) DivRisk can approximate the best theoretical result for a real-world industrial mobile application. In addition, we provide some practical guidelines of test report prioritization for crowdsourced testing based on the empirical study and our experiences. @InProceedings{ESEC/FSE15p225, author = {Yang Feng and Zhenyu Chen and James A. Jones and Chunrong Fang and Baowen Xu}, title = {Test Report Prioritization to Assist Crowdsourced Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {225--236}, doi = {}, year = {2015}, } |
|
Figueira Filho, Fernando |
ESEC/FSE '15: "Summarizing and Measuring ..."
Summarizing and Measuring Development Activity
Christoph Treude, Fernando Figueira Filho, and Uirá Kulesza (Federal University of Rio Grande do Norte, Brazil) Software developers pursue a wide range of activities as part of their work, and making sense of what they did in a given time frame is far from trivial as evidenced by the large number of awareness and coordination tools that have been developed in recent years. To inform tool design for making sense of the information available about a developer's activity, we conducted an empirical study with 156 GitHub users to investigate what information they would expect in a summary of development activity, how they would measure development activity, and what factors influence how such activity can be condensed into textual summaries or numbers. We found that unexpected events are as important as expected events in summaries of what a developer did, and that many developers do not believe in measuring development activity. Among the factors that influence summarization and measurement of development activity, we identified development experience and programming languages. @InProceedings{ESEC/FSE15p625, author = {Christoph Treude and Fernando Figueira Filho and Uirá Kulesza}, title = {Summarizing and Measuring Development Activity}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {625--636}, doi = {}, year = {2015}, } Info |
|
Filieri, Antonio |
ESEC/FSE '15: "Automated Multi-objective ..."
Automated Multi-objective Control for Self-Adaptive Software Design
Antonio Filieri, Henry Hoffmann , and Martina Maggio (University of Stuttgart, Germany; University of Chicago, USA; Lund University, Sweden) While software is becoming more complex everyday, the requirements on its behavior are not getting any easier to satisfy. An application should offer a certain quality of service, adapt to the current environmental conditions and withstand runtime variations that were simply unpredictable during the design phase. To tackle this complexity, control theory has been proposed as a technique for managing software's dynamic behavior, obviating the need for human intervention. Control-theoretical solutions, however, are either tailored for the specific application or do not handle the complexity of multiple interacting components and multiple goals. In this paper, we develop an automated control synthesis methodology that takes, as input, the configurable software components (or knobs) and the goals to be achieved. Our approach automatically constructs a control system that manages the specified knobs and guarantees the goals are met. These claims are backed up by experimental studies on three different software applications, where we show how the proposed automated approach handles the complexity of multiple knobs and objectives. @InProceedings{ESEC/FSE15p13, author = {Antonio Filieri and Henry Hoffmann and Martina Maggio}, title = {Automated Multi-objective Control for Self-Adaptive Software Design}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {13--24}, doi = {}, year = {2015}, } ESEC/FSE '15: "Iterative Distribution-Aware ..." Iterative Distribution-Aware Sampling for Probabilistic Symbolic Execution Mateus Borges, Antonio Filieri, Marcelo d'Amorim, and Corina S. Păsăreanu (University of Stuttgart, Germany; Federal University of Pernambuco, Brazil; Carnegie Mellon University, USA; NASA Ames Research Center, USA) Probabilistic symbolic execution aims at quantifying the probability of reaching program events of interest assuming that program inputs follow given probabilistic distributions. The technique collects constraints on the inputs that lead to the target events and analyzes them to quantify how likely it is for an input to satisfy the constraints. Current techniques either handle only linear constraints or only support continuous distributions using a “discretization” of the input domain, leading to imprecise and costly results. We propose an iterative distribution-aware sampling approach to support probabilistic symbolic execution for arbitrarily complex mathematical constraints and continuous input distributions. We follow a compositional approach, where the symbolic constraints are decomposed into sub-problems whose solution can be solved independently. At each iteration the convergence rate of the com- putation is increased by automatically refocusing the analysis on estimating the sub-problems that mostly affect the accuracy of the results, as guided by three different ranking strategies. Experiments on publicly available benchmarks show that the proposed technique improves on previous approaches in terms of scalability and accuracy of the results. @InProceedings{ESEC/FSE15p866, author = {Mateus Borges and Antonio Filieri and Marcelo d'Amorim and Corina S. Păsăreanu}, title = {Iterative Distribution-Aware Sampling for Probabilistic Symbolic Execution}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {866--877}, doi = {}, year = {2015}, } Info |
|
Filkov, Vladimir |
ESEC/FSE '15: "Developer Onboarding in GitHub: ..."
Developer Onboarding in GitHub: The Role of Prior Social Links and Language Experience
Casey Casalnuovo, Bogdan Vasilescu, Premkumar Devanbu , and Vladimir Filkov (University of California at Davis, USA) The team aspects of software engineering have been a subject of great interest since early work by Fred Brooks and others: how well do people work together in teams? why do people join teams? what happens if teams are distributed? Recently, the emergence of project ecosystems such as GitHub have created an entirely new, higher level of organization. GitHub supports numerous teams; they share a common technical platform (for work activities) and a common social platform (via following, commenting, etc). We explore the GitHub evidence for socialization as a precursor to joining a project, and how the technical factors of past experience and social factors of past connections to team members of a project affect productivity both initially and in the long run. We find developers preferentially join projects in GitHub where they have pre-existing relationships; furthermore, we find that the presence of past social connections combined with prior experience in languages dominant in the project leads to higher productivity both initially and cumulatively. Interestingly, we also find that stronger social connections are associated with slightly less productivity initially, but slightly more productivity in the long run. @InProceedings{ESEC/FSE15p817, author = {Casey Casalnuovo and Bogdan Vasilescu and Premkumar Devanbu and Vladimir Filkov}, title = {Developer Onboarding in GitHub: The Role of Prior Social Links and Language Experience}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {817--828}, doi = {}, year = {2015}, } ESEC/FSE '15: "Quality and Productivity Outcomes ..." Quality and Productivity Outcomes Relating to Continuous Integration in GitHub Bogdan Vasilescu, Yue Yu, Huaimin Wang, Premkumar Devanbu , and Vladimir Filkov (University of California at Davis, USA; National University of Defense Technology, China) Software processes comprise many steps; coding is followed by building, integration testing, system testing, deployment, operations, among others. Software process integration and automation have been areas of key concern in software engineering, ever since the pioneering work of Osterweil; market pressures for Agility, and open, decentralized, software development have provided additional pressures for progress in this area. But do these innovations actually help projects? Given the numerous confounding factors that can influence project performance, it can be a challenge to discern the effects of process integration and automation. Software project ecosystems such as GitHub provide a new opportunity in this regard: one can readily find large numbers of projects in various stages of process integration and automation, and gather data on various influencing factors as well as productivity and quality outcomes. In this paper we use large, historical data on process metrics and outcomes in GitHub projects to discern the effects of one specific innovation in process automation: continuous integration. Our main finding is that continuous integration improves the productivity of project teams, who can integrate more outside contributions, without an observable diminishment in code quality. @InProceedings{ESEC/FSE15p805, author = {Bogdan Vasilescu and Yue Yu and Huaimin Wang and Premkumar Devanbu and Vladimir Filkov}, title = {Quality and Productivity Outcomes Relating to Continuous Integration in GitHub}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {805--816}, doi = {}, year = {2015}, } |
|
Foucault, Matthieu |
ESEC/FSE '15: "Impact of Developer Turnover ..."
Impact of Developer Turnover on Quality in Open-Source Software
Matthieu Foucault, Marc Palyart, Xavier Blanc, Gail C. Murphy, and Jean-Rémy Falleri (University of Bordeaux, France; University of British Columbia, Canada) Turnover is the phenomenon of continuous influx and retreat of human resources in a team. Despite being well-studied in many settings, turnover has not been characterized for open-source software projects. We study the source code repositories of five open-source projects to characterize patterns of turnover and to determine the effects of turnover on software quality. We define the base concepts of both external and internal turnover, which are the mobility of developers in and out of a project, and the mobility of developers inside a project, respectively. We provide a qualitative analysis of turnover patterns. We also found, in a quantitative analysis, that the activity of external newcomers negatively impact software quality. @InProceedings{ESEC/FSE15p829, author = {Matthieu Foucault and Marc Palyart and Xavier Blanc and Gail C. Murphy and Jean-Rémy Falleri}, title = {Impact of Developer Turnover on Quality in Open-Source Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {829--841}, doi = {}, year = {2015}, } Info |
|
Fraser, Gordon |
ESEC/FSE '15: "Generating TCP/UDP Network ..."
Generating TCP/UDP Network Data for Automated Unit Test Generation
Andrea Arcuri, Gordon Fraser, and Juan Pablo Galeotti (Scienta, Norway; University of Luxembourg, Luxembourg; University of Sheffield, UK; Saarland University, Germany) Although automated unit test generation techniques can in principle generate test suites that achieve high code coverage, in practice this is often inhibited by the dependence of the code under test on external resources. In particular, a common problem in modern programming languages is posed by code that involves networking (e.g., opening a TCP listening port). In order to generate tests for such code, we describe an approach where we mock (simulate) the networking interfaces of the Java standard library, such that a search-based test generator can treat the network as part of the test input space. This not only has the benefit that it overcomes many limitations of testing networking code (e.g., different tests binding to the same local ports, and deterministic resolution of hostnames and ephemeral ports), it also substantially increases code coverage. An evaluation on 23,886 classes from 110 open source projects, totalling more than 6.6 million lines of Java code, reveals that network access happens in 2,642 classes (11%). Our implementation of the proposed technique as part of the EVOSUITE testing tool addresses the networking code contained in 1,672 (63%) of these classes, and leads to an increase of the average line coverage from 29.1% to 50.8%. On a manual selection of 42 Java classes heavily depending on networking, line coverage with EVOSUITE more than doubled with the use of network mocking, increasing from 31.8% to 76.6%. @InProceedings{ESEC/FSE15p155, author = {Andrea Arcuri and Gordon Fraser and Juan Pablo Galeotti}, title = {Generating TCP/UDP Network Data for Automated Unit Test Generation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {155--165}, doi = {}, year = {2015}, } ESEC/FSE '15: "Modeling Readability to Improve ..." Modeling Readability to Improve Unit Tests Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer (University of Sheffield, UK; University of Virginia, USA) Writing good unit tests can be tedious and error prone, but even once they are written, the job is not done: Developers need to reason about unit tests throughout software development and evolution, in order to diagnose test failures, maintain the tests, and to understand code written by other developers. Unreadable tests are more difficult to maintain and lose some of their value to developers. To overcome this problem, we propose a domain-specific model of unit test readability based on human judgements, and use this model to augment automated unit test generation. The resulting approach can automatically generate test suites with both high coverage and also improved readability. In human studies users prefer our improved tests and are able to answer maintenance questions about them 14% more quickly at the same level of accuracy. @InProceedings{ESEC/FSE15p107, author = {Ermira Daka and José Campos and Gordon Fraser and Jonathan Dorn and Westley Weimer}, title = {Modeling Readability to Improve Unit Tests}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {107--118}, doi = {}, year = {2015}, } Best-Paper Award |
|
Fratantonio, Yanick |
ESEC/FSE '15: "CLAPP: Characterizing Loops ..."
CLAPP: Characterizing Loops in Android Applications
Yanick Fratantonio, Aravind Machiry, Antonio Bianchi, Christopher Kruegel, and Giovanni Vigna (University of California at Santa Barbara, USA) When performing program analysis, loops are one of the most important aspects that needs to be taken into account. In the past, many approaches have been proposed to analyze loops to perform different tasks, ranging from compiler optimizations to Worst-Case Execution Time (WCET) analysis. While these approaches are powerful, they focus on tackling very specific categories of loops and known loop patterns, such as the ones for which the number of iterations can be statically determined. In this work, we developed a static analysis framework to characterize and analyze generic loops, without relying on techniques based on pattern matching. For this work, we focus on the Android platform, and we implemented a prototype, called CLAPP, that we used to perform the first large-scale empirical study of the usage of loops in Android applications. In particular, we used our tool to analyze a total of 4,110,510 loops found in 11,823 Android applications. As part of our evaluation, we provide the detailed results of our empirical study, we show how our analysis was able to determine that the execution of 63.28% of the loops is bounded, and we discuss several interesting insights related to the performance issues and security aspects associated with loops. @InProceedings{ESEC/FSE15p687, author = {Yanick Fratantonio and Aravind Machiry and Antonio Bianchi and Christopher Kruegel and Giovanni Vigna}, title = {CLAPP: Characterizing Loops in Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {687--697}, doi = {}, year = {2015}, } Info |
|
Fritz, Thomas |
ESEC/FSE '15: "Tracing Software Developers' ..."
Tracing Software Developers' Eyes and Interactions for Change Tasks
Katja Kevic, Braden M. Walters, Timothy R. Shaffer, Bonita Sharif, David C. Shepherd, and Thomas Fritz (University of Zurich, Switzerland; Youngstown State University, USA; ABB Research, USA) What are software developers doing during a change task? While an answer to this question opens countless opportunities to support developers in their work, only little is known about developers' detailed navigation behavior for realistic change tasks. Most empirical studies on developers performing change tasks are limited to very small code snippets or are limited by the granularity or the detail of the data collected for the study. In our research, we try to overcome these limitations by combining user interaction monitoring with very fine granular eye-tracking data that is automatically linked to the underlying source code entities in the IDE. In a study with 12 professional and 10 student developers working on three change tasks from an open source system, we used our approach to investigate the detailed navigation of developers for realistic change tasks. The results of our study show, amongst others, that the eye tracking data does indeed capture different aspects than user interaction data and that developers focus on only small parts of methods that are often related by data flow. We discuss our findings and their implications for better developer tool support. @InProceedings{ESEC/FSE15p202, author = {Katja Kevic and Braden M. Walters and Timothy R. Shaffer and Bonita Sharif and David C. Shepherd and Thomas Fritz}, title = {Tracing Software Developers' Eyes and Interactions for Change Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {202--213}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "The Making of Cloud Applications: ..." The Making of Cloud Applications: An Empirical Study on Software Development for the Cloud Jürgen Cito, Philipp Leitner, Thomas Fritz , and Harald C. Gall (University of Zurich, Switzerland) Cloud computing is gaining more and more traction as a deployment and provisioning model for software. While a large body of research already covers how to optimally operate a cloud system, we still lack insights into how professional software engineers actually use clouds, and how the cloud impacts development practices. This paper reports on the first systematic study on how software developers build applications for the cloud. We conducted a mixed-method study, consisting of qualitative interviews of 25 professional developers and a quantitative survey with 294 responses. Our results show that adopting the cloud has a profound impact throughout the software development process, as well as on how developers utilize tools and data in their daily work. Among other things, we found that (1) developers need better means to anticipate runtime problems and rigorously define metrics for improved fault localization and (2) the cloud offers an abundance of operational data, however, developers still often rely on their experience and intuition rather than utilizing metrics. From our findings, we extracted a set of guidelines for cloud development and identified challenges for researchers and tool vendors. @InProceedings{ESEC/FSE15p393, author = {Jürgen Cito and Philipp Leitner and Thomas Fritz and Harald C. Gall}, title = {The Making of Cloud Applications: An Empirical Study on Software Development for the Cloud}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {393--403}, doi = {}, year = {2015}, } |
|
Fu, Yangchun |
ESEC/FSE '15: "Automatically Deriving Pointer ..."
Automatically Deriving Pointer Reference Expressions from Binary Code for Memory Dump Analysis
Yangchun Fu, Zhiqiang Lin, and David Brumley (University of Texas at Dallas, USA; Carnegie Mellon University, USA) Given a crash dump or a kernel memory snapshot, it is often desirable to have a capability that can traverse its pointers to locate the root cause of the crash, or check their integrity to detect the control flow hijacks. To achieve this, one key challenge lies in how to locate where the pointers are. While locating a pointer usually requires the data structure knowledge of the corresponding program, an important advance made by this work is that we show a technique of extracting address-independent data reference expressions for pointers through dynamic binary analysis. This novel pointer reference expression encodes how a pointer is accessed through the combination of a base address (usually a global variable) with certain offset and further pointer dereferences. We have applied our techniques to OS kernels, and our experimental results with a number of real world kernel malware show that we can correctly identify the hijacked kernel function pointers by locating them using the extracted pointer reference expressions when only given a memory snapshot. @InProceedings{ESEC/FSE15p614, author = {Yangchun Fu and Zhiqiang Lin and David Brumley}, title = {Automatically Deriving Pointer Reference Expressions from Binary Code for Memory Dump Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {614--624}, doi = {}, year = {2015}, } |
|
Galeotti, Juan Pablo |
ESEC/FSE '15: "Generating TCP/UDP Network ..."
Generating TCP/UDP Network Data for Automated Unit Test Generation
Andrea Arcuri, Gordon Fraser, and Juan Pablo Galeotti (Scienta, Norway; University of Luxembourg, Luxembourg; University of Sheffield, UK; Saarland University, Germany) Although automated unit test generation techniques can in principle generate test suites that achieve high code coverage, in practice this is often inhibited by the dependence of the code under test on external resources. In particular, a common problem in modern programming languages is posed by code that involves networking (e.g., opening a TCP listening port). In order to generate tests for such code, we describe an approach where we mock (simulate) the networking interfaces of the Java standard library, such that a search-based test generator can treat the network as part of the test input space. This not only has the benefit that it overcomes many limitations of testing networking code (e.g., different tests binding to the same local ports, and deterministic resolution of hostnames and ephemeral ports), it also substantially increases code coverage. An evaluation on 23,886 classes from 110 open source projects, totalling more than 6.6 million lines of Java code, reveals that network access happens in 2,642 classes (11%). Our implementation of the proposed technique as part of the EVOSUITE testing tool addresses the networking code contained in 1,672 (63%) of these classes, and leads to an increase of the average line coverage from 29.1% to 50.8%. On a manual selection of 42 Java classes heavily depending on networking, line coverage with EVOSUITE more than doubled with the use of network mocking, increasing from 31.8% to 76.6%. @InProceedings{ESEC/FSE15p155, author = {Andrea Arcuri and Gordon Fraser and Juan Pablo Galeotti}, title = {Generating TCP/UDP Network Data for Automated Unit Test Generation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {155--165}, doi = {}, year = {2015}, } |
|
Gall, Harald C. |
ESEC/FSE '15: "The Making of Cloud Applications: ..."
The Making of Cloud Applications: An Empirical Study on Software Development for the Cloud
Jürgen Cito, Philipp Leitner, Thomas Fritz , and Harald C. Gall (University of Zurich, Switzerland) Cloud computing is gaining more and more traction as a deployment and provisioning model for software. While a large body of research already covers how to optimally operate a cloud system, we still lack insights into how professional software engineers actually use clouds, and how the cloud impacts development practices. This paper reports on the first systematic study on how software developers build applications for the cloud. We conducted a mixed-method study, consisting of qualitative interviews of 25 professional developers and a quantitative survey with 294 responses. Our results show that adopting the cloud has a profound impact throughout the software development process, as well as on how developers utilize tools and data in their daily work. Among other things, we found that (1) developers need better means to anticipate runtime problems and rigorously define metrics for improved fault localization and (2) the cloud offers an abundance of operational data, however, developers still often rely on their experience and intuition rather than utilizing metrics. From our findings, we extracted a set of guidelines for cloud development and identified challenges for researchers and tool vendors. @InProceedings{ESEC/FSE15p393, author = {Jürgen Cito and Philipp Leitner and Thomas Fritz and Harald C. Gall}, title = {The Making of Cloud Applications: An Empirical Study on Software Development for the Cloud}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {393--403}, doi = {}, year = {2015}, } |
|
Gargantini, Angelo |
ESEC/FSE '15: "Improving Model-Based Test ..."
Improving Model-Based Test Generation by Model Decomposition
Paolo Arcaini, Angelo Gargantini, and Elvinia Riccobene (Charles University in Prague, Czech Republic; University of Bergamo, Italy; University of Milan, Italy) One of the well-known techniques for model-based test generation exploits the capability of model checkers to return counterexamples upon property violations. However, this approach is not always optimal in practice due to the required time and memory, or even not feasible due to the state explosion problem of model checking. A way to mitigate these limitations consists in decomposing a system model into suitable subsystem models separately analyzable. In this paper, we show a technique to decompose a system model into subsystems by exploiting the model variables dependency, and then we propose a test generation approach which builds tests for the single subsystems and combines them later in order to obtain tests for the system as a whole. Such approach mitigates the exponential increase of the test generation time and memory consumption, and, compared with the same model-based test generation technique applied to the whole system, shows to be more efficient. We prove that, although not complete, the approach is sound. @InProceedings{ESEC/FSE15p119, author = {Paolo Arcaini and Angelo Gargantini and Elvinia Riccobene}, title = {Improving Model-Based Test Generation by Model Decomposition}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {119--130}, doi = {}, year = {2015}, } |
|
Garlan, David |
ESEC/FSE '15: "Proactive Self-Adaptation ..."
Proactive Self-Adaptation under Uncertainty: A Probabilistic Model Checking Approach
Gabriel A. Moreno, Javier Cámara, David Garlan, and Bradley Schmerl (SEI, USA; Carnegie Mellon University, USA) Self-adaptive systems tend to be reactive and myopic, adapting in response to changes without anticipating what the subsequent adaptation needs will be. Adapting reactively can result in inefficiencies due to the system performing a suboptimal sequence of adaptations. Furthermore, when adaptations have latency, and take some time to produce their effect, they have to be started with sufficient lead time so that they complete by the time their effect is needed. Proactive latency-aware adaptation addresses these issues by making adaptation decisions with a look-ahead horizon and taking adaptation latency into account. In this paper we present an approach for proactive latency-aware adaptation under uncertainty that uses probabilistic model checking for adaptation decisions. The key idea is to use a formal model of the adaptive system in which the adaptation decision is left underspecified through nondeterminism, and have the model checker resolve the nondeterministic choices so that the accumulated utility over the horizon is maximized. The adaptation decision is optimal over the horizon, and takes into account the inherent uncertainty of the environment predictions needed for looking ahead. Our results show that the decision based on a look-ahead horizon, and the factoring of both tactic latency and environment uncertainty, considerably improve the effectiveness of adaptation decisions. @InProceedings{ESEC/FSE15p1, author = {Gabriel A. Moreno and Javier Cámara and David Garlan and Bradley Schmerl}, title = {Proactive Self-Adaptation under Uncertainty: A Probabilistic Model Checking Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1--12}, doi = {}, year = {2015}, } |
|
Glanz, Leonid |
ESEC/FSE '15: "Hidden Truths in Dead Software ..."
Hidden Truths in Dead Software Paths
Michael Eichberg, Ben Hermann, Mira Mezini , and Leonid Glanz (TU Darmstadt, Germany) Approaches and techniques for statically finding a multitude of issues in source code have been developed in the past. A core property of these approaches is that they are usually targeted towards finding only a very specific kind of issue and that the effort to develop such an analysis is significant. This strictly limits the number of kinds of issues that can be detected. In this paper, we discuss a generic approach based on the detection of infeasible paths in code that can discover a wide range of code smells ranging from useless code that hinders comprehension to real bugs. Code issues are identified by calculating the difference between the control-flow graph that contains all technically possible edges and the corresponding graph recorded while performing a more precise analysis using abstract interpretation. We have evaluated the approach using the Java Development Kit as well as the Qualitas Corpus (a curated collection of over 100 Java Applications) and were able to find thousands of issues across a wide range of categories. @InProceedings{ESEC/FSE15p474, author = {Michael Eichberg and Ben Hermann and Mira Mezini and Leonid Glanz}, title = {Hidden Truths in Dead Software Paths}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {474--484}, doi = {}, year = {2015}, } Info |
|
Gong, Liang |
ESEC/FSE '15: "MultiSE: Multi-path Symbolic ..."
MultiSE: Multi-path Symbolic Execution using Value Summaries
Koushik Sen, George Necula, Liang Gong, and Wontae Choi (University of California at Berkeley, USA) Dynamic symbolic execution (DSE) has been proposed to effectively generate test inputs for real-world programs. Unfortunately, DSE techniques do not scale well for large realistic programs, because often the number of feasible execution paths of a program increases exponentially with the increase in the length of an execution path. In this paper, we propose MultiSE, a new technique for merging states incrementally during symbolic execution, without using auxiliary variables. The key idea of MultiSE is based on an alternative representation of the state, where we map each variable, including the program counter, to a set of guarded symbolic expressions called a value summary. MultiSE has several advantages over conventional DSE and conventional state merging techniques: value summaries enable sharing of symbolic expressions and path constraints along multiple paths and thus avoid redundant execution. MultiSE does not introduce auxiliary symbolic variables, which enables it to 1) make progress even when merging values not supported by the constraint solver, 2) avoid expensive constraint solver calls when resolving function calls and jumps, and 3) carry out most operations concretely. Moreover, MultiSE updates value summaries incrementally at every assignment instruction, which makes it unnecessary to identify the join points and to keep track of variables to merge at join points. We have implemented MultiSE for JavaScript programs in a publicly available open-source tool. Our evaluation of MultiSE on several programs shows that 1) value summaries are an eective technique to take advantage of the sharing of value along multiple execution path, that 2) MultiSE can run significantly faster than traditional dynamic symbolic execution and, 3) MultiSE saves a substantial number of state merges compared to conventional state-merging techniques. @InProceedings{ESEC/FSE15p842, author = {Koushik Sen and George Necula and Liang Gong and Wontae Choi}, title = {MultiSE: Multi-path Symbolic Execution using Value Summaries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {842--853}, doi = {}, year = {2015}, } Best-Paper Award ESEC/FSE '15: "JITProf: Pinpointing JIT-Unfriendly ..." JITProf: Pinpointing JIT-Unfriendly JavaScript Code Liang Gong, Michael Pradel , and Koushik Sen (University of California at Berkeley, USA; TU Darmstadt, Germany) Most modern JavaScript engines use just-in-time (JIT) compilation to translate parts of JavaScript code into efficient machine code at runtime. Despite the overall success of JIT compilers, programmers may still write code that uses the dynamic features of JavaScript in a way that prohibits profitable optimizations. Unfortunately, there currently is no way to measure how prevalent such JIT-unfriendly code is and to help developers detect such code locations. This paper presents JITProf, a profiling framework to dynamically identify code locations that prohibit profitable JIT optimizations. The key idea is to associate meta-information with JavaScript objects and code locations, to update this information whenever particular runtime events occur, and to use the meta-information to identify JIT-unfriendly operations. We use JITProf to analyze widely used JavaScript web applications and show that JIT-unfriendly code is prevalent in practice. Furthermore, we show how to use the approach as a profiling technique that finds optimization opportunities in a program. Applying the profiler to popular benchmark programs shows that refactoring these programs to avoid performance problems identified by JITProf leads to statistically significant performance improvements of up to 26.3% in 15 benchmarks. @InProceedings{ESEC/FSE15p357, author = {Liang Gong and Michael Pradel and Koushik Sen}, title = {JITProf: Pinpointing JIT-Unfriendly JavaScript Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {357--368}, doi = {}, year = {2015}, } Info |
|
Gorton, Ian |
ESEC/FSE '15: "Measure It? Manage It? Ignore ..."
Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt
Neil A. Ernst, Stephany Bellomo, Ipek Ozkaya , Robert L. Nord, and Ian Gorton (SEI, USA) The technical debt metaphor is widely used to encapsulate numerous software quality problems. The metaphor is attractive to practitioners as it communicates to both technical and nontechnical audiences that if quality problems are not addressed, things may get worse. However, it is unclear whether there are practices that move this metaphor beyond a mere communication mechanism. Existing studies of technical debt have largely focused on code metrics and small surveys of developers. In this paper, we report on our survey of 1,831 participants, primarily software engineers and architects working in long-lived, software-intensive projects from three large organizations, and follow-up interviews of seven software engineers. We analyzed our data using both nonparametric statistics and qualitative text analysis. We found that architectural decisions are the most important source of technical debt. Furthermore, while respondents believe the metaphor is itself important for communication, existing tools are not currently helpful in managing the details. We use our results to motivate a technical debt timeline to focus management and tooling approaches. @InProceedings{ESEC/FSE15p50, author = {Neil A. Ernst and Stephany Bellomo and Ipek Ozkaya and Robert L. Nord and Ian Gorton}, title = {Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {50--60}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Goues, Claire Le |
ESEC/FSE '15: "Is the Cure Worse Than the ..."
Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair
Edward K. Smith, Earl T. Barr , Claire Le Goues , and Yuriy Brun (University of Massachusetts at Amherst, USA; University College London, UK; Carnegie Mellon University, USA; University of Massachusetts, USA) Automated program repair has shown promise for reducing the significant manual effort debugging requires. This paper addresses a deficit of earlier evaluations of automated repair techniques caused by repairing programs and evaluating generated patches' correctness using the same set of tests. Since tests are an imperfect metric of program correctness, evaluations of this type do not discriminate between correct patches and patches that overfit the available tests and break untested but desired functionality. This paper evaluates two well-studied repair tools, GenProg and TrpAutoRepair, on a publicly available benchmark of bugs, each with a human-written patch. By evaluating patches using tests independent from those used during repair, we find that the tools are unlikely to improve the proportion of independent tests passed, and that the quality of the patches is proportional to the coverage of the test suite used during repair. For programs that pass most tests, the tools are as likely to break tests as to fix them. However, novice developers also overfit, and automated repair performs no worse than these developers. In addition to overfitting, we measure the effects of test suite coverage, test suite provenance, and starting program quality, as well as the difference in quality between novice-developer-written and tool-generated patches when quality is assessed with a test suite independent from the one used for patch generation. @InProceedings{ESEC/FSE15p532, author = {Edward K. Smith and Earl T. Barr and Claire Le Goues and Yuriy Brun}, title = {Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {532--543}, doi = {}, year = {2015}, } |
|
Gousios, Georgios |
ESEC/FSE '15: "When, How, and Why Developers ..."
When, How, and Why Developers (Do Not) Test in Their IDEs
Moritz Beller, Georgios Gousios, Annibale Panichella , and Andy Zaidman (Delft University of Technology, Netherlands; Radboud University Nijmegen, Netherlands) The research community in Software Engineering and Software Testing in particular builds many of its contributions on a set of mutually shared expectations. Despite the fact that they form the basis of many publications as well as open-source and commercial testing applications, these common expectations and beliefs are rarely ever questioned. For example, Frederic Brooks’ statement that testing takes half of the development time seems to have manifested itself within the community since he first made it in the “Mythical Man Month” in 1975. With this paper, we report on the surprising results of a large-scale field study with 416 software engineers whose development activity we closely monitored over the course of five months, resulting in over 13 years of recorded work time in their integrated development environments (IDEs). Our findings question several commonly shared assumptions and beliefs about testing and might be contributing factors to the observed bug proneness of software in practice: the majority of developers in our study does not test; developers rarely run their tests in the IDE; Test-Driven Development (TDD) is not widely practiced; and, last but not least, software developers only spend a quarter of their work time engineering tests, whereas they think they test half of their time. @InProceedings{ESEC/FSE15p179, author = {Moritz Beller and Georgios Gousios and Annibale Panichella and Andy Zaidman}, title = {When, How, and Why Developers (Do Not) Test in Their IDEs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {179--190}, doi = {}, year = {2015}, } |
|
Grebhahn, Alexander |
ESEC/FSE '15: "Performance-Influence Models ..."
Performance-Influence Models for Highly Configurable Systems
Norbert Siegmund, Alexander Grebhahn, Sven Apel, and Christian Kästner (University of Passau, Germany; Carnegie Mellon University, USA) Almost every complex software system today is configurable. While configurability has many benefits, it challenges performance prediction, optimization, and debugging. Often, the influences of individual configuration options on performance are unknown. Worse, configuration options may interact, giving rise to a configuration space of possibly exponential size. Addressing this challenge, we propose an approach that derives a performance-influence model for a given configurable system, describing all relevant influences of configuration options and their interactions. Our approach combines machine-learning and sampling heuristics in a novel way. It improves over standard techniques in that it (1) represents influences of options and their interactions explicitly (which eases debugging), (2) smoothly integrates binary and numeric configuration options for the first time, (3) incorporates domain knowledge, if available (which eases learning and increases accuracy), (4) considers complex constraints among options, and (5) systematically reduces the solution space to a tractable size. A series of experiments demonstrates the feasibility of our approach in terms of the accuracy of the models learned as well as the accuracy of the performance predictions one can make with them. @InProceedings{ESEC/FSE15p284, author = {Norbert Siegmund and Alexander Grebhahn and Sven Apel and Christian Kästner}, title = {Performance-Influence Models for Highly Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {284--294}, doi = {}, year = {2015}, } Info |
|
Grundy, John |
ESEC/FSE '15: "Rule-Based Extraction of Goal-Use ..."
Rule-Based Extraction of Goal-Use Case Models from Text
Tuong Huan Nguyen, John Grundy, and Mohamed Almorsy (Swinburne University of Technology, Australia) Goal and use case modeling has been recognized as a key approach for understanding and analyzing requirements. However, in practice, goals and use cases are often buried among other content in requirements specifications documents and written in unstructured styles. It is thus a time-consuming and error-prone process to identify such goals and use cases. In addition, having them embedded in natural language documents greatly limits the possibility of formally analyzing the requirements for problems. To address these issues, we have developed a novel rule-based approach to automatically extract goal and use case models from natural language requirements documents. Our approach is able to automatically categorize goals and ensure they are properly specified. We also provide automated semantic parameterization of artifact textual specifications to promote further analysis on the extracted goal-use case models. Our approach achieves 85% precision and 82% recall rates on average for model extraction and 88% accuracy for the automated parameterization. @InProceedings{ESEC/FSE15p591, author = {Tuong Huan Nguyen and John Grundy and Mohamed Almorsy}, title = {Rule-Based Extraction of Goal-Use Case Models from Text}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {591--601}, doi = {}, year = {2015}, } Info |
|
Gu, Rui |
ESEC/FSE '15: "What Change History Tells ..."
What Change History Tells Us about Thread Synchronization
Rui Gu, Guoliang Jin, Linhai Song, Linjie Zhu, and Shan Lu (Columbia University, USA; North Carolina State University, USA; University of Wisconsin-Madison, USA; University of Chicago, USA) Multi-threaded programs are pervasive, yet difficult to write. Missing proper synchronization leads to correctness bugs and over synchronization leads to performance problems. To improve the correctness and efficiency of multi-threaded software, we need a better understanding of synchronization challenges faced by real-world developers. This paper studies the code repositories of open-source multi-threaded software projects to obtain a broad and in- depth view of how developers handle synchronizations. We first examine how critical sections are changed when software evolves by checking over 250,000 revisions of four representative open-source software projects. The findings help us answer questions like how often synchronization is an afterthought for developers; whether it is difficult for devel- opers to decide critical section boundaries and lock variables; and what are real-world over-synchronization problems. We then conduct case studies to better understand (1) how critical sections are changed to solve performance prob- lems (i.e. over-synchronization issues) and (2) how soft- ware changes lead to synchronization-related correctness problems (i.e. concurrency bugs). This in-depth study shows that tool support is needed to help developers tackle over-synchronization problems; it also shows that concur- rency bug avoidance, detection, and testing can be improved through better awareness of code revision history. @InProceedings{ESEC/FSE15p426, author = {Rui Gu and Guoliang Jin and Linhai Song and Linjie Zhu and Shan Lu}, title = {What Change History Tells Us about Thread Synchronization}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {426--438}, doi = {}, year = {2015}, } |
|
Guo, Shengjian |
ESEC/FSE '15: "Assertion Guided Symbolic ..."
Assertion Guided Symbolic Execution of Multithreaded Programs
Shengjian Guo , Markus Kusano, Chao Wang, Zijiang Yang, and Aarti Gupta (Virginia Tech, USA; Western Michigan University, USA; Princeton University, USA) Symbolic execution is a powerful technique for systematic testing of sequential and multithreaded programs. However, its application is limited by the high cost of covering all feasible intra-thread paths and inter-thread interleavings. We propose a new assertion guided pruning framework that identifies executions guaranteed not to lead to an error and removes them during symbolic execution. By summarizing the reasons why previously explored executions cannot reach an error and using the information to prune redundant executions in the future, we can soundly reduce the search space. We also use static concurrent program slicing and heuristic minimization of symbolic constraints to further reduce the computational overhead. We have implemented our method in the Cloud9 symbolic execution tool and evaluated it on a large set of multithreaded C/C++ programs. Our experiments show that the new method can reduce the overall computational cost significantly. @InProceedings{ESEC/FSE15p854, author = {Shengjian Guo and Markus Kusano and Chao Wang and Zijiang Yang and Aarti Gupta}, title = {Assertion Guided Symbolic Execution of Multithreaded Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {854--865}, doi = {}, year = {2015}, } |
|
Gupta, Aarti |
ESEC/FSE '15: "Assertion Guided Symbolic ..."
Assertion Guided Symbolic Execution of Multithreaded Programs
Shengjian Guo , Markus Kusano, Chao Wang, Zijiang Yang, and Aarti Gupta (Virginia Tech, USA; Western Michigan University, USA; Princeton University, USA) Symbolic execution is a powerful technique for systematic testing of sequential and multithreaded programs. However, its application is limited by the high cost of covering all feasible intra-thread paths and inter-thread interleavings. We propose a new assertion guided pruning framework that identifies executions guaranteed not to lead to an error and removes them during symbolic execution. By summarizing the reasons why previously explored executions cannot reach an error and using the information to prune redundant executions in the future, we can soundly reduce the search space. We also use static concurrent program slicing and heuristic minimization of symbolic constraints to further reduce the computational overhead. We have implemented our method in the Cloud9 symbolic execution tool and evaluated it on a large set of multithreaded C/C++ programs. Our experiments show that the new method can reduce the overall computational cost significantly. @InProceedings{ESEC/FSE15p854, author = {Shengjian Guo and Markus Kusano and Chao Wang and Zijiang Yang and Aarti Gupta}, title = {Assertion Guided Symbolic Execution of Multithreaded Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {854--865}, doi = {}, year = {2015}, } |
|
Gyori, Alex |
ESEC/FSE '15: "Comparing and Combining Test-Suite ..."
Comparing and Combining Test-Suite Reduction and Regression Test Selection
August Shi, Tifany Yung, Alex Gyori, and Darko Marinov (University of Illinois at Urbana-Champaign, USA) Regression testing is widely used to check that changes made to software do not break existing functionality, but regression test suites grow, and running them fully can become costly. Researchers have proposed test-suite reduction and regression test selection as two approaches to reduce this cost by not running some of the tests from the test suite. However, previous research has not empirically evaluated how the two approaches compare to each other, and how well a combination of these approaches performs. We present the first extensive study that compares test-suite reduction and regression test selection approaches individually, and also evaluates a combination of the two approaches. We also propose a new criterion to measure the quality of tests with respect to software changes. Our experiments on 4,793 commits from 17 open-source projects show that regression test selection runs on average fewer tests (by 40.15pp) than test-suite reduction. However, test-suite reduction can have a high loss in fault-detection capability with respect to the changes, whereas a (safe) regression test selection has no loss. The experiments also show that a combination of the two approaches runs even fewer tests (on average 5.34pp) than regression test selection, but these tests still have a loss in fault-detection capability with respect to the changes. @InProceedings{ESEC/FSE15p237, author = {August Shi and Tifany Yung and Alex Gyori and Darko Marinov}, title = {Comparing and Combining Test-Suite Reduction and Regression Test Selection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {237--247}, doi = {}, year = {2015}, } |
|
Haiduc, Sonia |
ESEC/FSE '15: "Query-Based Configuration ..."
Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks
Laura Moreno, Gabriele Bavota, Sonia Haiduc, Massimiliano Di Penta, Rocco Oliveto, Barbara Russo, and Andrian Marcus (University of Texas at Dallas, USA; Free University of Bolzano, Italy; Florida State University, USA; University of Sannio, Italy; University of Molise, Italy) Text Retrieval (TR) approaches have been used to leverage the textual information contained in software artifacts to address a multitude of software engineering (SE) tasks. However, TR approaches need to be configured properly in order to lead to good results. Current approaches for automatic TR configuration in SE configure a single TR approach and then use it for all possible queries. In this paper, we show that such a configuration strategy leads to suboptimal results, and propose QUEST, the first approach bringing TR configuration selection to the query level. QUEST recommends the best TR configuration for a given query, based on a supervised learning approach that determines the TR configuration that performs the best for each query according to its properties. We evaluated QUEST in the context of feature and bug localization, using a data set with more than 1,000 queries. We found that QUEST is able to recommend one of the top three TR configurations for a query with a 69% accuracy, on average. We compared the results obtained with the configurations recommended by QUEST for every query with those obtained using a single TR configuration for all queries in a system and in the entire data set. We found that using QUEST we obtain better results than with any of the considered TR configurations. @InProceedings{ESEC/FSE15p567, author = {Laura Moreno and Gabriele Bavota and Sonia Haiduc and Massimiliano Di Penta and Rocco Oliveto and Barbara Russo and Andrian Marcus}, title = {Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {567--578}, doi = {}, year = {2015}, } Info |
|
Halfond, William G. J. |
ESEC/FSE '15: "Detecting Event Anomalies ..."
Detecting Event Anomalies in Event-Based Systems
Gholamreza Safi, Arman Shahbazian, William G. J. Halfond , and Nenad Medvidovic (University of Southern California, USA) Event-based interaction is an attractive paradigm because its use can lead to highly flexible and adaptable systems. One problem in this paradigm is that events are sent, received, and processed nondeterministically, due to the systems’ reliance on implicit invocation and implicit concurrency. This nondeterminism can lead to event anomalies, which occur when an event-based system receives multiple events that lead to the write of a shared field or memory location. Event anomalies can lead to unreliable, error-prone, and hard to debug behavior in an event-based system. To detect these anomalies, this paper presents a new static analysis technique, DEvA, for automatically detecting event anomalies. DEvA has been evaluated on a set of open-source event-based systems against a state-of-the-art technique for detecting data races in multithreaded systems, and a recent technique for solving a similar problem with event processing in Android applications. DEvA exhibited high precision with respect to manually constructed ground truths, and was able to locate event anomalies that had not been detected by the existing solutions. @InProceedings{ESEC/FSE15p25, author = {Gholamreza Safi and Arman Shahbazian and William G. J. Halfond and Nenad Medvidovic}, title = {Detecting Event Anomalies in Event-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {25--37}, doi = {}, year = {2015}, } Video Info ESEC/FSE '15: "String Analysis for Java and ..." String Analysis for Java and Android Applications Ding Li, Yingjun Lyu, Mian Wan, and William G. J. Halfond (University of Southern California, USA) String analysis is critical for many verification techniques. However, accurately modeling string variables is a challeng- ing problem. Current approaches are generally customized for certain problem domains or have critical limitations in handling loops, providing context-sensitive inter-procedural analysis, and performing efficient analysis on complicated apps. To address these limitations, we propose a general framework, Violist, for string analysis that allows researchers to more flexibly choose how they will address each of these challenges by separating the representation and interpreta- tion of string operations. In our evaluation, we show that our approach can achieve high accuracy on both Java and Android apps in a reasonable amount of time. We also com- pared our approach with a popular and widely used string analyzer and found that our approach has higher precision and shorter execution time while maintaining the same level of recall. @InProceedings{ESEC/FSE15p661, author = {Ding Li and Yingjun Lyu and Mian Wan and William G. J. Halfond}, title = {String Analysis for Java and Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {661--672}, doi = {}, year = {2015}, } |
|
Hammoudi, Mouna |
ESEC/FSE '15: "On the Use of Delta Debugging ..."
On the Use of Delta Debugging to Reduce Recordings and Facilitate Debugging of Web Applications
Mouna Hammoudi, Brian Burg, Gigon Bae, and Gregg Rothermel (University of Nebraska-Lincoln, USA; University of Washington, USA) Recording the sequence of events that lead to a failure of a web application can be an effective aid for debugging. Nevertheless, a recording of an event sequence may include many events that are not related to a failure, and this may render debugging more difficult. To address this problem, we have adapted Delta Debugging to function on recordings of web applications, in a manner that lets it identify and discard portions of those recordings that do not influence the occurrence of a failure. We present the results of three empirical studies that show that (1) recording reduction can achieve significant reductions in recording size and replay time on actual web applications obtained from developer forums, (2) reduced recordings do in fact help programmers locate faults significantly more efficiently as, and no less effectively than non-reduced recordings, and (3) recording reduction produces even greater reductions on larger, more complex applications. @InProceedings{ESEC/FSE15p333, author = {Mouna Hammoudi and Brian Burg and Gigon Bae and Gregg Rothermel}, title = {On the Use of Delta Debugging to Reduce Recordings and Facilitate Debugging of Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {333--344}, doi = {}, year = {2015}, } Info |
|
Harris, William |
ESEC/FSE '15: "FlexJava: Language Support ..."
FlexJava: Language Support for Safe and Modular Approximate Programming
Jongse Park, Hadi Esmaeilzadeh, Xin Zhang, Mayur Naik, and William Harris (Georgia Tech, USA) Energy efficiency is a primary constraint in modern systems. Approximate computing is a promising approach that trades quality of result for gains in efficiency and performance. State- of-the-art approximate programming models require extensive manual annotations on program data and operations to guarantee safe execution of approximate programs. The need for extensive manual annotations hinders the practical use of approximation techniques. This paper describes FlexJava, a small set of language extensions, that significantly reduces the annotation effort, paving the way for practical approximate programming. These extensions enable programmers to annotate approximation-tolerant method outputs. The FlexJava compiler, which is equipped with an approximation safety analysis, automatically infers the operations and data that affect these outputs and selectively marks them approximable while giving safety guarantees. The automation and the language–compiler codesign relieve programmers from manually and explicitly an- notating data declarations or operations as safe to approximate. FlexJava is designed to support safety, modularity, generality, and scalability in software development. We have implemented FlexJava annotations as a Java library and we demonstrate its practicality using a wide range of Java applications and by con- ducting a user study. Compared to EnerJ, a recent approximate programming system, FlexJava provides the same energy savings with significant reduction (from 2× to 17×) in the number of annotations. In our user study, programmers spend 6× to 12× less time annotating programs using FlexJava than when using EnerJ. @InProceedings{ESEC/FSE15p745, author = {Jongse Park and Hadi Esmaeilzadeh and Xin Zhang and Mayur Naik and William Harris}, title = {FlexJava: Language Support for Safe and Modular Approximate Programming}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {745--757}, doi = {}, year = {2015}, } |
|
Hassan, Ahmed E. |
ESEC/FSE '15: "An Empirical Study of Goto ..."
An Empirical Study of Goto in C Code from GitHub Repositories
Meiyappan Nagappan, Romain Robbes, Yasutaka Kamei, Éric Tanter , Shane McIntosh, Audris Mockus, and Ahmed E. Hassan (Rochester Institute of Technology, USA; University of Chile, Chile; Kyushu University, Japan; McGill University, Canada; University of Tennessee, USA; Queen's University, Canada) It is nearly 50 years since Dijkstra argued that goto obscures the flow of control in program execution and urged programmers to abandon the goto statement. While past research has shown that goto is still in use, little is known about whether goto is used in the unrestricted manner that Dijkstra feared, and if it is ‘harmful’ enough to be a part of a post-release bug. We, therefore, conduct a two part empirical study - (1) qualitatively analyze a statistically rep- resentative sample of 384 files from a population of almost 250K C programming language files collected from over 11K GitHub repositories and find that developers use goto in C files for error handling (80.21±5%) and cleaning up resources at the end of a procedure (40.36 ± 5%); and (2) quantitatively analyze the commit history from the release branches of six OSS projects and find that no goto statement was re- moved/modified in the post-release phase of four of the six projects. We conclude that developers limit themselves to using goto appropriately in most cases, and not in an unrestricted manner like Dijkstra feared, thus suggesting that goto does not appear to be harmful in practice. @InProceedings{ESEC/FSE15p404, author = {Meiyappan Nagappan and Romain Robbes and Yasutaka Kamei and Éric Tanter and Shane McIntosh and Audris Mockus and Ahmed E. Hassan}, title = {An Empirical Study of Goto in C Code from GitHub Repositories}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {404--414}, doi = {}, year = {2015}, } |
|
Heizmann, Matthias |
ESEC/FSE '15: "Witness Validation and Stepwise ..."
Witness Validation and Stepwise Testification across Software Verifiers
Dirk Beyer , Matthias Dangl, Daniel Dietsch, Matthias Heizmann, and Andreas Stahlbauer (University of Passau, Germany; University of Freiburg, Germany) It is commonly understood that a verification tool should provide a counterexample to witness a specification violation. Until recently, software verifiers dumped error witnesses in proprietary formats, which are often neither human- nor machine-readable, and an exchange of witnesses between different verifiers was impossible. To close this gap in software-verification technology, we have defined an exchange format for error witnesses that is easy to write and read by verification tools (for further processing, e.g., witness validation) and that is easy to convert into visualizations that conveniently let developers inspect an error path. To eliminate manual inspection of false alarms, we develop the notion of stepwise testification: in a first step, a verifier finds a problematic program path and, in addition to the verification result FALSE, constructs a witness for this path; in the next step, another verifier re-verifies that the witness indeed violates the specification. This process can have more than two steps, each reducing the state space around the error path, making it easier to validate the witness in a later step. An obvious application for testification is the setting where we have two verifiers: one that is efficient but imprecise and another one that is precise but expensive. We have implemented the technique of error-witness-driven program analysis in two state-of-the-art verification tools, CPAchecker and Ultimate Automizer, and show by experimental evaluation that the approach is applicable to a large set of verification tasks. @InProceedings{ESEC/FSE15p721, author = {Dirk Beyer and Matthias Dangl and Daniel Dietsch and Matthias Heizmann and Andreas Stahlbauer}, title = {Witness Validation and Stepwise Testification across Software Verifiers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {721--733}, doi = {}, year = {2015}, } Info |
|
Hermann, Ben |
ESEC/FSE '15: "Getting to Know You: Towards ..."
Getting to Know You: Towards a Capability Model for Java
Ben Hermann, Michael Reif, Michael Eichberg, and Mira Mezini (TU Darmstadt, Germany) Developing software from reusable libraries lets developers face a security dilemma: Either be efficient and reuse libraries as they are or inspect them, know about their resource usage, but possibly miss deadlines as reviews are a time consuming process. In this paper, we propose a novel capability inference mechanism for libraries written in Java. It uses a coarse-grained capability model for system resources that can be presented to developers. We found that the capability inference agrees by 86.81% on expectations towards capabilities that can be derived from project documentation. Moreover, our approach can find capabilities that cannot be discovered using project documentation. It is thus a helpful tool for developers mitigating the aforementioned dilemma. @InProceedings{ESEC/FSE15p758, author = {Ben Hermann and Michael Reif and Michael Eichberg and Mira Mezini}, title = {Getting to Know You: Towards a Capability Model for Java}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {758--769}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "Hidden Truths in Dead Software ..." Hidden Truths in Dead Software Paths Michael Eichberg, Ben Hermann, Mira Mezini , and Leonid Glanz (TU Darmstadt, Germany) Approaches and techniques for statically finding a multitude of issues in source code have been developed in the past. A core property of these approaches is that they are usually targeted towards finding only a very specific kind of issue and that the effort to develop such an analysis is significant. This strictly limits the number of kinds of issues that can be detected. In this paper, we discuss a generic approach based on the detection of infeasible paths in code that can discover a wide range of code smells ranging from useless code that hinders comprehension to real bugs. Code issues are identified by calculating the difference between the control-flow graph that contains all technically possible edges and the corresponding graph recorded while performing a more precise analysis using abstract interpretation. We have evaluated the approach using the Java Development Kit as well as the Qualitas Corpus (a curated collection of over 100 Java Applications) and were able to find thousands of issues across a wide range of categories. @InProceedings{ESEC/FSE15p474, author = {Michael Eichberg and Ben Hermann and Mira Mezini and Leonid Glanz}, title = {Hidden Truths in Dead Software Paths}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {474--484}, doi = {}, year = {2015}, } Info |
|
Heule, Stefan |
ESEC/FSE '15: "Mimic: Computing Models for ..."
Mimic: Computing Models for Opaque Code
Stefan Heule, Manu Sridharan, and Satish Chandra (Stanford University, USA; Samsung Research, USA) Opaque code, which is executable but whose source is unavailable or hard to process, can be problematic in a number of scenarios, such as program analysis. Manual construction of models is often used to handle opaque code, but this process is tedious and error-prone. (In this paper, we use model to mean a representation of a piece of code suitable for program analysis.) We present a novel technique for automatic generation of models for opaque code, based on program synthesis. The technique intercepts memory accesses from the opaque code to client objects, and uses this information to construct partial execution traces. Then, it performs a heuristic search inspired by Markov Chain Monte Carlo techniques to discover an executable code model whose behavior matches the opaque code. Native execution, parallelization, and a carefully-designed fitness function are leveraged to increase the effectiveness of the search. We have implemented our technique in a tool Mimic for discovering models of opaque JavaScript functions, and used Mimic to synthesize correct models for a variety of array-manipulating routines. @InProceedings{ESEC/FSE15p710, author = {Stefan Heule and Manu Sridharan and Satish Chandra}, title = {Mimic: Computing Models for Opaque Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {710--720}, doi = {}, year = {2015}, } Info |
|
Hoffmann, Henry |
ESEC/FSE '15: "Automated Multi-objective ..."
Automated Multi-objective Control for Self-Adaptive Software Design
Antonio Filieri, Henry Hoffmann , and Martina Maggio (University of Stuttgart, Germany; University of Chicago, USA; Lund University, Sweden) While software is becoming more complex everyday, the requirements on its behavior are not getting any easier to satisfy. An application should offer a certain quality of service, adapt to the current environmental conditions and withstand runtime variations that were simply unpredictable during the design phase. To tackle this complexity, control theory has been proposed as a technique for managing software's dynamic behavior, obviating the need for human intervention. Control-theoretical solutions, however, are either tailored for the specific application or do not handle the complexity of multiple interacting components and multiple goals. In this paper, we develop an automated control synthesis methodology that takes, as input, the configurable software components (or knobs) and the goals to be achieved. Our approach automatically constructs a control system that manages the specified knobs and guarantees the goals are met. These claims are backed up by experimental studies on three different software applications, where we show how the proposed automated approach handles the complexity of multiple knobs and objectives. @InProceedings{ESEC/FSE15p13, author = {Antonio Filieri and Henry Hoffmann and Martina Maggio}, title = {Automated Multi-objective Control for Self-Adaptive Software Design}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {13--24}, doi = {}, year = {2015}, } |
|
Huang, Jeff |
ESEC/FSE '15: "Finding Schedule-Sensitive ..."
Finding Schedule-Sensitive Branches
Jeff Huang and Lawrence Rauchwerger (Texas A&M University, USA) This paper presents an automated, precise technique, TAME, for identifying schedule-sensitive branches (SSBs) in concurrent programs, i.e., branches whose decision may vary depending on the actual scheduling of concurrent threads. The technique consists of 1) tracing events at fine-grained level; 2) deriving the constraints for each branch; and 3) invoking an SMT solver to find possible SSB, by trying to solve the negated branch condition. To handle the infeasibly huge number of computations that would be generated by the fine-grained tracing, TAME leverages concolic execution and implements several sound approximations to delimit the number of traces to analyse, yet without sacrificing precision. In addition, TAME implements a novel distributed trace partition approach distributing the analysis into smaller chunks. Evaluation on both popular benchmarks and real applications shows that TAME is effective in finding SSBs and has good scalability. TAME found a total of 34 SSBs, among which 17 are related to concurrency errors, and 9 are ad hoc synchronizations. @InProceedings{ESEC/FSE15p439, author = {Jeff Huang and Lawrence Rauchwerger}, title = {Finding Schedule-Sensitive Branches}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {439--449}, doi = {}, year = {2015}, } |
|
Jarke, Matthias |
ESEC/FSE '15: "Gamification for Enforcing ..."
Gamification for Enforcing Coding Conventions
Christian R. Prause and Matthias Jarke (DLR, Germany; RWTH Aachen University, Germany) Software is a knowledge intensive product, which can only evolve if there is effective and efficient information exchange between developers. Complying to coding conventions improves information exchange by improving the readability of source code. However, without some form of enforcement, compliance to coding conventions is limited. We look at the problem of information exchange in code and propose gamification as a way to motivate developers to invest in compliance. Our concept consists of a technical prototype and its integration into a Scrum environment. By means of two experiments with agile software teams and subsequent surveys, we show that gamification can effectively improve adherence to coding conventions. @InProceedings{ESEC/FSE15p649, author = {Christian R. Prause and Matthias Jarke}, title = {Gamification for Enforcing Coding Conventions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {649--660}, doi = {}, year = {2015}, } |
|
Jensen, Simon Holm |
ESEC/FSE '15: "MemInsight: Platform-Independent ..."
MemInsight: Platform-Independent Memory Debugging for JavaScript
Simon Holm Jensen, Manu Sridharan, Koushik Sen, and Satish Chandra (Snowflake Computing, USA; Samsung Research, USA; University of California at Berkeley, USA) JavaScript programs often suffer from memory issues that can either hurt performance or eventually cause memory exhaustion. While existing snapshot-based profiling tools can be helpful, the information provided is limited to the coarse granularity at which snapshots can be taken. We present MemInsight, a tool that provides detailed, time-varying analysis of the memory behavior of JavaScript applications, including web applications. MemInsight is platform independent and runs on unmodified JavaScript engines. It employs tuned source-code instrumentation to generate a trace of memory allocations and accesses, and it leverages modern browser features to track precise information for DOM (document object model) objects. It also computes exact object lifetimes without any garbage collector assistance, and exposes this information in an easily-consumable manner for further analysis. We describe several client analyses built into MemInsight, including detection of possible memory leaks and opportunities for stack allocation and object inlining. An experimental evaluation showed that with no modifications to the runtime, MemInsight was able to expose memory issues in several real-world applications. @InProceedings{ESEC/FSE15p345, author = {Simon Holm Jensen and Manu Sridharan and Koushik Sen and Satish Chandra}, title = {MemInsight: Platform-Independent Memory Debugging for JavaScript}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {345--356}, doi = {}, year = {2015}, } |
|
Jin, Guoliang |
ESEC/FSE '15: "What Change History Tells ..."
What Change History Tells Us about Thread Synchronization
Rui Gu, Guoliang Jin, Linhai Song, Linjie Zhu, and Shan Lu (Columbia University, USA; North Carolina State University, USA; University of Wisconsin-Madison, USA; University of Chicago, USA) Multi-threaded programs are pervasive, yet difficult to write. Missing proper synchronization leads to correctness bugs and over synchronization leads to performance problems. To improve the correctness and efficiency of multi-threaded software, we need a better understanding of synchronization challenges faced by real-world developers. This paper studies the code repositories of open-source multi-threaded software projects to obtain a broad and in- depth view of how developers handle synchronizations. We first examine how critical sections are changed when software evolves by checking over 250,000 revisions of four representative open-source software projects. The findings help us answer questions like how often synchronization is an afterthought for developers; whether it is difficult for devel- opers to decide critical section boundaries and lock variables; and what are real-world over-synchronization problems. We then conduct case studies to better understand (1) how critical sections are changed to solve performance prob- lems (i.e. over-synchronization issues) and (2) how soft- ware changes lead to synchronization-related correctness problems (i.e. concurrency bugs). This in-depth study shows that tool support is needed to help developers tackle over-synchronization problems; it also shows that concur- rency bug avoidance, detection, and testing can be improved through better awareness of code revision history. @InProceedings{ESEC/FSE15p426, author = {Rui Gu and Guoliang Jin and Linhai Song and Linjie Zhu and Shan Lu}, title = {What Change History Tells Us about Thread Synchronization}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {426--438}, doi = {}, year = {2015}, } |
|
Jin, Long |
ESEC/FSE '15: "Hey, You Have Given Me Too ..."
Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software
Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker (University of California at San Diego, USA; Huazhong University of Science and Technology, China; NetApp, USA) Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters ("knobs"). With hundreds of knobs, configuring system software to ensure high reliability and performance becomes a daunting, error-prone task. This paper makes a first step in understanding a fundamental question of configuration design: "do users really need so many knobs?" To provide the quantitatively answer, we study the configuration settings of real-world users, including thousands of customers of a commercial storage system (Storage-A), and hundreds of users of two widely-used open-source system software projects. Our study reveals a series of interesting findings to motivate software architects and developers to be more cautious and disciplined in configuration design. Motivated by these findings, we provide a few concrete, practical guidelines which can significantly reduce the configuration space. Take Storage-A as an example, the guidelines can remove 51.9% of its parameters and simplify 19.7% of the remaining ones with little impact on existing users. Also, we study the existing configuration navigation methods in the context of "too many knobs" to understand their effectiveness in dealing with the over-designed configuration, and to provide practices for building navigation support in system software. @InProceedings{ESEC/FSE15p307, author = {Tianyin Xu and Long Jin and Xuepeng Fan and Yuanyuan Zhou and Shankar Pasupathy and Rukma Talwadker}, title = {Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {307--319}, doi = {}, year = {2015}, } Video Info |
|
Jing, Xiaoyuan |
ESEC/FSE '15: "Heterogeneous Cross-Company ..."
Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning
Xiaoyuan Jing, Fei Wu, Xiwei Dong, Fumin Qi, and Baowen Xu (Wuhan University, China; Nanjing University of Posts and Telecommunications, China; Nanjing University, China) Cross-company defect prediction (CCDP) learns a prediction model by using training data from one or multiple projects of a source company and then applies the model to the target company data. Existing CCDP methods are based on the assumption that the data of source and target companies should have the same software metrics. However, for CCDP, the source and target company data is usually heterogeneous, namely the metrics used and the size of metric set are different in the data of two companies. We call CCDP in this scenario as heterogeneous CCDP (HCCDP) task. In this paper, we aim to provide an effective solution for HCCDP. We propose a unified metric representation (UMR) for the data of source and target companies. The UMR consists of three types of metrics, i.e., the common metrics of the source and target companies, source-company specific metrics and target-company specific metrics. To construct UMR for source company data, the target-company specific metrics are set as zeros, while for UMR of the target company data, the source-company specific metrics are set as zeros. Based on the unified metric representation, we for the first time introduce canonical correlation analysis (CCA), an effective transfer learning method, into CCDP to make the data distributions of source and target companies similar. Experiments on 14 public heterogeneous datasets from four companies indicate that: 1) for HCCDP with partially different metrics, our approach significantly outperforms state-of-the-art CCDP methods; 2) for HCCDP with totally different metrics, our approach obtains comparable prediction performances in contrast with within-project prediction results. The proposed approach is effective for HCCDP. @InProceedings{ESEC/FSE15p496, author = {Xiaoyuan Jing and Fei Wu and Xiwei Dong and Fumin Qi and Baowen Xu}, title = {Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {496--507}, doi = {}, year = {2015}, } |
|
Johnson, Brittany |
ESEC/FSE '15: "Questions Developers Ask While ..."
Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis
Justin Smith, Brittany Johnson, Emerson Murphy-Hill, Bill Chu, and Heather Richter Lipford (North Carolina State University, USA; University of North Carolina at Charlotte, USA) Security tools can help developers answer questions about potential vulnerabilities in their code. A better understanding of the types of questions asked by developers may help toolsmiths design more effective tools. In this paper, we describe how we collected and categorized these questions by conducting an exploratory study with novice and experienced software developers. We equipped them with Find Security Bugs, a security-oriented static analysis tool, and observed their interactions with security vulnerabilities in an open-source system that they had previously contributed to. We found that they asked questions not only about security vulnerabilities, associated attacks, and fixes, but also questions about the software itself, the social ecosystem that built the software, and related resources and tools. For example, when participants asked questions about the source of tainted data, their tools forced them to make imperfect tradeoffs between systematic and ad hoc program navigation strategies. @InProceedings{ESEC/FSE15p248, author = {Justin Smith and Brittany Johnson and Emerson Murphy-Hill and Bill Chu and Heather Richter Lipford}, title = {Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {248--259}, doi = {}, year = {2015}, } Info |
|
Jones, James A. |
ESEC/FSE '15: "Test Report Prioritization ..."
Test Report Prioritization to Assist Crowdsourced Testing
Yang Feng, Zhenyu Chen, James A. Jones , Chunrong Fang, and Baowen Xu (Nanjing University, China; University of California at Irvine, USA) In crowdsourced testing, users can be incentivized to perform testing tasks and report their results, and because crowdsourced workers are often paid per task, there is a financial incentive to complete tasks quickly rather than well. These reports of the crowdsourced testing tasks are called "test reports" and are composed of simple natural language and screenshots. Back at the software-development organization, developers must manually inspect the test reports to judge their value for revealing faults. Due to the nature of crowdsourced work, the number of test reports are often difficult to comprehensively inspect and process. In order to help with this daunting task, we created the first technique of its kind, to the best of our knowledge, to prioritize test reports for manual inspection. Our technique utilizes two key strategies: (1) a diversity strategy to help developers inspect a wide variety of test reports and to avoid duplicates and wasted effort on falsely classified faulty behavior, and (2) a risk strategy to help developers identify test reports that may be more likely to be fault-revealing based on past observations. Together, these strategies form our DivRisk strategy to prioritize test reports in crowd- sourced testing. Three industrial projects have been used to evaluate the effectiveness of test report prioritization methods. The results of the empirical study show that: (1) DivRisk can significantly outperform random prioritization; (2) DivRisk can approximate the best theoretical result for a real-world industrial mobile application. In addition, we provide some practical guidelines of test report prioritization for crowdsourced testing based on the empirical study and our experiences. @InProceedings{ESEC/FSE15p225, author = {Yang Feng and Zhenyu Chen and James A. Jones and Chunrong Fang and Baowen Xu}, title = {Test Report Prioritization to Assist Crowdsourced Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {225--236}, doi = {}, year = {2015}, } |
|
Kaiser, Gail |
ESEC/FSE '15: "Efficient Dependency Detection ..."
Efficient Dependency Detection for Safe Java Test Acceleration
Jonathan Bell, Gail Kaiser , Eric Melski, and Mohan Dattatreya (Columbia University, USA; Electric Cloud, USA) Slow builds remain a plague for software developers. The frequency with which code can be built (compiled, tested and packaged) directly impacts the productivity of developers: longer build times mean a longer wait before determining if a change to the application being built was successful. We have discovered that in the case of some languages, such as Java, the majority of build time is spent running tests, where dependencies between individual tests are complicated to discover, making many existing test acceleration techniques unsound to deploy in practice. Without knowledge of which tests are dependent on others, we cannot safely parallelize the execution of the tests, nor can we perform incremental testing (i.e., execute only a subset of an application's tests for each build). The previous techniques for detecting these dependencies did not scale to large test suites: given a test suite that normally ran in two hours, the best-case running scenario for the previous tool would have taken over 422 CPU days to find dependencies between all test methods (and would not soundly find all dependencies) — on the same project the exhaustive technique (to find all dependencies) would have taken over 1e300 years. We present a novel approach to detecting all dependencies between test cases in large projects that can enable safe exploitation of parallelism and test selection with a modest analysis cost. @InProceedings{ESEC/FSE15p770, author = {Jonathan Bell and Gail Kaiser and Eric Melski and Mohan Dattatreya}, title = {Efficient Dependency Detection for Safe Java Test Acceleration}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {770--781}, doi = {}, year = {2015}, } |
|
Kamei, Yasutaka |
ESEC/FSE '15: "An Empirical Study of Goto ..."
An Empirical Study of Goto in C Code from GitHub Repositories
Meiyappan Nagappan, Romain Robbes, Yasutaka Kamei, Éric Tanter , Shane McIntosh, Audris Mockus, and Ahmed E. Hassan (Rochester Institute of Technology, USA; University of Chile, Chile; Kyushu University, Japan; McGill University, Canada; University of Tennessee, USA; Queen's University, Canada) It is nearly 50 years since Dijkstra argued that goto obscures the flow of control in program execution and urged programmers to abandon the goto statement. While past research has shown that goto is still in use, little is known about whether goto is used in the unrestricted manner that Dijkstra feared, and if it is ‘harmful’ enough to be a part of a post-release bug. We, therefore, conduct a two part empirical study - (1) qualitatively analyze a statistically rep- resentative sample of 384 files from a population of almost 250K C programming language files collected from over 11K GitHub repositories and find that developers use goto in C files for error handling (80.21±5%) and cleaning up resources at the end of a procedure (40.36 ± 5%); and (2) quantitatively analyze the commit history from the release branches of six OSS projects and find that no goto statement was re- moved/modified in the post-release phase of four of the six projects. We conclude that developers limit themselves to using goto appropriately in most cases, and not in an unrestricted manner like Dijkstra feared, thus suggesting that goto does not appear to be harmful in practice. @InProceedings{ESEC/FSE15p404, author = {Meiyappan Nagappan and Romain Robbes and Yasutaka Kamei and Éric Tanter and Shane McIntosh and Audris Mockus and Ahmed E. Hassan}, title = {An Empirical Study of Goto in C Code from GitHub Repositories}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {404--414}, doi = {}, year = {2015}, } |
|
Kanade, Aditya |
ESEC/FSE '15: "P3: Partitioned Path Profiling ..."
P3: Partitioned Path Profiling
Mohammed Afraz, Diptikalyan Saha, and Aditya Kanade (Indian Institute of Science, India; IBM Research, India) Acyclic path profile is an abstraction of dynamic control flow paths of procedures and has been found to be useful in a wide spectrum of activities. Unfortunately, the runtime overhead of obtaining such a profile can be high, limiting its use in practice. In this paper, we present partitioned path profiling (P3) which runs K copies of the program in parallel, each with the same input but on a separate core, and collects the profile only for a subset of intra-procedural paths in each copy, thereby, distributing the overhead of profiling. P3 identifies “profitable” procedures and assigns disjoint subsets of paths of a profitable procedure to different copies for profiling. To obtain exact execution frequencies of a subset of paths, we design a new algorithm, called PSPP. All paths of an unprofitable procedure are assigned to the same copy. P3 uses the classic Ball-Larus algorithm for profiling unprofitable procedures. Further, P3 attempts to evenly distribute the profiling overhead across the copies. To the best of our knowledge, P3 is the first algorithm for parallel path profiling. We have applied P3 to profile several programs in the SPEC 2006 benchmark. Compared to sequential profiling, P3 substantially reduced the runtime overhead on these programs averaged across all benchmarks. The reduction was 23%, 43% and 56% on average for 2, 4 and 8 cores respectively. P3 also performed better than a coarse-grained approach that treats all procedures as unprofitable and distributes them across available cores. For 2 cores, the profiling overhead of P3 was on average 5% less compared to the coarse-grained approach across these programs. For 4 and 8 cores, it was respectively 18% and 25% less. @InProceedings{ESEC/FSE15p485, author = {Mohammed Afraz and Diptikalyan Saha and Aditya Kanade}, title = {P3: Partitioned Path Profiling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {485--495}, doi = {}, year = {2015}, } |
|
Kargén, Ulf |
ESEC/FSE '15: "Turning Programs against Each ..."
Turning Programs against Each Other: High Coverage Fuzz-Testing using Binary-Code Mutation and Dynamic Slicing
Ulf Kargén and Nahid Shahmehri (Linköping University, Sweden) Mutation-based fuzzing is a popular and widely employed black-box testing technique for finding security and robustness bugs in software. It owes much of its success to its simplicity; a well-formed seed input is mutated, e.g. through random bit-flipping, to produce test inputs. While reducing the need for human effort, and enabling security testing even of closed-source programs with undocumented input formats, the simplicity of mutation-based fuzzing comes at the cost of poor code coverage. Often millions of iterations are needed, and the results are highly dependent on configuration parameters and the choice of seed inputs. In this paper we propose a novel method for automated generation of high-coverage test cases for robustness testing. Our method is based on the observation that, even for closed-source programs with proprietary input formats, an implementation that can generate well-formed inputs to the program is typically available. By systematically mutating the program code of such generating programs, we leverage information about the input format encoded in the generating program to produce high-coverage test inputs, capable of reaching deep states in the program under test. Our method works entirely at the machine-code level, enabling use-cases similar to traditional black-box fuzzing. We have implemented the method in our tool MutaGen, and evaluated it on 7 popular Linux programs. We found that, for most programs, our method improves code coverage by one order of magnitude or more, compared to two well-known mutation-based fuzzers. We also found a total of 8 unique bugs. @InProceedings{ESEC/FSE15p782, author = {Ulf Kargén and Nahid Shahmehri}, title = {Turning Programs against Each Other: High Coverage Fuzz-Testing using Binary-Code Mutation and Dynamic Slicing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {782--792}, doi = {}, year = {2015}, } |
|
Karim, Rezwana |
ESEC/FSE '15: "Responsive Designs in a Snap ..."
Responsive Designs in a Snap
Nishant Sinha and Rezwana Karim (IBM Research, India; Rutgers University, USA) With the massive adoption of mobile devices with different form- factors, UI designers face the challenge of designing responsive UIs which are visually appealing across a wide range of devices. De- signing responsive UIs requires a deep knowledge of HTML/CSS as well as responsive patterns - juggling through various design configurations and re-designing for multiple devices is laborious and time-consuming. We present DECOR, a recommendation tool for creating multi-device responsive UIs. Given an initial UI de- sign, user-specified design constraints and a list of devices, DECOR provides ranked, device-specific recommendations to the designer for approval. Design space exploration involves a combinatorial explosion: we formulate it as a design repair problem and devise several design space pruning techniques to enable efficient repair. An evaluation over real-life designs shows that DECOR is able to compute the desired recommendations, involving a variety of responsive design patterns, in less than a minute. @InProceedings{ESEC/FSE15p544, author = {Nishant Sinha and Rezwana Karim}, title = {Responsive Designs in a Snap}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {544--554}, doi = {}, year = {2015}, } |
|
Kästner, Christian |
ESEC/FSE '15: "Performance-Influence Models ..."
Performance-Influence Models for Highly Configurable Systems
Norbert Siegmund, Alexander Grebhahn, Sven Apel, and Christian Kästner (University of Passau, Germany; Carnegie Mellon University, USA) Almost every complex software system today is configurable. While configurability has many benefits, it challenges performance prediction, optimization, and debugging. Often, the influences of individual configuration options on performance are unknown. Worse, configuration options may interact, giving rise to a configuration space of possibly exponential size. Addressing this challenge, we propose an approach that derives a performance-influence model for a given configurable system, describing all relevant influences of configuration options and their interactions. Our approach combines machine-learning and sampling heuristics in a novel way. It improves over standard techniques in that it (1) represents influences of options and their interactions explicitly (which eases debugging), (2) smoothly integrates binary and numeric configuration options for the first time, (3) incorporates domain knowledge, if available (which eases learning and increases accuracy), (4) considers complex constraints among options, and (5) systematically reduces the solution space to a tractable size. A series of experiments demonstrates the feasibility of our approach in terms of the accuracy of the models learned as well as the accuracy of the performance predictions one can make with them. @InProceedings{ESEC/FSE15p284, author = {Norbert Siegmund and Alexander Grebhahn and Sven Apel and Christian Kästner}, title = {Performance-Influence Models for Highly Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {284--294}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "Cross-Language Program Slicing ..." Cross-Language Program Slicing for Dynamic Web Applications Hung Viet Nguyen, Christian Kästner , and Tien N. Nguyen (Iowa State University, USA; Carnegie Mellon University, USA) During software maintenance, program slicing is a useful technique to assist developers in understanding the impact of their changes. While different program-slicing techniques have been proposed for traditional software systems, program slicing for dynamic web applications is challenging since the client-side code is generated from the server-side code and data entities are referenced across different languages and are often embedded in string literals in the server-side program. To address those challenges, we introduce WebSlice, an approach to compute program slices across different languages for web applications. We first identify data-flow dependencies among data entities for PHP code based on symbolic execution. We also compute SQL queries and a conditional DOM that represents client-code variations and construct the data flows for embedded languages: SQL, HTML, and JavaScript. Next, we connect the data flows across different languages and across PHP pages. Finally, we compute a program slice for a given entity based on the established data flows. Running WebSlice on five real-world, open-source PHP systems, we found that, out of 40,670 program slices, 10% cross languages, 38% cross files, and 13% cross string fragments, demonstrating the potential benefit of tool support for cross-language program slicing in dynamic web applications. @InProceedings{ESEC/FSE15p369, author = {Hung Viet Nguyen and Christian Kästner and Tien N. Nguyen}, title = {Cross-Language Program Slicing for Dynamic Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {369--380}, doi = {}, year = {2015}, } |
|
Kevic, Katja |
ESEC/FSE '15: "Tracing Software Developers' ..."
Tracing Software Developers' Eyes and Interactions for Change Tasks
Katja Kevic, Braden M. Walters, Timothy R. Shaffer, Bonita Sharif, David C. Shepherd, and Thomas Fritz (University of Zurich, Switzerland; Youngstown State University, USA; ABB Research, USA) What are software developers doing during a change task? While an answer to this question opens countless opportunities to support developers in their work, only little is known about developers' detailed navigation behavior for realistic change tasks. Most empirical studies on developers performing change tasks are limited to very small code snippets or are limited by the granularity or the detail of the data collected for the study. In our research, we try to overcome these limitations by combining user interaction monitoring with very fine granular eye-tracking data that is automatically linked to the underlying source code entities in the IDE. In a study with 12 professional and 10 student developers working on three change tasks from an open source system, we used our approach to investigate the detailed navigation of developers for realistic change tasks. The results of our study show, amongst others, that the eye tracking data does indeed capture different aspects than user interaction data and that developers focus on only small parts of methods that are often related by data flow. We discuss our findings and their implications for better developer tool support. @InProceedings{ESEC/FSE15p202, author = {Katja Kevic and Braden M. Walters and Timothy R. Shaffer and Bonita Sharif and David C. Shepherd and Thomas Fritz}, title = {Tracing Software Developers' Eyes and Interactions for Change Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {202--213}, doi = {}, year = {2015}, } Info |
|
Kim, Sunghun |
ESEC/FSE '15: "Heterogeneous Defect Prediction ..."
Heterogeneous Defect Prediction
Jaechang Nam and Sunghun Kim (Hong Kong University of Science and Technology, China) Software defect prediction is one of the most active research areas in software engineering. We can build a prediction model with defect data collected from a software project and predict defects in the same project, i.e. within-project defect prediction (WPDP). Researchers also proposed cross-project defect prediction (CPDP) to predict defects for new projects lacking in defect data by using prediction models built by other projects. In recent studies, CPDP is proved to be feasible. However, CPDP requires projects that have the same metric set, meaning the metric sets should be identical between projects. As a result, current techniques for CPDP are difficult to apply across projects with heterogeneous metric sets. To address the limitation, we propose heterogeneous defect prediction (HDP) to predict defects across projects with heterogeneous metric sets. Our HDP approach conducts metric selection and metric matching to build a prediction model between projects with heterogeneous metric sets. Our empirical study on 28 subjects shows that about 68% of predictions using our approach outperform or are comparable to WPDP with statistical significance. @InProceedings{ESEC/FSE15p508, author = {Jaechang Nam and Sunghun Kim}, title = {Heterogeneous Defect Prediction}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {508--519}, doi = {}, year = {2015}, } ESEC/FSE '15: "Crowd Debugging ..." Crowd Debugging Fuxiang Chen and Sunghun Kim (Hong Kong University of Science and Technology, China) Research shows that, in general, many people turn to QA sites to solicit answers to their problems. We observe in Stack Overflow a huge number of recurring questions, 1,632,590, despite mechanisms having been put into place to prevent these recurring questions. Recurring questions imply developers are facing similar issues in their source code. However, limitations exist in the QA sites. Developers need to visit them frequently and/or should be familiar with all the content to take advantage of the crowd's knowledge. Due to the large and rapid growth of QA data, it is difficult, if not impossible for developers to catch up. To address these limitations, we propose mining the QA site, Stack Overflow, to leverage the huge mass of crowd knowledge to help developers debug their code. Our approach reveals 189 warnings and 171 (90.5%) of them are confirmed by developers from eight high-quality and well-maintained projects. Developers appreciate these findings because the crowd provides solutions and comprehensive explanations to the issues. We compared the confirmed bugs with three popular static analysis tools (FindBugs, JLint and PMD). Of the 171 bugs identified by our approach, only FindBugs detected six of them whereas JLint and PMD detected none. @InProceedings{ESEC/FSE15p320, author = {Fuxiang Chen and Sunghun Kim}, title = {Crowd Debugging}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {320--332}, doi = {}, year = {2015}, } |
|
Kruegel, Christopher |
ESEC/FSE '15: "CLAPP: Characterizing Loops ..."
CLAPP: Characterizing Loops in Android Applications
Yanick Fratantonio, Aravind Machiry, Antonio Bianchi, Christopher Kruegel, and Giovanni Vigna (University of California at Santa Barbara, USA) When performing program analysis, loops are one of the most important aspects that needs to be taken into account. In the past, many approaches have been proposed to analyze loops to perform different tasks, ranging from compiler optimizations to Worst-Case Execution Time (WCET) analysis. While these approaches are powerful, they focus on tackling very specific categories of loops and known loop patterns, such as the ones for which the number of iterations can be statically determined. In this work, we developed a static analysis framework to characterize and analyze generic loops, without relying on techniques based on pattern matching. For this work, we focus on the Android platform, and we implemented a prototype, called CLAPP, that we used to perform the first large-scale empirical study of the usage of loops in Android applications. In particular, we used our tool to analyze a total of 4,110,510 loops found in 11,823 Android applications. As part of our evaluation, we provide the detailed results of our empirical study, we show how our analysis was able to determine that the execution of 63.28% of the loops is bounded, and we discuss several interesting insights related to the performance issues and security aspects associated with loops. @InProceedings{ESEC/FSE15p687, author = {Yanick Fratantonio and Aravind Machiry and Antonio Bianchi and Christopher Kruegel and Giovanni Vigna}, title = {CLAPP: Characterizing Loops in Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {687--697}, doi = {}, year = {2015}, } Info |
|
Kulesza, Uirá |
ESEC/FSE '15: "Summarizing and Measuring ..."
Summarizing and Measuring Development Activity
Christoph Treude, Fernando Figueira Filho, and Uirá Kulesza (Federal University of Rio Grande do Norte, Brazil) Software developers pursue a wide range of activities as part of their work, and making sense of what they did in a given time frame is far from trivial as evidenced by the large number of awareness and coordination tools that have been developed in recent years. To inform tool design for making sense of the information available about a developer's activity, we conducted an empirical study with 156 GitHub users to investigate what information they would expect in a summary of development activity, how they would measure development activity, and what factors influence how such activity can be condensed into textual summaries or numbers. We found that unexpected events are as important as expected events in summaries of what a developer did, and that many developers do not believe in measuring development activity. Among the factors that influence summarization and measurement of development activity, we identified development experience and programming languages. @InProceedings{ESEC/FSE15p625, author = {Christoph Treude and Fernando Figueira Filho and Uirá Kulesza}, title = {Summarizing and Measuring Development Activity}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {625--636}, doi = {}, year = {2015}, } Info |
|
Kusano, Markus |
ESEC/FSE '15: "Assertion Guided Symbolic ..."
Assertion Guided Symbolic Execution of Multithreaded Programs
Shengjian Guo , Markus Kusano, Chao Wang, Zijiang Yang, and Aarti Gupta (Virginia Tech, USA; Western Michigan University, USA; Princeton University, USA) Symbolic execution is a powerful technique for systematic testing of sequential and multithreaded programs. However, its application is limited by the high cost of covering all feasible intra-thread paths and inter-thread interleavings. We propose a new assertion guided pruning framework that identifies executions guaranteed not to lead to an error and removes them during symbolic execution. By summarizing the reasons why previously explored executions cannot reach an error and using the information to prune redundant executions in the future, we can soundly reduce the search space. We also use static concurrent program slicing and heuristic minimization of symbolic constraints to further reduce the computational overhead. We have implemented our method in the Cloud9 symbolic execution tool and evaluated it on a large set of multithreaded C/C++ programs. Our experiments show that the new method can reduce the overall computational cost significantly. @InProceedings{ESEC/FSE15p854, author = {Shengjian Guo and Markus Kusano and Chao Wang and Zijiang Yang and Aarti Gupta}, title = {Assertion Guided Symbolic Execution of Multithreaded Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {854--865}, doi = {}, year = {2015}, } |
|
Le, Tien-Duy B. |
ESEC/FSE '15: "Information Retrieval and ..."
Information Retrieval and Spectrum Based Bug Localization: Better Together
Tien-Duy B. Le, Richard J. Oentaryo, and David Lo (Singapore Management University, Singapore) Debugging often takes much effort and resources. To help developers debug, numerous information retrieval (IR)-based and spectrum-based bug localization techniques have been proposed. IR-based techniques process textual information in bug reports, while spectrum-based techniques process program spectra (i.e., a record of which program elements are executed for each test case). Both eventually generate a ranked list of program elements that are likely to contain the bug. However, these techniques only consider one source of information, either bug reports or program spectra, which is not optimal. To deal with the limitation of existing techniques, in this work, we propose a new multi-modal technique that considers both bug reports and program spectra to localize bugs. Our approach adaptively creates a bug-specific model to map a particular bug to its possible location, and introduces a novel idea of suspicious words that are highly associated to a bug. We evaluate our approach on 157 real bugs from four software systems, and compare it with a state-of-the-art IR-based bug localization method, a state-of-the-art spectrum-based bug localization method, and three state-of-the-art multi-modal feature location methods that are adapted for bug localization. Experiments show that our approach can outperform the baselines by at least 47.62%, 31.48%, 27.78%, and 28.80% in terms of number of bugs successfully localized when a developer inspects 1, 5, and 10 program elements (i.e., Top 1, Top 5, and Top 10), and Mean Average Precision (MAP) respectively. @InProceedings{ESEC/FSE15p579, author = {Tien-Duy B. Le and Richard J. Oentaryo and David Lo}, title = {Information Retrieval and Spectrum Based Bug Localization: Better Together}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {579--590}, doi = {}, year = {2015}, } |
|
Leitner, Philipp |
ESEC/FSE '15: "The Making of Cloud Applications: ..."
The Making of Cloud Applications: An Empirical Study on Software Development for the Cloud
Jürgen Cito, Philipp Leitner, Thomas Fritz , and Harald C. Gall (University of Zurich, Switzerland) Cloud computing is gaining more and more traction as a deployment and provisioning model for software. While a large body of research already covers how to optimally operate a cloud system, we still lack insights into how professional software engineers actually use clouds, and how the cloud impacts development practices. This paper reports on the first systematic study on how software developers build applications for the cloud. We conducted a mixed-method study, consisting of qualitative interviews of 25 professional developers and a quantitative survey with 294 responses. Our results show that adopting the cloud has a profound impact throughout the software development process, as well as on how developers utilize tools and data in their daily work. Among other things, we found that (1) developers need better means to anticipate runtime problems and rigorously define metrics for improved fault localization and (2) the cloud offers an abundance of operational data, however, developers still often rely on their experience and intuition rather than utilizing metrics. From our findings, we extracted a set of guidelines for cloud development and identified challenges for researchers and tool vendors. @InProceedings{ESEC/FSE15p393, author = {Jürgen Cito and Philipp Leitner and Thomas Fritz and Harald C. Gall}, title = {The Making of Cloud Applications: An Empirical Study on Software Development for the Cloud}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {393--403}, doi = {}, year = {2015}, } |
|
Li, Ding |
ESEC/FSE '15: "String Analysis for Java and ..."
String Analysis for Java and Android Applications
Ding Li, Yingjun Lyu, Mian Wan, and William G. J. Halfond (University of Southern California, USA) String analysis is critical for many verification techniques. However, accurately modeling string variables is a challeng- ing problem. Current approaches are generally customized for certain problem domains or have critical limitations in handling loops, providing context-sensitive inter-procedural analysis, and performing efficient analysis on complicated apps. To address these limitations, we propose a general framework, Violist, for string analysis that allows researchers to more flexibly choose how they will address each of these challenges by separating the representation and interpreta- tion of string operations. In our evaluation, we show that our approach can achieve high accuracy on both Java and Android apps in a reasonable amount of time. We also com- pared our approach with a popular and widely used string analyzer and found that our approach has higher precision and shorter execution time while maintaining the same level of recall. @InProceedings{ESEC/FSE15p661, author = {Ding Li and Yingjun Lyu and Mian Wan and William G. J. Halfond}, title = {String Analysis for Java and Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {661--672}, doi = {}, year = {2015}, } |
|
Lin, Shang-Wei |
ESEC/FSE '15: "TLV: Abstraction through Testing, ..."
TLV: Abstraction through Testing, Learning, and Validation
Jun Sun, Hao Xiao, Yang Liu , Shang-Wei Lin, and Shengchao Qin (Singapore University of Technology and Design, Singapore; Nanyang Technological University, Singapore; Teesside University, UK; Shenzhen University, China) A (Java) class provides a service to its clients (i.e., programs which use the class). The service must satisfy certain specifications. Different specifications might be expected at different levels of abstraction depending on the client's objective. In order to effectively contrast the class against its specifications, whether manually or automatically, one essential step is to automatically construct an abstraction of the given class at a proper level of abstraction. The abstraction should be correct (i.e., over-approximating) and accurate (i.e., with few spurious traces). We present an automatic approach, which combines testing, learning, and validation, to constructing an abstraction. Our approach is designed such that a large part of the abstraction is generated based on testing and learning so as to minimize the use of heavy-weight techniques like symbolic execution. The abstraction is generated through a process of abstraction/refinement, with no user input, and converges to a specific level of abstraction depending on the usage context. The generated abstraction is guaranteed to be correct and accurate. We have implemented the proposed approach in a toolkit named TLV and evaluated TLV with a number of benchmark programs as well as three real-world ones. The results show that TLV generates abstraction for program analysis and verification more efficiently. @InProceedings{ESEC/FSE15p698, author = {Jun Sun and Hao Xiao and Yang Liu and Shang-Wei Lin and Shengchao Qin}, title = {TLV: Abstraction through Testing, Learning, and Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {698--709}, doi = {}, year = {2015}, } Info |
|
Lin, Yun |
ESEC/FSE '15: "Clone-Based and Interactive ..."
Clone-Based and Interactive Recommendation for Modifying Pasted Code
Yun Lin, Xin Peng , Zhenchang Xing, Diwen Zheng, and Wenyun Zhao (Fudan University, China; Nanyang Technological University, Singapore) Developers often need to modify pasted code when programming with copy-and-paste practice. Some modifications on pasted code could involve lots of editing efforts, and any missing or wrong edit could incur bugs. In this paper, we propose a clone-based and interactive approach to recommending where and how to modify the pasted code. In our approach, we regard clones of the pasted code as the results of historical copy-and-paste operations and their differences as historical modifications on the same piece of code. Our approach first retrieves clones of the pasted code from a clone repository and detects syntactically complete differences among them. Then our approach transfers each clone difference into a modification slot on the pasted code, suggests options for each slot, and further mines modifying regulations from the clone differences. Based on the mined modifying regulations, our approach dynamically updates the suggested options and their ranking in each slot according to developer's modifications on the pasted code. We implement a proof-of-concept tool CCDemon based on our approach and evaluate its effectiveness based on code clones detected from five open source projects. The results show that our approach can identify 96.9% of the to-be-modified positions in pasted code and suggest 75.0% of the required modifications. Our human study further confirms that CCDemon can help developers to accomplish their modifications of pasted code more efficiently. @InProceedings{ESEC/FSE15p520, author = {Yun Lin and Xin Peng and Zhenchang Xing and Diwen Zheng and Wenyun Zhao}, title = {Clone-Based and Interactive Recommendation for Modifying Pasted Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {520--531}, doi = {}, year = {2015}, } |
|
Lin, Zhiqiang |
ESEC/FSE '15: "Automatically Deriving Pointer ..."
Automatically Deriving Pointer Reference Expressions from Binary Code for Memory Dump Analysis
Yangchun Fu, Zhiqiang Lin, and David Brumley (University of Texas at Dallas, USA; Carnegie Mellon University, USA) Given a crash dump or a kernel memory snapshot, it is often desirable to have a capability that can traverse its pointers to locate the root cause of the crash, or check their integrity to detect the control flow hijacks. To achieve this, one key challenge lies in how to locate where the pointers are. While locating a pointer usually requires the data structure knowledge of the corresponding program, an important advance made by this work is that we show a technique of extracting address-independent data reference expressions for pointers through dynamic binary analysis. This novel pointer reference expression encodes how a pointer is accessed through the combination of a base address (usually a global variable) with certain offset and further pointer dereferences. We have applied our techniques to OS kernels, and our experimental results with a number of real world kernel malware show that we can correctly identify the hijacked kernel function pointers by locating them using the extracted pointer reference expressions when only given a memory snapshot. @InProceedings{ESEC/FSE15p614, author = {Yangchun Fu and Zhiqiang Lin and David Brumley}, title = {Automatically Deriving Pointer Reference Expressions from Binary Code for Memory Dump Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {614--624}, doi = {}, year = {2015}, } |
|
Linares-Vásquez, Mario |
ESEC/FSE '15: "Auto-completing Bug Reports ..."
Auto-completing Bug Reports for Android Applications
Kevin Moran, Mario Linares-Vásquez, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (College of William and Mary, USA) The modern software development landscape has seen a shift in focus toward mobile applications as tablets and smartphones near ubiquitous adoption. Due to this trend, the complexity of these “apps” has been increasing, making development and maintenance challenging. Additionally, current bug tracking systems are not able to effectively support construction of reports with actionable information that directly lead to a bug’s resolution. To address the need for an improved reporting system, we introduce a novel solution, called FUSION, that helps users auto-complete reproduction steps in bug reports for mobile apps. FUSION links user-provided information to program artifacts extracted through static and dynamic analysis performed before testing or release. The approach that FUSION employs is generalizable to other current mobile software platforms, and constitutes a new method by which off-device bug reporting can be conducted for mobile software projects. In a study involving 28 participants we applied FUSION to support the maintenance tasks of reporting and reproducing defects from 15 real-world bugs found in 14 open source Android apps while qualitatively and qualitatively measuring the user experience of the system. Our results demonstrate that FUSION both effectively facilitates reporting and allows for more reliable reproduction of bugs from reports compared to traditional issue tracking systems by presenting more detailed contextual app information. @InProceedings{ESEC/FSE15p673, author = {Kevin Moran and Mario Linares-Vásquez and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Auto-completing Bug Reports for Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {673--686}, doi = {}, year = {2015}, } Video Info ESEC/FSE '15: "Optimizing Energy Consumption ..." Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach Mario Linares-Vásquez, Gabriele Bavota, Carlos Eduardo Bernal Cárdenas, Rocco Oliveto, Massimiliano Di Penta, and Denys Poshyvanyk (College of William and Mary, USA; Free University of Bolzano, Italy; University of Molise, Italy; University of Sannio, Italy) The wide diffusion of mobile devices has motivated research towards optimizing energy consumption of software systems— including apps—targeting such devices. Besides efforts aimed at dealing with various kinds of energy bugs, the adoption of Organic Light-Emitting Diode (OLED) screens has motivated research towards reducing energy consumption by choosing an appropriate color palette. Whilst past research in this area aimed at optimizing energy while keeping an acceptable level of contrast, this paper proposes an approach, named GEMMA (Gui Energy Multi-objective optiMization for Android apps), for generating color palettes using a multi- objective optimization technique, which produces color solutions optimizing energy consumption and contrast while using consistent colors with respect to the original color palette. An empirical evaluation that we performed on 25 Android apps demonstrates not only significant improvements in terms of the three different objectives, but also confirmed that in most cases users still perceived the choices of colors as attractive. Finally, for several apps we interviewed the original developers, who in some cases expressed the intent to adopt the proposed choice of color palette, whereas in other cases pointed out directions for future improvements @InProceedings{ESEC/FSE15p143, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Eduardo Bernal Cárdenas and Rocco Oliveto and Massimiliano Di Penta and Denys Poshyvanyk}, title = {Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {143--154}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Lipford, Heather Richter |
ESEC/FSE '15: "Questions Developers Ask While ..."
Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis
Justin Smith, Brittany Johnson, Emerson Murphy-Hill, Bill Chu, and Heather Richter Lipford (North Carolina State University, USA; University of North Carolina at Charlotte, USA) Security tools can help developers answer questions about potential vulnerabilities in their code. A better understanding of the types of questions asked by developers may help toolsmiths design more effective tools. In this paper, we describe how we collected and categorized these questions by conducting an exploratory study with novice and experienced software developers. We equipped them with Find Security Bugs, a security-oriented static analysis tool, and observed their interactions with security vulnerabilities in an open-source system that they had previously contributed to. We found that they asked questions not only about security vulnerabilities, associated attacks, and fixes, but also questions about the software itself, the social ecosystem that built the software, and related resources and tools. For example, when participants asked questions about the source of tainted data, their tools forced them to make imperfect tradeoffs between systematic and ad hoc program navigation strategies. @InProceedings{ESEC/FSE15p248, author = {Justin Smith and Brittany Johnson and Emerson Murphy-Hill and Bill Chu and Heather Richter Lipford}, title = {Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {248--259}, doi = {}, year = {2015}, } Info |
|
Liu, Yang |
ESEC/FSE '15: "TLV: Abstraction through Testing, ..."
TLV: Abstraction through Testing, Learning, and Validation
Jun Sun, Hao Xiao, Yang Liu , Shang-Wei Lin, and Shengchao Qin (Singapore University of Technology and Design, Singapore; Nanyang Technological University, Singapore; Teesside University, UK; Shenzhen University, China) A (Java) class provides a service to its clients (i.e., programs which use the class). The service must satisfy certain specifications. Different specifications might be expected at different levels of abstraction depending on the client's objective. In order to effectively contrast the class against its specifications, whether manually or automatically, one essential step is to automatically construct an abstraction of the given class at a proper level of abstraction. The abstraction should be correct (i.e., over-approximating) and accurate (i.e., with few spurious traces). We present an automatic approach, which combines testing, learning, and validation, to constructing an abstraction. Our approach is designed such that a large part of the abstraction is generated based on testing and learning so as to minimize the use of heavy-weight techniques like symbolic execution. The abstraction is generated through a process of abstraction/refinement, with no user input, and converges to a specific level of abstraction depending on the usage context. The generated abstraction is guaranteed to be correct and accurate. We have implemented the proposed approach in a toolkit named TLV and evaluated TLV with a number of benchmark programs as well as three real-world ones. The results show that TLV generates abstraction for program analysis and verification more efficiently. @InProceedings{ESEC/FSE15p698, author = {Jun Sun and Hao Xiao and Yang Liu and Shang-Wei Lin and Shengchao Qin}, title = {TLV: Abstraction through Testing, Learning, and Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {698--709}, doi = {}, year = {2015}, } Info |
|
Livshits, Benjamin |
ESEC/FSE '15: "Detecting JavaScript Races ..."
Detecting JavaScript Races That Matter
Erdal Mutlu, Serdar Tasiran, and Benjamin Livshits (Koç University, Turkey; Microsoft Research, USA) As JavaScript has become virtually omnipresent as the language for programming large and complex web applications in the last several years, we have seen an increase in interest in finding data races in client-side JavaScript. While JavaScript execution is single-threaded, there is still enough potential for data races, created largely by the non-determinism of the scheduler. Recently, several academic efforts have explored both static and run-time analysis approaches in an effort to find data races. However, despite this, we have not seen these analysis techniques deployed in practice and we have only seen scarce evidence that developers find and fix bugs related to data races in JavaScript. In this paper we argue for a different formulation of what it means to have a data race in a JavaScript application and distinguish between benign and harmful races, affecting persistent browser or server state. We further argue that while benign races — the subject of the majority of prior work — do exist, harmful races are exceedingly rare in practice (19 harmful vs. 621 benign). Our results shed a new light on the issues of data race prevalence and importance. To find races, we also propose a novel lightweight run-time symbolic exploration algorithm for finding races in traces of run-time execution. Our algorithm eschews schedule exploration in favor of smaller run-time overheads and thus can be used by beta testers or in crowd-sourced testing. In our experiments on 26 sites, we demonstrate that benign races are considerably more common than harmful ones. @InProceedings{ESEC/FSE15p381, author = {Erdal Mutlu and Serdar Tasiran and Benjamin Livshits}, title = {Detecting JavaScript Races That Matter}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {381--392}, doi = {}, year = {2015}, } Info |
|
Lo, David |
ESEC/FSE '15: "Information Retrieval and ..."
Information Retrieval and Spectrum Based Bug Localization: Better Together
Tien-Duy B. Le, Richard J. Oentaryo, and David Lo (Singapore Management University, Singapore) Debugging often takes much effort and resources. To help developers debug, numerous information retrieval (IR)-based and spectrum-based bug localization techniques have been proposed. IR-based techniques process textual information in bug reports, while spectrum-based techniques process program spectra (i.e., a record of which program elements are executed for each test case). Both eventually generate a ranked list of program elements that are likely to contain the bug. However, these techniques only consider one source of information, either bug reports or program spectra, which is not optimal. To deal with the limitation of existing techniques, in this work, we propose a new multi-modal technique that considers both bug reports and program spectra to localize bugs. Our approach adaptively creates a bug-specific model to map a particular bug to its possible location, and introduces a novel idea of suspicious words that are highly associated to a bug. We evaluate our approach on 157 real bugs from four software systems, and compare it with a state-of-the-art IR-based bug localization method, a state-of-the-art spectrum-based bug localization method, and three state-of-the-art multi-modal feature location methods that are adapted for bug localization. Experiments show that our approach can outperform the baselines by at least 47.62%, 31.48%, 27.78%, and 28.80% in terms of number of bugs successfully localized when a developer inspects 1, 5, and 10 program elements (i.e., Top 1, Top 5, and Top 10), and Mean Average Precision (MAP) respectively. @InProceedings{ESEC/FSE15p579, author = {Tien-Duy B. Le and Richard J. Oentaryo and David Lo}, title = {Information Retrieval and Spectrum Based Bug Localization: Better Together}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {579--590}, doi = {}, year = {2015}, } ESEC/FSE '15: "How Practitioners Perceive ..." How Practitioners Perceive the Relevance of Software Engineering Research David Lo , Nachiappan Nagappan, and Thomas Zimmermann (Singapore Management University, Singapore; Microsoft Research, USA) The number of software engineering research papers over the last few years has grown significantly. An important question here is: how relevant is software engineering research to practitioners in the field? To address this question, we conducted a survey at Microsoft where we invited 3,000 industry practitioners to rate the relevance of research ideas contained in 571 ICSE, ESEC/FSE and FSE papers that were published over a five year period. We received 17,913 ratings by 512 practitioners who labelled ideas as essential, worthwhile, unimportant, or unwise. The results from the survey suggest that practitioners are positive towards studies done by the software engineering research community: 71% of all ratings were essential or worthwhile. We found no correlation between the citation counts and the relevance scores of the papers. Through a qualitative analysis of free text responses, we identify several reasons why practitioners considered certain research ideas to be unwise. The survey approach described in this paper is lightweight: on average, a participant spent only 22.5 minutes to respond to the survey. At the same time, the results can provide useful insight to conference organizers, authors, and participating practitioners. @InProceedings{ESEC/FSE15p415, author = {David Lo and Nachiappan Nagappan and Thomas Zimmermann}, title = {How Practitioners Perceive the Relevance of Software Engineering Research}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {415--425}, doi = {}, year = {2015}, } Best-Paper Award |
|
Long, Fan |
ESEC/FSE '15: "Staged Program Repair with ..."
Staged Program Repair with Condition Synthesis
Fan Long and Martin Rinard (Massachusetts Institute of Technology, USA) We present SPR, a new program repair system that combines staged program repair and condition synthesis. These techniques enable SPR to work productively with a set of parameterized transformation schemas to generate and efficiently search a rich space of program repairs. Together these techniques enable SPR to generate correct repairs for over five times as many defects as previous systems evaluated on the same benchmark set. @InProceedings{ESEC/FSE15p166, author = {Fan Long and Martin Rinard}, title = {Staged Program Repair with Condition Synthesis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {166--178}, doi = {}, year = {2015}, } Info |
|
Lu, Shan |
ESEC/FSE '15: "What Change History Tells ..."
What Change History Tells Us about Thread Synchronization
Rui Gu, Guoliang Jin, Linhai Song, Linjie Zhu, and Shan Lu (Columbia University, USA; North Carolina State University, USA; University of Wisconsin-Madison, USA; University of Chicago, USA) Multi-threaded programs are pervasive, yet difficult to write. Missing proper synchronization leads to correctness bugs and over synchronization leads to performance problems. To improve the correctness and efficiency of multi-threaded software, we need a better understanding of synchronization challenges faced by real-world developers. This paper studies the code repositories of open-source multi-threaded software projects to obtain a broad and in- depth view of how developers handle synchronizations. We first examine how critical sections are changed when software evolves by checking over 250,000 revisions of four representative open-source software projects. The findings help us answer questions like how often synchronization is an afterthought for developers; whether it is difficult for devel- opers to decide critical section boundaries and lock variables; and what are real-world over-synchronization problems. We then conduct case studies to better understand (1) how critical sections are changed to solve performance prob- lems (i.e. over-synchronization issues) and (2) how soft- ware changes lead to synchronization-related correctness problems (i.e. concurrency bugs). This in-depth study shows that tool support is needed to help developers tackle over-synchronization problems; it also shows that concur- rency bug avoidance, detection, and testing can be improved through better awareness of code revision history. @InProceedings{ESEC/FSE15p426, author = {Rui Gu and Guoliang Jin and Linhai Song and Linjie Zhu and Shan Lu}, title = {What Change History Tells Us about Thread Synchronization}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {426--438}, doi = {}, year = {2015}, } |
|
Lyu, Yingjun |
ESEC/FSE '15: "String Analysis for Java and ..."
String Analysis for Java and Android Applications
Ding Li, Yingjun Lyu, Mian Wan, and William G. J. Halfond (University of Southern California, USA) String analysis is critical for many verification techniques. However, accurately modeling string variables is a challeng- ing problem. Current approaches are generally customized for certain problem domains or have critical limitations in handling loops, providing context-sensitive inter-procedural analysis, and performing efficient analysis on complicated apps. To address these limitations, we propose a general framework, Violist, for string analysis that allows researchers to more flexibly choose how they will address each of these challenges by separating the representation and interpreta- tion of string operations. In our evaluation, we show that our approach can achieve high accuracy on both Java and Android apps in a reasonable amount of time. We also com- pared our approach with a popular and widely used string analyzer and found that our approach has higher precision and shorter execution time while maintaining the same level of recall. @InProceedings{ESEC/FSE15p661, author = {Ding Li and Yingjun Lyu and Mian Wan and William G. J. Halfond}, title = {String Analysis for Java and Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {661--672}, doi = {}, year = {2015}, } |
|
Machiry, Aravind |
ESEC/FSE '15: "CLAPP: Characterizing Loops ..."
CLAPP: Characterizing Loops in Android Applications
Yanick Fratantonio, Aravind Machiry, Antonio Bianchi, Christopher Kruegel, and Giovanni Vigna (University of California at Santa Barbara, USA) When performing program analysis, loops are one of the most important aspects that needs to be taken into account. In the past, many approaches have been proposed to analyze loops to perform different tasks, ranging from compiler optimizations to Worst-Case Execution Time (WCET) analysis. While these approaches are powerful, they focus on tackling very specific categories of loops and known loop patterns, such as the ones for which the number of iterations can be statically determined. In this work, we developed a static analysis framework to characterize and analyze generic loops, without relying on techniques based on pattern matching. For this work, we focus on the Android platform, and we implemented a prototype, called CLAPP, that we used to perform the first large-scale empirical study of the usage of loops in Android applications. In particular, we used our tool to analyze a total of 4,110,510 loops found in 11,823 Android applications. As part of our evaluation, we provide the detailed results of our empirical study, we show how our analysis was able to determine that the execution of 63.28% of the loops is bounded, and we discuss several interesting insights related to the performance issues and security aspects associated with loops. @InProceedings{ESEC/FSE15p687, author = {Yanick Fratantonio and Aravind Machiry and Antonio Bianchi and Christopher Kruegel and Giovanni Vigna}, title = {CLAPP: Characterizing Loops in Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {687--697}, doi = {}, year = {2015}, } Info |
|
Maggio, Martina |
ESEC/FSE '15: "Automated Multi-objective ..."
Automated Multi-objective Control for Self-Adaptive Software Design
Antonio Filieri, Henry Hoffmann , and Martina Maggio (University of Stuttgart, Germany; University of Chicago, USA; Lund University, Sweden) While software is becoming more complex everyday, the requirements on its behavior are not getting any easier to satisfy. An application should offer a certain quality of service, adapt to the current environmental conditions and withstand runtime variations that were simply unpredictable during the design phase. To tackle this complexity, control theory has been proposed as a technique for managing software's dynamic behavior, obviating the need for human intervention. Control-theoretical solutions, however, are either tailored for the specific application or do not handle the complexity of multiple interacting components and multiple goals. In this paper, we develop an automated control synthesis methodology that takes, as input, the configurable software components (or knobs) and the goals to be achieved. Our approach automatically constructs a control system that manages the specified knobs and guarantees the goals are met. These claims are backed up by experimental studies on three different software applications, where we show how the proposed automated approach handles the complexity of multiple knobs and objectives. @InProceedings{ESEC/FSE15p13, author = {Antonio Filieri and Henry Hoffmann and Martina Maggio}, title = {Automated Multi-objective Control for Self-Adaptive Software Design}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {13--24}, doi = {}, year = {2015}, } |
|
Mangal, Ravi |
ESEC/FSE '15: "A User-Guided Approach to ..."
A User-Guided Approach to Program Analysis
Ravi Mangal, Xin Zhang, Aditya V. Nori, and Mayur Naik (Georgia Tech, USA; Microsoft Research, UK) Program analysis tools often produce undesirable output due to various approximations. We present an approach and a system EUGENE that allows user feedback to guide such approximations towards producing the desired output. We formulate the problem of user-guided program analysis in terms of solving a combination of hard rules and soft rules: hard rules capture soundness while soft rules capture degrees of approximations and preferences of users. Our technique solves the rules using an off-the-shelf solver in a manner that is sound (satisfies all hard rules), optimal (maximally satisfies soft rules), and scales to real-world analyses and programs. We evaluate EUGENE on two different analyses with labeled output on a suite of seven Java programs of size 131–198 KLOC. We also report upon a user study involving nine users who employ EUGENE to guide an information-flow analysis on three Java micro-benchmarks. In our experiments, EUGENE significantly reduces misclassified reports upon providing limited amounts of feedback. @InProceedings{ESEC/FSE15p462, author = {Ravi Mangal and Xin Zhang and Aditya V. Nori and Mayur Naik}, title = {A User-Guided Approach to Program Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {462--473}, doi = {}, year = {2015}, } Best-Paper Award |
|
Maoz, Shahar |
ESEC/FSE '15: "GR(1) Synthesis for LTL Specification ..."
GR(1) Synthesis for LTL Specification Patterns
Shahar Maoz and Jan Oliver Ringert (Tel Aviv University, Israel) Reactive synthesis is an automated procedure to obtain a correct-by-construction reactive system from its temporal logic specification. Two of the main challenges in bringing reactive synthesis to software engineering practice are its very high worst-case complexity -- for linear temporal logic (LTL) it is double exponential in the length of the formula, and the difficulty of writing declarative specifications using basic LTL operators. To address the first challenge, Piterman et al. have suggested the General Reactivity of Rank 1 (GR(1)) fragment of LTL, which has an efficient polynomial time symbolic synthesis algorithm. To address the second challenge, Dwyer et al. have identified 55 LTL specification patterns, which are common in industrial specifications and make writing specifications easier. In this work we show that almost all of the 55 LTL specification patterns identified by Dwyer et al. can be expressed as assumptions and guarantees in the GR(1) fragment of LTL. Specifically, we present an automated, sound and complete translation of the patterns to the GR(1) form, which effectively results in an efficient reactive synthesis procedure for any specification that is written using the patterns. We have validated the correctness of the catalog of GR(1) templates we have created. The work is implemented in our reactive synthesis environment. It provides positive, promising evidence, for the potential feasibility of using reactive synthesis in practice. @InProceedings{ESEC/FSE15p96, author = {Shahar Maoz and Jan Oliver Ringert}, title = {GR(1) Synthesis for LTL Specification Patterns}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {96--106}, doi = {}, year = {2015}, } |
|
Marcus, Andrian |
ESEC/FSE '15: "Query-Based Configuration ..."
Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks
Laura Moreno, Gabriele Bavota, Sonia Haiduc, Massimiliano Di Penta, Rocco Oliveto, Barbara Russo, and Andrian Marcus (University of Texas at Dallas, USA; Free University of Bolzano, Italy; Florida State University, USA; University of Sannio, Italy; University of Molise, Italy) Text Retrieval (TR) approaches have been used to leverage the textual information contained in software artifacts to address a multitude of software engineering (SE) tasks. However, TR approaches need to be configured properly in order to lead to good results. Current approaches for automatic TR configuration in SE configure a single TR approach and then use it for all possible queries. In this paper, we show that such a configuration strategy leads to suboptimal results, and propose QUEST, the first approach bringing TR configuration selection to the query level. QUEST recommends the best TR configuration for a given query, based on a supervised learning approach that determines the TR configuration that performs the best for each query according to its properties. We evaluated QUEST in the context of feature and bug localization, using a data set with more than 1,000 queries. We found that QUEST is able to recommend one of the top three TR configurations for a query with a 69% accuracy, on average. We compared the results obtained with the configurations recommended by QUEST for every query with those obtained using a single TR configuration for all queries in a system and in the entire data set. We found that using QUEST we obtain better results than with any of the considered TR configurations. @InProceedings{ESEC/FSE15p567, author = {Laura Moreno and Gabriele Bavota and Sonia Haiduc and Massimiliano Di Penta and Rocco Oliveto and Barbara Russo and Andrian Marcus}, title = {Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {567--578}, doi = {}, year = {2015}, } Info |
|
Marinov, Darko |
ESEC/FSE '15: "Comparing and Combining Test-Suite ..."
Comparing and Combining Test-Suite Reduction and Regression Test Selection
August Shi, Tifany Yung, Alex Gyori, and Darko Marinov (University of Illinois at Urbana-Champaign, USA) Regression testing is widely used to check that changes made to software do not break existing functionality, but regression test suites grow, and running them fully can become costly. Researchers have proposed test-suite reduction and regression test selection as two approaches to reduce this cost by not running some of the tests from the test suite. However, previous research has not empirically evaluated how the two approaches compare to each other, and how well a combination of these approaches performs. We present the first extensive study that compares test-suite reduction and regression test selection approaches individually, and also evaluates a combination of the two approaches. We also propose a new criterion to measure the quality of tests with respect to software changes. Our experiments on 4,793 commits from 17 open-source projects show that regression test selection runs on average fewer tests (by 40.15pp) than test-suite reduction. However, test-suite reduction can have a high loss in fault-detection capability with respect to the changes, whereas a (safe) regression test selection has no loss. The experiments also show that a combination of the two approaches runs even fewer tests (on average 5.34pp) than regression test selection, but these tests still have a loss in fault-detection capability with respect to the changes. @InProceedings{ESEC/FSE15p237, author = {August Shi and Tifany Yung and Alex Gyori and Darko Marinov}, title = {Comparing and Combining Test-Suite Reduction and Regression Test Selection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {237--247}, doi = {}, year = {2015}, } |
|
Matinnejad, Reza |
ESEC/FSE '15: "Effective Test Suites for ..."
Effective Test Suites for Mixed Discrete-Continuous Stateflow Controllers
Reza Matinnejad, Shiva Nejati, Lionel C. Briand , and Thomas Bruckmann (University of Luxembourg, Luxembourg; Delphi Automotive Systems, Luxembourg) Modeling mixed discrete-continuous controllers using Stateflow is common practice and has a long tradition in the embedded software system industry. Testing Stateflow models is complicated by expensive and manual test oracles that are not amenable to full automation due to the complex continuous behaviors of such models. In this paper, we reduce the cost of manual test oracles by providing test case selection algorithms that help engineers develop small test suites with high fault revealing power for Stateflow models. We present six test selection algorithms for discrete-continuous Stateflows: An adaptive random test selection algorithm that diversifies test inputs, two white-box coverage-based algorithms, a black-box algorithm that diversifies test outputs, and two search-based black-box algorithms that aim to maximize the likelihood of presence of continuous output failure patterns. We evaluate and compare our test selection algorithms, and find that our three output-based algorithms consistently outperform the coverage- and input-based algorithms in revealing faults in discrete-continuous Stateflow models. Further, we show that our output-based algorithms are complementary as the two search-based algorithms perform best in revealing specific failures with small test suites, while the output diversity algorithm is able to identify different failure types better than other algorithms when test suites are above a certain size. @InProceedings{ESEC/FSE15p84, author = {Reza Matinnejad and Shiva Nejati and Lionel C. Briand and Thomas Bruckmann}, title = {Effective Test Suites for Mixed Discrete-Continuous Stateflow Controllers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {84--95}, doi = {}, year = {2015}, } Best-Paper Award |
|
Mayhorn, Chris |
ESEC/FSE '15: "Quantifying Developers' ..."
Quantifying Developers' Adoption of Security Tools
Jim Witschey, Olga Zielinska, Allaire Welk, Emerson Murphy-Hill, Chris Mayhorn, and Thomas Zimmermann (North Carolina State University, USA; Microsoft Research, USA) Security tools could help developers find critical vulnerabilities, yet such tools remain underused. We surveyed developers from 14 companies and 5 mailing lists about their reasons for using and not using security tools. The resulting thirty-nine predictors of security tool use provide both expected and unexpected insights. As we expected, developers who perceive security to be important are more likely to use security tools than those who do not. But that was not the strongest predictor of security tool use, it was instead developers' ability to observe their peers using security tools. @InProceedings{ESEC/FSE15p260, author = {Jim Witschey and Olga Zielinska and Allaire Welk and Emerson Murphy-Hill and Chris Mayhorn and Thomas Zimmermann}, title = {Quantifying Developers' Adoption of Security Tools}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {260--271}, doi = {}, year = {2015}, } |
|
McIntosh, Shane |
ESEC/FSE '15: "An Empirical Study of Goto ..."
An Empirical Study of Goto in C Code from GitHub Repositories
Meiyappan Nagappan, Romain Robbes, Yasutaka Kamei, Éric Tanter , Shane McIntosh, Audris Mockus, and Ahmed E. Hassan (Rochester Institute of Technology, USA; University of Chile, Chile; Kyushu University, Japan; McGill University, Canada; University of Tennessee, USA; Queen's University, Canada) It is nearly 50 years since Dijkstra argued that goto obscures the flow of control in program execution and urged programmers to abandon the goto statement. While past research has shown that goto is still in use, little is known about whether goto is used in the unrestricted manner that Dijkstra feared, and if it is ‘harmful’ enough to be a part of a post-release bug. We, therefore, conduct a two part empirical study - (1) qualitatively analyze a statistically rep- resentative sample of 384 files from a population of almost 250K C programming language files collected from over 11K GitHub repositories and find that developers use goto in C files for error handling (80.21±5%) and cleaning up resources at the end of a procedure (40.36 ± 5%); and (2) quantitatively analyze the commit history from the release branches of six OSS projects and find that no goto statement was re- moved/modified in the post-release phase of four of the six projects. We conclude that developers limit themselves to using goto appropriately in most cases, and not in an unrestricted manner like Dijkstra feared, thus suggesting that goto does not appear to be harmful in practice. @InProceedings{ESEC/FSE15p404, author = {Meiyappan Nagappan and Romain Robbes and Yasutaka Kamei and Éric Tanter and Shane McIntosh and Audris Mockus and Ahmed E. Hassan}, title = {An Empirical Study of Goto in C Code from GitHub Repositories}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {404--414}, doi = {}, year = {2015}, } |
|
Medvidovic, Nenad |
ESEC/FSE '15: "Detecting Event Anomalies ..."
Detecting Event Anomalies in Event-Based Systems
Gholamreza Safi, Arman Shahbazian, William G. J. Halfond , and Nenad Medvidovic (University of Southern California, USA) Event-based interaction is an attractive paradigm because its use can lead to highly flexible and adaptable systems. One problem in this paradigm is that events are sent, received, and processed nondeterministically, due to the systems’ reliance on implicit invocation and implicit concurrency. This nondeterminism can lead to event anomalies, which occur when an event-based system receives multiple events that lead to the write of a shared field or memory location. Event anomalies can lead to unreliable, error-prone, and hard to debug behavior in an event-based system. To detect these anomalies, this paper presents a new static analysis technique, DEvA, for automatically detecting event anomalies. DEvA has been evaluated on a set of open-source event-based systems against a state-of-the-art technique for detecting data races in multithreaded systems, and a recent technique for solving a similar problem with event processing in Android applications. DEvA exhibited high precision with respect to manually constructed ground truths, and was able to locate event anomalies that had not been detected by the existing solutions. @InProceedings{ESEC/FSE15p25, author = {Gholamreza Safi and Arman Shahbazian and William G. J. Halfond and Nenad Medvidovic}, title = {Detecting Event Anomalies in Event-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {25--37}, doi = {}, year = {2015}, } Video Info |
|
Melski, Eric |
ESEC/FSE '15: "Efficient Dependency Detection ..."
Efficient Dependency Detection for Safe Java Test Acceleration
Jonathan Bell, Gail Kaiser , Eric Melski, and Mohan Dattatreya (Columbia University, USA; Electric Cloud, USA) Slow builds remain a plague for software developers. The frequency with which code can be built (compiled, tested and packaged) directly impacts the productivity of developers: longer build times mean a longer wait before determining if a change to the application being built was successful. We have discovered that in the case of some languages, such as Java, the majority of build time is spent running tests, where dependencies between individual tests are complicated to discover, making many existing test acceleration techniques unsound to deploy in practice. Without knowledge of which tests are dependent on others, we cannot safely parallelize the execution of the tests, nor can we perform incremental testing (i.e., execute only a subset of an application's tests for each build). The previous techniques for detecting these dependencies did not scale to large test suites: given a test suite that normally ran in two hours, the best-case running scenario for the previous tool would have taken over 422 CPU days to find dependencies between all test methods (and would not soundly find all dependencies) — on the same project the exhaustive technique (to find all dependencies) would have taken over 1e300 years. We present a novel approach to detecting all dependencies between test cases in large projects that can enable safe exploitation of parallelism and test selection with a modest analysis cost. @InProceedings{ESEC/FSE15p770, author = {Jonathan Bell and Gail Kaiser and Eric Melski and Mohan Dattatreya}, title = {Efficient Dependency Detection for Safe Java Test Acceleration}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {770--781}, doi = {}, year = {2015}, } |
|
Mesbah, Ali |
ESEC/FSE '15: "Assertions Are Strongly Correlated ..."
Assertions Are Strongly Correlated with Test Suite Effectiveness
Yucheng Zhang and Ali Mesbah (University of British Columbia, Canada) Code coverage is a popular test adequacy criterion in practice. Code coverage, however, remains controversial as there is a lack of coherent empirical evidence for its relation with test suite effectiveness. More recently, test suite size has been shown to be highly correlated with effectiveness. However, previous studies treat test methods as the smallest unit of interest, and ignore potential factors influencing this relationship. We propose to go beyond test suite size, by investigating test assertions inside test methods. We empirically evaluate the relationship between a test suite’s effectiveness and the (1) number of assertions, (2) assertion coverage, and (3) different types of assertions. We compose 6,700 test suites in total, using 24,000 assertions of five real-world Java projects. We find that the number of assertions in a test suite strongly correlates with its effectiveness, and this factor directly influences the relationship between test suite size and effectiveness. Our results also indicate that assertion coverage is strongly correlated with effectiveness and different types of assertions can influence the effectiveness of their containing test suites. @InProceedings{ESEC/FSE15p214, author = {Yucheng Zhang and Ali Mesbah}, title = {Assertions Are Strongly Correlated with Test Suite Effectiveness}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {214--224}, doi = {}, year = {2015}, } Info |
|
Meyer, Bertrand |
ESEC/FSE '15: "Efficient and Reasonable Object-Oriented ..."
Efficient and Reasonable Object-Oriented Concurrency
Scott West, Sebastian Nanz, and Bertrand Meyer (Google, Switzerland; ETH Zurich, Switzerland) Making threaded programs safe and easy to reason about is one of the chief difficulties in modern programming. This work provides an efficient execution model for SCOOP, a concurrency approach that provides not only data-race freedom but also pre/postcondition reasoning guarantees between threads. The extensions we propose influence both the underlying semantics to increase the amount of concurrent execution that is possible, exclude certain classes of deadlocks, and enable greater performance. These extensions are used as the basis of an efficient runtime and optimization pass that improve performance 15x over a baseline implementation. This new implementation of SCOOP is, on average, also 2x faster than other well-known safe concurrent languages. The measurements are based on both coordination-intensive and data-manipulation-intensive benchmarks designed to offer a mixture of workloads. @InProceedings{ESEC/FSE15p734, author = {Scott West and Sebastian Nanz and Bertrand Meyer}, title = {Efficient and Reasonable Object-Oriented Concurrency}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {734--744}, doi = {}, year = {2015}, } |
|
Mezini, Mira |
ESEC/FSE '15: "Getting to Know You: Towards ..."
Getting to Know You: Towards a Capability Model for Java
Ben Hermann, Michael Reif, Michael Eichberg, and Mira Mezini (TU Darmstadt, Germany) Developing software from reusable libraries lets developers face a security dilemma: Either be efficient and reuse libraries as they are or inspect them, know about their resource usage, but possibly miss deadlines as reviews are a time consuming process. In this paper, we propose a novel capability inference mechanism for libraries written in Java. It uses a coarse-grained capability model for system resources that can be presented to developers. We found that the capability inference agrees by 86.81% on expectations towards capabilities that can be derived from project documentation. Moreover, our approach can find capabilities that cannot be discovered using project documentation. It is thus a helpful tool for developers mitigating the aforementioned dilemma. @InProceedings{ESEC/FSE15p758, author = {Ben Hermann and Michael Reif and Michael Eichberg and Mira Mezini}, title = {Getting to Know You: Towards a Capability Model for Java}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {758--769}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "Hidden Truths in Dead Software ..." Hidden Truths in Dead Software Paths Michael Eichberg, Ben Hermann, Mira Mezini , and Leonid Glanz (TU Darmstadt, Germany) Approaches and techniques for statically finding a multitude of issues in source code have been developed in the past. A core property of these approaches is that they are usually targeted towards finding only a very specific kind of issue and that the effort to develop such an analysis is significant. This strictly limits the number of kinds of issues that can be detected. In this paper, we discuss a generic approach based on the detection of infeasible paths in code that can discover a wide range of code smells ranging from useless code that hinders comprehension to real bugs. Code issues are identified by calculating the difference between the control-flow graph that contains all technically possible edges and the corresponding graph recorded while performing a more precise analysis using abstract interpretation. We have evaluated the approach using the Java Development Kit as well as the Qualitas Corpus (a curated collection of over 100 Java Applications) and were able to find thousands of issues across a wide range of categories. @InProceedings{ESEC/FSE15p474, author = {Michael Eichberg and Ben Hermann and Mira Mezini and Leonid Glanz}, title = {Hidden Truths in Dead Software Paths}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {474--484}, doi = {}, year = {2015}, } Info |
|
Mockus, Audris |
ESEC/FSE '15: "A Method to Identify and Correct ..."
A Method to Identify and Correct Problematic Software Activity Data: Exploiting Capacity Constraints and Data Redundancies
Qimu Zheng, Audris Mockus, and Minghui Zhou (Peking University, China; University of Tennessee, USA) Mining software repositories to understand and improve software development is a common approach in research and practice. The operational data obtained from these repositories often do not faithfully represent the intended aspects of software development and, therefore, may jeopardize the conclusions derived from it. We propose an approach to identify problematic values based on the constraints of software development and to correct such values using data redundancies. We investigate the approach using issue and commit data of Mozilla project. In particular, we identified problematic data in four types of events and found the fraction of problematic values to exceed 10% and rapidly rising. We found the corrected values to be 50% closer to the most accurate estimate of task completion time. Finally, we found that the models of time until fix changed substantially when data were corrected, with the corrected data providing a 20% better fit. We discuss how the approach may be generalized to other types of operational data to increase fidelity of software measurement in practice and in research. @InProceedings{ESEC/FSE15p637, author = {Qimu Zheng and Audris Mockus and Minghui Zhou}, title = {A Method to Identify and Correct Problematic Software Activity Data: Exploiting Capacity Constraints and Data Redundancies}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {637--648}, doi = {}, year = {2015}, } ESEC/FSE '15: "An Empirical Study of Goto ..." An Empirical Study of Goto in C Code from GitHub Repositories Meiyappan Nagappan, Romain Robbes, Yasutaka Kamei, Éric Tanter , Shane McIntosh, Audris Mockus, and Ahmed E. Hassan (Rochester Institute of Technology, USA; University of Chile, Chile; Kyushu University, Japan; McGill University, Canada; University of Tennessee, USA; Queen's University, Canada) It is nearly 50 years since Dijkstra argued that goto obscures the flow of control in program execution and urged programmers to abandon the goto statement. While past research has shown that goto is still in use, little is known about whether goto is used in the unrestricted manner that Dijkstra feared, and if it is ‘harmful’ enough to be a part of a post-release bug. We, therefore, conduct a two part empirical study - (1) qualitatively analyze a statistically rep- resentative sample of 384 files from a population of almost 250K C programming language files collected from over 11K GitHub repositories and find that developers use goto in C files for error handling (80.21±5%) and cleaning up resources at the end of a procedure (40.36 ± 5%); and (2) quantitatively analyze the commit history from the release branches of six OSS projects and find that no goto statement was re- moved/modified in the post-release phase of four of the six projects. We conclude that developers limit themselves to using goto appropriately in most cases, and not in an unrestricted manner like Dijkstra feared, thus suggesting that goto does not appear to be harmful in practice. @InProceedings{ESEC/FSE15p404, author = {Meiyappan Nagappan and Romain Robbes and Yasutaka Kamei and Éric Tanter and Shane McIntosh and Audris Mockus and Ahmed E. Hassan}, title = {An Empirical Study of Goto in C Code from GitHub Repositories}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {404--414}, doi = {}, year = {2015}, } |
|
Moran, Kevin |
ESEC/FSE '15: "Auto-completing Bug Reports ..."
Auto-completing Bug Reports for Android Applications
Kevin Moran, Mario Linares-Vásquez, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (College of William and Mary, USA) The modern software development landscape has seen a shift in focus toward mobile applications as tablets and smartphones near ubiquitous adoption. Due to this trend, the complexity of these “apps” has been increasing, making development and maintenance challenging. Additionally, current bug tracking systems are not able to effectively support construction of reports with actionable information that directly lead to a bug’s resolution. To address the need for an improved reporting system, we introduce a novel solution, called FUSION, that helps users auto-complete reproduction steps in bug reports for mobile apps. FUSION links user-provided information to program artifacts extracted through static and dynamic analysis performed before testing or release. The approach that FUSION employs is generalizable to other current mobile software platforms, and constitutes a new method by which off-device bug reporting can be conducted for mobile software projects. In a study involving 28 participants we applied FUSION to support the maintenance tasks of reporting and reproducing defects from 15 real-world bugs found in 14 open source Android apps while qualitatively and qualitatively measuring the user experience of the system. Our results demonstrate that FUSION both effectively facilitates reporting and allows for more reliable reproduction of bugs from reports compared to traditional issue tracking systems by presenting more detailed contextual app information. @InProceedings{ESEC/FSE15p673, author = {Kevin Moran and Mario Linares-Vásquez and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Auto-completing Bug Reports for Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {673--686}, doi = {}, year = {2015}, } Video Info |
|
Moreno, Gabriel A. |
ESEC/FSE '15: "Proactive Self-Adaptation ..."
Proactive Self-Adaptation under Uncertainty: A Probabilistic Model Checking Approach
Gabriel A. Moreno, Javier Cámara, David Garlan, and Bradley Schmerl (SEI, USA; Carnegie Mellon University, USA) Self-adaptive systems tend to be reactive and myopic, adapting in response to changes without anticipating what the subsequent adaptation needs will be. Adapting reactively can result in inefficiencies due to the system performing a suboptimal sequence of adaptations. Furthermore, when adaptations have latency, and take some time to produce their effect, they have to be started with sufficient lead time so that they complete by the time their effect is needed. Proactive latency-aware adaptation addresses these issues by making adaptation decisions with a look-ahead horizon and taking adaptation latency into account. In this paper we present an approach for proactive latency-aware adaptation under uncertainty that uses probabilistic model checking for adaptation decisions. The key idea is to use a formal model of the adaptive system in which the adaptation decision is left underspecified through nondeterminism, and have the model checker resolve the nondeterministic choices so that the accumulated utility over the horizon is maximized. The adaptation decision is optimal over the horizon, and takes into account the inherent uncertainty of the environment predictions needed for looking ahead. Our results show that the decision based on a look-ahead horizon, and the factoring of both tactic latency and environment uncertainty, considerably improve the effectiveness of adaptation decisions. @InProceedings{ESEC/FSE15p1, author = {Gabriel A. Moreno and Javier Cámara and David Garlan and Bradley Schmerl}, title = {Proactive Self-Adaptation under Uncertainty: A Probabilistic Model Checking Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1--12}, doi = {}, year = {2015}, } |
|
Moreno, Laura |
ESEC/FSE '15: "Query-Based Configuration ..."
Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks
Laura Moreno, Gabriele Bavota, Sonia Haiduc, Massimiliano Di Penta, Rocco Oliveto, Barbara Russo, and Andrian Marcus (University of Texas at Dallas, USA; Free University of Bolzano, Italy; Florida State University, USA; University of Sannio, Italy; University of Molise, Italy) Text Retrieval (TR) approaches have been used to leverage the textual information contained in software artifacts to address a multitude of software engineering (SE) tasks. However, TR approaches need to be configured properly in order to lead to good results. Current approaches for automatic TR configuration in SE configure a single TR approach and then use it for all possible queries. In this paper, we show that such a configuration strategy leads to suboptimal results, and propose QUEST, the first approach bringing TR configuration selection to the query level. QUEST recommends the best TR configuration for a given query, based on a supervised learning approach that determines the TR configuration that performs the best for each query according to its properties. We evaluated QUEST in the context of feature and bug localization, using a data set with more than 1,000 queries. We found that QUEST is able to recommend one of the top three TR configurations for a query with a 69% accuracy, on average. We compared the results obtained with the configurations recommended by QUEST for every query with those obtained using a single TR configuration for all queries in a system and in the entire data set. We found that using QUEST we obtain better results than with any of the considered TR configurations. @InProceedings{ESEC/FSE15p567, author = {Laura Moreno and Gabriele Bavota and Sonia Haiduc and Massimiliano Di Penta and Rocco Oliveto and Barbara Russo and Andrian Marcus}, title = {Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {567--578}, doi = {}, year = {2015}, } Info |
|
Murphy, Gail C. |
ESEC/FSE '15: "Impact of Developer Turnover ..."
Impact of Developer Turnover on Quality in Open-Source Software
Matthieu Foucault, Marc Palyart, Xavier Blanc, Gail C. Murphy, and Jean-Rémy Falleri (University of Bordeaux, France; University of British Columbia, Canada) Turnover is the phenomenon of continuous influx and retreat of human resources in a team. Despite being well-studied in many settings, turnover has not been characterized for open-source software projects. We study the source code repositories of five open-source projects to characterize patterns of turnover and to determine the effects of turnover on software quality. We define the base concepts of both external and internal turnover, which are the mobility of developers in and out of a project, and the mobility of developers inside a project, respectively. We provide a qualitative analysis of turnover patterns. We also found, in a quantitative analysis, that the activity of external newcomers negatively impact software quality. @InProceedings{ESEC/FSE15p829, author = {Matthieu Foucault and Marc Palyart and Xavier Blanc and Gail C. Murphy and Jean-Rémy Falleri}, title = {Impact of Developer Turnover on Quality in Open-Source Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {829--841}, doi = {}, year = {2015}, } Info |
|
Murphy-Hill, Emerson |
ESEC/FSE '15: "Quantifying Developers' ..."
Quantifying Developers' Adoption of Security Tools
Jim Witschey, Olga Zielinska, Allaire Welk, Emerson Murphy-Hill, Chris Mayhorn, and Thomas Zimmermann (North Carolina State University, USA; Microsoft Research, USA) Security tools could help developers find critical vulnerabilities, yet such tools remain underused. We surveyed developers from 14 companies and 5 mailing lists about their reasons for using and not using security tools. The resulting thirty-nine predictors of security tool use provide both expected and unexpected insights. As we expected, developers who perceive security to be important are more likely to use security tools than those who do not. But that was not the strongest predictor of security tool use, it was instead developers' ability to observe their peers using security tools. @InProceedings{ESEC/FSE15p260, author = {Jim Witschey and Olga Zielinska and Allaire Welk and Emerson Murphy-Hill and Chris Mayhorn and Thomas Zimmermann}, title = {Quantifying Developers' Adoption of Security Tools}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {260--271}, doi = {}, year = {2015}, } ESEC/FSE '15: "Questions Developers Ask While ..." Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis Justin Smith, Brittany Johnson, Emerson Murphy-Hill, Bill Chu, and Heather Richter Lipford (North Carolina State University, USA; University of North Carolina at Charlotte, USA) Security tools can help developers answer questions about potential vulnerabilities in their code. A better understanding of the types of questions asked by developers may help toolsmiths design more effective tools. In this paper, we describe how we collected and categorized these questions by conducting an exploratory study with novice and experienced software developers. We equipped them with Find Security Bugs, a security-oriented static analysis tool, and observed their interactions with security vulnerabilities in an open-source system that they had previously contributed to. We found that they asked questions not only about security vulnerabilities, associated attacks, and fixes, but also questions about the software itself, the social ecosystem that built the software, and related resources and tools. For example, when participants asked questions about the source of tainted data, their tools forced them to make imperfect tradeoffs between systematic and ad hoc program navigation strategies. @InProceedings{ESEC/FSE15p248, author = {Justin Smith and Brittany Johnson and Emerson Murphy-Hill and Bill Chu and Heather Richter Lipford}, title = {Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {248--259}, doi = {}, year = {2015}, } Info |
|
Mutlu, Erdal |
ESEC/FSE '15: "Detecting JavaScript Races ..."
Detecting JavaScript Races That Matter
Erdal Mutlu, Serdar Tasiran, and Benjamin Livshits (Koç University, Turkey; Microsoft Research, USA) As JavaScript has become virtually omnipresent as the language for programming large and complex web applications in the last several years, we have seen an increase in interest in finding data races in client-side JavaScript. While JavaScript execution is single-threaded, there is still enough potential for data races, created largely by the non-determinism of the scheduler. Recently, several academic efforts have explored both static and run-time analysis approaches in an effort to find data races. However, despite this, we have not seen these analysis techniques deployed in practice and we have only seen scarce evidence that developers find and fix bugs related to data races in JavaScript. In this paper we argue for a different formulation of what it means to have a data race in a JavaScript application and distinguish between benign and harmful races, affecting persistent browser or server state. We further argue that while benign races — the subject of the majority of prior work — do exist, harmful races are exceedingly rare in practice (19 harmful vs. 621 benign). Our results shed a new light on the issues of data race prevalence and importance. To find races, we also propose a novel lightweight run-time symbolic exploration algorithm for finding races in traces of run-time execution. Our algorithm eschews schedule exploration in favor of smaller run-time overheads and thus can be used by beta testers or in crowd-sourced testing. In our experiments on 26 sites, we demonstrate that benign races are considerably more common than harmful ones. @InProceedings{ESEC/FSE15p381, author = {Erdal Mutlu and Serdar Tasiran and Benjamin Livshits}, title = {Detecting JavaScript Races That Matter}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {381--392}, doi = {}, year = {2015}, } Info |
|
Nagappan, Meiyappan |
ESEC/FSE '15: "An Empirical Study of Goto ..."
An Empirical Study of Goto in C Code from GitHub Repositories
Meiyappan Nagappan, Romain Robbes, Yasutaka Kamei, Éric Tanter , Shane McIntosh, Audris Mockus, and Ahmed E. Hassan (Rochester Institute of Technology, USA; University of Chile, Chile; Kyushu University, Japan; McGill University, Canada; University of Tennessee, USA; Queen's University, Canada) It is nearly 50 years since Dijkstra argued that goto obscures the flow of control in program execution and urged programmers to abandon the goto statement. While past research has shown that goto is still in use, little is known about whether goto is used in the unrestricted manner that Dijkstra feared, and if it is ‘harmful’ enough to be a part of a post-release bug. We, therefore, conduct a two part empirical study - (1) qualitatively analyze a statistically rep- resentative sample of 384 files from a population of almost 250K C programming language files collected from over 11K GitHub repositories and find that developers use goto in C files for error handling (80.21±5%) and cleaning up resources at the end of a procedure (40.36 ± 5%); and (2) quantitatively analyze the commit history from the release branches of six OSS projects and find that no goto statement was re- moved/modified in the post-release phase of four of the six projects. We conclude that developers limit themselves to using goto appropriately in most cases, and not in an unrestricted manner like Dijkstra feared, thus suggesting that goto does not appear to be harmful in practice. @InProceedings{ESEC/FSE15p404, author = {Meiyappan Nagappan and Romain Robbes and Yasutaka Kamei and Éric Tanter and Shane McIntosh and Audris Mockus and Ahmed E. Hassan}, title = {An Empirical Study of Goto in C Code from GitHub Repositories}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {404--414}, doi = {}, year = {2015}, } |
|
Nagappan, Nachiappan |
ESEC/FSE '15: "How Practitioners Perceive ..."
How Practitioners Perceive the Relevance of Software Engineering Research
David Lo , Nachiappan Nagappan, and Thomas Zimmermann (Singapore Management University, Singapore; Microsoft Research, USA) The number of software engineering research papers over the last few years has grown significantly. An important question here is: how relevant is software engineering research to practitioners in the field? To address this question, we conducted a survey at Microsoft where we invited 3,000 industry practitioners to rate the relevance of research ideas contained in 571 ICSE, ESEC/FSE and FSE papers that were published over a five year period. We received 17,913 ratings by 512 practitioners who labelled ideas as essential, worthwhile, unimportant, or unwise. The results from the survey suggest that practitioners are positive towards studies done by the software engineering research community: 71% of all ratings were essential or worthwhile. We found no correlation between the citation counts and the relevance scores of the papers. Through a qualitative analysis of free text responses, we identify several reasons why practitioners considered certain research ideas to be unwise. The survey approach described in this paper is lightweight: on average, a participant spent only 22.5 minutes to respond to the survey. At the same time, the results can provide useful insight to conference organizers, authors, and participating practitioners. @InProceedings{ESEC/FSE15p415, author = {David Lo and Nachiappan Nagappan and Thomas Zimmermann}, title = {How Practitioners Perceive the Relevance of Software Engineering Research}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {415--425}, doi = {}, year = {2015}, } Best-Paper Award |
|
Naik, Mayur |
ESEC/FSE '15: "FlexJava: Language Support ..."
FlexJava: Language Support for Safe and Modular Approximate Programming
Jongse Park, Hadi Esmaeilzadeh, Xin Zhang, Mayur Naik, and William Harris (Georgia Tech, USA) Energy efficiency is a primary constraint in modern systems. Approximate computing is a promising approach that trades quality of result for gains in efficiency and performance. State- of-the-art approximate programming models require extensive manual annotations on program data and operations to guarantee safe execution of approximate programs. The need for extensive manual annotations hinders the practical use of approximation techniques. This paper describes FlexJava, a small set of language extensions, that significantly reduces the annotation effort, paving the way for practical approximate programming. These extensions enable programmers to annotate approximation-tolerant method outputs. The FlexJava compiler, which is equipped with an approximation safety analysis, automatically infers the operations and data that affect these outputs and selectively marks them approximable while giving safety guarantees. The automation and the language–compiler codesign relieve programmers from manually and explicitly an- notating data declarations or operations as safe to approximate. FlexJava is designed to support safety, modularity, generality, and scalability in software development. We have implemented FlexJava annotations as a Java library and we demonstrate its practicality using a wide range of Java applications and by con- ducting a user study. Compared to EnerJ, a recent approximate programming system, FlexJava provides the same energy savings with significant reduction (from 2× to 17×) in the number of annotations. In our user study, programmers spend 6× to 12× less time annotating programs using FlexJava than when using EnerJ. @InProceedings{ESEC/FSE15p745, author = {Jongse Park and Hadi Esmaeilzadeh and Xin Zhang and Mayur Naik and William Harris}, title = {FlexJava: Language Support for Safe and Modular Approximate Programming}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {745--757}, doi = {}, year = {2015}, } ESEC/FSE '15: "A User-Guided Approach to ..." A User-Guided Approach to Program Analysis Ravi Mangal, Xin Zhang, Aditya V. Nori, and Mayur Naik (Georgia Tech, USA; Microsoft Research, UK) Program analysis tools often produce undesirable output due to various approximations. We present an approach and a system EUGENE that allows user feedback to guide such approximations towards producing the desired output. We formulate the problem of user-guided program analysis in terms of solving a combination of hard rules and soft rules: hard rules capture soundness while soft rules capture degrees of approximations and preferences of users. Our technique solves the rules using an off-the-shelf solver in a manner that is sound (satisfies all hard rules), optimal (maximally satisfies soft rules), and scales to real-world analyses and programs. We evaluate EUGENE on two different analyses with labeled output on a suite of seven Java programs of size 131–198 KLOC. We also report upon a user study involving nine users who employ EUGENE to guide an information-flow analysis on three Java micro-benchmarks. In our experiments, EUGENE significantly reduces misclassified reports upon providing limited amounts of feedback. @InProceedings{ESEC/FSE15p462, author = {Ravi Mangal and Xin Zhang and Aditya V. Nori and Mayur Naik}, title = {A User-Guided Approach to Program Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {462--473}, doi = {}, year = {2015}, } Best-Paper Award |
|
Nam, Jaechang |
ESEC/FSE '15: "Heterogeneous Defect Prediction ..."
Heterogeneous Defect Prediction
Jaechang Nam and Sunghun Kim (Hong Kong University of Science and Technology, China) Software defect prediction is one of the most active research areas in software engineering. We can build a prediction model with defect data collected from a software project and predict defects in the same project, i.e. within-project defect prediction (WPDP). Researchers also proposed cross-project defect prediction (CPDP) to predict defects for new projects lacking in defect data by using prediction models built by other projects. In recent studies, CPDP is proved to be feasible. However, CPDP requires projects that have the same metric set, meaning the metric sets should be identical between projects. As a result, current techniques for CPDP are difficult to apply across projects with heterogeneous metric sets. To address the limitation, we propose heterogeneous defect prediction (HDP) to predict defects across projects with heterogeneous metric sets. Our HDP approach conducts metric selection and metric matching to build a prediction model between projects with heterogeneous metric sets. Our empirical study on 28 subjects shows that about 68% of predictions using our approach outperform or are comparable to WPDP with statistical significance. @InProceedings{ESEC/FSE15p508, author = {Jaechang Nam and Sunghun Kim}, title = {Heterogeneous Defect Prediction}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {508--519}, doi = {}, year = {2015}, } |
|
Nanz, Sebastian |
ESEC/FSE '15: "Efficient and Reasonable Object-Oriented ..."
Efficient and Reasonable Object-Oriented Concurrency
Scott West, Sebastian Nanz, and Bertrand Meyer (Google, Switzerland; ETH Zurich, Switzerland) Making threaded programs safe and easy to reason about is one of the chief difficulties in modern programming. This work provides an efficient execution model for SCOOP, a concurrency approach that provides not only data-race freedom but also pre/postcondition reasoning guarantees between threads. The extensions we propose influence both the underlying semantics to increase the amount of concurrent execution that is possible, exclude certain classes of deadlocks, and enable greater performance. These extensions are used as the basis of an efficient runtime and optimization pass that improve performance 15x over a baseline implementation. This new implementation of SCOOP is, on average, also 2x faster than other well-known safe concurrent languages. The measurements are based on both coordination-intensive and data-manipulation-intensive benchmarks designed to offer a mixture of workloads. @InProceedings{ESEC/FSE15p734, author = {Scott West and Sebastian Nanz and Bertrand Meyer}, title = {Efficient and Reasonable Object-Oriented Concurrency}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {734--744}, doi = {}, year = {2015}, } |
|
Necula, George |
ESEC/FSE '15: "MultiSE: Multi-path Symbolic ..."
MultiSE: Multi-path Symbolic Execution using Value Summaries
Koushik Sen, George Necula, Liang Gong, and Wontae Choi (University of California at Berkeley, USA) Dynamic symbolic execution (DSE) has been proposed to effectively generate test inputs for real-world programs. Unfortunately, DSE techniques do not scale well for large realistic programs, because often the number of feasible execution paths of a program increases exponentially with the increase in the length of an execution path. In this paper, we propose MultiSE, a new technique for merging states incrementally during symbolic execution, without using auxiliary variables. The key idea of MultiSE is based on an alternative representation of the state, where we map each variable, including the program counter, to a set of guarded symbolic expressions called a value summary. MultiSE has several advantages over conventional DSE and conventional state merging techniques: value summaries enable sharing of symbolic expressions and path constraints along multiple paths and thus avoid redundant execution. MultiSE does not introduce auxiliary symbolic variables, which enables it to 1) make progress even when merging values not supported by the constraint solver, 2) avoid expensive constraint solver calls when resolving function calls and jumps, and 3) carry out most operations concretely. Moreover, MultiSE updates value summaries incrementally at every assignment instruction, which makes it unnecessary to identify the join points and to keep track of variables to merge at join points. We have implemented MultiSE for JavaScript programs in a publicly available open-source tool. Our evaluation of MultiSE on several programs shows that 1) value summaries are an eective technique to take advantage of the sharing of value along multiple execution path, that 2) MultiSE can run significantly faster than traditional dynamic symbolic execution and, 3) MultiSE saves a substantial number of state merges compared to conventional state-merging techniques. @InProceedings{ESEC/FSE15p842, author = {Koushik Sen and George Necula and Liang Gong and Wontae Choi}, title = {MultiSE: Multi-path Symbolic Execution using Value Summaries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {842--853}, doi = {}, year = {2015}, } Best-Paper Award |
|
Nejati, Shiva |
ESEC/FSE '15: "Effective Test Suites for ..."
Effective Test Suites for Mixed Discrete-Continuous Stateflow Controllers
Reza Matinnejad, Shiva Nejati, Lionel C. Briand , and Thomas Bruckmann (University of Luxembourg, Luxembourg; Delphi Automotive Systems, Luxembourg) Modeling mixed discrete-continuous controllers using Stateflow is common practice and has a long tradition in the embedded software system industry. Testing Stateflow models is complicated by expensive and manual test oracles that are not amenable to full automation due to the complex continuous behaviors of such models. In this paper, we reduce the cost of manual test oracles by providing test case selection algorithms that help engineers develop small test suites with high fault revealing power for Stateflow models. We present six test selection algorithms for discrete-continuous Stateflows: An adaptive random test selection algorithm that diversifies test inputs, two white-box coverage-based algorithms, a black-box algorithm that diversifies test outputs, and two search-based black-box algorithms that aim to maximize the likelihood of presence of continuous output failure patterns. We evaluate and compare our test selection algorithms, and find that our three output-based algorithms consistently outperform the coverage- and input-based algorithms in revealing faults in discrete-continuous Stateflow models. Further, we show that our output-based algorithms are complementary as the two search-based algorithms perform best in revealing specific failures with small test suites, while the output diversity algorithm is able to identify different failure types better than other algorithms when test suites are above a certain size. @InProceedings{ESEC/FSE15p84, author = {Reza Matinnejad and Shiva Nejati and Lionel C. Briand and Thomas Bruckmann}, title = {Effective Test Suites for Mixed Discrete-Continuous Stateflow Controllers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {84--95}, doi = {}, year = {2015}, } Best-Paper Award |
|
Nguyen, Hung Viet |
ESEC/FSE '15: "Cross-Language Program Slicing ..."
Cross-Language Program Slicing for Dynamic Web Applications
Hung Viet Nguyen, Christian Kästner , and Tien N. Nguyen (Iowa State University, USA; Carnegie Mellon University, USA) During software maintenance, program slicing is a useful technique to assist developers in understanding the impact of their changes. While different program-slicing techniques have been proposed for traditional software systems, program slicing for dynamic web applications is challenging since the client-side code is generated from the server-side code and data entities are referenced across different languages and are often embedded in string literals in the server-side program. To address those challenges, we introduce WebSlice, an approach to compute program slices across different languages for web applications. We first identify data-flow dependencies among data entities for PHP code based on symbolic execution. We also compute SQL queries and a conditional DOM that represents client-code variations and construct the data flows for embedded languages: SQL, HTML, and JavaScript. Next, we connect the data flows across different languages and across PHP pages. Finally, we compute a program slice for a given entity based on the established data flows. Running WebSlice on five real-world, open-source PHP systems, we found that, out of 40,670 program slices, 10% cross languages, 38% cross files, and 13% cross string fragments, demonstrating the potential benefit of tool support for cross-language program slicing in dynamic web applications. @InProceedings{ESEC/FSE15p369, author = {Hung Viet Nguyen and Christian Kästner and Tien N. Nguyen}, title = {Cross-Language Program Slicing for Dynamic Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {369--380}, doi = {}, year = {2015}, } |
|
Nguyen, Tien N. |
ESEC/FSE '15: "Cross-Language Program Slicing ..."
Cross-Language Program Slicing for Dynamic Web Applications
Hung Viet Nguyen, Christian Kästner , and Tien N. Nguyen (Iowa State University, USA; Carnegie Mellon University, USA) During software maintenance, program slicing is a useful technique to assist developers in understanding the impact of their changes. While different program-slicing techniques have been proposed for traditional software systems, program slicing for dynamic web applications is challenging since the client-side code is generated from the server-side code and data entities are referenced across different languages and are often embedded in string literals in the server-side program. To address those challenges, we introduce WebSlice, an approach to compute program slices across different languages for web applications. We first identify data-flow dependencies among data entities for PHP code based on symbolic execution. We also compute SQL queries and a conditional DOM that represents client-code variations and construct the data flows for embedded languages: SQL, HTML, and JavaScript. Next, we connect the data flows across different languages and across PHP pages. Finally, we compute a program slice for a given entity based on the established data flows. Running WebSlice on five real-world, open-source PHP systems, we found that, out of 40,670 program slices, 10% cross languages, 38% cross files, and 13% cross string fragments, demonstrating the potential benefit of tool support for cross-language program slicing in dynamic web applications. @InProceedings{ESEC/FSE15p369, author = {Hung Viet Nguyen and Christian Kästner and Tien N. Nguyen}, title = {Cross-Language Program Slicing for Dynamic Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {369--380}, doi = {}, year = {2015}, } |
|
Nguyen, Tuong Huan |
ESEC/FSE '15: "Rule-Based Extraction of Goal-Use ..."
Rule-Based Extraction of Goal-Use Case Models from Text
Tuong Huan Nguyen, John Grundy, and Mohamed Almorsy (Swinburne University of Technology, Australia) Goal and use case modeling has been recognized as a key approach for understanding and analyzing requirements. However, in practice, goals and use cases are often buried among other content in requirements specifications documents and written in unstructured styles. It is thus a time-consuming and error-prone process to identify such goals and use cases. In addition, having them embedded in natural language documents greatly limits the possibility of formally analyzing the requirements for problems. To address these issues, we have developed a novel rule-based approach to automatically extract goal and use case models from natural language requirements documents. Our approach is able to automatically categorize goals and ensure they are properly specified. We also provide automated semantic parameterization of artifact textual specifications to promote further analysis on the extracted goal-use case models. Our approach achieves 85% precision and 82% recall rates on average for model extraction and 88% accuracy for the automated parameterization. @InProceedings{ESEC/FSE15p591, author = {Tuong Huan Nguyen and John Grundy and Mohamed Almorsy}, title = {Rule-Based Extraction of Goal-Use Case Models from Text}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {591--601}, doi = {}, year = {2015}, } Info |
|
Nord, Robert L. |
ESEC/FSE '15: "Measure It? Manage It? Ignore ..."
Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt
Neil A. Ernst, Stephany Bellomo, Ipek Ozkaya , Robert L. Nord, and Ian Gorton (SEI, USA) The technical debt metaphor is widely used to encapsulate numerous software quality problems. The metaphor is attractive to practitioners as it communicates to both technical and nontechnical audiences that if quality problems are not addressed, things may get worse. However, it is unclear whether there are practices that move this metaphor beyond a mere communication mechanism. Existing studies of technical debt have largely focused on code metrics and small surveys of developers. In this paper, we report on our survey of 1,831 participants, primarily software engineers and architects working in long-lived, software-intensive projects from three large organizations, and follow-up interviews of seven software engineers. We analyzed our data using both nonparametric statistics and qualitative text analysis. We found that architectural decisions are the most important source of technical debt. Furthermore, while respondents believe the metaphor is itself important for communication, existing tools are not currently helpful in managing the details. We use our results to motivate a technical debt timeline to focus management and tooling approaches. @InProceedings{ESEC/FSE15p50, author = {Neil A. Ernst and Stephany Bellomo and Ipek Ozkaya and Robert L. Nord and Ian Gorton}, title = {Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {50--60}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Nori, Aditya V. |
ESEC/FSE '15: "A User-Guided Approach to ..."
A User-Guided Approach to Program Analysis
Ravi Mangal, Xin Zhang, Aditya V. Nori, and Mayur Naik (Georgia Tech, USA; Microsoft Research, UK) Program analysis tools often produce undesirable output due to various approximations. We present an approach and a system EUGENE that allows user feedback to guide such approximations towards producing the desired output. We formulate the problem of user-guided program analysis in terms of solving a combination of hard rules and soft rules: hard rules capture soundness while soft rules capture degrees of approximations and preferences of users. Our technique solves the rules using an off-the-shelf solver in a manner that is sound (satisfies all hard rules), optimal (maximally satisfies soft rules), and scales to real-world analyses and programs. We evaluate EUGENE on two different analyses with labeled output on a suite of seven Java programs of size 131–198 KLOC. We also report upon a user study involving nine users who employ EUGENE to guide an information-flow analysis on three Java micro-benchmarks. In our experiments, EUGENE significantly reduces misclassified reports upon providing limited amounts of feedback. @InProceedings{ESEC/FSE15p462, author = {Ravi Mangal and Xin Zhang and Aditya V. Nori and Mayur Naik}, title = {A User-Guided Approach to Program Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {462--473}, doi = {}, year = {2015}, } Best-Paper Award |
|
Oentaryo, Richard J. |
ESEC/FSE '15: "Information Retrieval and ..."
Information Retrieval and Spectrum Based Bug Localization: Better Together
Tien-Duy B. Le, Richard J. Oentaryo, and David Lo (Singapore Management University, Singapore) Debugging often takes much effort and resources. To help developers debug, numerous information retrieval (IR)-based and spectrum-based bug localization techniques have been proposed. IR-based techniques process textual information in bug reports, while spectrum-based techniques process program spectra (i.e., a record of which program elements are executed for each test case). Both eventually generate a ranked list of program elements that are likely to contain the bug. However, these techniques only consider one source of information, either bug reports or program spectra, which is not optimal. To deal with the limitation of existing techniques, in this work, we propose a new multi-modal technique that considers both bug reports and program spectra to localize bugs. Our approach adaptively creates a bug-specific model to map a particular bug to its possible location, and introduces a novel idea of suspicious words that are highly associated to a bug. We evaluate our approach on 157 real bugs from four software systems, and compare it with a state-of-the-art IR-based bug localization method, a state-of-the-art spectrum-based bug localization method, and three state-of-the-art multi-modal feature location methods that are adapted for bug localization. Experiments show that our approach can outperform the baselines by at least 47.62%, 31.48%, 27.78%, and 28.80% in terms of number of bugs successfully localized when a developer inspects 1, 5, and 10 program elements (i.e., Top 1, Top 5, and Top 10), and Mean Average Precision (MAP) respectively. @InProceedings{ESEC/FSE15p579, author = {Tien-Duy B. Le and Richard J. Oentaryo and David Lo}, title = {Information Retrieval and Spectrum Based Bug Localization: Better Together}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {579--590}, doi = {}, year = {2015}, } |
|
Oliveto, Rocco |
ESEC/FSE '15: "Query-Based Configuration ..."
Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks
Laura Moreno, Gabriele Bavota, Sonia Haiduc, Massimiliano Di Penta, Rocco Oliveto, Barbara Russo, and Andrian Marcus (University of Texas at Dallas, USA; Free University of Bolzano, Italy; Florida State University, USA; University of Sannio, Italy; University of Molise, Italy) Text Retrieval (TR) approaches have been used to leverage the textual information contained in software artifacts to address a multitude of software engineering (SE) tasks. However, TR approaches need to be configured properly in order to lead to good results. Current approaches for automatic TR configuration in SE configure a single TR approach and then use it for all possible queries. In this paper, we show that such a configuration strategy leads to suboptimal results, and propose QUEST, the first approach bringing TR configuration selection to the query level. QUEST recommends the best TR configuration for a given query, based on a supervised learning approach that determines the TR configuration that performs the best for each query according to its properties. We evaluated QUEST in the context of feature and bug localization, using a data set with more than 1,000 queries. We found that QUEST is able to recommend one of the top three TR configurations for a query with a 69% accuracy, on average. We compared the results obtained with the configurations recommended by QUEST for every query with those obtained using a single TR configuration for all queries in a system and in the entire data set. We found that using QUEST we obtain better results than with any of the considered TR configurations. @InProceedings{ESEC/FSE15p567, author = {Laura Moreno and Gabriele Bavota and Sonia Haiduc and Massimiliano Di Penta and Rocco Oliveto and Barbara Russo and Andrian Marcus}, title = {Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {567--578}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "Optimizing Energy Consumption ..." Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach Mario Linares-Vásquez, Gabriele Bavota, Carlos Eduardo Bernal Cárdenas, Rocco Oliveto, Massimiliano Di Penta, and Denys Poshyvanyk (College of William and Mary, USA; Free University of Bolzano, Italy; University of Molise, Italy; University of Sannio, Italy) The wide diffusion of mobile devices has motivated research towards optimizing energy consumption of software systems— including apps—targeting such devices. Besides efforts aimed at dealing with various kinds of energy bugs, the adoption of Organic Light-Emitting Diode (OLED) screens has motivated research towards reducing energy consumption by choosing an appropriate color palette. Whilst past research in this area aimed at optimizing energy while keeping an acceptable level of contrast, this paper proposes an approach, named GEMMA (Gui Energy Multi-objective optiMization for Android apps), for generating color palettes using a multi- objective optimization technique, which produces color solutions optimizing energy consumption and contrast while using consistent colors with respect to the original color palette. An empirical evaluation that we performed on 25 Android apps demonstrates not only significant improvements in terms of the three different objectives, but also confirmed that in most cases users still perceived the choices of colors as attractive. Finally, for several apps we interviewed the original developers, who in some cases expressed the intent to adopt the proposed choice of color palette, whereas in other cases pointed out directions for future improvements @InProceedings{ESEC/FSE15p143, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Eduardo Bernal Cárdenas and Rocco Oliveto and Massimiliano Di Penta and Denys Poshyvanyk}, title = {Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {143--154}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Orso, Alessandro |
ESEC/FSE '15: "Users Beware: Preference Inconsistencies ..."
Users Beware: Preference Inconsistencies Ahead
Farnaz Behrang, Myra B. Cohen, and Alessandro Orso (Georgia Tech, USA; University of Nebraska-Lincoln, USA) The structure of preferences for modern highly-configurable software systems has become extremely complex, usually consisting of multiple layers of access that go from the user interface down to the lowest levels of the source code. This complexity can lead to inconsistencies between layers, especially during software evolution. For example, there may be preferences that users can change through the GUI, but that have no effect on the actual behavior of the system because the related source code is not present or has been removed going from one version to the next. These inconsistencies may result in unexpected program behaviors, which range in severity from mild annoyances to more critical security or performance problems. To address this problem, we present SCIC (Software Configuration Inconsistency Checker), a static analysis technique that can automatically detect these kinds of inconsistencies. Unlike other configuration analysis tools, SCIC can handle software that (1) is written in multiple programming languages and (2) has a complex preference structure. In an empirical evaluation that we performed on 10 years worth of versions of both the widely used Mozilla Core and Firefox, SCIC was able to find 40 real inconsistencies (some determined as severe), whose lifetime spanned multiple versions, and whose detection required the analysis of code written in multiple languages. @InProceedings{ESEC/FSE15p295, author = {Farnaz Behrang and Myra B. Cohen and Alessandro Orso}, title = {Users Beware: Preference Inconsistencies Ahead}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {295--306}, doi = {}, year = {2015}, } Best-Paper Award |
|
Ozkaya, Ipek |
ESEC/FSE '15: "Measure It? Manage It? Ignore ..."
Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt
Neil A. Ernst, Stephany Bellomo, Ipek Ozkaya , Robert L. Nord, and Ian Gorton (SEI, USA) The technical debt metaphor is widely used to encapsulate numerous software quality problems. The metaphor is attractive to practitioners as it communicates to both technical and nontechnical audiences that if quality problems are not addressed, things may get worse. However, it is unclear whether there are practices that move this metaphor beyond a mere communication mechanism. Existing studies of technical debt have largely focused on code metrics and small surveys of developers. In this paper, we report on our survey of 1,831 participants, primarily software engineers and architects working in long-lived, software-intensive projects from three large organizations, and follow-up interviews of seven software engineers. We analyzed our data using both nonparametric statistics and qualitative text analysis. We found that architectural decisions are the most important source of technical debt. Furthermore, while respondents believe the metaphor is itself important for communication, existing tools are not currently helpful in managing the details. We use our results to motivate a technical debt timeline to focus management and tooling approaches. @InProceedings{ESEC/FSE15p50, author = {Neil A. Ernst and Stephany Bellomo and Ipek Ozkaya and Robert L. Nord and Ian Gorton}, title = {Measure It? Manage It? Ignore It? Software Practitioners and Technical Debt}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {50--60}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Palyart, Marc |
ESEC/FSE '15: "Impact of Developer Turnover ..."
Impact of Developer Turnover on Quality in Open-Source Software
Matthieu Foucault, Marc Palyart, Xavier Blanc, Gail C. Murphy, and Jean-Rémy Falleri (University of Bordeaux, France; University of British Columbia, Canada) Turnover is the phenomenon of continuous influx and retreat of human resources in a team. Despite being well-studied in many settings, turnover has not been characterized for open-source software projects. We study the source code repositories of five open-source projects to characterize patterns of turnover and to determine the effects of turnover on software quality. We define the base concepts of both external and internal turnover, which are the mobility of developers in and out of a project, and the mobility of developers inside a project, respectively. We provide a qualitative analysis of turnover patterns. We also found, in a quantitative analysis, that the activity of external newcomers negatively impact software quality. @InProceedings{ESEC/FSE15p829, author = {Matthieu Foucault and Marc Palyart and Xavier Blanc and Gail C. Murphy and Jean-Rémy Falleri}, title = {Impact of Developer Turnover on Quality in Open-Source Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {829--841}, doi = {}, year = {2015}, } Info |
|
Panichella, Annibale |
ESEC/FSE '15: "When, How, and Why Developers ..."
When, How, and Why Developers (Do Not) Test in Their IDEs
Moritz Beller, Georgios Gousios, Annibale Panichella , and Andy Zaidman (Delft University of Technology, Netherlands; Radboud University Nijmegen, Netherlands) The research community in Software Engineering and Software Testing in particular builds many of its contributions on a set of mutually shared expectations. Despite the fact that they form the basis of many publications as well as open-source and commercial testing applications, these common expectations and beliefs are rarely ever questioned. For example, Frederic Brooks’ statement that testing takes half of the development time seems to have manifested itself within the community since he first made it in the “Mythical Man Month” in 1975. With this paper, we report on the surprising results of a large-scale field study with 416 software engineers whose development activity we closely monitored over the course of five months, resulting in over 13 years of recorded work time in their integrated development environments (IDEs). Our findings question several commonly shared assumptions and beliefs about testing and might be contributing factors to the observed bug proneness of software in practice: the majority of developers in our study does not test; developers rarely run their tests in the IDE; Test-Driven Development (TDD) is not widely practiced; and, last but not least, software developers only spend a quarter of their work time engineering tests, whereas they think they test half of their time. @InProceedings{ESEC/FSE15p179, author = {Moritz Beller and Georgios Gousios and Annibale Panichella and Andy Zaidman}, title = {When, How, and Why Developers (Do Not) Test in Their IDEs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {179--190}, doi = {}, year = {2015}, } |
|
Parameshwaran, Inian |
ESEC/FSE '15: "Auto-patching DOM-Based XSS ..."
Auto-patching DOM-Based XSS at Scale
Inian Parameshwaran, Enrico Budianto, Shweta Shinde, Hung Dang, Atul Sadhu, and Prateek Saxena (National University of Singapore, Singapore) DOM-based cross-site scripting (XSS) is a client-side code injection vulnerability that results from unsafe dynamic code generation in JavaScript applications, and has few known practical defenses. We study dynamic code evaluation practices on nearly a quarter million URLs crawled starting from the the Alexa Top 1000 websites. Of 777,082 cases of dynamic HTML/JS code generation we observe, 13.3% use unsafe string interpolation for dynamic code generation — a well-known dangerous coding practice. To remedy this, we propose a technique to generate secure patches that replace unsafe string interpolation with safer code that utilizes programmatic DOM construction techniques. Our system transparently auto-patches the vulnerable site while incurring only 5.2 − 8.07% overhead. The patching mechanism requires no access to server-side code or modification to browsers, and thus is practical as a turnkey defense. @InProceedings{ESEC/FSE15p272, author = {Inian Parameshwaran and Enrico Budianto and Shweta Shinde and Hung Dang and Atul Sadhu and Prateek Saxena}, title = {Auto-patching DOM-Based XSS at Scale}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {272--283}, doi = {}, year = {2015}, } Info |
|
Park, Jongse |
ESEC/FSE '15: "FlexJava: Language Support ..."
FlexJava: Language Support for Safe and Modular Approximate Programming
Jongse Park, Hadi Esmaeilzadeh, Xin Zhang, Mayur Naik, and William Harris (Georgia Tech, USA) Energy efficiency is a primary constraint in modern systems. Approximate computing is a promising approach that trades quality of result for gains in efficiency and performance. State- of-the-art approximate programming models require extensive manual annotations on program data and operations to guarantee safe execution of approximate programs. The need for extensive manual annotations hinders the practical use of approximation techniques. This paper describes FlexJava, a small set of language extensions, that significantly reduces the annotation effort, paving the way for practical approximate programming. These extensions enable programmers to annotate approximation-tolerant method outputs. The FlexJava compiler, which is equipped with an approximation safety analysis, automatically infers the operations and data that affect these outputs and selectively marks them approximable while giving safety guarantees. The automation and the language–compiler codesign relieve programmers from manually and explicitly an- notating data declarations or operations as safe to approximate. FlexJava is designed to support safety, modularity, generality, and scalability in software development. We have implemented FlexJava annotations as a Java library and we demonstrate its practicality using a wide range of Java applications and by con- ducting a user study. Compared to EnerJ, a recent approximate programming system, FlexJava provides the same energy savings with significant reduction (from 2× to 17×) in the number of annotations. In our user study, programmers spend 6× to 12× less time annotating programs using FlexJava than when using EnerJ. @InProceedings{ESEC/FSE15p745, author = {Jongse Park and Hadi Esmaeilzadeh and Xin Zhang and Mayur Naik and William Harris}, title = {FlexJava: Language Support for Safe and Modular Approximate Programming}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {745--757}, doi = {}, year = {2015}, } |
|
Păsăreanu, Corina S. |
ESEC/FSE '15: "Iterative Distribution-Aware ..."
Iterative Distribution-Aware Sampling for Probabilistic Symbolic Execution
Mateus Borges, Antonio Filieri, Marcelo d'Amorim, and Corina S. Păsăreanu (University of Stuttgart, Germany; Federal University of Pernambuco, Brazil; Carnegie Mellon University, USA; NASA Ames Research Center, USA) Probabilistic symbolic execution aims at quantifying the probability of reaching program events of interest assuming that program inputs follow given probabilistic distributions. The technique collects constraints on the inputs that lead to the target events and analyzes them to quantify how likely it is for an input to satisfy the constraints. Current techniques either handle only linear constraints or only support continuous distributions using a “discretization” of the input domain, leading to imprecise and costly results. We propose an iterative distribution-aware sampling approach to support probabilistic symbolic execution for arbitrarily complex mathematical constraints and continuous input distributions. We follow a compositional approach, where the symbolic constraints are decomposed into sub-problems whose solution can be solved independently. At each iteration the convergence rate of the com- putation is increased by automatically refocusing the analysis on estimating the sub-problems that mostly affect the accuracy of the results, as guided by three different ranking strategies. Experiments on publicly available benchmarks show that the proposed technique improves on previous approaches in terms of scalability and accuracy of the results. @InProceedings{ESEC/FSE15p866, author = {Mateus Borges and Antonio Filieri and Marcelo d'Amorim and Corina S. Păsăreanu}, title = {Iterative Distribution-Aware Sampling for Probabilistic Symbolic Execution}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {866--877}, doi = {}, year = {2015}, } Info |
|
Pasupathy, Shankar |
ESEC/FSE '15: "Hey, You Have Given Me Too ..."
Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software
Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker (University of California at San Diego, USA; Huazhong University of Science and Technology, China; NetApp, USA) Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters ("knobs"). With hundreds of knobs, configuring system software to ensure high reliability and performance becomes a daunting, error-prone task. This paper makes a first step in understanding a fundamental question of configuration design: "do users really need so many knobs?" To provide the quantitatively answer, we study the configuration settings of real-world users, including thousands of customers of a commercial storage system (Storage-A), and hundreds of users of two widely-used open-source system software projects. Our study reveals a series of interesting findings to motivate software architects and developers to be more cautious and disciplined in configuration design. Motivated by these findings, we provide a few concrete, practical guidelines which can significantly reduce the configuration space. Take Storage-A as an example, the guidelines can remove 51.9% of its parameters and simplify 19.7% of the remaining ones with little impact on existing users. Also, we study the existing configuration navigation methods in the context of "too many knobs" to understand their effectiveness in dealing with the over-designed configuration, and to provide practices for building navigation support in system software. @InProceedings{ESEC/FSE15p307, author = {Tianyin Xu and Long Jin and Xuepeng Fan and Yuanyuan Zhou and Shankar Pasupathy and Rukma Talwadker}, title = {Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {307--319}, doi = {}, year = {2015}, } Video Info |
|
Peng, Xin |
ESEC/FSE '15: "Clone-Based and Interactive ..."
Clone-Based and Interactive Recommendation for Modifying Pasted Code
Yun Lin, Xin Peng , Zhenchang Xing, Diwen Zheng, and Wenyun Zhao (Fudan University, China; Nanyang Technological University, Singapore) Developers often need to modify pasted code when programming with copy-and-paste practice. Some modifications on pasted code could involve lots of editing efforts, and any missing or wrong edit could incur bugs. In this paper, we propose a clone-based and interactive approach to recommending where and how to modify the pasted code. In our approach, we regard clones of the pasted code as the results of historical copy-and-paste operations and their differences as historical modifications on the same piece of code. Our approach first retrieves clones of the pasted code from a clone repository and detects syntactically complete differences among them. Then our approach transfers each clone difference into a modification slot on the pasted code, suggests options for each slot, and further mines modifying regulations from the clone differences. Based on the mined modifying regulations, our approach dynamically updates the suggested options and their ranking in each slot according to developer's modifications on the pasted code. We implement a proof-of-concept tool CCDemon based on our approach and evaluate its effectiveness based on code clones detected from five open source projects. The results show that our approach can identify 96.9% of the to-be-modified positions in pasted code and suggest 75.0% of the required modifications. Our human study further confirms that CCDemon can help developers to accomplish their modifications of pasted code more efficiently. @InProceedings{ESEC/FSE15p520, author = {Yun Lin and Xin Peng and Zhenchang Xing and Diwen Zheng and Wenyun Zhao}, title = {Clone-Based and Interactive Recommendation for Modifying Pasted Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {520--531}, doi = {}, year = {2015}, } |
|
Pezzè, Mauro |
ESEC/FSE '15: "Symbolic Execution of Programs ..."
Symbolic Execution of Programs with Heap Inputs
Pietro Braione, Giovanni Denaro, and Mauro Pezzè (University of Milano-Bicocca, Italy; University of Lugano, Switzerland) Symbolic analysis is a core component of many automatic test generation and program verication approaches. To verify complex software systems, test and analysis techniques shall deal with the many aspects of the target systems at different granularity levels. In particular, testing software programs that make extensive use of heap data structures at unit and integration levels requires generating suitable input data structures in the heap. This is a main challenge for symbolic testing and analysis techniques that work well when dealing with numeric inputs, but do not satisfactorily cope with heap data structures yet. In this paper we propose a language HEX to specify invariants of partially initialized data structures, and a decision procedure that supports the incremental evaluation of structural properties in HEX. Used in combination with the symbolic execution of heap manipulating programs, HEX prevents the exploration of invalid states, thus improving the eefficiency of program testing and analysis, and avoiding false alarms that negatively impact on verication activities. The experimental data conrm that HEX is an effective and efficient solution to the problem of testing and analyzing heap manipulating programs, and outperforms the alternative approaches that have been proposed so far. @InProceedings{ESEC/FSE15p602, author = {Pietro Braione and Giovanni Denaro and Mauro Pezzè}, title = {Symbolic Execution of Programs with Heap Inputs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {602--613}, doi = {}, year = {2015}, } |
|
Poshyvanyk, Denys |
ESEC/FSE '15: "Auto-completing Bug Reports ..."
Auto-completing Bug Reports for Android Applications
Kevin Moran, Mario Linares-Vásquez, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (College of William and Mary, USA) The modern software development landscape has seen a shift in focus toward mobile applications as tablets and smartphones near ubiquitous adoption. Due to this trend, the complexity of these “apps” has been increasing, making development and maintenance challenging. Additionally, current bug tracking systems are not able to effectively support construction of reports with actionable information that directly lead to a bug’s resolution. To address the need for an improved reporting system, we introduce a novel solution, called FUSION, that helps users auto-complete reproduction steps in bug reports for mobile apps. FUSION links user-provided information to program artifacts extracted through static and dynamic analysis performed before testing or release. The approach that FUSION employs is generalizable to other current mobile software platforms, and constitutes a new method by which off-device bug reporting can be conducted for mobile software projects. In a study involving 28 participants we applied FUSION to support the maintenance tasks of reporting and reproducing defects from 15 real-world bugs found in 14 open source Android apps while qualitatively and qualitatively measuring the user experience of the system. Our results demonstrate that FUSION both effectively facilitates reporting and allows for more reliable reproduction of bugs from reports compared to traditional issue tracking systems by presenting more detailed contextual app information. @InProceedings{ESEC/FSE15p673, author = {Kevin Moran and Mario Linares-Vásquez and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Auto-completing Bug Reports for Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {673--686}, doi = {}, year = {2015}, } Video Info ESEC/FSE '15: "Optimizing Energy Consumption ..." Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach Mario Linares-Vásquez, Gabriele Bavota, Carlos Eduardo Bernal Cárdenas, Rocco Oliveto, Massimiliano Di Penta, and Denys Poshyvanyk (College of William and Mary, USA; Free University of Bolzano, Italy; University of Molise, Italy; University of Sannio, Italy) The wide diffusion of mobile devices has motivated research towards optimizing energy consumption of software systems— including apps—targeting such devices. Besides efforts aimed at dealing with various kinds of energy bugs, the adoption of Organic Light-Emitting Diode (OLED) screens has motivated research towards reducing energy consumption by choosing an appropriate color palette. Whilst past research in this area aimed at optimizing energy while keeping an acceptable level of contrast, this paper proposes an approach, named GEMMA (Gui Energy Multi-objective optiMization for Android apps), for generating color palettes using a multi- objective optimization technique, which produces color solutions optimizing energy consumption and contrast while using consistent colors with respect to the original color palette. An empirical evaluation that we performed on 25 Android apps demonstrates not only significant improvements in terms of the three different objectives, but also confirmed that in most cases users still perceived the choices of colors as attractive. Finally, for several apps we interviewed the original developers, who in some cases expressed the intent to adopt the proposed choice of color palette, whereas in other cases pointed out directions for future improvements @InProceedings{ESEC/FSE15p143, author = {Mario Linares-Vásquez and Gabriele Bavota and Carlos Eduardo Bernal Cárdenas and Rocco Oliveto and Massimiliano Di Penta and Denys Poshyvanyk}, title = {Optimizing Energy Consumption of GUIs in Android Apps: A Multi-objective Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {143--154}, doi = {}, year = {2015}, } Info Best-Paper Award |
|
Pradel, Michael |
ESEC/FSE '15: "JITProf: Pinpointing JIT-Unfriendly ..."
JITProf: Pinpointing JIT-Unfriendly JavaScript Code
Liang Gong, Michael Pradel , and Koushik Sen (University of California at Berkeley, USA; TU Darmstadt, Germany) Most modern JavaScript engines use just-in-time (JIT) compilation to translate parts of JavaScript code into efficient machine code at runtime. Despite the overall success of JIT compilers, programmers may still write code that uses the dynamic features of JavaScript in a way that prohibits profitable optimizations. Unfortunately, there currently is no way to measure how prevalent such JIT-unfriendly code is and to help developers detect such code locations. This paper presents JITProf, a profiling framework to dynamically identify code locations that prohibit profitable JIT optimizations. The key idea is to associate meta-information with JavaScript objects and code locations, to update this information whenever particular runtime events occur, and to use the meta-information to identify JIT-unfriendly operations. We use JITProf to analyze widely used JavaScript web applications and show that JIT-unfriendly code is prevalent in practice. Furthermore, we show how to use the approach as a profiling technique that finds optimization opportunities in a program. Applying the profiler to popular benchmark programs shows that refactoring these programs to avoid performance problems identified by JITProf leads to statistically significant performance improvements of up to 26.3% in 15 benchmarks. @InProceedings{ESEC/FSE15p357, author = {Liang Gong and Michael Pradel and Koushik Sen}, title = {JITProf: Pinpointing JIT-Unfriendly JavaScript Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {357--368}, doi = {}, year = {2015}, } Info |
|
Prause, Christian R. |
ESEC/FSE '15: "Gamification for Enforcing ..."
Gamification for Enforcing Coding Conventions
Christian R. Prause and Matthias Jarke (DLR, Germany; RWTH Aachen University, Germany) Software is a knowledge intensive product, which can only evolve if there is effective and efficient information exchange between developers. Complying to coding conventions improves information exchange by improving the readability of source code. However, without some form of enforcement, compliance to coding conventions is limited. We look at the problem of information exchange in code and propose gamification as a way to motivate developers to invest in compliance. Our concept consists of a technical prototype and its integration into a Scrum environment. By means of two experiments with agile software teams and subsequent surveys, we show that gamification can effectively improve adherence to coding conventions. @InProceedings{ESEC/FSE15p649, author = {Christian R. Prause and Matthias Jarke}, title = {Gamification for Enforcing Coding Conventions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {649--660}, doi = {}, year = {2015}, } |
|
Purandare, Rahul |
ESEC/FSE '15: "CLOTHO: Saving Programs from ..."
CLOTHO: Saving Programs from Malformed Strings and Incorrect String-Handling
Aritra Dhar, Rahul Purandare , Mohan Dhawan, and Suresh Rangaswamy (Xerox Research Center, India; IIIT Delhi, India; IBM Research, India) Software is susceptible to malformed data originating from untrusted sources. Occasionally the programming logic or constructs used are inappropriate to handle the varied constraints imposed by legal and well-formed data. Consequently, softwares may produce unexpected results or even crash. In this paper, we present CLOTHO, a novel hybrid approach that saves such softwares from crashing when failures originate from malformed strings or inappropriate handling of strings. CLOTHO statically analyses a program to identify statements that are vulnerable to failures related to associated string data. CLOTHO then generates patches that are likely to satisfy constraints on the data, and in case of failures produces program behavior which would be close to the expected. The precision of the patches is improved with the help of a dynamic analysis. We have implemented CLOTHO for the JAVA String API, and our evaluation based on several popular open-source libraries shows that CLOTHO generates patches that are semantically similar to the patches generated by the programmers in the later versions. Additionally, these patches are activated only when a failure is detected, and thus CLOTHO incurs no runtime overhead during normal execution, and negligible overhead in case of failures. @InProceedings{ESEC/FSE15p555, author = {Aritra Dhar and Rahul Purandare and Mohan Dhawan and Suresh Rangaswamy}, title = {CLOTHO: Saving Programs from Malformed Strings and Incorrect String-Handling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {555--566}, doi = {}, year = {2015}, } Info |
|
Qadeer, Shaz |
ESEC/FSE '15: "Systematic Testing of Asynchronous ..."
Systematic Testing of Asynchronous Reactive Systems
Ankush Desai, Shaz Qadeer, and Sanjit A. Seshia (University of California at Berkeley, USA; Microsoft Research, USA) We introduce the concept of a delaying explorer with the goal of performing prioritized exploration of the behaviors of an asynchronous reactive program. A delaying explorer stratifies the search space using a custom strategy, and a delay operation that allows deviation from that strategy. We show that prioritized search with a delaying explorer performs significantly better than existing prioritization techniques. We also demonstrate empirically the need for writing different delaying explorers for scalable systematic testing and hence, present a flexible delaying explorer interface. We introduce two new techniques to improve the scalability of search based on delaying explorers. First, we present an algorithm for stratified exhaustive search and use efficient state caching to avoid redundant exploration of schedules. We provide soundness and termination guarantees for our algorithm. Second, for the cases where the state of the system cannot be captured or there are resource constraints, we present an algorithm to randomly sample any execution from the stratified search space. This algorithm guarantees that any such execution that requires d delay operations is sampled with probability at least 1/Ld, where L is the maximum number of program steps. We have implemented our algorithms and evaluated them on a collection of real-world fault-tolerant distributed protocols. @InProceedings{ESEC/FSE15p73, author = {Ankush Desai and Shaz Qadeer and Sanjit A. Seshia}, title = {Systematic Testing of Asynchronous Reactive Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {73--83}, doi = {}, year = {2015}, } Info |
|
Qi, Fumin |
ESEC/FSE '15: "Heterogeneous Cross-Company ..."
Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning
Xiaoyuan Jing, Fei Wu, Xiwei Dong, Fumin Qi, and Baowen Xu (Wuhan University, China; Nanjing University of Posts and Telecommunications, China; Nanjing University, China) Cross-company defect prediction (CCDP) learns a prediction model by using training data from one or multiple projects of a source company and then applies the model to the target company data. Existing CCDP methods are based on the assumption that the data of source and target companies should have the same software metrics. However, for CCDP, the source and target company data is usually heterogeneous, namely the metrics used and the size of metric set are different in the data of two companies. We call CCDP in this scenario as heterogeneous CCDP (HCCDP) task. In this paper, we aim to provide an effective solution for HCCDP. We propose a unified metric representation (UMR) for the data of source and target companies. The UMR consists of three types of metrics, i.e., the common metrics of the source and target companies, source-company specific metrics and target-company specific metrics. To construct UMR for source company data, the target-company specific metrics are set as zeros, while for UMR of the target company data, the source-company specific metrics are set as zeros. Based on the unified metric representation, we for the first time introduce canonical correlation analysis (CCA), an effective transfer learning method, into CCDP to make the data distributions of source and target companies similar. Experiments on 14 public heterogeneous datasets from four companies indicate that: 1) for HCCDP with partially different metrics, our approach significantly outperforms state-of-the-art CCDP methods; 2) for HCCDP with totally different metrics, our approach obtains comparable prediction performances in contrast with within-project prediction results. The proposed approach is effective for HCCDP. @InProceedings{ESEC/FSE15p496, author = {Xiaoyuan Jing and Fei Wu and Xiwei Dong and Fumin Qi and Baowen Xu}, title = {Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {496--507}, doi = {}, year = {2015}, } |
|
Qin, Shengchao |
ESEC/FSE '15: "TLV: Abstraction through Testing, ..."
TLV: Abstraction through Testing, Learning, and Validation
Jun Sun, Hao Xiao, Yang Liu , Shang-Wei Lin, and Shengchao Qin (Singapore University of Technology and Design, Singapore; Nanyang Technological University, Singapore; Teesside University, UK; Shenzhen University, China) A (Java) class provides a service to its clients (i.e., programs which use the class). The service must satisfy certain specifications. Different specifications might be expected at different levels of abstraction depending on the client's objective. In order to effectively contrast the class against its specifications, whether manually or automatically, one essential step is to automatically construct an abstraction of the given class at a proper level of abstraction. The abstraction should be correct (i.e., over-approximating) and accurate (i.e., with few spurious traces). We present an automatic approach, which combines testing, learning, and validation, to constructing an abstraction. Our approach is designed such that a large part of the abstraction is generated based on testing and learning so as to minimize the use of heavy-weight techniques like symbolic execution. The abstraction is generated through a process of abstraction/refinement, with no user input, and converges to a specific level of abstraction depending on the usage context. The generated abstraction is guaranteed to be correct and accurate. We have implemented the proposed approach in a toolkit named TLV and evaluated TLV with a number of benchmark programs as well as three real-world ones. The results show that TLV generates abstraction for program analysis and verification more efficiently. @InProceedings{ESEC/FSE15p698, author = {Jun Sun and Hao Xiao and Yang Liu and Shang-Wei Lin and Shengchao Qin}, title = {TLV: Abstraction through Testing, Learning, and Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {698--709}, doi = {}, year = {2015}, } Info |
|
Ramanathan, Murali Krishna |
ESEC/FSE '15: "Synthesizing Tests for Detecting ..."
Synthesizing Tests for Detecting Atomicity Violations
Malavika Samak and Murali Krishna Ramanathan (Indian Institute of Science, India) Using thread-safe libraries can help programmers avoid the complexities of multithreading. However, designing libraries that guarantee thread-safety can be challenging. Detecting and eliminating atomicity violations when methods in the libraries are invoked concurrently is vital in building reliable client applications that use the libraries. While there are dynamic analyses to detect atomicity violations, these techniques are critically dependent on effective multithreaded tests. Unfortunately, designing such tests is non-trivial. In this paper, we design a novel and scalable approach for synthesizing multithreaded tests that help detect atomicity violations. The input to the approach is the implementation of the library and a sequential seed testsuite that invokes every method in the library with random parameters. We analyze the execution of the sequential tests, generate variable lock dependencies and construct a set of three accesses which when interleaved suitably in a multithreaded execution can cause an atomicity violation. Subsequently, we identify pairs of method invocations that correspond to these accesses and invoke them concurrently from distinct threads with appropriate objects to help expose atomicity violations. We have incorporated these ideas in our tool, named Intruder, and applied it on multiple open-source Java multithreaded libraries. Intruder is able to synthesize 40 multithreaded tests across nine classes in less than two minutes to detect 79 harmful atomicity violations, including previously unknown violations in thread-safe classes. We also demonstrate the effectiveness of Intruder by comparing the results with other approaches designed for synthesizing multithreaded tests. @InProceedings{ESEC/FSE15p131, author = {Malavika Samak and Murali Krishna Ramanathan}, title = {Synthesizing Tests for Detecting Atomicity Violations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {131--142}, doi = {}, year = {2015}, } |
|
Rangaswamy, Suresh |
ESEC/FSE '15: "CLOTHO: Saving Programs from ..."
CLOTHO: Saving Programs from Malformed Strings and Incorrect String-Handling
Aritra Dhar, Rahul Purandare , Mohan Dhawan, and Suresh Rangaswamy (Xerox Research Center, India; IIIT Delhi, India; IBM Research, India) Software is susceptible to malformed data originating from untrusted sources. Occasionally the programming logic or constructs used are inappropriate to handle the varied constraints imposed by legal and well-formed data. Consequently, softwares may produce unexpected results or even crash. In this paper, we present CLOTHO, a novel hybrid approach that saves such softwares from crashing when failures originate from malformed strings or inappropriate handling of strings. CLOTHO statically analyses a program to identify statements that are vulnerable to failures related to associated string data. CLOTHO then generates patches that are likely to satisfy constraints on the data, and in case of failures produces program behavior which would be close to the expected. The precision of the patches is improved with the help of a dynamic analysis. We have implemented CLOTHO for the JAVA String API, and our evaluation based on several popular open-source libraries shows that CLOTHO generates patches that are semantically similar to the patches generated by the programmers in the later versions. Additionally, these patches are activated only when a failure is detected, and thus CLOTHO incurs no runtime overhead during normal execution, and negligible overhead in case of failures. @InProceedings{ESEC/FSE15p555, author = {Aritra Dhar and Rahul Purandare and Mohan Dhawan and Suresh Rangaswamy}, title = {CLOTHO: Saving Programs from Malformed Strings and Incorrect String-Handling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {555--566}, doi = {}, year = {2015}, } Info |
|
Rauchwerger, Lawrence |
ESEC/FSE '15: "Finding Schedule-Sensitive ..."
Finding Schedule-Sensitive Branches
Jeff Huang and Lawrence Rauchwerger (Texas A&M University, USA) This paper presents an automated, precise technique, TAME, for identifying schedule-sensitive branches (SSBs) in concurrent programs, i.e., branches whose decision may vary depending on the actual scheduling of concurrent threads. The technique consists of 1) tracing events at fine-grained level; 2) deriving the constraints for each branch; and 3) invoking an SMT solver to find possible SSB, by trying to solve the negated branch condition. To handle the infeasibly huge number of computations that would be generated by the fine-grained tracing, TAME leverages concolic execution and implements several sound approximations to delimit the number of traces to analyse, yet without sacrificing precision. In addition, TAME implements a novel distributed trace partition approach distributing the analysis into smaller chunks. Evaluation on both popular benchmarks and real applications shows that TAME is effective in finding SSBs and has good scalability. TAME found a total of 34 SSBs, among which 17 are related to concurrency errors, and 9 are ad hoc synchronizations. @InProceedings{ESEC/FSE15p439, author = {Jeff Huang and Lawrence Rauchwerger}, title = {Finding Schedule-Sensitive Branches}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {439--449}, doi = {}, year = {2015}, } |
|
Reif, Michael |
ESEC/FSE '15: "Getting to Know You: Towards ..."
Getting to Know You: Towards a Capability Model for Java
Ben Hermann, Michael Reif, Michael Eichberg, and Mira Mezini (TU Darmstadt, Germany) Developing software from reusable libraries lets developers face a security dilemma: Either be efficient and reuse libraries as they are or inspect them, know about their resource usage, but possibly miss deadlines as reviews are a time consuming process. In this paper, we propose a novel capability inference mechanism for libraries written in Java. It uses a coarse-grained capability model for system resources that can be presented to developers. We found that the capability inference agrees by 86.81% on expectations towards capabilities that can be derived from project documentation. Moreover, our approach can find capabilities that cannot be discovered using project documentation. It is thus a helpful tool for developers mitigating the aforementioned dilemma. @InProceedings{ESEC/FSE15p758, author = {Ben Hermann and Michael Reif and Michael Eichberg and Mira Mezini}, title = {Getting to Know You: Towards a Capability Model for Java}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {758--769}, doi = {}, year = {2015}, } Info |
|
Riccobene, Elvinia |
ESEC/FSE '15: "Improving Model-Based Test ..."
Improving Model-Based Test Generation by Model Decomposition
Paolo Arcaini, Angelo Gargantini, and Elvinia Riccobene (Charles University in Prague, Czech Republic; University of Bergamo, Italy; University of Milan, Italy) One of the well-known techniques for model-based test generation exploits the capability of model checkers to return counterexamples upon property violations. However, this approach is not always optimal in practice due to the required time and memory, or even not feasible due to the state explosion problem of model checking. A way to mitigate these limitations consists in decomposing a system model into suitable subsystem models separately analyzable. In this paper, we show a technique to decompose a system model into subsystems by exploiting the model variables dependency, and then we propose a test generation approach which builds tests for the single subsystems and combines them later in order to obtain tests for the system as a whole. Such approach mitigates the exponential increase of the test generation time and memory consumption, and, compared with the same model-based test generation technique applied to the whole system, shows to be more efficient. We prove that, although not complete, the approach is sound. @InProceedings{ESEC/FSE15p119, author = {Paolo Arcaini and Angelo Gargantini and Elvinia Riccobene}, title = {Improving Model-Based Test Generation by Model Decomposition}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {119--130}, doi = {}, year = {2015}, } |
|
Rinard, Martin |
ESEC/FSE '15: "Staged Program Repair with ..."
Staged Program Repair with Condition Synthesis
Fan Long and Martin Rinard (Massachusetts Institute of Technology, USA) We present SPR, a new program repair system that combines staged program repair and condition synthesis. These techniques enable SPR to work productively with a set of parameterized transformation schemas to generate and efficiently search a rich space of program repairs. Together these techniques enable SPR to generate correct repairs for over five times as many defects as previous systems evaluated on the same benchmark set. @InProceedings{ESEC/FSE15p166, author = {Fan Long and Martin Rinard}, title = {Staged Program Repair with Condition Synthesis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {166--178}, doi = {}, year = {2015}, } Info |
|
Ringert, Jan Oliver |
ESEC/FSE '15: "GR(1) Synthesis for LTL Specification ..."
GR(1) Synthesis for LTL Specification Patterns
Shahar Maoz and Jan Oliver Ringert (Tel Aviv University, Israel) Reactive synthesis is an automated procedure to obtain a correct-by-construction reactive system from its temporal logic specification. Two of the main challenges in bringing reactive synthesis to software engineering practice are its very high worst-case complexity -- for linear temporal logic (LTL) it is double exponential in the length of the formula, and the difficulty of writing declarative specifications using basic LTL operators. To address the first challenge, Piterman et al. have suggested the General Reactivity of Rank 1 (GR(1)) fragment of LTL, which has an efficient polynomial time symbolic synthesis algorithm. To address the second challenge, Dwyer et al. have identified 55 LTL specification patterns, which are common in industrial specifications and make writing specifications easier. In this work we show that almost all of the 55 LTL specification patterns identified by Dwyer et al. can be expressed as assumptions and guarantees in the GR(1) fragment of LTL. Specifically, we present an automated, sound and complete translation of the patterns to the GR(1) form, which effectively results in an efficient reactive synthesis procedure for any specification that is written using the patterns. We have validated the correctness of the catalog of GR(1) templates we have created. The work is implemented in our reactive synthesis environment. It provides positive, promising evidence, for the potential feasibility of using reactive synthesis in practice. @InProceedings{ESEC/FSE15p96, author = {Shahar Maoz and Jan Oliver Ringert}, title = {GR(1) Synthesis for LTL Specification Patterns}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {96--106}, doi = {}, year = {2015}, } |
|
Robbes, Romain |
ESEC/FSE '15: "An Empirical Study of Goto ..."
An Empirical Study of Goto in C Code from GitHub Repositories
Meiyappan Nagappan, Romain Robbes, Yasutaka Kamei, Éric Tanter , Shane McIntosh, Audris Mockus, and Ahmed E. Hassan (Rochester Institute of Technology, USA; University of Chile, Chile; Kyushu University, Japan; McGill University, Canada; University of Tennessee, USA; Queen's University, Canada) It is nearly 50 years since Dijkstra argued that goto obscures the flow of control in program execution and urged programmers to abandon the goto statement. While past research has shown that goto is still in use, little is known about whether goto is used in the unrestricted manner that Dijkstra feared, and if it is ‘harmful’ enough to be a part of a post-release bug. We, therefore, conduct a two part empirical study - (1) qualitatively analyze a statistically rep- resentative sample of 384 files from a population of almost 250K C programming language files collected from over 11K GitHub repositories and find that developers use goto in C files for error handling (80.21±5%) and cleaning up resources at the end of a procedure (40.36 ± 5%); and (2) quantitatively analyze the commit history from the release branches of six OSS projects and find that no goto statement was re- moved/modified in the post-release phase of four of the six projects. We conclude that developers limit themselves to using goto appropriately in most cases, and not in an unrestricted manner like Dijkstra feared, thus suggesting that goto does not appear to be harmful in practice. @InProceedings{ESEC/FSE15p404, author = {Meiyappan Nagappan and Romain Robbes and Yasutaka Kamei and Éric Tanter and Shane McIntosh and Audris Mockus and Ahmed E. Hassan}, title = {An Empirical Study of Goto in C Code from GitHub Repositories}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {404--414}, doi = {}, year = {2015}, } |
|
Rothermel, Gregg |
ESEC/FSE '15: "On the Use of Delta Debugging ..."
On the Use of Delta Debugging to Reduce Recordings and Facilitate Debugging of Web Applications
Mouna Hammoudi, Brian Burg, Gigon Bae, and Gregg Rothermel (University of Nebraska-Lincoln, USA; University of Washington, USA) Recording the sequence of events that lead to a failure of a web application can be an effective aid for debugging. Nevertheless, a recording of an event sequence may include many events that are not related to a failure, and this may render debugging more difficult. To address this problem, we have adapted Delta Debugging to function on recordings of web applications, in a manner that lets it identify and discard portions of those recordings that do not influence the occurrence of a failure. We present the results of three empirical studies that show that (1) recording reduction can achieve significant reductions in recording size and replay time on actual web applications obtained from developer forums, (2) reduced recordings do in fact help programmers locate faults significantly more efficiently as, and no less effectively than non-reduced recordings, and (3) recording reduction produces even greater reductions on larger, more complex applications. @InProceedings{ESEC/FSE15p333, author = {Mouna Hammoudi and Brian Burg and Gigon Bae and Gregg Rothermel}, title = {On the Use of Delta Debugging to Reduce Recordings and Facilitate Debugging of Web Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {333--344}, doi = {}, year = {2015}, } Info |
|
Russo, Barbara |
ESEC/FSE '15: "Query-Based Configuration ..."
Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks
Laura Moreno, Gabriele Bavota, Sonia Haiduc, Massimiliano Di Penta, Rocco Oliveto, Barbara Russo, and Andrian Marcus (University of Texas at Dallas, USA; Free University of Bolzano, Italy; Florida State University, USA; University of Sannio, Italy; University of Molise, Italy) Text Retrieval (TR) approaches have been used to leverage the textual information contained in software artifacts to address a multitude of software engineering (SE) tasks. However, TR approaches need to be configured properly in order to lead to good results. Current approaches for automatic TR configuration in SE configure a single TR approach and then use it for all possible queries. In this paper, we show that such a configuration strategy leads to suboptimal results, and propose QUEST, the first approach bringing TR configuration selection to the query level. QUEST recommends the best TR configuration for a given query, based on a supervised learning approach that determines the TR configuration that performs the best for each query according to its properties. We evaluated QUEST in the context of feature and bug localization, using a data set with more than 1,000 queries. We found that QUEST is able to recommend one of the top three TR configurations for a query with a 69% accuracy, on average. We compared the results obtained with the configurations recommended by QUEST for every query with those obtained using a single TR configuration for all queries in a system and in the entire data set. We found that using QUEST we obtain better results than with any of the considered TR configurations. @InProceedings{ESEC/FSE15p567, author = {Laura Moreno and Gabriele Bavota and Sonia Haiduc and Massimiliano Di Penta and Rocco Oliveto and Barbara Russo and Andrian Marcus}, title = {Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {567--578}, doi = {}, year = {2015}, } Info |
|
Sadhu, Atul |
ESEC/FSE '15: "Auto-patching DOM-Based XSS ..."
Auto-patching DOM-Based XSS at Scale
Inian Parameshwaran, Enrico Budianto, Shweta Shinde, Hung Dang, Atul Sadhu, and Prateek Saxena (National University of Singapore, Singapore) DOM-based cross-site scripting (XSS) is a client-side code injection vulnerability that results from unsafe dynamic code generation in JavaScript applications, and has few known practical defenses. We study dynamic code evaluation practices on nearly a quarter million URLs crawled starting from the the Alexa Top 1000 websites. Of 777,082 cases of dynamic HTML/JS code generation we observe, 13.3% use unsafe string interpolation for dynamic code generation — a well-known dangerous coding practice. To remedy this, we propose a technique to generate secure patches that replace unsafe string interpolation with safer code that utilizes programmatic DOM construction techniques. Our system transparently auto-patches the vulnerable site while incurring only 5.2 − 8.07% overhead. The patching mechanism requires no access to server-side code or modification to browsers, and thus is practical as a turnkey defense. @InProceedings{ESEC/FSE15p272, author = {Inian Parameshwaran and Enrico Budianto and Shweta Shinde and Hung Dang and Atul Sadhu and Prateek Saxena}, title = {Auto-patching DOM-Based XSS at Scale}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {272--283}, doi = {}, year = {2015}, } Info |
|
Sadowski, Caitlin |
ESEC/FSE '15: "How Developers Search for ..."
How Developers Search for Code: A Case Study
Caitlin Sadowski, Kathryn T. Stolee, and Sebastian Elbaum (Google, USA; Iowa State University, USA; University of Nebraska-Lincoln, USA) With the advent of large code repositories and sophisticated search capabilities, code search is increasingly becoming a key software development activity. In this work we shed some light into how developers search for code through a case study performed at Google, using a combination of survey and log-analysis methodologies. Our study provides insights into what developers are doing and trying to learn when per- forming a search, search scope, query properties, and what a search session under different contexts usually entails. Our results indicate that programmers search for code very frequently, conducting an average of five search sessions with 12 total queries each workday. The search queries are often targeted at a particular code location and programmers are typically looking for code with which they are somewhat familiar. Further, programmers are generally seeking answers to questions about how to use an API, what code does, why something is failing, or where code is located. @InProceedings{ESEC/FSE15p191, author = {Caitlin Sadowski and Kathryn T. Stolee and Sebastian Elbaum}, title = {How Developers Search for Code: A Case Study}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2015}, } |
|
Safi, Gholamreza |
ESEC/FSE '15: "Detecting Event Anomalies ..."
Detecting Event Anomalies in Event-Based Systems
Gholamreza Safi, Arman Shahbazian, William G. J. Halfond , and Nenad Medvidovic (University of Southern California, USA) Event-based interaction is an attractive paradigm because its use can lead to highly flexible and adaptable systems. One problem in this paradigm is that events are sent, received, and processed nondeterministically, due to the systems’ reliance on implicit invocation and implicit concurrency. This nondeterminism can lead to event anomalies, which occur when an event-based system receives multiple events that lead to the write of a shared field or memory location. Event anomalies can lead to unreliable, error-prone, and hard to debug behavior in an event-based system. To detect these anomalies, this paper presents a new static analysis technique, DEvA, for automatically detecting event anomalies. DEvA has been evaluated on a set of open-source event-based systems against a state-of-the-art technique for detecting data races in multithreaded systems, and a recent technique for solving a similar problem with event processing in Android applications. DEvA exhibited high precision with respect to manually constructed ground truths, and was able to locate event anomalies that had not been detected by the existing solutions. @InProceedings{ESEC/FSE15p25, author = {Gholamreza Safi and Arman Shahbazian and William G. J. Halfond and Nenad Medvidovic}, title = {Detecting Event Anomalies in Event-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {25--37}, doi = {}, year = {2015}, } Video Info |
|
Saha, Diptikalyan |
ESEC/FSE '15: "P3: Partitioned Path Profiling ..."
P3: Partitioned Path Profiling
Mohammed Afraz, Diptikalyan Saha, and Aditya Kanade (Indian Institute of Science, India; IBM Research, India) Acyclic path profile is an abstraction of dynamic control flow paths of procedures and has been found to be useful in a wide spectrum of activities. Unfortunately, the runtime overhead of obtaining such a profile can be high, limiting its use in practice. In this paper, we present partitioned path profiling (P3) which runs K copies of the program in parallel, each with the same input but on a separate core, and collects the profile only for a subset of intra-procedural paths in each copy, thereby, distributing the overhead of profiling. P3 identifies “profitable” procedures and assigns disjoint subsets of paths of a profitable procedure to different copies for profiling. To obtain exact execution frequencies of a subset of paths, we design a new algorithm, called PSPP. All paths of an unprofitable procedure are assigned to the same copy. P3 uses the classic Ball-Larus algorithm for profiling unprofitable procedures. Further, P3 attempts to evenly distribute the profiling overhead across the copies. To the best of our knowledge, P3 is the first algorithm for parallel path profiling. We have applied P3 to profile several programs in the SPEC 2006 benchmark. Compared to sequential profiling, P3 substantially reduced the runtime overhead on these programs averaged across all benchmarks. The reduction was 23%, 43% and 56% on average for 2, 4 and 8 cores respectively. P3 also performed better than a coarse-grained approach that treats all procedures as unprofitable and distributes them across available cores. For 2 cores, the profiling overhead of P3 was on average 5% less compared to the coarse-grained approach across these programs. For 4 and 8 cores, it was respectively 18% and 25% less. @InProceedings{ESEC/FSE15p485, author = {Mohammed Afraz and Diptikalyan Saha and Aditya Kanade}, title = {P3: Partitioned Path Profiling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {485--495}, doi = {}, year = {2015}, } |
|
Samak, Malavika |
ESEC/FSE '15: "Synthesizing Tests for Detecting ..."
Synthesizing Tests for Detecting Atomicity Violations
Malavika Samak and Murali Krishna Ramanathan (Indian Institute of Science, India) Using thread-safe libraries can help programmers avoid the complexities of multithreading. However, designing libraries that guarantee thread-safety can be challenging. Detecting and eliminating atomicity violations when methods in the libraries are invoked concurrently is vital in building reliable client applications that use the libraries. While there are dynamic analyses to detect atomicity violations, these techniques are critically dependent on effective multithreaded tests. Unfortunately, designing such tests is non-trivial. In this paper, we design a novel and scalable approach for synthesizing multithreaded tests that help detect atomicity violations. The input to the approach is the implementation of the library and a sequential seed testsuite that invokes every method in the library with random parameters. We analyze the execution of the sequential tests, generate variable lock dependencies and construct a set of three accesses which when interleaved suitably in a multithreaded execution can cause an atomicity violation. Subsequently, we identify pairs of method invocations that correspond to these accesses and invoke them concurrently from distinct threads with appropriate objects to help expose atomicity violations. We have incorporated these ideas in our tool, named Intruder, and applied it on multiple open-source Java multithreaded libraries. Intruder is able to synthesize 40 multithreaded tests across nine classes in less than two minutes to detect 79 harmful atomicity violations, including previously unknown violations in thread-safe classes. We also demonstrate the effectiveness of Intruder by comparing the results with other approaches designed for synthesizing multithreaded tests. @InProceedings{ESEC/FSE15p131, author = {Malavika Samak and Murali Krishna Ramanathan}, title = {Synthesizing Tests for Detecting Atomicity Violations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {131--142}, doi = {}, year = {2015}, } |
|
Saxena, Prateek |
ESEC/FSE '15: "Auto-patching DOM-Based XSS ..."
Auto-patching DOM-Based XSS at Scale
Inian Parameshwaran, Enrico Budianto, Shweta Shinde, Hung Dang, Atul Sadhu, and Prateek Saxena (National University of Singapore, Singapore) DOM-based cross-site scripting (XSS) is a client-side code injection vulnerability that results from unsafe dynamic code generation in JavaScript applications, and has few known practical defenses. We study dynamic code evaluation practices on nearly a quarter million URLs crawled starting from the the Alexa Top 1000 websites. Of 777,082 cases of dynamic HTML/JS code generation we observe, 13.3% use unsafe string interpolation for dynamic code generation — a well-known dangerous coding practice. To remedy this, we propose a technique to generate secure patches that replace unsafe string interpolation with safer code that utilizes programmatic DOM construction techniques. Our system transparently auto-patches the vulnerable site while incurring only 5.2 − 8.07% overhead. The patching mechanism requires no access to server-side code or modification to browsers, and thus is practical as a turnkey defense. @InProceedings{ESEC/FSE15p272, author = {Inian Parameshwaran and Enrico Budianto and Shweta Shinde and Hung Dang and Atul Sadhu and Prateek Saxena}, title = {Auto-patching DOM-Based XSS at Scale}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {272--283}, doi = {}, year = {2015}, } Info |
|
Schmerl, Bradley |
ESEC/FSE '15: "Proactive Self-Adaptation ..."
Proactive Self-Adaptation under Uncertainty: A Probabilistic Model Checking Approach
Gabriel A. Moreno, Javier Cámara, David Garlan, and Bradley Schmerl (SEI, USA; Carnegie Mellon University, USA) Self-adaptive systems tend to be reactive and myopic, adapting in response to changes without anticipating what the subsequent adaptation needs will be. Adapting reactively can result in inefficiencies due to the system performing a suboptimal sequence of adaptations. Furthermore, when adaptations have latency, and take some time to produce their effect, they have to be started with sufficient lead time so that they complete by the time their effect is needed. Proactive latency-aware adaptation addresses these issues by making adaptation decisions with a look-ahead horizon and taking adaptation latency into account. In this paper we present an approach for proactive latency-aware adaptation under uncertainty that uses probabilistic model checking for adaptation decisions. The key idea is to use a formal model of the adaptive system in which the adaptation decision is left underspecified through nondeterminism, and have the model checker resolve the nondeterministic choices so that the accumulated utility over the horizon is maximized. The adaptation decision is optimal over the horizon, and takes into account the inherent uncertainty of the environment predictions needed for looking ahead. Our results show that the decision based on a look-ahead horizon, and the factoring of both tactic latency and environment uncertainty, considerably improve the effectiveness of adaptation decisions. @InProceedings{ESEC/FSE15p1, author = {Gabriel A. Moreno and Javier Cámara and David Garlan and Bradley Schmerl}, title = {Proactive Self-Adaptation under Uncertainty: A Probabilistic Model Checking Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1--12}, doi = {}, year = {2015}, } |
|
Sen, Koushik |
ESEC/FSE '15: "MultiSE: Multi-path Symbolic ..."
MultiSE: Multi-path Symbolic Execution using Value Summaries
Koushik Sen, George Necula, Liang Gong, and Wontae Choi (University of California at Berkeley, USA) Dynamic symbolic execution (DSE) has been proposed to effectively generate test inputs for real-world programs. Unfortunately, DSE techniques do not scale well for large realistic programs, because often the number of feasible execution paths of a program increases exponentially with the increase in the length of an execution path. In this paper, we propose MultiSE, a new technique for merging states incrementally during symbolic execution, without using auxiliary variables. The key idea of MultiSE is based on an alternative representation of the state, where we map each variable, including the program counter, to a set of guarded symbolic expressions called a value summary. MultiSE has several advantages over conventional DSE and conventional state merging techniques: value summaries enable sharing of symbolic expressions and path constraints along multiple paths and thus avoid redundant execution. MultiSE does not introduce auxiliary symbolic variables, which enables it to 1) make progress even when merging values not supported by the constraint solver, 2) avoid expensive constraint solver calls when resolving function calls and jumps, and 3) carry out most operations concretely. Moreover, MultiSE updates value summaries incrementally at every assignment instruction, which makes it unnecessary to identify the join points and to keep track of variables to merge at join points. We have implemented MultiSE for JavaScript programs in a publicly available open-source tool. Our evaluation of MultiSE on several programs shows that 1) value summaries are an eective technique to take advantage of the sharing of value along multiple execution path, that 2) MultiSE can run significantly faster than traditional dynamic symbolic execution and, 3) MultiSE saves a substantial number of state merges compared to conventional state-merging techniques. @InProceedings{ESEC/FSE15p842, author = {Koushik Sen and George Necula and Liang Gong and Wontae Choi}, title = {MultiSE: Multi-path Symbolic Execution using Value Summaries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {842--853}, doi = {}, year = {2015}, } Best-Paper Award ESEC/FSE '15: "JITProf: Pinpointing JIT-Unfriendly ..." JITProf: Pinpointing JIT-Unfriendly JavaScript Code Liang Gong, Michael Pradel , and Koushik Sen (University of California at Berkeley, USA; TU Darmstadt, Germany) Most modern JavaScript engines use just-in-time (JIT) compilation to translate parts of JavaScript code into efficient machine code at runtime. Despite the overall success of JIT compilers, programmers may still write code that uses the dynamic features of JavaScript in a way that prohibits profitable optimizations. Unfortunately, there currently is no way to measure how prevalent such JIT-unfriendly code is and to help developers detect such code locations. This paper presents JITProf, a profiling framework to dynamically identify code locations that prohibit profitable JIT optimizations. The key idea is to associate meta-information with JavaScript objects and code locations, to update this information whenever particular runtime events occur, and to use the meta-information to identify JIT-unfriendly operations. We use JITProf to analyze widely used JavaScript web applications and show that JIT-unfriendly code is prevalent in practice. Furthermore, we show how to use the approach as a profiling technique that finds optimization opportunities in a program. Applying the profiler to popular benchmark programs shows that refactoring these programs to avoid performance problems identified by JITProf leads to statistically significant performance improvements of up to 26.3% in 15 benchmarks. @InProceedings{ESEC/FSE15p357, author = {Liang Gong and Michael Pradel and Koushik Sen}, title = {JITProf: Pinpointing JIT-Unfriendly JavaScript Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {357--368}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "MemInsight: Platform-Independent ..." MemInsight: Platform-Independent Memory Debugging for JavaScript Simon Holm Jensen, Manu Sridharan, Koushik Sen, and Satish Chandra (Snowflake Computing, USA; Samsung Research, USA; University of California at Berkeley, USA) JavaScript programs often suffer from memory issues that can either hurt performance or eventually cause memory exhaustion. While existing snapshot-based profiling tools can be helpful, the information provided is limited to the coarse granularity at which snapshots can be taken. We present MemInsight, a tool that provides detailed, time-varying analysis of the memory behavior of JavaScript applications, including web applications. MemInsight is platform independent and runs on unmodified JavaScript engines. It employs tuned source-code instrumentation to generate a trace of memory allocations and accesses, and it leverages modern browser features to track precise information for DOM (document object model) objects. It also computes exact object lifetimes without any garbage collector assistance, and exposes this information in an easily-consumable manner for further analysis. We describe several client analyses built into MemInsight, including detection of possible memory leaks and opportunities for stack allocation and object inlining. An experimental evaluation showed that with no modifications to the runtime, MemInsight was able to expose memory issues in several real-world applications. @InProceedings{ESEC/FSE15p345, author = {Simon Holm Jensen and Manu Sridharan and Koushik Sen and Satish Chandra}, title = {MemInsight: Platform-Independent Memory Debugging for JavaScript}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {345--356}, doi = {}, year = {2015}, } |
|
Seshia, Sanjit A. |
ESEC/FSE '15: "Systematic Testing of Asynchronous ..."
Systematic Testing of Asynchronous Reactive Systems
Ankush Desai, Shaz Qadeer, and Sanjit A. Seshia (University of California at Berkeley, USA; Microsoft Research, USA) We introduce the concept of a delaying explorer with the goal of performing prioritized exploration of the behaviors of an asynchronous reactive program. A delaying explorer stratifies the search space using a custom strategy, and a delay operation that allows deviation from that strategy. We show that prioritized search with a delaying explorer performs significantly better than existing prioritization techniques. We also demonstrate empirically the need for writing different delaying explorers for scalable systematic testing and hence, present a flexible delaying explorer interface. We introduce two new techniques to improve the scalability of search based on delaying explorers. First, we present an algorithm for stratified exhaustive search and use efficient state caching to avoid redundant exploration of schedules. We provide soundness and termination guarantees for our algorithm. Second, for the cases where the state of the system cannot be captured or there are resource constraints, we present an algorithm to randomly sample any execution from the stratified search space. This algorithm guarantees that any such execution that requires d delay operations is sampled with probability at least 1/Ld, where L is the maximum number of program steps. We have implemented our algorithms and evaluated them on a collection of real-world fault-tolerant distributed protocols. @InProceedings{ESEC/FSE15p73, author = {Ankush Desai and Shaz Qadeer and Sanjit A. Seshia}, title = {Systematic Testing of Asynchronous Reactive Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {73--83}, doi = {}, year = {2015}, } Info |
|
Shaffer, Timothy R. |
ESEC/FSE '15: "Tracing Software Developers' ..."
Tracing Software Developers' Eyes and Interactions for Change Tasks
Katja Kevic, Braden M. Walters, Timothy R. Shaffer, Bonita Sharif, David C. Shepherd, and Thomas Fritz (University of Zurich, Switzerland; Youngstown State University, USA; ABB Research, USA) What are software developers doing during a change task? While an answer to this question opens countless opportunities to support developers in their work, only little is known about developers' detailed navigation behavior for realistic change tasks. Most empirical studies on developers performing change tasks are limited to very small code snippets or are limited by the granularity or the detail of the data collected for the study. In our research, we try to overcome these limitations by combining user interaction monitoring with very fine granular eye-tracking data that is automatically linked to the underlying source code entities in the IDE. In a study with 12 professional and 10 student developers working on three change tasks from an open source system, we used our approach to investigate the detailed navigation of developers for realistic change tasks. The results of our study show, amongst others, that the eye tracking data does indeed capture different aspects than user interaction data and that developers focus on only small parts of methods that are often related by data flow. We discuss our findings and their implications for better developer tool support. @InProceedings{ESEC/FSE15p202, author = {Katja Kevic and Braden M. Walters and Timothy R. Shaffer and Bonita Sharif and David C. Shepherd and Thomas Fritz}, title = {Tracing Software Developers' Eyes and Interactions for Change Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {202--213}, doi = {}, year = {2015}, } Info |
|
Shahbazian, Arman |
ESEC/FSE '15: "Detecting Event Anomalies ..."
Detecting Event Anomalies in Event-Based Systems
Gholamreza Safi, Arman Shahbazian, William G. J. Halfond , and Nenad Medvidovic (University of Southern California, USA) Event-based interaction is an attractive paradigm because its use can lead to highly flexible and adaptable systems. One problem in this paradigm is that events are sent, received, and processed nondeterministically, due to the systems’ reliance on implicit invocation and implicit concurrency. This nondeterminism can lead to event anomalies, which occur when an event-based system receives multiple events that lead to the write of a shared field or memory location. Event anomalies can lead to unreliable, error-prone, and hard to debug behavior in an event-based system. To detect these anomalies, this paper presents a new static analysis technique, DEvA, for automatically detecting event anomalies. DEvA has been evaluated on a set of open-source event-based systems against a state-of-the-art technique for detecting data races in multithreaded systems, and a recent technique for solving a similar problem with event processing in Android applications. DEvA exhibited high precision with respect to manually constructed ground truths, and was able to locate event anomalies that had not been detected by the existing solutions. @InProceedings{ESEC/FSE15p25, author = {Gholamreza Safi and Arman Shahbazian and William G. J. Halfond and Nenad Medvidovic}, title = {Detecting Event Anomalies in Event-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {25--37}, doi = {}, year = {2015}, } Video Info |
|
Shahmehri, Nahid |
ESEC/FSE '15: "Turning Programs against Each ..."
Turning Programs against Each Other: High Coverage Fuzz-Testing using Binary-Code Mutation and Dynamic Slicing
Ulf Kargén and Nahid Shahmehri (Linköping University, Sweden) Mutation-based fuzzing is a popular and widely employed black-box testing technique for finding security and robustness bugs in software. It owes much of its success to its simplicity; a well-formed seed input is mutated, e.g. through random bit-flipping, to produce test inputs. While reducing the need for human effort, and enabling security testing even of closed-source programs with undocumented input formats, the simplicity of mutation-based fuzzing comes at the cost of poor code coverage. Often millions of iterations are needed, and the results are highly dependent on configuration parameters and the choice of seed inputs. In this paper we propose a novel method for automated generation of high-coverage test cases for robustness testing. Our method is based on the observation that, even for closed-source programs with proprietary input formats, an implementation that can generate well-formed inputs to the program is typically available. By systematically mutating the program code of such generating programs, we leverage information about the input format encoded in the generating program to produce high-coverage test inputs, capable of reaching deep states in the program under test. Our method works entirely at the machine-code level, enabling use-cases similar to traditional black-box fuzzing. We have implemented the method in our tool MutaGen, and evaluated it on 7 popular Linux programs. We found that, for most programs, our method improves code coverage by one order of magnitude or more, compared to two well-known mutation-based fuzzers. We also found a total of 8 unique bugs. @InProceedings{ESEC/FSE15p782, author = {Ulf Kargén and Nahid Shahmehri}, title = {Turning Programs against Each Other: High Coverage Fuzz-Testing using Binary-Code Mutation and Dynamic Slicing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {782--792}, doi = {}, year = {2015}, } |
|
Sharif, Bonita |
ESEC/FSE '15: "Tracing Software Developers' ..."
Tracing Software Developers' Eyes and Interactions for Change Tasks
Katja Kevic, Braden M. Walters, Timothy R. Shaffer, Bonita Sharif, David C. Shepherd, and Thomas Fritz (University of Zurich, Switzerland; Youngstown State University, USA; ABB Research, USA) What are software developers doing during a change task? While an answer to this question opens countless opportunities to support developers in their work, only little is known about developers' detailed navigation behavior for realistic change tasks. Most empirical studies on developers performing change tasks are limited to very small code snippets or are limited by the granularity or the detail of the data collected for the study. In our research, we try to overcome these limitations by combining user interaction monitoring with very fine granular eye-tracking data that is automatically linked to the underlying source code entities in the IDE. In a study with 12 professional and 10 student developers working on three change tasks from an open source system, we used our approach to investigate the detailed navigation of developers for realistic change tasks. The results of our study show, amongst others, that the eye tracking data does indeed capture different aspects than user interaction data and that developers focus on only small parts of methods that are often related by data flow. We discuss our findings and their implications for better developer tool support. @InProceedings{ESEC/FSE15p202, author = {Katja Kevic and Braden M. Walters and Timothy R. Shaffer and Bonita Sharif and David C. Shepherd and Thomas Fritz}, title = {Tracing Software Developers' Eyes and Interactions for Change Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {202--213}, doi = {}, year = {2015}, } Info |
|
Shepherd, David C. |
ESEC/FSE '15: "Tracing Software Developers' ..."
Tracing Software Developers' Eyes and Interactions for Change Tasks
Katja Kevic, Braden M. Walters, Timothy R. Shaffer, Bonita Sharif, David C. Shepherd, and Thomas Fritz (University of Zurich, Switzerland; Youngstown State University, USA; ABB Research, USA) What are software developers doing during a change task? While an answer to this question opens countless opportunities to support developers in their work, only little is known about developers' detailed navigation behavior for realistic change tasks. Most empirical studies on developers performing change tasks are limited to very small code snippets or are limited by the granularity or the detail of the data collected for the study. In our research, we try to overcome these limitations by combining user interaction monitoring with very fine granular eye-tracking data that is automatically linked to the underlying source code entities in the IDE. In a study with 12 professional and 10 student developers working on three change tasks from an open source system, we used our approach to investigate the detailed navigation of developers for realistic change tasks. The results of our study show, amongst others, that the eye tracking data does indeed capture different aspects than user interaction data and that developers focus on only small parts of methods that are often related by data flow. We discuss our findings and their implications for better developer tool support. @InProceedings{ESEC/FSE15p202, author = {Katja Kevic and Braden M. Walters and Timothy R. Shaffer and Bonita Sharif and David C. Shepherd and Thomas Fritz}, title = {Tracing Software Developers' Eyes and Interactions for Change Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {202--213}, doi = {}, year = {2015}, } Info |
|
Shi, August |
ESEC/FSE '15: "Comparing and Combining Test-Suite ..."
Comparing and Combining Test-Suite Reduction and Regression Test Selection
August Shi, Tifany Yung, Alex Gyori, and Darko Marinov (University of Illinois at Urbana-Champaign, USA) Regression testing is widely used to check that changes made to software do not break existing functionality, but regression test suites grow, and running them fully can become costly. Researchers have proposed test-suite reduction and regression test selection as two approaches to reduce this cost by not running some of the tests from the test suite. However, previous research has not empirically evaluated how the two approaches compare to each other, and how well a combination of these approaches performs. We present the first extensive study that compares test-suite reduction and regression test selection approaches individually, and also evaluates a combination of the two approaches. We also propose a new criterion to measure the quality of tests with respect to software changes. Our experiments on 4,793 commits from 17 open-source projects show that regression test selection runs on average fewer tests (by 40.15pp) than test-suite reduction. However, test-suite reduction can have a high loss in fault-detection capability with respect to the changes, whereas a (safe) regression test selection has no loss. The experiments also show that a combination of the two approaches runs even fewer tests (on average 5.34pp) than regression test selection, but these tests still have a loss in fault-detection capability with respect to the changes. @InProceedings{ESEC/FSE15p237, author = {August Shi and Tifany Yung and Alex Gyori and Darko Marinov}, title = {Comparing and Combining Test-Suite Reduction and Regression Test Selection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {237--247}, doi = {}, year = {2015}, } |
|
Shinde, Shweta |
ESEC/FSE '15: "Auto-patching DOM-Based XSS ..."
Auto-patching DOM-Based XSS at Scale
Inian Parameshwaran, Enrico Budianto, Shweta Shinde, Hung Dang, Atul Sadhu, and Prateek Saxena (National University of Singapore, Singapore) DOM-based cross-site scripting (XSS) is a client-side code injection vulnerability that results from unsafe dynamic code generation in JavaScript applications, and has few known practical defenses. We study dynamic code evaluation practices on nearly a quarter million URLs crawled starting from the the Alexa Top 1000 websites. Of 777,082 cases of dynamic HTML/JS code generation we observe, 13.3% use unsafe string interpolation for dynamic code generation — a well-known dangerous coding practice. To remedy this, we propose a technique to generate secure patches that replace unsafe string interpolation with safer code that utilizes programmatic DOM construction techniques. Our system transparently auto-patches the vulnerable site while incurring only 5.2 − 8.07% overhead. The patching mechanism requires no access to server-side code or modification to browsers, and thus is practical as a turnkey defense. @InProceedings{ESEC/FSE15p272, author = {Inian Parameshwaran and Enrico Budianto and Shweta Shinde and Hung Dang and Atul Sadhu and Prateek Saxena}, title = {Auto-patching DOM-Based XSS at Scale}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {272--283}, doi = {}, year = {2015}, } Info |
|
Siegmund, Norbert |
ESEC/FSE '15: "Performance-Influence Models ..."
Performance-Influence Models for Highly Configurable Systems
Norbert Siegmund, Alexander Grebhahn, Sven Apel, and Christian Kästner (University of Passau, Germany; Carnegie Mellon University, USA) Almost every complex software system today is configurable. While configurability has many benefits, it challenges performance prediction, optimization, and debugging. Often, the influences of individual configuration options on performance are unknown. Worse, configuration options may interact, giving rise to a configuration space of possibly exponential size. Addressing this challenge, we propose an approach that derives a performance-influence model for a given configurable system, describing all relevant influences of configuration options and their interactions. Our approach combines machine-learning and sampling heuristics in a novel way. It improves over standard techniques in that it (1) represents influences of options and their interactions explicitly (which eases debugging), (2) smoothly integrates binary and numeric configuration options for the first time, (3) incorporates domain knowledge, if available (which eases learning and increases accuracy), (4) considers complex constraints among options, and (5) systematically reduces the solution space to a tractable size. A series of experiments demonstrates the feasibility of our approach in terms of the accuracy of the models learned as well as the accuracy of the performance predictions one can make with them. @InProceedings{ESEC/FSE15p284, author = {Norbert Siegmund and Alexander Grebhahn and Sven Apel and Christian Kästner}, title = {Performance-Influence Models for Highly Configurable Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {284--294}, doi = {}, year = {2015}, } Info |
|
Sinha, Nishant |
ESEC/FSE '15: "Responsive Designs in a Snap ..."
Responsive Designs in a Snap
Nishant Sinha and Rezwana Karim (IBM Research, India; Rutgers University, USA) With the massive adoption of mobile devices with different form- factors, UI designers face the challenge of designing responsive UIs which are visually appealing across a wide range of devices. De- signing responsive UIs requires a deep knowledge of HTML/CSS as well as responsive patterns - juggling through various design configurations and re-designing for multiple devices is laborious and time-consuming. We present DECOR, a recommendation tool for creating multi-device responsive UIs. Given an initial UI de- sign, user-specified design constraints and a list of devices, DECOR provides ranked, device-specific recommendations to the designer for approval. Design space exploration involves a combinatorial explosion: we formulate it as a design repair problem and devise several design space pruning techniques to enable efficient repair. An evaluation over real-life designs shows that DECOR is able to compute the desired recommendations, involving a variety of responsive design patterns, in less than a minute. @InProceedings{ESEC/FSE15p544, author = {Nishant Sinha and Rezwana Karim}, title = {Responsive Designs in a Snap}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {544--554}, doi = {}, year = {2015}, } |
|
Smith, Edward K. |
ESEC/FSE '15: "Is the Cure Worse Than the ..."
Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair
Edward K. Smith, Earl T. Barr , Claire Le Goues , and Yuriy Brun (University of Massachusetts at Amherst, USA; University College London, UK; Carnegie Mellon University, USA; University of Massachusetts, USA) Automated program repair has shown promise for reducing the significant manual effort debugging requires. This paper addresses a deficit of earlier evaluations of automated repair techniques caused by repairing programs and evaluating generated patches' correctness using the same set of tests. Since tests are an imperfect metric of program correctness, evaluations of this type do not discriminate between correct patches and patches that overfit the available tests and break untested but desired functionality. This paper evaluates two well-studied repair tools, GenProg and TrpAutoRepair, on a publicly available benchmark of bugs, each with a human-written patch. By evaluating patches using tests independent from those used during repair, we find that the tools are unlikely to improve the proportion of independent tests passed, and that the quality of the patches is proportional to the coverage of the test suite used during repair. For programs that pass most tests, the tools are as likely to break tests as to fix them. However, novice developers also overfit, and automated repair performs no worse than these developers. In addition to overfitting, we measure the effects of test suite coverage, test suite provenance, and starting program quality, as well as the difference in quality between novice-developer-written and tool-generated patches when quality is assessed with a test suite independent from the one used for patch generation. @InProceedings{ESEC/FSE15p532, author = {Edward K. Smith and Earl T. Barr and Claire Le Goues and Yuriy Brun}, title = {Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {532--543}, doi = {}, year = {2015}, } |
|
Smith, Justin |
ESEC/FSE '15: "Questions Developers Ask While ..."
Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis
Justin Smith, Brittany Johnson, Emerson Murphy-Hill, Bill Chu, and Heather Richter Lipford (North Carolina State University, USA; University of North Carolina at Charlotte, USA) Security tools can help developers answer questions about potential vulnerabilities in their code. A better understanding of the types of questions asked by developers may help toolsmiths design more effective tools. In this paper, we describe how we collected and categorized these questions by conducting an exploratory study with novice and experienced software developers. We equipped them with Find Security Bugs, a security-oriented static analysis tool, and observed their interactions with security vulnerabilities in an open-source system that they had previously contributed to. We found that they asked questions not only about security vulnerabilities, associated attacks, and fixes, but also questions about the software itself, the social ecosystem that built the software, and related resources and tools. For example, when participants asked questions about the source of tainted data, their tools forced them to make imperfect tradeoffs between systematic and ad hoc program navigation strategies. @InProceedings{ESEC/FSE15p248, author = {Justin Smith and Brittany Johnson and Emerson Murphy-Hill and Bill Chu and Heather Richter Lipford}, title = {Questions Developers Ask While Diagnosing Potential Security Vulnerabilities with Static Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {248--259}, doi = {}, year = {2015}, } Info |
|
Song, Linhai |
ESEC/FSE '15: "What Change History Tells ..."
What Change History Tells Us about Thread Synchronization
Rui Gu, Guoliang Jin, Linhai Song, Linjie Zhu, and Shan Lu (Columbia University, USA; North Carolina State University, USA; University of Wisconsin-Madison, USA; University of Chicago, USA) Multi-threaded programs are pervasive, yet difficult to write. Missing proper synchronization leads to correctness bugs and over synchronization leads to performance problems. To improve the correctness and efficiency of multi-threaded software, we need a better understanding of synchronization challenges faced by real-world developers. This paper studies the code repositories of open-source multi-threaded software projects to obtain a broad and in- depth view of how developers handle synchronizations. We first examine how critical sections are changed when software evolves by checking over 250,000 revisions of four representative open-source software projects. The findings help us answer questions like how often synchronization is an afterthought for developers; whether it is difficult for devel- opers to decide critical section boundaries and lock variables; and what are real-world over-synchronization problems. We then conduct case studies to better understand (1) how critical sections are changed to solve performance prob- lems (i.e. over-synchronization issues) and (2) how soft- ware changes lead to synchronization-related correctness problems (i.e. concurrency bugs). This in-depth study shows that tool support is needed to help developers tackle over-synchronization problems; it also shows that concur- rency bug avoidance, detection, and testing can be improved through better awareness of code revision history. @InProceedings{ESEC/FSE15p426, author = {Rui Gu and Guoliang Jin and Linhai Song and Linjie Zhu and Shan Lu}, title = {What Change History Tells Us about Thread Synchronization}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {426--438}, doi = {}, year = {2015}, } |
|
Sridharan, Manu |
ESEC/FSE '15: "Mimic: Computing Models for ..."
Mimic: Computing Models for Opaque Code
Stefan Heule, Manu Sridharan, and Satish Chandra (Stanford University, USA; Samsung Research, USA) Opaque code, which is executable but whose source is unavailable or hard to process, can be problematic in a number of scenarios, such as program analysis. Manual construction of models is often used to handle opaque code, but this process is tedious and error-prone. (In this paper, we use model to mean a representation of a piece of code suitable for program analysis.) We present a novel technique for automatic generation of models for opaque code, based on program synthesis. The technique intercepts memory accesses from the opaque code to client objects, and uses this information to construct partial execution traces. Then, it performs a heuristic search inspired by Markov Chain Monte Carlo techniques to discover an executable code model whose behavior matches the opaque code. Native execution, parallelization, and a carefully-designed fitness function are leveraged to increase the effectiveness of the search. We have implemented our technique in a tool Mimic for discovering models of opaque JavaScript functions, and used Mimic to synthesize correct models for a variety of array-manipulating routines. @InProceedings{ESEC/FSE15p710, author = {Stefan Heule and Manu Sridharan and Satish Chandra}, title = {Mimic: Computing Models for Opaque Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {710--720}, doi = {}, year = {2015}, } Info ESEC/FSE '15: "MemInsight: Platform-Independent ..." MemInsight: Platform-Independent Memory Debugging for JavaScript Simon Holm Jensen, Manu Sridharan, Koushik Sen, and Satish Chandra (Snowflake Computing, USA; Samsung Research, USA; University of California at Berkeley, USA) JavaScript programs often suffer from memory issues that can either hurt performance or eventually cause memory exhaustion. While existing snapshot-based profiling tools can be helpful, the information provided is limited to the coarse granularity at which snapshots can be taken. We present MemInsight, a tool that provides detailed, time-varying analysis of the memory behavior of JavaScript applications, including web applications. MemInsight is platform independent and runs on unmodified JavaScript engines. It employs tuned source-code instrumentation to generate a trace of memory allocations and accesses, and it leverages modern browser features to track precise information for DOM (document object model) objects. It also computes exact object lifetimes without any garbage collector assistance, and exposes this information in an easily-consumable manner for further analysis. We describe several client analyses built into MemInsight, including detection of possible memory leaks and opportunities for stack allocation and object inlining. An experimental evaluation showed that with no modifications to the runtime, MemInsight was able to expose memory issues in several real-world applications. @InProceedings{ESEC/FSE15p345, author = {Simon Holm Jensen and Manu Sridharan and Koushik Sen and Satish Chandra}, title = {MemInsight: Platform-Independent Memory Debugging for JavaScript}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {345--356}, doi = {}, year = {2015}, } |
|
Stahlbauer, Andreas |
ESEC/FSE '15: "Witness Validation and Stepwise ..."
Witness Validation and Stepwise Testification across Software Verifiers
Dirk Beyer , Matthias Dangl, Daniel Dietsch, Matthias Heizmann, and Andreas Stahlbauer (University of Passau, Germany; University of Freiburg, Germany) It is commonly understood that a verification tool should provide a counterexample to witness a specification violation. Until recently, software verifiers dumped error witnesses in proprietary formats, which are often neither human- nor machine-readable, and an exchange of witnesses between different verifiers was impossible. To close this gap in software-verification technology, we have defined an exchange format for error witnesses that is easy to write and read by verification tools (for further processing, e.g., witness validation) and that is easy to convert into visualizations that conveniently let developers inspect an error path. To eliminate manual inspection of false alarms, we develop the notion of stepwise testification: in a first step, a verifier finds a problematic program path and, in addition to the verification result FALSE, constructs a witness for this path; in the next step, another verifier re-verifies that the witness indeed violates the specification. This process can have more than two steps, each reducing the state space around the error path, making it easier to validate the witness in a later step. An obvious application for testification is the setting where we have two verifiers: one that is efficient but imprecise and another one that is precise but expensive. We have implemented the technique of error-witness-driven program analysis in two state-of-the-art verification tools, CPAchecker and Ultimate Automizer, and show by experimental evaluation that the approach is applicable to a large set of verification tasks. @InProceedings{ESEC/FSE15p721, author = {Dirk Beyer and Matthias Dangl and Daniel Dietsch and Matthias Heizmann and Andreas Stahlbauer}, title = {Witness Validation and Stepwise Testification across Software Verifiers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {721--733}, doi = {}, year = {2015}, } Info |
|
Stolee, Kathryn T. |
ESEC/FSE '15: "How Developers Search for ..."
How Developers Search for Code: A Case Study
Caitlin Sadowski, Kathryn T. Stolee, and Sebastian Elbaum (Google, USA; Iowa State University, USA; University of Nebraska-Lincoln, USA) With the advent of large code repositories and sophisticated search capabilities, code search is increasingly becoming a key software development activity. In this work we shed some light into how developers search for code through a case study performed at Google, using a combination of survey and log-analysis methodologies. Our study provides insights into what developers are doing and trying to learn when per- forming a search, search scope, query properties, and what a search session under different contexts usually entails. Our results indicate that programmers search for code very frequently, conducting an average of five search sessions with 12 total queries each workday. The search queries are often targeted at a particular code location and programmers are typically looking for code with which they are somewhat familiar. Further, programmers are generally seeking answers to questions about how to use an API, what code does, why something is failing, or where code is located. @InProceedings{ESEC/FSE15p191, author = {Caitlin Sadowski and Kathryn T. Stolee and Sebastian Elbaum}, title = {How Developers Search for Code: A Case Study}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {191--201}, doi = {}, year = {2015}, } |
|
Su, Zhendong |
ESEC/FSE '15: "Guided Differential Testing ..."
Guided Differential Testing of Certificate Validation in SSL/TLS Implementations
Yuting Chen and Zhendong Su (Shanghai Jiao Tong University, China; University of California at Davis, USA) Certificate validation in SSL/TLS implementations is critical for Internet security. There is recent strong effort, namely frankencert, in automatically synthesizing certificates for stress-testing certificate validation. Despite its early promise, it remains a significant challenge to generate effective test certificates as they are structurally complex with intricate syntactic and semantic constraints. This paper tackles this challenge by introducing mucert, a novel, guided technique to much more effectively test real-world certificate validation code. Our core insight is to (1) leverage easily accessible Internet certificates as seed certificates, and (2) diversify them by adapting Markov Chain Monte Carlo (MCMC) sampling. The diversified certificates are then used to reveal discrepancies, thus potential flaws, among different certificate validation implementations. We have implemented mucert and extensively evaluated it against frankencert. Our experimental results show that mucert is significantly more cost-effective than frankencert. Indeed, 1K mucerts (i.e., mucert-mutated certificates) yield three times as many distinct discrepancies as 8M frankencerts (i.e., frankencert-synthesized certificates), and 200 mucerts can achieve higher code coverage than 100,000 frankencerts. This improvement is significant as it incurs much cost to test each generated certificate. We have analyzed and reported 20+ latent discrepancies (presumably missed by frankencert), and reported an additional 357 discrepancy-triggering certificates to SSL/TLS developers, who have already confirmed some of our reported issues and are investigating causes of all the reported discrepancies. In particular, our reports have led to bug fixes, active discussions in the community, and proposed changes to relevant IETF’s RFCs. We believe that mucert is practical and effective for helping improve the robustness of SSL/TLS implementations. @InProceedings{ESEC/FSE15p793, author = {Yuting Chen and Zhendong Su}, title = {Guided Differential Testing of Certificate Validation in SSL/TLS Implementations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {793--804}, doi = {}, year = {2015}, } Info |
|
Sun, Jun |
ESEC/FSE '15: "TLV: Abstraction through Testing, ..."
TLV: Abstraction through Testing, Learning, and Validation
Jun Sun, Hao Xiao, Yang Liu , Shang-Wei Lin, and Shengchao Qin (Singapore University of Technology and Design, Singapore; Nanyang Technological University, Singapore; Teesside University, UK; Shenzhen University, China) A (Java) class provides a service to its clients (i.e., programs which use the class). The service must satisfy certain specifications. Different specifications might be expected at different levels of abstraction depending on the client's objective. In order to effectively contrast the class against its specifications, whether manually or automatically, one essential step is to automatically construct an abstraction of the given class at a proper level of abstraction. The abstraction should be correct (i.e., over-approximating) and accurate (i.e., with few spurious traces). We present an automatic approach, which combines testing, learning, and validation, to constructing an abstraction. Our approach is designed such that a large part of the abstraction is generated based on testing and learning so as to minimize the use of heavy-weight techniques like symbolic execution. The abstraction is generated through a process of abstraction/refinement, with no user input, and converges to a specific level of abstraction depending on the usage context. The generated abstraction is guaranteed to be correct and accurate. We have implemented the proposed approach in a toolkit named TLV and evaluated TLV with a number of benchmark programs as well as three real-world ones. The results show that TLV generates abstraction for program analysis and verification more efficiently. @InProceedings{ESEC/FSE15p698, author = {Jun Sun and Hao Xiao and Yang Liu and Shang-Wei Lin and Shengchao Qin}, title = {TLV: Abstraction through Testing, Learning, and Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {698--709}, doi = {}, year = {2015}, } Info |
|
Sutton, Charles |
ESEC/FSE '15: "Suggesting Accurate Method ..."
Suggesting Accurate Method and Class Names
Miltiadis Allamanis, Earl T. Barr , Christian Bird, and Charles Sutton (University of Edinburgh, UK; University College London, UK; Microsoft Research, USA) Descriptive names are a vital part of readable, and hence maintainable, code. Recent progress on automatically suggesting names for local variables tantalizes with the prospect of replicating that success with method and class names. However, suggesting names for methods and classes is much more difficult. This is because good method and class names need to be functionally descriptive, but suggesting such names requires that the model goes beyond local context. We introduce a neural probabilistic language model for source code that is specifically designed for the method naming problem. Our model learns which names are semantically similar by assigning them to locations, called embeddings, in a high-dimensional continuous space, in such a way that names with similar embeddings tend to be used in similar contexts. These embeddings seem to contain semantic information about tokens, even though they are learned only from statistical co-occurrences of tokens. Furthermore, we introduce a variant of our model that is, to our knowledge, the first that can propose neologisms, names that have not appeared in the training corpus. We obtain state of the art results on the method, class, and even the simpler variable naming tasks. More broadly, the continuous embeddings that are learned by our model have the potential for wide application within software engineering. @InProceedings{ESEC/FSE15p38, author = {Miltiadis Allamanis and Earl T. Barr and Christian Bird and Charles Sutton}, title = {Suggesting Accurate Method and Class Names}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {38--49}, doi = {}, year = {2015}, } Info |
|
Talwadker, Rukma |
ESEC/FSE '15: "Hey, You Have Given Me Too ..."
Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software
Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker (University of California at San Diego, USA; Huazhong University of Science and Technology, China; NetApp, USA) Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters ("knobs"). With hundreds of knobs, configuring system software to ensure high reliability and performance becomes a daunting, error-prone task. This paper makes a first step in understanding a fundamental question of configuration design: "do users really need so many knobs?" To provide the quantitatively answer, we study the configuration settings of real-world users, including thousands of customers of a commercial storage system (Storage-A), and hundreds of users of two widely-used open-source system software projects. Our study reveals a series of interesting findings to motivate software architects and developers to be more cautious and disciplined in configuration design. Motivated by these findings, we provide a few concrete, practical guidelines which can significantly reduce the configuration space. Take Storage-A as an example, the guidelines can remove 51.9% of its parameters and simplify 19.7% of the remaining ones with little impact on existing users. Also, we study the existing configuration navigation methods in the context of "too many knobs" to understand their effectiveness in dealing with the over-designed configuration, and to provide practices for building navigation support in system software. @InProceedings{ESEC/FSE15p307, author = {Tianyin Xu and Long Jin and Xuepeng Fan and Yuanyuan Zhou and Shankar Pasupathy and Rukma Talwadker}, title = {Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {307--319}, doi = {}, year = {2015}, } Video Info |
|
Tanter, Éric |
ESEC/FSE '15: "An Empirical Study of Goto ..."
An Empirical Study of Goto in C Code from GitHub Repositories
Meiyappan Nagappan, Romain Robbes, Yasutaka Kamei, Éric Tanter , Shane McIntosh, Audris Mockus, and Ahmed E. Hassan (Rochester Institute of Technology, USA; University of Chile, Chile; Kyushu University, Japan; McGill University, Canada; University of Tennessee, USA; Queen's University, Canada) It is nearly 50 years since Dijkstra argued that goto obscures the flow of control in program execution and urged programmers to abandon the goto statement. While past research has shown that goto is still in use, little is known about whether goto is used in the unrestricted manner that Dijkstra feared, and if it is ‘harmful’ enough to be a part of a post-release bug. We, therefore, conduct a two part empirical study - (1) qualitatively analyze a statistically rep- resentative sample of 384 files from a population of almost 250K C programming language files collected from over 11K GitHub repositories and find that developers use goto in C files for error handling (80.21±5%) and cleaning up resources at the end of a procedure (40.36 ± 5%); and (2) quantitatively analyze the commit history from the release branches of six OSS projects and find that no goto statement was re- moved/modified in the post-release phase of four of the six projects. We conclude that developers limit themselves to using goto appropriately in most cases, and not in an unrestricted manner like Dijkstra feared, thus suggesting that goto does not appear to be harmful in practice. @InProceedings{ESEC/FSE15p404, author = {Meiyappan Nagappan and Romain Robbes and Yasutaka Kamei and Éric Tanter and Shane McIntosh and Audris Mockus and Ahmed E. Hassan}, title = {An Empirical Study of Goto in C Code from GitHub Repositories}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {404--414}, doi = {}, year = {2015}, } |
|
Tasiran, Serdar |
ESEC/FSE '15: "Detecting JavaScript Races ..."
Detecting JavaScript Races That Matter
Erdal Mutlu, Serdar Tasiran, and Benjamin Livshits (Koç University, Turkey; Microsoft Research, USA) As JavaScript has become virtually omnipresent as the language for programming large and complex web applications in the last several years, we have seen an increase in interest in finding data races in client-side JavaScript. While JavaScript execution is single-threaded, there is still enough potential for data races, created largely by the non-determinism of the scheduler. Recently, several academic efforts have explored both static and run-time analysis approaches in an effort to find data races. However, despite this, we have not seen these analysis techniques deployed in practice and we have only seen scarce evidence that developers find and fix bugs related to data races in JavaScript. In this paper we argue for a different formulation of what it means to have a data race in a JavaScript application and distinguish between benign and harmful races, affecting persistent browser or server state. We further argue that while benign races — the subject of the majority of prior work — do exist, harmful races are exceedingly rare in practice (19 harmful vs. 621 benign). Our results shed a new light on the issues of data race prevalence and importance. To find races, we also propose a novel lightweight run-time symbolic exploration algorithm for finding races in traces of run-time execution. Our algorithm eschews schedule exploration in favor of smaller run-time overheads and thus can be used by beta testers or in crowd-sourced testing. In our experiments on 26 sites, we demonstrate that benign races are considerably more common than harmful ones. @InProceedings{ESEC/FSE15p381, author = {Erdal Mutlu and Serdar Tasiran and Benjamin Livshits}, title = {Detecting JavaScript Races That Matter}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {381--392}, doi = {}, year = {2015}, } Info |
|
Treude, Christoph |
ESEC/FSE '15: "Summarizing and Measuring ..."
Summarizing and Measuring Development Activity
Christoph Treude, Fernando Figueira Filho, and Uirá Kulesza (Federal University of Rio Grande do Norte, Brazil) Software developers pursue a wide range of activities as part of their work, and making sense of what they did in a given time frame is far from trivial as evidenced by the large number of awareness and coordination tools that have been developed in recent years. To inform tool design for making sense of the information available about a developer's activity, we conducted an empirical study with 156 GitHub users to investigate what information they would expect in a summary of development activity, how they would measure development activity, and what factors influence how such activity can be condensed into textual summaries or numbers. We found that unexpected events are as important as expected events in summaries of what a developer did, and that many developers do not believe in measuring development activity. Among the factors that influence summarization and measurement of development activity, we identified development experience and programming languages. @InProceedings{ESEC/FSE15p625, author = {Christoph Treude and Fernando Figueira Filho and Uirá Kulesza}, title = {Summarizing and Measuring Development Activity}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {625--636}, doi = {}, year = {2015}, } Info |
|
Vasilescu, Bogdan |
ESEC/FSE '15: "Developer Onboarding in GitHub: ..."
Developer Onboarding in GitHub: The Role of Prior Social Links and Language Experience
Casey Casalnuovo, Bogdan Vasilescu, Premkumar Devanbu , and Vladimir Filkov (University of California at Davis, USA) The team aspects of software engineering have been a subject of great interest since early work by Fred Brooks and others: how well do people work together in teams? why do people join teams? what happens if teams are distributed? Recently, the emergence of project ecosystems such as GitHub have created an entirely new, higher level of organization. GitHub supports numerous teams; they share a common technical platform (for work activities) and a common social platform (via following, commenting, etc). We explore the GitHub evidence for socialization as a precursor to joining a project, and how the technical factors of past experience and social factors of past connections to team members of a project affect productivity both initially and in the long run. We find developers preferentially join projects in GitHub where they have pre-existing relationships; furthermore, we find that the presence of past social connections combined with prior experience in languages dominant in the project leads to higher productivity both initially and cumulatively. Interestingly, we also find that stronger social connections are associated with slightly less productivity initially, but slightly more productivity in the long run. @InProceedings{ESEC/FSE15p817, author = {Casey Casalnuovo and Bogdan Vasilescu and Premkumar Devanbu and Vladimir Filkov}, title = {Developer Onboarding in GitHub: The Role of Prior Social Links and Language Experience}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {817--828}, doi = {}, year = {2015}, } ESEC/FSE '15: "Quality and Productivity Outcomes ..." Quality and Productivity Outcomes Relating to Continuous Integration in GitHub Bogdan Vasilescu, Yue Yu, Huaimin Wang, Premkumar Devanbu , and Vladimir Filkov (University of California at Davis, USA; National University of Defense Technology, China) Software processes comprise many steps; coding is followed by building, integration testing, system testing, deployment, operations, among others. Software process integration and automation have been areas of key concern in software engineering, ever since the pioneering work of Osterweil; market pressures for Agility, and open, decentralized, software development have provided additional pressures for progress in this area. But do these innovations actually help projects? Given the numerous confounding factors that can influence project performance, it can be a challenge to discern the effects of process integration and automation. Software project ecosystems such as GitHub provide a new opportunity in this regard: one can readily find large numbers of projects in various stages of process integration and automation, and gather data on various influencing factors as well as productivity and quality outcomes. In this paper we use large, historical data on process metrics and outcomes in GitHub projects to discern the effects of one specific innovation in process automation: continuous integration. Our main finding is that continuous integration improves the productivity of project teams, who can integrate more outside contributions, without an observable diminishment in code quality. @InProceedings{ESEC/FSE15p805, author = {Bogdan Vasilescu and Yue Yu and Huaimin Wang and Premkumar Devanbu and Vladimir Filkov}, title = {Quality and Productivity Outcomes Relating to Continuous Integration in GitHub}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {805--816}, doi = {}, year = {2015}, } |
|
Vigna, Giovanni |
ESEC/FSE '15: "CLAPP: Characterizing Loops ..."
CLAPP: Characterizing Loops in Android Applications
Yanick Fratantonio, Aravind Machiry, Antonio Bianchi, Christopher Kruegel, and Giovanni Vigna (University of California at Santa Barbara, USA) When performing program analysis, loops are one of the most important aspects that needs to be taken into account. In the past, many approaches have been proposed to analyze loops to perform different tasks, ranging from compiler optimizations to Worst-Case Execution Time (WCET) analysis. While these approaches are powerful, they focus on tackling very specific categories of loops and known loop patterns, such as the ones for which the number of iterations can be statically determined. In this work, we developed a static analysis framework to characterize and analyze generic loops, without relying on techniques based on pattern matching. For this work, we focus on the Android platform, and we implemented a prototype, called CLAPP, that we used to perform the first large-scale empirical study of the usage of loops in Android applications. In particular, we used our tool to analyze a total of 4,110,510 loops found in 11,823 Android applications. As part of our evaluation, we provide the detailed results of our empirical study, we show how our analysis was able to determine that the execution of 63.28% of the loops is bounded, and we discuss several interesting insights related to the performance issues and security aspects associated with loops. @InProceedings{ESEC/FSE15p687, author = {Yanick Fratantonio and Aravind Machiry and Antonio Bianchi and Christopher Kruegel and Giovanni Vigna}, title = {CLAPP: Characterizing Loops in Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {687--697}, doi = {}, year = {2015}, } Info |
|
Walters, Braden M. |
ESEC/FSE '15: "Tracing Software Developers' ..."
Tracing Software Developers' Eyes and Interactions for Change Tasks
Katja Kevic, Braden M. Walters, Timothy R. Shaffer, Bonita Sharif, David C. Shepherd, and Thomas Fritz (University of Zurich, Switzerland; Youngstown State University, USA; ABB Research, USA) What are software developers doing during a change task? While an answer to this question opens countless opportunities to support developers in their work, only little is known about developers' detailed navigation behavior for realistic change tasks. Most empirical studies on developers performing change tasks are limited to very small code snippets or are limited by the granularity or the detail of the data collected for the study. In our research, we try to overcome these limitations by combining user interaction monitoring with very fine granular eye-tracking data that is automatically linked to the underlying source code entities in the IDE. In a study with 12 professional and 10 student developers working on three change tasks from an open source system, we used our approach to investigate the detailed navigation of developers for realistic change tasks. The results of our study show, amongst others, that the eye tracking data does indeed capture different aspects than user interaction data and that developers focus on only small parts of methods that are often related by data flow. We discuss our findings and their implications for better developer tool support. @InProceedings{ESEC/FSE15p202, author = {Katja Kevic and Braden M. Walters and Timothy R. Shaffer and Bonita Sharif and David C. Shepherd and Thomas Fritz}, title = {Tracing Software Developers' Eyes and Interactions for Change Tasks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {202--213}, doi = {}, year = {2015}, } Info |
|
Wan, Mian |
ESEC/FSE '15: "String Analysis for Java and ..."
String Analysis for Java and Android Applications
Ding Li, Yingjun Lyu, Mian Wan, and William G. J. Halfond (University of Southern California, USA) String analysis is critical for many verification techniques. However, accurately modeling string variables is a challeng- ing problem. Current approaches are generally customized for certain problem domains or have critical limitations in handling loops, providing context-sensitive inter-procedural analysis, and performing efficient analysis on complicated apps. To address these limitations, we propose a general framework, Violist, for string analysis that allows researchers to more flexibly choose how they will address each of these challenges by separating the representation and interpreta- tion of string operations. In our evaluation, we show that our approach can achieve high accuracy on both Java and Android apps in a reasonable amount of time. We also com- pared our approach with a popular and widely used string analyzer and found that our approach has higher precision and shorter execution time while maintaining the same level of recall. @InProceedings{ESEC/FSE15p661, author = {Ding Li and Yingjun Lyu and Mian Wan and William G. J. Halfond}, title = {String Analysis for Java and Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {661--672}, doi = {}, year = {2015}, } |
|
Wang, Chao |
ESEC/FSE '15: "Assertion Guided Symbolic ..."
Assertion Guided Symbolic Execution of Multithreaded Programs
Shengjian Guo , Markus Kusano, Chao Wang, Zijiang Yang, and Aarti Gupta (Virginia Tech, USA; Western Michigan University, USA; Princeton University, USA) Symbolic execution is a powerful technique for systematic testing of sequential and multithreaded programs. However, its application is limited by the high cost of covering all feasible intra-thread paths and inter-thread interleavings. We propose a new assertion guided pruning framework that identifies executions guaranteed not to lead to an error and removes them during symbolic execution. By summarizing the reasons why previously explored executions cannot reach an error and using the information to prune redundant executions in the future, we can soundly reduce the search space. We also use static concurrent program slicing and heuristic minimization of symbolic constraints to further reduce the computational overhead. We have implemented our method in the Cloud9 symbolic execution tool and evaluated it on a large set of multithreaded C/C++ programs. Our experiments show that the new method can reduce the overall computational cost significantly. @InProceedings{ESEC/FSE15p854, author = {Shengjian Guo and Markus Kusano and Chao Wang and Zijiang Yang and Aarti Gupta}, title = {Assertion Guided Symbolic Execution of Multithreaded Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {854--865}, doi = {}, year = {2015}, } |
|
Wang, Huaimin |
ESEC/FSE '15: "Quality and Productivity Outcomes ..."
Quality and Productivity Outcomes Relating to Continuous Integration in GitHub
Bogdan Vasilescu, Yue Yu, Huaimin Wang, Premkumar Devanbu , and Vladimir Filkov (University of California at Davis, USA; National University of Defense Technology, China) Software processes comprise many steps; coding is followed by building, integration testing, system testing, deployment, operations, among others. Software process integration and automation have been areas of key concern in software engineering, ever since the pioneering work of Osterweil; market pressures for Agility, and open, decentralized, software development have provided additional pressures for progress in this area. But do these innovations actually help projects? Given the numerous confounding factors that can influence project performance, it can be a challenge to discern the effects of process integration and automation. Software project ecosystems such as GitHub provide a new opportunity in this regard: one can readily find large numbers of projects in various stages of process integration and automation, and gather data on various influencing factors as well as productivity and quality outcomes. In this paper we use large, historical data on process metrics and outcomes in GitHub projects to discern the effects of one specific innovation in process automation: continuous integration. Our main finding is that continuous integration improves the productivity of project teams, who can integrate more outside contributions, without an observable diminishment in code quality. @InProceedings{ESEC/FSE15p805, author = {Bogdan Vasilescu and Yue Yu and Huaimin Wang and Premkumar Devanbu and Vladimir Filkov}, title = {Quality and Productivity Outcomes Relating to Continuous Integration in GitHub}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {805--816}, doi = {}, year = {2015}, } |
|
Weimer, Westley |
ESEC/FSE '15: "Modeling Readability to Improve ..."
Modeling Readability to Improve Unit Tests
Ermira Daka, José Campos, Gordon Fraser, Jonathan Dorn, and Westley Weimer (University of Sheffield, UK; University of Virginia, USA) Writing good unit tests can be tedious and error prone, but even once they are written, the job is not done: Developers need to reason about unit tests throughout software development and evolution, in order to diagnose test failures, maintain the tests, and to understand code written by other developers. Unreadable tests are more difficult to maintain and lose some of their value to developers. To overcome this problem, we propose a domain-specific model of unit test readability based on human judgements, and use this model to augment automated unit test generation. The resulting approach can automatically generate test suites with both high coverage and also improved readability. In human studies users prefer our improved tests and are able to answer maintenance questions about them 14% more quickly at the same level of accuracy. @InProceedings{ESEC/FSE15p107, author = {Ermira Daka and José Campos and Gordon Fraser and Jonathan Dorn and Westley Weimer}, title = {Modeling Readability to Improve Unit Tests}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {107--118}, doi = {}, year = {2015}, } Best-Paper Award |
|
Welk, Allaire |
ESEC/FSE '15: "Quantifying Developers' ..."
Quantifying Developers' Adoption of Security Tools
Jim Witschey, Olga Zielinska, Allaire Welk, Emerson Murphy-Hill, Chris Mayhorn, and Thomas Zimmermann (North Carolina State University, USA; Microsoft Research, USA) Security tools could help developers find critical vulnerabilities, yet such tools remain underused. We surveyed developers from 14 companies and 5 mailing lists about their reasons for using and not using security tools. The resulting thirty-nine predictors of security tool use provide both expected and unexpected insights. As we expected, developers who perceive security to be important are more likely to use security tools than those who do not. But that was not the strongest predictor of security tool use, it was instead developers' ability to observe their peers using security tools. @InProceedings{ESEC/FSE15p260, author = {Jim Witschey and Olga Zielinska and Allaire Welk and Emerson Murphy-Hill and Chris Mayhorn and Thomas Zimmermann}, title = {Quantifying Developers' Adoption of Security Tools}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {260--271}, doi = {}, year = {2015}, } |
|
West, Scott |
ESEC/FSE '15: "Efficient and Reasonable Object-Oriented ..."
Efficient and Reasonable Object-Oriented Concurrency
Scott West, Sebastian Nanz, and Bertrand Meyer (Google, Switzerland; ETH Zurich, Switzerland) Making threaded programs safe and easy to reason about is one of the chief difficulties in modern programming. This work provides an efficient execution model for SCOOP, a concurrency approach that provides not only data-race freedom but also pre/postcondition reasoning guarantees between threads. The extensions we propose influence both the underlying semantics to increase the amount of concurrent execution that is possible, exclude certain classes of deadlocks, and enable greater performance. These extensions are used as the basis of an efficient runtime and optimization pass that improve performance 15x over a baseline implementation. This new implementation of SCOOP is, on average, also 2x faster than other well-known safe concurrent languages. The measurements are based on both coordination-intensive and data-manipulation-intensive benchmarks designed to offer a mixture of workloads. @InProceedings{ESEC/FSE15p734, author = {Scott West and Sebastian Nanz and Bertrand Meyer}, title = {Efficient and Reasonable Object-Oriented Concurrency}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {734--744}, doi = {}, year = {2015}, } |
|
Witschey, Jim |
ESEC/FSE '15: "Quantifying Developers' ..."
Quantifying Developers' Adoption of Security Tools
Jim Witschey, Olga Zielinska, Allaire Welk, Emerson Murphy-Hill, Chris Mayhorn, and Thomas Zimmermann (North Carolina State University, USA; Microsoft Research, USA) Security tools could help developers find critical vulnerabilities, yet such tools remain underused. We surveyed developers from 14 companies and 5 mailing lists about their reasons for using and not using security tools. The resulting thirty-nine predictors of security tool use provide both expected and unexpected insights. As we expected, developers who perceive security to be important are more likely to use security tools than those who do not. But that was not the strongest predictor of security tool use, it was instead developers' ability to observe their peers using security tools. @InProceedings{ESEC/FSE15p260, author = {Jim Witschey and Olga Zielinska and Allaire Welk and Emerson Murphy-Hill and Chris Mayhorn and Thomas Zimmermann}, title = {Quantifying Developers' Adoption of Security Tools}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {260--271}, doi = {}, year = {2015}, } |
|
Wu, Fei |
ESEC/FSE '15: "Heterogeneous Cross-Company ..."
Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning
Xiaoyuan Jing, Fei Wu, Xiwei Dong, Fumin Qi, and Baowen Xu (Wuhan University, China; Nanjing University of Posts and Telecommunications, China; Nanjing University, China) Cross-company defect prediction (CCDP) learns a prediction model by using training data from one or multiple projects of a source company and then applies the model to the target company data. Existing CCDP methods are based on the assumption that the data of source and target companies should have the same software metrics. However, for CCDP, the source and target company data is usually heterogeneous, namely the metrics used and the size of metric set are different in the data of two companies. We call CCDP in this scenario as heterogeneous CCDP (HCCDP) task. In this paper, we aim to provide an effective solution for HCCDP. We propose a unified metric representation (UMR) for the data of source and target companies. The UMR consists of three types of metrics, i.e., the common metrics of the source and target companies, source-company specific metrics and target-company specific metrics. To construct UMR for source company data, the target-company specific metrics are set as zeros, while for UMR of the target company data, the source-company specific metrics are set as zeros. Based on the unified metric representation, we for the first time introduce canonical correlation analysis (CCA), an effective transfer learning method, into CCDP to make the data distributions of source and target companies similar. Experiments on 14 public heterogeneous datasets from four companies indicate that: 1) for HCCDP with partially different metrics, our approach significantly outperforms state-of-the-art CCDP methods; 2) for HCCDP with totally different metrics, our approach obtains comparable prediction performances in contrast with within-project prediction results. The proposed approach is effective for HCCDP. @InProceedings{ESEC/FSE15p496, author = {Xiaoyuan Jing and Fei Wu and Xiwei Dong and Fumin Qi and Baowen Xu}, title = {Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {496--507}, doi = {}, year = {2015}, } |
|
Xiao, Hao |
ESEC/FSE '15: "TLV: Abstraction through Testing, ..."
TLV: Abstraction through Testing, Learning, and Validation
Jun Sun, Hao Xiao, Yang Liu , Shang-Wei Lin, and Shengchao Qin (Singapore University of Technology and Design, Singapore; Nanyang Technological University, Singapore; Teesside University, UK; Shenzhen University, China) A (Java) class provides a service to its clients (i.e., programs which use the class). The service must satisfy certain specifications. Different specifications might be expected at different levels of abstraction depending on the client's objective. In order to effectively contrast the class against its specifications, whether manually or automatically, one essential step is to automatically construct an abstraction of the given class at a proper level of abstraction. The abstraction should be correct (i.e., over-approximating) and accurate (i.e., with few spurious traces). We present an automatic approach, which combines testing, learning, and validation, to constructing an abstraction. Our approach is designed such that a large part of the abstraction is generated based on testing and learning so as to minimize the use of heavy-weight techniques like symbolic execution. The abstraction is generated through a process of abstraction/refinement, with no user input, and converges to a specific level of abstraction depending on the usage context. The generated abstraction is guaranteed to be correct and accurate. We have implemented the proposed approach in a toolkit named TLV and evaluated TLV with a number of benchmark programs as well as three real-world ones. The results show that TLV generates abstraction for program analysis and verification more efficiently. @InProceedings{ESEC/FSE15p698, author = {Jun Sun and Hao Xiao and Yang Liu and Shang-Wei Lin and Shengchao Qin}, title = {TLV: Abstraction through Testing, Learning, and Validation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {698--709}, doi = {}, year = {2015}, } Info |
|
Xing, Zhenchang |
ESEC/FSE '15: "Clone-Based and Interactive ..."
Clone-Based and Interactive Recommendation for Modifying Pasted Code
Yun Lin, Xin Peng , Zhenchang Xing, Diwen Zheng, and Wenyun Zhao (Fudan University, China; Nanyang Technological University, Singapore) Developers often need to modify pasted code when programming with copy-and-paste practice. Some modifications on pasted code could involve lots of editing efforts, and any missing or wrong edit could incur bugs. In this paper, we propose a clone-based and interactive approach to recommending where and how to modify the pasted code. In our approach, we regard clones of the pasted code as the results of historical copy-and-paste operations and their differences as historical modifications on the same piece of code. Our approach first retrieves clones of the pasted code from a clone repository and detects syntactically complete differences among them. Then our approach transfers each clone difference into a modification slot on the pasted code, suggests options for each slot, and further mines modifying regulations from the clone differences. Based on the mined modifying regulations, our approach dynamically updates the suggested options and their ranking in each slot according to developer's modifications on the pasted code. We implement a proof-of-concept tool CCDemon based on our approach and evaluate its effectiveness based on code clones detected from five open source projects. The results show that our approach can identify 96.9% of the to-be-modified positions in pasted code and suggest 75.0% of the required modifications. Our human study further confirms that CCDemon can help developers to accomplish their modifications of pasted code more efficiently. @InProceedings{ESEC/FSE15p520, author = {Yun Lin and Xin Peng and Zhenchang Xing and Diwen Zheng and Wenyun Zhao}, title = {Clone-Based and Interactive Recommendation for Modifying Pasted Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {520--531}, doi = {}, year = {2015}, } |
|
Xu, Baowen |
ESEC/FSE '15: "Test Report Prioritization ..."
Test Report Prioritization to Assist Crowdsourced Testing
Yang Feng, Zhenyu Chen, James A. Jones , Chunrong Fang, and Baowen Xu (Nanjing University, China; University of California at Irvine, USA) In crowdsourced testing, users can be incentivized to perform testing tasks and report their results, and because crowdsourced workers are often paid per task, there is a financial incentive to complete tasks quickly rather than well. These reports of the crowdsourced testing tasks are called "test reports" and are composed of simple natural language and screenshots. Back at the software-development organization, developers must manually inspect the test reports to judge their value for revealing faults. Due to the nature of crowdsourced work, the number of test reports are often difficult to comprehensively inspect and process. In order to help with this daunting task, we created the first technique of its kind, to the best of our knowledge, to prioritize test reports for manual inspection. Our technique utilizes two key strategies: (1) a diversity strategy to help developers inspect a wide variety of test reports and to avoid duplicates and wasted effort on falsely classified faulty behavior, and (2) a risk strategy to help developers identify test reports that may be more likely to be fault-revealing based on past observations. Together, these strategies form our DivRisk strategy to prioritize test reports in crowd- sourced testing. Three industrial projects have been used to evaluate the effectiveness of test report prioritization methods. The results of the empirical study show that: (1) DivRisk can significantly outperform random prioritization; (2) DivRisk can approximate the best theoretical result for a real-world industrial mobile application. In addition, we provide some practical guidelines of test report prioritization for crowdsourced testing based on the empirical study and our experiences. @InProceedings{ESEC/FSE15p225, author = {Yang Feng and Zhenyu Chen and James A. Jones and Chunrong Fang and Baowen Xu}, title = {Test Report Prioritization to Assist Crowdsourced Testing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {225--236}, doi = {}, year = {2015}, } ESEC/FSE '15: "Heterogeneous Cross-Company ..." Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning Xiaoyuan Jing, Fei Wu, Xiwei Dong, Fumin Qi, and Baowen Xu (Wuhan University, China; Nanjing University of Posts and Telecommunications, China; Nanjing University, China) Cross-company defect prediction (CCDP) learns a prediction model by using training data from one or multiple projects of a source company and then applies the model to the target company data. Existing CCDP methods are based on the assumption that the data of source and target companies should have the same software metrics. However, for CCDP, the source and target company data is usually heterogeneous, namely the metrics used and the size of metric set are different in the data of two companies. We call CCDP in this scenario as heterogeneous CCDP (HCCDP) task. In this paper, we aim to provide an effective solution for HCCDP. We propose a unified metric representation (UMR) for the data of source and target companies. The UMR consists of three types of metrics, i.e., the common metrics of the source and target companies, source-company specific metrics and target-company specific metrics. To construct UMR for source company data, the target-company specific metrics are set as zeros, while for UMR of the target company data, the source-company specific metrics are set as zeros. Based on the unified metric representation, we for the first time introduce canonical correlation analysis (CCA), an effective transfer learning method, into CCDP to make the data distributions of source and target companies similar. Experiments on 14 public heterogeneous datasets from four companies indicate that: 1) for HCCDP with partially different metrics, our approach significantly outperforms state-of-the-art CCDP methods; 2) for HCCDP with totally different metrics, our approach obtains comparable prediction performances in contrast with within-project prediction results. The proposed approach is effective for HCCDP. @InProceedings{ESEC/FSE15p496, author = {Xiaoyuan Jing and Fei Wu and Xiwei Dong and Fumin Qi and Baowen Xu}, title = {Heterogeneous Cross-Company Defect Prediction by Unified Metric Representation and CCA-Based Transfer Learning}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {496--507}, doi = {}, year = {2015}, } |
|
Xu, Tianyin |
ESEC/FSE '15: "Hey, You Have Given Me Too ..."
Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software
Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker (University of California at San Diego, USA; Huazhong University of Science and Technology, China; NetApp, USA) Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters ("knobs"). With hundreds of knobs, configuring system software to ensure high reliability and performance becomes a daunting, error-prone task. This paper makes a first step in understanding a fundamental question of configuration design: "do users really need so many knobs?" To provide the quantitatively answer, we study the configuration settings of real-world users, including thousands of customers of a commercial storage system (Storage-A), and hundreds of users of two widely-used open-source system software projects. Our study reveals a series of interesting findings to motivate software architects and developers to be more cautious and disciplined in configuration design. Motivated by these findings, we provide a few concrete, practical guidelines which can significantly reduce the configuration space. Take Storage-A as an example, the guidelines can remove 51.9% of its parameters and simplify 19.7% of the remaining ones with little impact on existing users. Also, we study the existing configuration navigation methods in the context of "too many knobs" to understand their effectiveness in dealing with the over-designed configuration, and to provide practices for building navigation support in system software. @InProceedings{ESEC/FSE15p307, author = {Tianyin Xu and Long Jin and Xuepeng Fan and Yuanyuan Zhou and Shankar Pasupathy and Rukma Talwadker}, title = {Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {307--319}, doi = {}, year = {2015}, } Video Info |
|
Yang, Zijiang |
ESEC/FSE '15: "Assertion Guided Symbolic ..."
Assertion Guided Symbolic Execution of Multithreaded Programs
Shengjian Guo , Markus Kusano, Chao Wang, Zijiang Yang, and Aarti Gupta (Virginia Tech, USA; Western Michigan University, USA; Princeton University, USA) Symbolic execution is a powerful technique for systematic testing of sequential and multithreaded programs. However, its application is limited by the high cost of covering all feasible intra-thread paths and inter-thread interleavings. We propose a new assertion guided pruning framework that identifies executions guaranteed not to lead to an error and removes them during symbolic execution. By summarizing the reasons why previously explored executions cannot reach an error and using the information to prune redundant executions in the future, we can soundly reduce the search space. We also use static concurrent program slicing and heuristic minimization of symbolic constraints to further reduce the computational overhead. We have implemented our method in the Cloud9 symbolic execution tool and evaluated it on a large set of multithreaded C/C++ programs. Our experiments show that the new method can reduce the overall computational cost significantly. @InProceedings{ESEC/FSE15p854, author = {Shengjian Guo and Markus Kusano and Chao Wang and Zijiang Yang and Aarti Gupta}, title = {Assertion Guided Symbolic Execution of Multithreaded Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {854--865}, doi = {}, year = {2015}, } |
|
Yu, Yue |
ESEC/FSE '15: "Quality and Productivity Outcomes ..."
Quality and Productivity Outcomes Relating to Continuous Integration in GitHub
Bogdan Vasilescu, Yue Yu, Huaimin Wang, Premkumar Devanbu , and Vladimir Filkov (University of California at Davis, USA; National University of Defense Technology, China) Software processes comprise many steps; coding is followed by building, integration testing, system testing, deployment, operations, among others. Software process integration and automation have been areas of key concern in software engineering, ever since the pioneering work of Osterweil; market pressures for Agility, and open, decentralized, software development have provided additional pressures for progress in this area. But do these innovations actually help projects? Given the numerous confounding factors that can influence project performance, it can be a challenge to discern the effects of process integration and automation. Software project ecosystems such as GitHub provide a new opportunity in this regard: one can readily find large numbers of projects in various stages of process integration and automation, and gather data on various influencing factors as well as productivity and quality outcomes. In this paper we use large, historical data on process metrics and outcomes in GitHub projects to discern the effects of one specific innovation in process automation: continuous integration. Our main finding is that continuous integration improves the productivity of project teams, who can integrate more outside contributions, without an observable diminishment in code quality. @InProceedings{ESEC/FSE15p805, author = {Bogdan Vasilescu and Yue Yu and Huaimin Wang and Premkumar Devanbu and Vladimir Filkov}, title = {Quality and Productivity Outcomes Relating to Continuous Integration in GitHub}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {805--816}, doi = {}, year = {2015}, } |
|
Yung, Tifany |
ESEC/FSE '15: "Comparing and Combining Test-Suite ..."
Comparing and Combining Test-Suite Reduction and Regression Test Selection
August Shi, Tifany Yung, Alex Gyori, and Darko Marinov (University of Illinois at Urbana-Champaign, USA) Regression testing is widely used to check that changes made to software do not break existing functionality, but regression test suites grow, and running them fully can become costly. Researchers have proposed test-suite reduction and regression test selection as two approaches to reduce this cost by not running some of the tests from the test suite. However, previous research has not empirically evaluated how the two approaches compare to each other, and how well a combination of these approaches performs. We present the first extensive study that compares test-suite reduction and regression test selection approaches individually, and also evaluates a combination of the two approaches. We also propose a new criterion to measure the quality of tests with respect to software changes. Our experiments on 4,793 commits from 17 open-source projects show that regression test selection runs on average fewer tests (by 40.15pp) than test-suite reduction. However, test-suite reduction can have a high loss in fault-detection capability with respect to the changes, whereas a (safe) regression test selection has no loss. The experiments also show that a combination of the two approaches runs even fewer tests (on average 5.34pp) than regression test selection, but these tests still have a loss in fault-detection capability with respect to the changes. @InProceedings{ESEC/FSE15p237, author = {August Shi and Tifany Yung and Alex Gyori and Darko Marinov}, title = {Comparing and Combining Test-Suite Reduction and Regression Test Selection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {237--247}, doi = {}, year = {2015}, } |
|
Zaidman, Andy |
ESEC/FSE '15: "When, How, and Why Developers ..."
When, How, and Why Developers (Do Not) Test in Their IDEs
Moritz Beller, Georgios Gousios, Annibale Panichella , and Andy Zaidman (Delft University of Technology, Netherlands; Radboud University Nijmegen, Netherlands) The research community in Software Engineering and Software Testing in particular builds many of its contributions on a set of mutually shared expectations. Despite the fact that they form the basis of many publications as well as open-source and commercial testing applications, these common expectations and beliefs are rarely ever questioned. For example, Frederic Brooks’ statement that testing takes half of the development time seems to have manifested itself within the community since he first made it in the “Mythical Man Month” in 1975. With this paper, we report on the surprising results of a large-scale field study with 416 software engineers whose development activity we closely monitored over the course of five months, resulting in over 13 years of recorded work time in their integrated development environments (IDEs). Our findings question several commonly shared assumptions and beliefs about testing and might be contributing factors to the observed bug proneness of software in practice: the majority of developers in our study does not test; developers rarely run their tests in the IDE; Test-Driven Development (TDD) is not widely practiced; and, last but not least, software developers only spend a quarter of their work time engineering tests, whereas they think they test half of their time. @InProceedings{ESEC/FSE15p179, author = {Moritz Beller and Georgios Gousios and Annibale Panichella and Andy Zaidman}, title = {When, How, and Why Developers (Do Not) Test in Their IDEs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {179--190}, doi = {}, year = {2015}, } |
|
Zhang, Xin |
ESEC/FSE '15: "FlexJava: Language Support ..."
FlexJava: Language Support for Safe and Modular Approximate Programming
Jongse Park, Hadi Esmaeilzadeh, Xin Zhang, Mayur Naik, and William Harris (Georgia Tech, USA) Energy efficiency is a primary constraint in modern systems. Approximate computing is a promising approach that trades quality of result for gains in efficiency and performance. State- of-the-art approximate programming models require extensive manual annotations on program data and operations to guarantee safe execution of approximate programs. The need for extensive manual annotations hinders the practical use of approximation techniques. This paper describes FlexJava, a small set of language extensions, that significantly reduces the annotation effort, paving the way for practical approximate programming. These extensions enable programmers to annotate approximation-tolerant method outputs. The FlexJava compiler, which is equipped with an approximation safety analysis, automatically infers the operations and data that affect these outputs and selectively marks them approximable while giving safety guarantees. The automation and the language–compiler codesign relieve programmers from manually and explicitly an- notating data declarations or operations as safe to approximate. FlexJava is designed to support safety, modularity, generality, and scalability in software development. We have implemented FlexJava annotations as a Java library and we demonstrate its practicality using a wide range of Java applications and by con- ducting a user study. Compared to EnerJ, a recent approximate programming system, FlexJava provides the same energy savings with significant reduction (from 2× to 17×) in the number of annotations. In our user study, programmers spend 6× to 12× less time annotating programs using FlexJava than when using EnerJ. @InProceedings{ESEC/FSE15p745, author = {Jongse Park and Hadi Esmaeilzadeh and Xin Zhang and Mayur Naik and William Harris}, title = {FlexJava: Language Support for Safe and Modular Approximate Programming}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {745--757}, doi = {}, year = {2015}, } ESEC/FSE '15: "A User-Guided Approach to ..." A User-Guided Approach to Program Analysis Ravi Mangal, Xin Zhang, Aditya V. Nori, and Mayur Naik (Georgia Tech, USA; Microsoft Research, UK) Program analysis tools often produce undesirable output due to various approximations. We present an approach and a system EUGENE that allows user feedback to guide such approximations towards producing the desired output. We formulate the problem of user-guided program analysis in terms of solving a combination of hard rules and soft rules: hard rules capture soundness while soft rules capture degrees of approximations and preferences of users. Our technique solves the rules using an off-the-shelf solver in a manner that is sound (satisfies all hard rules), optimal (maximally satisfies soft rules), and scales to real-world analyses and programs. We evaluate EUGENE on two different analyses with labeled output on a suite of seven Java programs of size 131–198 KLOC. We also report upon a user study involving nine users who employ EUGENE to guide an information-flow analysis on three Java micro-benchmarks. In our experiments, EUGENE significantly reduces misclassified reports upon providing limited amounts of feedback. @InProceedings{ESEC/FSE15p462, author = {Ravi Mangal and Xin Zhang and Aditya V. Nori and Mayur Naik}, title = {A User-Guided Approach to Program Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {462--473}, doi = {}, year = {2015}, } Best-Paper Award |
|
Zhang, Yucheng |
ESEC/FSE '15: "Assertions Are Strongly Correlated ..."
Assertions Are Strongly Correlated with Test Suite Effectiveness
Yucheng Zhang and Ali Mesbah (University of British Columbia, Canada) Code coverage is a popular test adequacy criterion in practice. Code coverage, however, remains controversial as there is a lack of coherent empirical evidence for its relation with test suite effectiveness. More recently, test suite size has been shown to be highly correlated with effectiveness. However, previous studies treat test methods as the smallest unit of interest, and ignore potential factors influencing this relationship. We propose to go beyond test suite size, by investigating test assertions inside test methods. We empirically evaluate the relationship between a test suite’s effectiveness and the (1) number of assertions, (2) assertion coverage, and (3) different types of assertions. We compose 6,700 test suites in total, using 24,000 assertions of five real-world Java projects. We find that the number of assertions in a test suite strongly correlates with its effectiveness, and this factor directly influences the relationship between test suite size and effectiveness. Our results also indicate that assertion coverage is strongly correlated with effectiveness and different types of assertions can influence the effectiveness of their containing test suites. @InProceedings{ESEC/FSE15p214, author = {Yucheng Zhang and Ali Mesbah}, title = {Assertions Are Strongly Correlated with Test Suite Effectiveness}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {214--224}, doi = {}, year = {2015}, } Info |
|
Zhao, Wenyun |
ESEC/FSE '15: "Clone-Based and Interactive ..."
Clone-Based and Interactive Recommendation for Modifying Pasted Code
Yun Lin, Xin Peng , Zhenchang Xing, Diwen Zheng, and Wenyun Zhao (Fudan University, China; Nanyang Technological University, Singapore) Developers often need to modify pasted code when programming with copy-and-paste practice. Some modifications on pasted code could involve lots of editing efforts, and any missing or wrong edit could incur bugs. In this paper, we propose a clone-based and interactive approach to recommending where and how to modify the pasted code. In our approach, we regard clones of the pasted code as the results of historical copy-and-paste operations and their differences as historical modifications on the same piece of code. Our approach first retrieves clones of the pasted code from a clone repository and detects syntactically complete differences among them. Then our approach transfers each clone difference into a modification slot on the pasted code, suggests options for each slot, and further mines modifying regulations from the clone differences. Based on the mined modifying regulations, our approach dynamically updates the suggested options and their ranking in each slot according to developer's modifications on the pasted code. We implement a proof-of-concept tool CCDemon based on our approach and evaluate its effectiveness based on code clones detected from five open source projects. The results show that our approach can identify 96.9% of the to-be-modified positions in pasted code and suggest 75.0% of the required modifications. Our human study further confirms that CCDemon can help developers to accomplish their modifications of pasted code more efficiently. @InProceedings{ESEC/FSE15p520, author = {Yun Lin and Xin Peng and Zhenchang Xing and Diwen Zheng and Wenyun Zhao}, title = {Clone-Based and Interactive Recommendation for Modifying Pasted Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {520--531}, doi = {}, year = {2015}, } |
|
Zheng, Diwen |
ESEC/FSE '15: "Clone-Based and Interactive ..."
Clone-Based and Interactive Recommendation for Modifying Pasted Code
Yun Lin, Xin Peng , Zhenchang Xing, Diwen Zheng, and Wenyun Zhao (Fudan University, China; Nanyang Technological University, Singapore) Developers often need to modify pasted code when programming with copy-and-paste practice. Some modifications on pasted code could involve lots of editing efforts, and any missing or wrong edit could incur bugs. In this paper, we propose a clone-based and interactive approach to recommending where and how to modify the pasted code. In our approach, we regard clones of the pasted code as the results of historical copy-and-paste operations and their differences as historical modifications on the same piece of code. Our approach first retrieves clones of the pasted code from a clone repository and detects syntactically complete differences among them. Then our approach transfers each clone difference into a modification slot on the pasted code, suggests options for each slot, and further mines modifying regulations from the clone differences. Based on the mined modifying regulations, our approach dynamically updates the suggested options and their ranking in each slot according to developer's modifications on the pasted code. We implement a proof-of-concept tool CCDemon based on our approach and evaluate its effectiveness based on code clones detected from five open source projects. The results show that our approach can identify 96.9% of the to-be-modified positions in pasted code and suggest 75.0% of the required modifications. Our human study further confirms that CCDemon can help developers to accomplish their modifications of pasted code more efficiently. @InProceedings{ESEC/FSE15p520, author = {Yun Lin and Xin Peng and Zhenchang Xing and Diwen Zheng and Wenyun Zhao}, title = {Clone-Based and Interactive Recommendation for Modifying Pasted Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {520--531}, doi = {}, year = {2015}, } |
|
Zheng, Qimu |
ESEC/FSE '15: "A Method to Identify and Correct ..."
A Method to Identify and Correct Problematic Software Activity Data: Exploiting Capacity Constraints and Data Redundancies
Qimu Zheng, Audris Mockus, and Minghui Zhou (Peking University, China; University of Tennessee, USA) Mining software repositories to understand and improve software development is a common approach in research and practice. The operational data obtained from these repositories often do not faithfully represent the intended aspects of software development and, therefore, may jeopardize the conclusions derived from it. We propose an approach to identify problematic values based on the constraints of software development and to correct such values using data redundancies. We investigate the approach using issue and commit data of Mozilla project. In particular, we identified problematic data in four types of events and found the fraction of problematic values to exceed 10% and rapidly rising. We found the corrected values to be 50% closer to the most accurate estimate of task completion time. Finally, we found that the models of time until fix changed substantially when data were corrected, with the corrected data providing a 20% better fit. We discuss how the approach may be generalized to other types of operational data to increase fidelity of software measurement in practice and in research. @InProceedings{ESEC/FSE15p637, author = {Qimu Zheng and Audris Mockus and Minghui Zhou}, title = {A Method to Identify and Correct Problematic Software Activity Data: Exploiting Capacity Constraints and Data Redundancies}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {637--648}, doi = {}, year = {2015}, } |
|
Zhou, Minghui |
ESEC/FSE '15: "A Method to Identify and Correct ..."
A Method to Identify and Correct Problematic Software Activity Data: Exploiting Capacity Constraints and Data Redundancies
Qimu Zheng, Audris Mockus, and Minghui Zhou (Peking University, China; University of Tennessee, USA) Mining software repositories to understand and improve software development is a common approach in research and practice. The operational data obtained from these repositories often do not faithfully represent the intended aspects of software development and, therefore, may jeopardize the conclusions derived from it. We propose an approach to identify problematic values based on the constraints of software development and to correct such values using data redundancies. We investigate the approach using issue and commit data of Mozilla project. In particular, we identified problematic data in four types of events and found the fraction of problematic values to exceed 10% and rapidly rising. We found the corrected values to be 50% closer to the most accurate estimate of task completion time. Finally, we found that the models of time until fix changed substantially when data were corrected, with the corrected data providing a 20% better fit. We discuss how the approach may be generalized to other types of operational data to increase fidelity of software measurement in practice and in research. @InProceedings{ESEC/FSE15p637, author = {Qimu Zheng and Audris Mockus and Minghui Zhou}, title = {A Method to Identify and Correct Problematic Software Activity Data: Exploiting Capacity Constraints and Data Redundancies}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {637--648}, doi = {}, year = {2015}, } |
|
Zhou, Yuanyuan |
ESEC/FSE '15: "Hey, You Have Given Me Too ..."
Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software
Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker (University of California at San Diego, USA; Huazhong University of Science and Technology, China; NetApp, USA) Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters ("knobs"). With hundreds of knobs, configuring system software to ensure high reliability and performance becomes a daunting, error-prone task. This paper makes a first step in understanding a fundamental question of configuration design: "do users really need so many knobs?" To provide the quantitatively answer, we study the configuration settings of real-world users, including thousands of customers of a commercial storage system (Storage-A), and hundreds of users of two widely-used open-source system software projects. Our study reveals a series of interesting findings to motivate software architects and developers to be more cautious and disciplined in configuration design. Motivated by these findings, we provide a few concrete, practical guidelines which can significantly reduce the configuration space. Take Storage-A as an example, the guidelines can remove 51.9% of its parameters and simplify 19.7% of the remaining ones with little impact on existing users. Also, we study the existing configuration navigation methods in the context of "too many knobs" to understand their effectiveness in dealing with the over-designed configuration, and to provide practices for building navigation support in system software. @InProceedings{ESEC/FSE15p307, author = {Tianyin Xu and Long Jin and Xuepeng Fan and Yuanyuan Zhou and Shankar Pasupathy and Rukma Talwadker}, title = {Hey, You Have Given Me Too Many Knobs!: Understanding and Dealing with Over-Designed Configuration in System Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {307--319}, doi = {}, year = {2015}, } Video Info |
|
Zhu, Linjie |
ESEC/FSE '15: "What Change History Tells ..."
What Change History Tells Us about Thread Synchronization
Rui Gu, Guoliang Jin, Linhai Song, Linjie Zhu, and Shan Lu (Columbia University, USA; North Carolina State University, USA; University of Wisconsin-Madison, USA; University of Chicago, USA) Multi-threaded programs are pervasive, yet difficult to write. Missing proper synchronization leads to correctness bugs and over synchronization leads to performance problems. To improve the correctness and efficiency of multi-threaded software, we need a better understanding of synchronization challenges faced by real-world developers. This paper studies the code repositories of open-source multi-threaded software projects to obtain a broad and in- depth view of how developers handle synchronizations. We first examine how critical sections are changed when software evolves by checking over 250,000 revisions of four representative open-source software projects. The findings help us answer questions like how often synchronization is an afterthought for developers; whether it is difficult for devel- opers to decide critical section boundaries and lock variables; and what are real-world over-synchronization problems. We then conduct case studies to better understand (1) how critical sections are changed to solve performance prob- lems (i.e. over-synchronization issues) and (2) how soft- ware changes lead to synchronization-related correctness problems (i.e. concurrency bugs). This in-depth study shows that tool support is needed to help developers tackle over-synchronization problems; it also shows that concur- rency bug avoidance, detection, and testing can be improved through better awareness of code revision history. @InProceedings{ESEC/FSE15p426, author = {Rui Gu and Guoliang Jin and Linhai Song and Linjie Zhu and Shan Lu}, title = {What Change History Tells Us about Thread Synchronization}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {426--438}, doi = {}, year = {2015}, } |
|
Zielinska, Olga |
ESEC/FSE '15: "Quantifying Developers' ..."
Quantifying Developers' Adoption of Security Tools
Jim Witschey, Olga Zielinska, Allaire Welk, Emerson Murphy-Hill, Chris Mayhorn, and Thomas Zimmermann (North Carolina State University, USA; Microsoft Research, USA) Security tools could help developers find critical vulnerabilities, yet such tools remain underused. We surveyed developers from 14 companies and 5 mailing lists about their reasons for using and not using security tools. The resulting thirty-nine predictors of security tool use provide both expected and unexpected insights. As we expected, developers who perceive security to be important are more likely to use security tools than those who do not. But that was not the strongest predictor of security tool use, it was instead developers' ability to observe their peers using security tools. @InProceedings{ESEC/FSE15p260, author = {Jim Witschey and Olga Zielinska and Allaire Welk and Emerson Murphy-Hill and Chris Mayhorn and Thomas Zimmermann}, title = {Quantifying Developers' Adoption of Security Tools}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {260--271}, doi = {}, year = {2015}, } |
|
Zimmermann, Thomas |
ESEC/FSE '15: "Quantifying Developers' ..."
Quantifying Developers' Adoption of Security Tools
Jim Witschey, Olga Zielinska, Allaire Welk, Emerson Murphy-Hill, Chris Mayhorn, and Thomas Zimmermann (North Carolina State University, USA; Microsoft Research, USA) Security tools could help developers find critical vulnerabilities, yet such tools remain underused. We surveyed developers from 14 companies and 5 mailing lists about their reasons for using and not using security tools. The resulting thirty-nine predictors of security tool use provide both expected and unexpected insights. As we expected, developers who perceive security to be important are more likely to use security tools than those who do not. But that was not the strongest predictor of security tool use, it was instead developers' ability to observe their peers using security tools. @InProceedings{ESEC/FSE15p260, author = {Jim Witschey and Olga Zielinska and Allaire Welk and Emerson Murphy-Hill and Chris Mayhorn and Thomas Zimmermann}, title = {Quantifying Developers' Adoption of Security Tools}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {260--271}, doi = {}, year = {2015}, } ESEC/FSE '15: "How Practitioners Perceive ..." How Practitioners Perceive the Relevance of Software Engineering Research David Lo , Nachiappan Nagappan, and Thomas Zimmermann (Singapore Management University, Singapore; Microsoft Research, USA) The number of software engineering research papers over the last few years has grown significantly. An important question here is: how relevant is software engineering research to practitioners in the field? To address this question, we conducted a survey at Microsoft where we invited 3,000 industry practitioners to rate the relevance of research ideas contained in 571 ICSE, ESEC/FSE and FSE papers that were published over a five year period. We received 17,913 ratings by 512 practitioners who labelled ideas as essential, worthwhile, unimportant, or unwise. The results from the survey suggest that practitioners are positive towards studies done by the software engineering research community: 71% of all ratings were essential or worthwhile. We found no correlation between the citation counts and the relevance scores of the papers. Through a qualitative analysis of free text responses, we identify several reasons why practitioners considered certain research ideas to be unwise. The survey approach described in this paper is lightweight: on average, a participant spent only 22.5 minutes to respond to the survey. At the same time, the results can provide useful insight to conference organizers, authors, and participating practitioners. @InProceedings{ESEC/FSE15p415, author = {David Lo and Nachiappan Nagappan and Thomas Zimmermann}, title = {How Practitioners Perceive the Relevance of Software Engineering Research}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {415--425}, doi = {}, year = {2015}, } Best-Paper Award |
252 authors
proc time: 1.75