ESEC/FSE 2017 – Author Index |
Contents -
Abstracts -
Authors
|
A B C D E F G H I J K L M N O P R S T U V W X Y Z
Aafer, Yousra |
ESEC/FSE '17: "LAMP: Data Provenance for ..."
LAMP: Data Provenance for Graph Based Machine Learning Algorithms through Derivative Computation
Shiqing Ma, Yousra Aafer, Zhaogui Xu, Wen-Chuan Lee, Juan Zhai, Yingqi Liu, and Xiangyu Zhang (Purdue University, USA; Nanjing University, China) Data provenance tracking determines the set of inputs related to a given output. It enables quality control and problem diagnosis in data engineering. Most existing techniques work by tracking program dependencies. They cannot quantitatively assess the importance of related inputs, which is critical to machine learning algorithms, in which an output tends to depend on a huge set of inputs while only some of them are of importance. In this paper, we propose LAMP, a provenance computation system for machine learning algorithms. Inspired by automatic differentiation (AD), LAMP quantifies the importance of an input for an output by computing the partial derivative. LAMP separates the original data processing and the more expensive derivative computation to different processes to achieve cost-effectiveness. In addition, it allows quantifying importance for inputs related to discrete behavior, such as control flow selection. The evaluation on a set of real world programs and data sets illustrates that LAMP produces more precise and succinct provenance than program dependence based techniques, with much less overhead. Our case studies demonstrate the potential of LAMP in problem diagnosis in data engineering. @InProceedings{ESEC/FSE17p786, author = {Shiqing Ma and Yousra Aafer and Zhaogui Xu and Wen-Chuan Lee and Juan Zhai and Yingqi Liu and Xiangyu Zhang}, title = {LAMP: Data Provenance for Graph Based Machine Learning Algorithms through Derivative Computation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {786--797}, doi = {}, year = {2017}, } |
|
Abdalkareem, Rabe |
ESEC/FSE '17: "Why Do Developers Use Trivial ..."
Why Do Developers Use Trivial Packages? An Empirical Case Study on npm
Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad Shihab (Concordia University, Canada) Code reuse is traditionally seen as good practice. Recent trends have pushed the concept of code reuse to an extreme, by using packages that implement simple and trivial tasks, which we call `trivial packages'. A recent incident where a trivial package led to the breakdown of some of the most popular web applications such as Facebook and Netflix made it imperative to question the growing use of trivial packages. Therefore, in this paper, we mine more than 230,000 npm packages and 38,000 JavaScript applications in order to study the prevalence of trivial packages. We found that trivial packages are common and are increasing in popularity, making up 16.8% of the studied npm packages. We performed a survey with 88 Node.js developers who use trivial packages to understand the reasons and drawbacks of their use. Our survey revealed that trivial packages are used because they are perceived to be well implemented and tested pieces of code. However, developers are concerned about maintaining and the risks of breakages due to the extra dependencies trivial packages introduce. To objectively verify the survey results, we empirically validate the most cited reason and drawback and find that, contrary to developers' beliefs, only 45.2% of trivial packages even have tests. However, trivial packages appear to be `deployment tested' and to have similar test, usage and community interest as non-trivial packages. On the other hand, we found that 11.5% of the studied trivial packages have more than 20 dependencies. Hence, developers should be careful about which trivial packages they decide to use. @InProceedings{ESEC/FSE17p385, author = {Rabe Abdalkareem and Olivier Nourry and Sultan Wehaibi and Suhaib Mujahid and Emad Shihab}, title = {Why Do Developers Use Trivial Packages? An Empirical Case Study on npm}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {385--395}, doi = {}, year = {2017}, } ESEC/FSE '17-SRC: "Reasons and Drawbacks of using ..." Reasons and Drawbacks of using Trivial npm Packages: The Developers' Perspective Rabe Abdalkareem (Concordia University, Canada) Code reuse is traditionally seen as good practice. Recent trends have pushed the idea of code reuse to an extreme, by using packages that implement simple and trivial tasks, which we call ‘trivial packages’. A recent incident where a trivial package led to the breakdown of some of the most popular web applications such as Facebook and Netflix, put the spotlight on whether using trivial packages should be encouraged. Therefore, in this research, we mine more than 230,000 npm packages and 38,000 JavaScript projects in order to study the prevalence of trivial packages. We found that trivial packages are common, making up 16.8% of the studied npm packages. We performed a survey with 88 Node.js developers who use trivial packages to understand the reasons for and drawbacks of their use. We found that trivial packages are used because they are perceived to be well-implemented and tested pieces of code. However, developers are concerned about maintaining and the risks of breakages due to the extra dependencies trivial packages introduce. @InProceedings{ESEC/FSE17p1062, author = {Rabe Abdalkareem}, title = {Reasons and Drawbacks of using Trivial npm Packages: The Developers' Perspective}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1062--1064}, doi = {}, year = {2017}, } |
|
Abusair, Mai |
ESEC/FSE '17-DOC: "User- and Analysis-Driven ..."
User- and Analysis-Driven Context Aware Software Development in Mobile Computing
Mai Abusair (University of L'Aquila, Italy) Mobile applications may benefit from context awareness since they incur to context changes during their execution and their success depends on the user perceived quality. Context awareness requires context monitoring and system adaptation, these two tasks are very expensive especially in mobile applications. Our research aims at developing a methodology that enables effective context awareness techniques for mobile applications that allows adaptations of the mobile app to context changes so that the desired system quality properties and user satisfaction is maximized. Here effective means selecting a minimum set of context variables to monitor and a minimum set of adaptive tactics to inject into mobile applications that allows to guarantee the required software quality and to maximize the user satisfaction. In this paper, we show the devised methodology on a motivating example, detailing the ongoing work. @InProceedings{ESEC/FSE17p1022, author = {Mai Abusair}, title = {User- and Analysis-Driven Context Aware Software Development in Mobile Computing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1022--1025}, doi = {}, year = {2017}, } |
|
Adzic, Gojko |
ESEC/FSE '17-IND: "Serverless Computing: Economic ..."
Serverless Computing: Economic and Architectural Impact
Gojko Adzic and Robert Chatley (Neuri Consulting, UK; Imperial College London, UK) Amazon Web Services unveiled their ‘Lambda’ platform in late 2014. Since then, each of the major cloud computing infrastructure providers has released services supporting a similar style of deployment and operation, where rather than deploying and running monolithic services, or dedicated virtual machines, users are able to deploy individual functions, and pay only for the time that their code is actually executing. These technologies are gathered together under the marketing term ‘serverless’ and the providers suggest that they have the potential to significantly change how client/server applications are designed, developed and operated. This paper presents two case industrial studies of early adopters, showing how migrating an application to the Lambda deployment architecture reduced hosting costs – by between 66% and 95% – and discusses how further adoption of this trend might influence common software architecture design practices. @InProceedings{ESEC/FSE17p884, author = {Gojko Adzic and Robert Chatley}, title = {Serverless Computing: Economic and Architectural Impact}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {884--889}, doi = {}, year = {2017}, } |
|
Aguirre, Nazareno |
ESEC/FSE '17-DEMO: "DynAlloy Analyzer: A Tool ..."
DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour
Germán Regis, César Cornejo, Simón Gutiérrez Brida, Mariano Politano, Fernando Raverta, Pablo Ponzio, Nazareno Aguirre, Juan Pablo Galeotti, and Marcelo Frias (Universidad Nacional de Río Cuarto, Argentina; Universidad Nacional de Córdoba, Argentina; University of Buenos Aires, Argentina; Buenos Aires Institute of Technology, Argentina) We describe DynAlloy Analyzer, a tool that extends Alloy Analyzer with support for dynamic elements in Alloy models. The tool builds upon Alloy Analyzer in a way that makes it fully compatible with Alloy models, and extends their syntax with a particular idiom, inspired in dynamic logic, for the description of dynamic behaviours, understood as sequences of states over standard Alloy models, in terms of programs. The syntax is broad enough to accommodate abstract dynamic behaviours, e.g., using nondeterministic choice and finite unbounded iteration, as well as more concrete ones, using standard sequential programming constructions. The analysis of DynAlloy models resorts to the analysis of Alloy models, through an optimized translation that often makes the analysis more efficient than that of typical ad-hoc constructions to capture dynamism in Alloy. Tool screencast, binaries and further details available in: http://dc.exa.unrc.edu.ar/tools/dynalloy @InProceedings{ESEC/FSE17p969, author = {Germán Regis and César Cornejo and Simón Gutiérrez Brida and Mariano Politano and Fernando Raverta and Pablo Ponzio and Nazareno Aguirre and Juan Pablo Galeotti and Marcelo Frias}, title = {DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {969--973}, doi = {}, year = {2017}, } Video Info ESEC/FSE '17-DEMO: "CLTSA: Labelled Transition ..." CLTSA: Labelled Transition System Analyser with Counting Fluent Support Germán Regis, Renzo Degiovanni, Nicolas D'Ippolito, and Nazareno Aguirre (Universidad Nacional de Río Cuarto, Argentina; CONICET, Argentina; University of Buenos Aires, Argentina) In this paper we present CLTSA (Counting Fluents Labelled Transition System Analyser), an extension of LTSA (Labelled Transition System Analyser) that incorporates counting fluents, a useful mechanism to capture properties related to counting events. Counting fluent temporal logic is a formalism for specifying properties of event-based systems, which complements the notion of fluent by the related concept of counting fluent. While fluents allow us to capture boolean properties of the behaviour of a reactive system, counting fluents are numerical values, that enumerate event occurrences. The tool supports a superset of FSP (Finite State Processes), that allows one to define LTL properties involving counting fluents, which can be model checked on FSP processes. Detailed information can be found at http://dc.exa.unrc.edu.ar/tools/cltsa. @InProceedings{ESEC/FSE17p979, author = {Germán Regis and Renzo Degiovanni and Nicolas D'Ippolito and Nazareno Aguirre}, title = {CLTSA: Labelled Transition System Analyser with Counting Fluent Support}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {979--983}, doi = {}, year = {2017}, } Video Info |
|
Ahmadian, Amir Shayan |
ESEC/FSE '17-DEMO: "Model-Based Privacy and Security ..."
Model-Based Privacy and Security Analysis with CARiSMA
Amir Shayan Ahmadian, Sven Peldszus, Qusai Ramadan, and Jan Jürjens (University of Koblenz-Landau, Germany; Fraunhofer ISST, Germany) We present CARiSMA, a tool that is originally designed to support model-based security analysis of IT systems. In our recent work, we added several new functionalities to CARiSMA to support the privacy of personal data. Moreover, we introduced a mechanism to assist the system designers to perform a CARiSMA analysis by automatically initializing an appropriate CARiSMA analysis concerning security and privacy requirements. The motivation for our work is Article 25 of Regulation (EU) 2016/679, which requires appropriate technical and organizational controls must be implemented for ensuring that, by default, the processing of personal data complies with the principles on processing of personal data. This implies that initially IT systems must be analyzed to verify if such principles are respected. System models allow the system developers to handle the complexity of systems and to focus on key aspects such as privacy and security. CARiSMA is available at http://carisma.umlsec.de and our screen cast at https://youtu.be/b5zeHig3ARw. @InProceedings{ESEC/FSE17p989, author = {Amir Shayan Ahmadian and Sven Peldszus and Qusai Ramadan and Jan Jürjens}, title = {Model-Based Privacy and Security Analysis with CARiSMA}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {989--993}, doi = {}, year = {2017}, } Video Info |
|
Ahmed, Umair Z. |
ESEC/FSE '17: "A Feasibility Study of Using ..."
A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments
Jooyong Yi, Umair Z. Ahmed, Amey Karkare, Shin Hwei Tan, and Abhik Roychoudhury (Innopolis University, Russia; IIT Kanpur, India; National University of Singapore, Singapore) Despite the fact an intelligent tutoring system for programming (ITSP) education has long attracted interest, its widespread use has been hindered by the difficulty of generating personalized feedback automatically. Meanwhile, automated program repair (APR) is an emerging new technology that automatically fixes software bugs, and it has been shown that APR can fix the bugs of large real-world software. In this paper, we study the feasibility of marrying intelligent programming tutoring and APR. We perform our feasibility study with four state-of-the-art APR tools (GenProg, AE, Angelix, and Prophet), and 661 programs written by the students taking an introductory programming course. We found that when APR tools are used out of the box, only about 30% of the programs in our dataset are repaired. This low repair rate is largely due to the student programs often being significantly incorrect — in contrast, professional software for which APR was successfully applied typically fails only a small portion of tests. To bridge this gap, we adopt in APR a new repair policy akin to the hint generation policy employed in the existing ITSP. This new repair policy admits partial repairs that address part of failing tests, which results in 84% improvement of repair rate. We also performed a user study with 263 novice students and 37 graders, and identified an understudied problem; while novice students do not seem to know how to effectively make use of generated repairs as hints, the graders do seem to gain benefits from repairs. @InProceedings{ESEC/FSE17p740, author = {Jooyong Yi and Umair Z. Ahmed and Amey Karkare and Shin Hwei Tan and Abhik Roychoudhury}, title = {A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {740--751}, doi = {}, year = {2017}, } Info Artifacts Functional |
|
Albarghouthi, Aws |
ESEC/FSE '17: "Discovering Relational Specifications ..."
Discovering Relational Specifications
Calvin Smith, Gabriel Ferns, and Aws Albarghouthi (University of Wisconsin-Madison, USA) Formal specifications of library functions play a critical role in a number of program analysis and development tasks. We present Bach, a technique for discovering likely relational specifications from data describing input–output behavior of a set of functions comprising a library or a program. Relational specifications correlate different executions of different functions; for instance, commutativity, transitivity, equivalence of two functions, etc. Bach combines novel insights from program synthesis and databases to discover a rich array of specifications. We apply Bach to learn specifications from data generated for a number of standard libraries. Our experimental evaluation demonstrates Bach’s ability to learn useful and deep specifications in a small amount of time. @InProceedings{ESEC/FSE17p616, author = {Calvin Smith and Gabriel Ferns and Aws Albarghouthi}, title = {Discovering Relational Specifications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {616--626}, doi = {}, year = {2017}, } Best-Paper Award |
|
Aliabadi, Maryam Raiyat |
ESEC/FSE '17: "ARTINALI: Dynamic Invariant ..."
ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security
Maryam Raiyat Aliabadi, Amita Ajith Kamath, Julien Gascon-Samson, and Karthik Pattabiraman (University of British Columbia, Canada; National Institute of Technology Karnataka, India) Cyber-Physical Systems (CPSes) are being widely deployed in security critical scenarios such as smart homes and medical devices. Unfortunately, the connectedness of these systems and their relative lack of security measures makes them ripe targets for attacks. Specification-based Intrusion Detection Systems (IDS) have been shown to be effective for securing CPSs. Unfortunately, deriving invariants for capturing the specifications of CPS systems is a tedious and error-prone process. Therefore, it is important to dynamically monitor the CPS system to learn its common behaviors and formulate invariants for detecting security attacks. Existing techniques for invariant mining only incorporate data and events, but not time. However, time is central to most CPS systems, and hence incorporating time in addition to data and events, is essential for achieving low false positives and false negatives. This paper proposes ARTINALI, which mines dynamic system properties by incorporating time as a first-class property of the system. We build ARTINALI-based Intrusion Detection Systems (IDSes) for two CPSes, namely smart meters and smart medical devices, and measure their efficacy. We find that the ARTINALI-based IDSes significantly reduce the ratio of false positives and false negatives by 16 to 48% (average 30.75%) and 89 to 95% (average 93.4%) respectively over other dynamic invariant detection tools. @InProceedings{ESEC/FSE17p349, author = {Maryam Raiyat Aliabadi and Amita Ajith Kamath and Julien Gascon-Samson and Karthik Pattabiraman}, title = {ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {349--361}, doi = {}, year = {2017}, } |
|
Alrajeh, Dalal |
ESEC/FSE '17: "On Evidence Preservation Requirements ..."
On Evidence Preservation Requirements for Forensic-Ready Systems
Dalal Alrajeh, Liliana Pasquale, and Bashar Nuseibeh (Imperial College London, UK; University College Dublin, Ireland; Open University, UK; Lero, Ireland) Forensic readiness denotes the capability of a system to support digital forensic investigations of potential, known incidents by preserving in advance data that could serve as evidence explaining how an incident occurred. Given the increasing rate at which (potentially criminal) incidents occur, designing soware systems that are forensic-ready can facilitate and reduce the costs of digital forensic investigations. However, to date, little or no attention has been given to how forensic-ready software systems can be designed systematically. In this paper we propose to explicitly represent evidence preservation requirements prescribing preservation of the minimal amount of data that would be relevant to a future digital investigation. We formalise evidence preservation requirements and propose an approach for synthesising specifications for systems to meet these requirements. We present our prototype implementation—based on a satisfiability solver and a logic-based learner—which we use to evaluate our approach, applying it to two digital forensic corpora. Our evaluation suggests that our approach preserves relevant data that could support hypotheses of potential incidents. Moreover, it enables significant reduction in the volume of data that would need to be examined during an investigation. @InProceedings{ESEC/FSE17p559, author = {Dalal Alrajeh and Liliana Pasquale and Bashar Nuseibeh}, title = {On Evidence Preservation Requirements for Forensic-Ready Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {559--569}, doi = {}, year = {2017}, } |
|
Amann, Sven |
ESEC/FSE '17: "CodeMatch: Obfuscation Won't ..."
CodeMatch: Obfuscation Won't Conceal Your Repackaged App
Leonid Glanz, Sven Amann, Michael Eichberg, Michael Reif, Ben Hermann, Johannes Lerch, and Mira Mezini (TU Darmstadt, Germany) An established way to steal the income of app developers, or to trick users into installing malware, is the creation of repackaged apps. These are clones of – typically – successful apps. To conceal their nature, they are often obfuscated by their creators. But, given that it is a common best practice to obfuscate apps, a trivial identification of repackaged apps is not possible. The problem is further intensified by the prevalent usage of libraries. In many apps, the size of the overall code base is basically determined by the used libraries. Therefore, two apps, where the obfuscated code bases are very similar, do not have to be repackages of each other. To reliably detect repackaged apps, we propose a two step approach which first focuses on the identification and removal of the library code in obfuscated apps. This approach – LibDetect – relies on code representations which abstract over several parts of the underlying bytecode to be resilient against certain obfuscation techniques. Using this approach, we are able to identify on average 70% more used libraries per app than previous approaches. After the removal of an app’s library code, we then fuzzy hash the most abstract representation of the remaining app code to ensure that we can identify repackaged apps even if very advanced obfuscation techniques are used. This makes it possible to identify repackaged apps. Using our approach, we found that ≈ 15% of all apps in Android app stores are repackages @InProceedings{ESEC/FSE17p638, author = {Leonid Glanz and Sven Amann and Michael Eichberg and Michael Reif and Ben Hermann and Johannes Lerch and Mira Mezini}, title = {CodeMatch: Obfuscation Won't Conceal Your Repackaged App}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {638--648}, doi = {}, year = {2017}, } Info |
|
Amidon, Peter |
ESEC/FSE '17: "Automatic Inference of Code ..."
Automatic Inference of Code Transforms for Patch Generation
Fan Long, Peter Amidon, and Martin Rinard (Massachusetts Institute of Technology, USA; University of California at San Diego, USA) We present a new system, Genesis, that processes human patches to automatically infer code transforms for automatic patch generation. We present results that characterize the effectiveness of the Genesis inference algorithms and the complete Genesis patch generation system working with real-world patches and defects collected from 372 Java projects. To the best of our knowledge, Genesis is the first system to automatically infer patch generation transforms or candidate patch search spaces from previous successful patches. @InProceedings{ESEC/FSE17p727, author = {Fan Long and Peter Amidon and Martin Rinard}, title = {Automatic Inference of Code Transforms for Patch Generation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {727--739}, doi = {}, year = {2017}, } Info Artifacts Functional |
|
Angelov, Samuil |
ESEC/FSE '17-IND: "Reference Architectures and ..."
Reference Architectures and Scrum: Friends or Foes?
Matthias Galster, Samuil Angelov, Silverio Martínez-Fernández, and Dan Tofan (University of Canterbury, New Zealand; Fontys University of Applied Sciences, Netherlands; Fraunhofer IESE, Germany) Software reference architectures provide templates and guidelines for designing systems in a particular domain. Companies use them to achieve interoperability of (parts of) their software, standardization, and faster development. In contrast to system-specific software architectures that "emerge" during development, reference architectures dictate significant parts of the software design early on. Agile software development frameworks (such as Scrum) acknowledge changing software requirements and the need to adapt the software design accordingly. In this paper, we present lessons learned about how reference architectures interact with Scrum (the most frequently used agile process framework). These lessons are based on observing software development projects in five companies. We found that reference architectures can support good practice in Scrum: They provide enough design upfront without too much effort, reduce documentation activities, facilitate knowledge sharing, and contribute to "architectural thinking" of developers. However, reference architectures can impose risks or even threats to the success of Scrum (e.g., to self-organizing and motivated teams). @InProceedings{ESEC/FSE17p896, author = {Matthias Galster and Samuil Angelov and Silverio Martínez-Fernández and Dan Tofan}, title = {Reference Architectures and Scrum: Friends or Foes?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {896--901}, doi = {}, year = {2017}, } |
|
Antonopoulos, Timos |
ESEC/FSE '17: "Counterexample-Guided Approach ..."
Counterexample-Guided Approach to Finding Numerical Invariants
ThanhVu Nguyen, Timos Antonopoulos, Andrew Ruef, and Michael Hicks (University of Nebraska-Lincoln, USA; Yale University, USA; University of Maryland, USA) Numerical invariants, e.g., relationships among numerical variables in a program, represent a useful class of properties to analyze programs. General polynomial invariants represent more complex numerical relations, but they are often required in many scientific and engineering applications. We present NumInv, a tool that implements a counterexample-guided invariant generation (CEGIR) technique to automatically discover numerical invariants, which are polynomial equality and inequality relations among numerical variables. This CEGIR technique infers candidate invariants from program traces and then checks them against the program source code using the KLEE test-input generation tool. If the invariants are incorrect KLEE returns counterexample traces, which help the dynamic inference obtain better results. Existing CEGIR approaches often require sound invariants, however NumInv sacrifices soundness and produces results that KLEE cannot refute within certain time bounds. This design and the use of KLEE as a verifier allow NumInv to discover useful and important numerical invariants for many challenging programs. Preliminary results show that NumInv generates required invariants for understanding and verifying correctness of programs involving complex arithmetic. We also show that NumInv discovers polynomial invariants that capture precise complexity bounds of programs used to benchmark existing static complexity analysis techniques. Finally, we show that NumInv performs competitively comparing to state of the art numerical invariant analysis tools. @InProceedings{ESEC/FSE17p605, author = {ThanhVu Nguyen and Timos Antonopoulos and Andrew Ruef and Michael Hicks}, title = {Counterexample-Guided Approach to Finding Numerical Invariants}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {605--615}, doi = {}, year = {2017}, } |
|
Apel, Sven |
ESEC/FSE '17: "Measuring Neural Efficiency ..."
Measuring Neural Efficiency of Program Comprehension
Janet Siegmund, Norman Peitek, Chris Parnin, Sven Apel, Johannes Hofmeister, Christian Kästner, Andrew Begel, Anja Bethmann, and André Brechmann (University of Passau, Germany; Leibniz Institute for Neurobiology, Germany; North Carolina State University, USA; Carnegie Mellon University, USA; Microsoft Research, USA) Most modern software programs cannot be understood in their entirety by a single programmer. Instead, programmers must rely on a set of cognitive processes that aid in seeking, filtering, and shaping relevant information for a given programming task. Several theories have been proposed to explain these processes, such as ``beacons,' for locating relevant code, and ``plans,'' for encoding cognitive models. However, these theories are decades old and lack validation with modern cognitive-neuroscience methods. In this paper, we report on a study using functional magnetic resonance imaging (fMRI) with 11 participants who performed program comprehension tasks. We manipulated experimental conditions related to beacons and layout to isolate specific cognitive processes related to bottom-up comprehension and comprehension based on semantic cues. We found evidence of semantic chunking during bottom-up comprehension and lower activation of brain areas during comprehension based on semantic cues, confirming that beacons ease comprehension. @InProceedings{ESEC/FSE17p140, author = {Janet Siegmund and Norman Peitek and Chris Parnin and Sven Apel and Johannes Hofmeister and Christian Kästner and Andrew Begel and Anja Bethmann and André Brechmann}, title = {Measuring Neural Efficiency of Program Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {140--150}, doi = {}, year = {2017}, } Info ESEC/FSE '17: "Attributed Variability Models: ..." Attributed Variability Models: Outside the Comfort Zone Norbert Siegmund, Stefan Sobernig, and Sven Apel (Bauhaus-University Weimar, Germany; WU Vienna, Austria; University of Passau, Germany) Variability models are often enriched with attributes, such as performance, that encode the influence of features on the respective attribute. In spite of their importance, there are only few attributed variability models available that have attribute values obtained from empirical, real-world observations and that cover interactions between features. But, what does it mean for research and practice when staying in the comfort zone of developing algorithms and tools in a setting where artificial attribute values are used and where interactions are neglected? This is the central question that we want to answer here. To leave the comfort zone, we use a combination of kernel density estimation and a genetic algorithm to rescale a given (real-world) attribute-value profile to a given variability model. To demonstrate the influence and relevance of realistic attribute values and interactions, we present a replication of a widely recognized, third-party study, into which we introduce realistic attribute values and interactions. We found statistically significant differences between the original study and the replication. We infer lessons learned to conduct experiments that involve attributed variability models. We also provide the accompanying tool Thor for generating attribute values including interactions. Our solution is shown to be agnostic about the given input distribution and to scale to large variability models. @InProceedings{ESEC/FSE17p268, author = {Norbert Siegmund and Stefan Sobernig and Sven Apel}, title = {Attributed Variability Models: Outside the Comfort Zone}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {268--278}, doi = {}, year = {2017}, } Info ESEC/FSE '17: "Using Bad Learners to Find ..." Using Bad Learners to Find Good Configurations Vivek Nair, Tim Menzies, Norbert Siegmund, and Sven Apel (North Carolina State University, USA; Bauhaus-University Weimar, Germany; University of Passau, Germany) Finding the optimally performing configuration of a software system for a given setting is often challenging. Recent approaches address this challenge by learning performance models based on a sample set of configurations. However, building an accurate performance model can be very expensive (and is often infeasible in practice). The central insight of this paper is that exact performance values (e.g., the response time of a software system) are not required to rank configurations and to identify the optimal one. As shown by our experiments, performance models that are cheap to learn but inaccurate (with respect to the difference between actual and predicted performance) can still be used rank configurations and hence find the optimal configuration. This novel rank-based approach allows us to significantly reduce the cost (in terms of number of measurements of sample configuration) as well as the time required to build performance models. We evaluate our approach with 21 scenarios based on 9 software systems and demonstrate that our approach is beneficial in 16 scenarios; for the remaining 5 scenarios, an accurate model can be built by using very few samples anyway, without the need for a rank-based approach. @InProceedings{ESEC/FSE17p257, author = {Vivek Nair and Tim Menzies and Norbert Siegmund and Sven Apel}, title = {Using Bad Learners to Find Good Configurations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2017}, } |
|
Atlee, Joanne M. |
ESEC/FSE '17: "Continuous Variable-Specific ..."
Continuous Variable-Specific Resolutions of Feature Interactions
M. Hadi Zibaeenejad, Chi Zhang, and Joanne M. Atlee (University of Waterloo, Canada) Systems that are assembled from independently developed features suffer from feature interactions, in which features affect one another’s behaviour in surprising ways. The Feature Interaction Problem results from trying to implement an appropriate resolution for each interaction within each possible context, because the number of possible contexts to consider increases exponentially with the number of features in the system. Resolution strategies aim to combat the Feature Interaction Problem by offering default strategies that resolve entire classes of interactions, thereby reducing the work needed to resolve lots of interactions. However most such approaches employ coarse-grained resolution strategies (e.g., feature priority) or a centralized arbitrator. Our work focuses on employing variable-specific default-resolution strategies that aim to resolve at runtime features’ conflicting actions on a system’s outputs. In this paper, we extend prior work to enable co-resolution of interactions on coupled output variables and to promote smooth continuous resolutions over execution paths. We implemented our approach within the PreScan simulator and performed a case study involving 15 automotive features; this entailed our devising and implementing three resolution strategies for three output variables. The results of the case study show that the approach produces smooth and continuous resolutions of interactions throughout interesting scenarios. @InProceedings{ESEC/FSE17p408, author = {M. Hadi Zibaeenejad and Chi Zhang and Joanne M. Atlee}, title = {Continuous Variable-Specific Resolutions of Feature Interactions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {408--418}, doi = {}, year = {2017}, } Info |
|
Aydin, Abdulbaki |
ESEC/FSE '17: "Constraint Normalization and ..."
Constraint Normalization and Parameterized Caching for Quantitative Program Analysis
Tegan Brennan, Nestan Tsiskaridze, Nicolás Rosner, Abdulbaki Aydin, and Tevfik Bultan (University of California at Santa Barbara, USA) Symbolic program analysis techniques rely on satisfiability-checking constraint solvers, while quantitative program analysis techniques rely on model-counting constraint solvers. Hence, the efficiency of satisfiability checking and model counting is crucial for efficiency of modern program analysis techniques. In this paper, we present a constraint caching framework to expedite potentially expensive satisfiability and model-counting queries. Integral to this framework is our new constraint normalization procedure under which the cardinality of the solution set of a constraint, but not necessarily the solution set itself, is preserved. We extend these constraint normalization techniques to string constraints in order to support analysis of string-manipulating code. A group-theoretic framework which generalizes earlier results on constraint normalization is used to express our normalization techniques. We also present a parameterized caching approach where, in addition to storing the result of a model-counting query, we also store a model-counter object in the constraint store that allows us to efficiently recount the number of satisfying models for different maximum bounds. We implement our caching framework in our tool Cashew, which is built as an extension of the Green caching framework, and integrate it with the symbolic execution tool Symbolic PathFinder (SPF) and the model-counting constraint solver ABC. Our experiments show that constraint caching can significantly improve the performance of symbolic and quantitative program analyses. For instance, Cashew can normalize the 10,104 unique constraints in the SMC/Kaluza benchmark down to 394 normal forms, achieve a 10x speedup on the SMC/Kaluza-Big dataset, and an average 3x speedup in our SPF-based side-channel analysis experiments. @InProceedings{ESEC/FSE17p535, author = {Tegan Brennan and Nestan Tsiskaridze and Nicolás Rosner and Abdulbaki Aydin and Tevfik Bultan}, title = {Constraint Normalization and Parameterized Caching for Quantitative Program Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {535--546}, doi = {}, year = {2017}, } Info Artifacts Reusable |
|
Bagherzadeh, Mojtaba |
ESEC/FSE '17: "Model-Level, Platform-Independent ..."
Model-Level, Platform-Independent Debugging in the Context of the Model-Driven Development of Real-Time Systems
Mojtaba Bagherzadeh, Nicolas Hili, and Juergen Dingel (Queen's University, Canada) Providing proper support for debugging models at model-level is one of the main barriers to a broader adoption of Model Driven Development (MDD). In this paper, we focus on the use of MDD for the development of real-time embedded systems (RTE). We introduce a new platform-independent approach to implement model-level debuggers. We describe how to realize support for model-level debugging entirely in terms of the modeling language and show how to implement this support in terms of a model-to-model transformation. Key advantages of the approach over existing work are that (1) it does not require a program debugger for the code generated from the model, and that (2) any changes to, e.g., the code generator, the target language, or the hardware platform leave the debugger completely unaffected. We also describe an implementation of the approach in the context of Papyrus-RT, an open source MDD tool based on the modeling language UML-RT. We summarize the results of the use of our model-based debugger on several use cases to determine its overhead in terms of size and performance. Despite being a prototype, the performance overhead is in the order of microseconds, while the size overhead is comparable with that of GDB, the GNU Debugger. @InProceedings{ESEC/FSE17p419, author = {Mojtaba Bagherzadeh and Nicolas Hili and Juergen Dingel}, title = {Model-Level, Platform-Independent Debugging in the Context of the Model-Driven Development of Real-Time Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {419--430}, doi = {}, year = {2017}, } Video Info Artifacts Functional |
|
Barnett, Michael |
ESEC/FSE '17-IND: "Static Analysis for Optimizing ..."
Static Analysis for Optimizing Big Data Queries
Diego Garbervetsky, Zvonimir Pavlinovic, Michael Barnett, Madanlal Musuvathi, Todd Mytkowicz, and Edgardo Zoppi (University of Buenos Aires, Argentina; New York University, USA; Microsoft Research, USA) Query languages for big data analysis provide user extensibility through a mechanism of user-defined operators (UDOs). These operators allow programmers to write proprietary functionalities on top of a relational query skeleton. However, achieving effective query optimization for such languages is extremely challenging since the optimizer needs to understand data dependencies induced by UDOs. SCOPE, the query language from Microsoft, allows for hand coded declarations of UDO data dependencies. Unfortunately, most programmers avoid using this facility since writing and maintaining the declarations is tedious and error-prone. In this work, we designed and implemented two sound and robust static analyses for computing UDO data dependencies. The analyses can detect what columns of an input table are never used or pass-through a UDO unchanged. This information can be used to significantly improve execution of SCOPE scripts. We evaluate our analyses on thousands of real-world queries and show we can catch many unused and pass-through columns automatically without relying on any manually provided declarations. @InProceedings{ESEC/FSE17p932, author = {Diego Garbervetsky and Zvonimir Pavlinovic and Michael Barnett and Madanlal Musuvathi and Todd Mytkowicz and Edgardo Zoppi}, title = {Static Analysis for Optimizing Big Data Queries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {932--937}, doi = {}, year = {2017}, } |
|
Batory, Don |
ESEC/FSE '17: "Finding Near-Optimal Configurations ..."
Finding Near-Optimal Configurations in Product Lines by Random Sampling
Jeho Oh, Don Batory, Margaret Myers, and Norbert Siegmund (University of Texas at Austin, USA; Bauhaus-University Weimar, Germany) Software Product Lines (SPLs) are highly configurable systems. This raises the challenge to find optimal performing configurations for an anticipated workload. As SPL configuration spaces are huge, it is infeasible to benchmark all configurations to find an optimal one. Prior work focused on building performance models to predict and optimize SPL configurations. Instead, we randomly sample and recursively search a configuration space directly to find near-optimal configurations without constructing a prediction model. Our algorithms are simpler and have higher accuracy and efficiency. @InProceedings{ESEC/FSE17p61, author = {Jeho Oh and Don Batory and Margaret Myers and Norbert Siegmund}, title = {Finding Near-Optimal Configurations in Product Lines by Random Sampling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {61--71}, doi = {}, year = {2017}, } |
|
Bavota, Gabriele |
ESEC/FSE '17: "Enabling Mutation Testing ..."
Enabling Mutation Testing for Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Michele Tufano, Kevin Moran, Massimiliano Di Penta, Christopher Vendome, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (Universidad de los Andes, Colombia; University of Lugano, Switzerland; College of William and Mary, USA; University of Sannio, Italy) Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well as to guide test case generation or prioritization. Empirical studies have shown that, while mutants are generally representative of real faults, an effective application of mutation testing requires “traditional” operators designed for programming languages to be augmented with operators specific to an application domain and/or technology. This paper proposes MDroid+, a framework for effective mutation testing of Android apps. First, we systematically devise a taxonomy of 262 types of Android faults grouped in 14 categories by manually analyzing 2,023 so ware artifacts from different sources (e.g., bug reports, commits). Then, we identified a set of 38 mutation operators, and implemented an infrastructure to automatically seed mutations in Android apps with 35 of the identified operators. The taxonomy and the proposed operators have been evaluated in terms of stillborn/trivial mutants generated as compared to well know mutation tools, and their capacity to represent real faults in Android apps @InProceedings{ESEC/FSE17p233, author = {Mario Linares-Vásquez and Gabriele Bavota and Michele Tufano and Kevin Moran and Massimiliano Di Penta and Christopher Vendome and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Enabling Mutation Testing for Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {233--244}, doi = {}, year = {2017}, } Info ESEC/FSE '17: "Detecting Missing Information ..." Detecting Missing Information in Bug Descriptions Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng (University of Texas at Dallas, USA; University of Sannio, Italy; Colorado State University, USA; University of Lugano, Switzerland) Bug reports document unexpected software behaviors experienced by users. To be effective, they should allow bug triagers to easily understand and reproduce the potential reported bugs, by clearly describing the Observed Behavior (OB), the Steps to Reproduce (S2R), and the Expected Behavior (EB). Unfortunately, while considered extremely useful, reporters often miss such pieces of information in bug reports and, to date, there is no effective way to automatically check and enforce their presence. We manually analyzed nearly 3k bug reports to understand to what extent OB, EB, and S2R are reported in bug reports and what discourse patterns reporters use to describe such information. We found that (i) while most reports contain OB (i.e., 93.5%), only 35.2% and 51.4% explicitly describe EB and S2R, respectively; and (ii) reporters recurrently use 154 discourse patterns to describe such content. Based on these findings, we designed and evaluated an automated approach to detect the absence (or presence) of EB and S2R in bug descriptions. With its best setting, our approach is able to detect missing EB (S2R) with 85.9% (69.2%) average precision and 93.2% (83%) average recall. Our approach intends to improve bug descriptions quality by alerting reporters about missing EB and S2R at reporting time. @InProceedings{ESEC/FSE17p396, author = {Oscar Chaparro and Jing Lu and Fiorella Zampetti and Laura Moreno and Massimiliano Di Penta and Andrian Marcus and Gabriele Bavota and Vincent Ng}, title = {Detecting Missing Information in Bug Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {396--407}, doi = {}, year = {2017}, } |
|
Begel, Andrew |
ESEC/FSE '17: "Measuring Neural Efficiency ..."
Measuring Neural Efficiency of Program Comprehension
Janet Siegmund, Norman Peitek, Chris Parnin, Sven Apel, Johannes Hofmeister, Christian Kästner, Andrew Begel, Anja Bethmann, and André Brechmann (University of Passau, Germany; Leibniz Institute for Neurobiology, Germany; North Carolina State University, USA; Carnegie Mellon University, USA; Microsoft Research, USA) Most modern software programs cannot be understood in their entirety by a single programmer. Instead, programmers must rely on a set of cognitive processes that aid in seeking, filtering, and shaping relevant information for a given programming task. Several theories have been proposed to explain these processes, such as ``beacons,' for locating relevant code, and ``plans,'' for encoding cognitive models. However, these theories are decades old and lack validation with modern cognitive-neuroscience methods. In this paper, we report on a study using functional magnetic resonance imaging (fMRI) with 11 participants who performed program comprehension tasks. We manipulated experimental conditions related to beacons and layout to isolate specific cognitive processes related to bottom-up comprehension and comprehension based on semantic cues. We found evidence of semantic chunking during bottom-up comprehension and lower activation of brain areas during comprehension based on semantic cues, confirming that beacons ease comprehension. @InProceedings{ESEC/FSE17p140, author = {Janet Siegmund and Norman Peitek and Chris Parnin and Sven Apel and Johannes Hofmeister and Christian Kästner and Andrew Begel and Anja Bethmann and André Brechmann}, title = {Measuring Neural Efficiency of Program Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {140--150}, doi = {}, year = {2017}, } Info |
|
Bergmann, Gábor |
ESEC/FSE '17-DEMO: "The MONDO Collaboration Framework: ..."
The MONDO Collaboration Framework: Secure Collaborative Modeling over Existing Version Control Systems
Csaba Debreceni, Gábor Bergmann, Márton Búr, István Ráth, and Dániel Varró (Budapest University of Technology and Economics, Hungary; MTA-BME Lendület Research Group on Cyber-Physical Systems, Hungary; McGill University, Canada) Model-based systems engineering of critical cyber-physical systems necessitates effective collaboration between different stakeholders while still providing secure protection of intellectual properties of all involved parties. While engineering artifacts are frequently stored in version control repositories, secure access control is limited to file-level strategies in most existing frameworks where models are split into multiple fragments with all-or-nothing permissions, which becomes a scalability and usability bottleneck in case of complex industrial models. In this paper, we introduce the MONDO Collaboration Framework, which provides rule-based fine-grained model-level secure access control, property-based locking and automated model merge integrated over existing version control systems such as Subversion (SVN) for storage and version control. Our framework simultaneously supports offline collaboration (asynchronous checkout-modify-commit) on top of off-the-shelf modeling tools and online scenarios (GoogleDocs-style short transactions) scenarios by offering a web-based modeling frontend. Screencast Demo: https://youtu.be/Ix3CgmsYIU0 @InProceedings{ESEC/FSE17p984, author = {Csaba Debreceni and Gábor Bergmann and Márton Búr and István Ráth and Dániel Varró}, title = {The MONDO Collaboration Framework: Secure Collaborative Modeling over Existing Version Control Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {984--988}, doi = {}, year = {2017}, } |
|
Bernal-Cárdenas, Carlos |
ESEC/FSE '17: "Enabling Mutation Testing ..."
Enabling Mutation Testing for Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Michele Tufano, Kevin Moran, Massimiliano Di Penta, Christopher Vendome, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (Universidad de los Andes, Colombia; University of Lugano, Switzerland; College of William and Mary, USA; University of Sannio, Italy) Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well as to guide test case generation or prioritization. Empirical studies have shown that, while mutants are generally representative of real faults, an effective application of mutation testing requires “traditional” operators designed for programming languages to be augmented with operators specific to an application domain and/or technology. This paper proposes MDroid+, a framework for effective mutation testing of Android apps. First, we systematically devise a taxonomy of 262 types of Android faults grouped in 14 categories by manually analyzing 2,023 so ware artifacts from different sources (e.g., bug reports, commits). Then, we identified a set of 38 mutation operators, and implemented an infrastructure to automatically seed mutations in Android apps with 35 of the identified operators. The taxonomy and the proposed operators have been evaluated in terms of stillborn/trivial mutants generated as compared to well know mutation tools, and their capacity to represent real faults in Android apps @InProceedings{ESEC/FSE17p233, author = {Mario Linares-Vásquez and Gabriele Bavota and Michele Tufano and Kevin Moran and Massimiliano Di Penta and Christopher Vendome and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Enabling Mutation Testing for Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {233--244}, doi = {}, year = {2017}, } Info |
|
Bethmann, Anja |
ESEC/FSE '17: "Measuring Neural Efficiency ..."
Measuring Neural Efficiency of Program Comprehension
Janet Siegmund, Norman Peitek, Chris Parnin, Sven Apel, Johannes Hofmeister, Christian Kästner, Andrew Begel, Anja Bethmann, and André Brechmann (University of Passau, Germany; Leibniz Institute for Neurobiology, Germany; North Carolina State University, USA; Carnegie Mellon University, USA; Microsoft Research, USA) Most modern software programs cannot be understood in their entirety by a single programmer. Instead, programmers must rely on a set of cognitive processes that aid in seeking, filtering, and shaping relevant information for a given programming task. Several theories have been proposed to explain these processes, such as ``beacons,' for locating relevant code, and ``plans,'' for encoding cognitive models. However, these theories are decades old and lack validation with modern cognitive-neuroscience methods. In this paper, we report on a study using functional magnetic resonance imaging (fMRI) with 11 participants who performed program comprehension tasks. We manipulated experimental conditions related to beacons and layout to isolate specific cognitive processes related to bottom-up comprehension and comprehension based on semantic cues. We found evidence of semantic chunking during bottom-up comprehension and lower activation of brain areas during comprehension based on semantic cues, confirming that beacons ease comprehension. @InProceedings{ESEC/FSE17p140, author = {Janet Siegmund and Norman Peitek and Chris Parnin and Sven Apel and Johannes Hofmeister and Christian Kästner and Andrew Begel and Anja Bethmann and André Brechmann}, title = {Measuring Neural Efficiency of Program Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {140--150}, doi = {}, year = {2017}, } Info |
|
Bianchi, Francesco A. |
ESEC/FSE '17: "Reproducing Concurrency Failures ..."
Reproducing Concurrency Failures from Crash Stacks
Francesco A. Bianchi, Mauro Pezzè, and Valerio Terragni (University of Lugano, Switzerland) Reproducing field failures is the first essential step for understanding, localizing and removing faults. Reproducing concurrency field failures is hard due to the need of synthesizing a test code jointly with a thread interleaving that induce the failure in the presence of limited information from the field. Current techniques for reproducing concurrency failures focus on identifying failure-inducing interleavings, leaving largely open the problem of synthesizing the test code that manifests such interleavings. In this paper, we present ConCrash, a technique to automatically generate test codes that reproduce concurrency failures that violate thread-safety from crash stacks, which commonly summarize the conditions of field failures. ConCrash efficiently explores the huge space of possible test codes to identify a failure-inducing one by using a suitable set of search pruning strategies. Combined with existing techniques for exploring interleavings, ConCrash automatically reproduces a given concurrency failure that violates the thread-safety of a class by identifying both a failure-inducing test code and corresponding interleaving. In the paper, we define the ConCrash approach, present a prototype implementation of ConCrash, and discuss the experimental results that we obtained on a known set of ten field failures that witness the effectiveness of the approach. @InProceedings{ESEC/FSE17p705, author = {Francesco A. Bianchi and Mauro Pezzè and Valerio Terragni}, title = {Reproducing Concurrency Failures from Crash Stacks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {705--716}, doi = {}, year = {2017}, } |
|
Bianculli, Domenico |
ESEC/FSE '17-DEMO: "JoanAudit: A Tool for Auditing ..."
JoanAudit: A Tool for Auditing Common Injection Vulnerabilities
Julian Thomé, Lwin Khin Shar, Domenico Bianculli, and Lionel C. Briand (University of Luxembourg, Luxembourg) JoanAudit is a static analysis tool to assist security auditors in auditing Web applications and Web services for common injection vulnerabilities during software development. It automatically identifies parts of the program code that are relevant for security and generates an HTML report to guide security auditors audit the source code in a scalable way. JoanAudit is configured with various security-sensitive input sources and sinks relevant to injection vulnerabilities and standard sanitization procedures that prevent these vulnerabilities. It can also automatically fix some cases of vulnerabilities in source code — cases where inputs are directly used in sinks without any form of sanitization — by using standard sanitization procedures. Our evaluation shows that by using JoanAudit, security auditors are required to inspect only 1% of the total code for auditing common injection vulnerabilities. The screen-cast demo is available at https://github.com/julianthome/joanaudit. @InProceedings{ESEC/FSE17p1004, author = {Julian Thomé and Lwin Khin Shar and Domenico Bianculli and Lionel C. Briand}, title = {JoanAudit: A Tool for Auditing Common Injection Vulnerabilities}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1004--1008}, doi = {}, year = {2017}, } Video Info |
|
Binkley, David |
ESEC/FSE '17: "Generalized Observational ..."
Generalized Observational Slicing for Tree-Represented Modelling Languages
Nicolas E. Gold, David Binkley, Mark Harman, Syed Islam, Jens Krinke, and Shin Yoo (University College London, UK; Loyola University Maryland, USA; University of East London, UK; KAIST, South Korea) Model-driven software engineering raises the abstraction level making complex systems easier to understand than if written in textual code. Nevertheless, large complicated software systems can have large models, motivating the need for slicing techniques that reduce the size of a model. We present a generalization of observation-based slicing that allows the criterion to be defined using a variety of kinds of observable behavior and does not require any complex dependence analysis. We apply our implementation of generalized observational slicing for tree-structured representations to Simulink models. The resulting slice might be the subset of the original model responsible for an observed failure or simply the sub-model semantically related to a classic slicing criterion. Unlike its predecessors, the algorithm is also capable of slicing embedded Stateflow state machines. A study of nine real-world models drawn from four different application domains demonstrates the effectiveness of our approach at dramatically reducing Simulink model sizes for realistic observation scenarios: for 9 out of 20 cases, the resulting model has fewer than 25% of the original model's elements. @InProceedings{ESEC/FSE17p547, author = {Nicolas E. Gold and David Binkley and Mark Harman and Syed Islam and Jens Krinke and Shin Yoo}, title = {Generalized Observational Slicing for Tree-Represented Modelling Languages}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {547--558}, doi = {}, year = {2017}, } |
|
Böhme, Marcel |
ESEC/FSE '17: "Where Is the Bug and How Is ..."
Where Is the Bug and How Is It Fixed? An Experiment with Practitioners
Marcel Böhme, Ezekiel O. Soremekun, Sudipta Chattopadhyay, Emamurho Ugherughe, and Andreas Zeller (National University of Singapore, Singapore; Saarland University, Germany; Singapore University of Technology and Design, Singapore; SAP, Germany) Research has produced many approaches to automatically locate, explain, and repair software bugs. But do these approaches relate to the way practitioners actually locate, understand, and fix bugs? To help answer this question, we have collected a dataset named DBGBENCH --- the correct fault locations, bug diagnoses, and software patches of 27 real errors in open-source C projects that were consolidated from hundreds of debugging sessions of professional software engineers. Moreover, we shed light on the entire debugging process, from constructing a hypothesis to submitting a patch, and how debugging time, difficulty, and strategies vary across practitioners and types of errors. Most notably, DBGBENCH can serve as reality check for novel automated debugging and repair techniques. @InProceedings{ESEC/FSE17p117, author = {Marcel Böhme and Ezekiel O. Soremekun and Sudipta Chattopadhyay and Emamurho Ugherughe and Andreas Zeller}, title = {Where Is the Bug and How Is It Fixed? An Experiment with Practitioners}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {117--128}, doi = {}, year = {2017}, } Info Artifacts Reusable |
|
Bos, Jeroen van den |
ESEC/FSE '17-IND: "Model-Driven Software Engineering ..."
Model-Driven Software Engineering in Practice: Privacy-Enhanced Filtering of Network Traffic
Roel van Dijk, Christophe Creeten, Jeroen van der Ham, and Jeroen van den Bos (Netherlands Forensic Institute, Netherlands; Delft University of Technology, Netherlands; National Cyber Security Centre, Netherlands; Zuyd University of Applied Sciences, Netherlands) Network traffic data contains a wealth of information for use in security analysis and application development. Unfortunately, it also usually contains confidential or otherwise sensitive information, prohibiting sharing and analysis. Existing automated anonymization solutions are hard to maintain and tend to be outdated. We present Privacy-Enhanced Filtering (PEF), a model-driven prototype framework that relies on declarative descriptions of protocols and a set of filter rules, which are used to automatically transform network traffic data to remove sensitive information. This paper discusses the design, implementation and application of PEF, which is available as open-source software and configured for use in a typical malware detection scenario. @InProceedings{ESEC/FSE17p860, author = {Roel van Dijk and Christophe Creeten and Jeroen van der Ham and Jeroen van den Bos}, title = {Model-Driven Software Engineering in Practice: Privacy-Enhanced Filtering of Network Traffic}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {860--865}, doi = {}, year = {2017}, } |
|
Brechmann, André |
ESEC/FSE '17: "Measuring Neural Efficiency ..."
Measuring Neural Efficiency of Program Comprehension
Janet Siegmund, Norman Peitek, Chris Parnin, Sven Apel, Johannes Hofmeister, Christian Kästner, Andrew Begel, Anja Bethmann, and André Brechmann (University of Passau, Germany; Leibniz Institute for Neurobiology, Germany; North Carolina State University, USA; Carnegie Mellon University, USA; Microsoft Research, USA) Most modern software programs cannot be understood in their entirety by a single programmer. Instead, programmers must rely on a set of cognitive processes that aid in seeking, filtering, and shaping relevant information for a given programming task. Several theories have been proposed to explain these processes, such as ``beacons,' for locating relevant code, and ``plans,'' for encoding cognitive models. However, these theories are decades old and lack validation with modern cognitive-neuroscience methods. In this paper, we report on a study using functional magnetic resonance imaging (fMRI) with 11 participants who performed program comprehension tasks. We manipulated experimental conditions related to beacons and layout to isolate specific cognitive processes related to bottom-up comprehension and comprehension based on semantic cues. We found evidence of semantic chunking during bottom-up comprehension and lower activation of brain areas during comprehension based on semantic cues, confirming that beacons ease comprehension. @InProceedings{ESEC/FSE17p140, author = {Janet Siegmund and Norman Peitek and Chris Parnin and Sven Apel and Johannes Hofmeister and Christian Kästner and Andrew Begel and Anja Bethmann and André Brechmann}, title = {Measuring Neural Efficiency of Program Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {140--150}, doi = {}, year = {2017}, } Info |
|
Brennan, Tegan |
ESEC/FSE '17: "Constraint Normalization and ..."
Constraint Normalization and Parameterized Caching for Quantitative Program Analysis
Tegan Brennan, Nestan Tsiskaridze, Nicolás Rosner, Abdulbaki Aydin, and Tevfik Bultan (University of California at Santa Barbara, USA) Symbolic program analysis techniques rely on satisfiability-checking constraint solvers, while quantitative program analysis techniques rely on model-counting constraint solvers. Hence, the efficiency of satisfiability checking and model counting is crucial for efficiency of modern program analysis techniques. In this paper, we present a constraint caching framework to expedite potentially expensive satisfiability and model-counting queries. Integral to this framework is our new constraint normalization procedure under which the cardinality of the solution set of a constraint, but not necessarily the solution set itself, is preserved. We extend these constraint normalization techniques to string constraints in order to support analysis of string-manipulating code. A group-theoretic framework which generalizes earlier results on constraint normalization is used to express our normalization techniques. We also present a parameterized caching approach where, in addition to storing the result of a model-counting query, we also store a model-counter object in the constraint store that allows us to efficiently recount the number of satisfying models for different maximum bounds. We implement our caching framework in our tool Cashew, which is built as an extension of the Green caching framework, and integrate it with the symbolic execution tool Symbolic PathFinder (SPF) and the model-counting constraint solver ABC. Our experiments show that constraint caching can significantly improve the performance of symbolic and quantitative program analyses. For instance, Cashew can normalize the 10,104 unique constraints in the SMC/Kaluza benchmark down to 394 normal forms, achieve a 10x speedup on the SMC/Kaluza-Big dataset, and an average 3x speedup in our SPF-based side-channel analysis experiments. @InProceedings{ESEC/FSE17p535, author = {Tegan Brennan and Nestan Tsiskaridze and Nicolás Rosner and Abdulbaki Aydin and Tevfik Bultan}, title = {Constraint Normalization and Parameterized Caching for Quantitative Program Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {535--546}, doi = {}, year = {2017}, } Info Artifacts Reusable |
|
Briand, Lionel C. |
ESEC/FSE '17-DEMO: "JoanAudit: A Tool for Auditing ..."
JoanAudit: A Tool for Auditing Common Injection Vulnerabilities
Julian Thomé, Lwin Khin Shar, Domenico Bianculli, and Lionel C. Briand (University of Luxembourg, Luxembourg) JoanAudit is a static analysis tool to assist security auditors in auditing Web applications and Web services for common injection vulnerabilities during software development. It automatically identifies parts of the program code that are relevant for security and generates an HTML report to guide security auditors audit the source code in a scalable way. JoanAudit is configured with various security-sensitive input sources and sinks relevant to injection vulnerabilities and standard sanitization procedures that prevent these vulnerabilities. It can also automatically fix some cases of vulnerabilities in source code — cases where inputs are directly used in sinks without any form of sanitization — by using standard sanitization procedures. Our evaluation shows that by using JoanAudit, security auditors are required to inspect only 1% of the total code for auditing common injection vulnerabilities. The screen-cast demo is available at https://github.com/julianthome/joanaudit. @InProceedings{ESEC/FSE17p1004, author = {Julian Thomé and Lwin Khin Shar and Domenico Bianculli and Lionel C. Briand}, title = {JoanAudit: A Tool for Auditing Common Injection Vulnerabilities}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1004--1008}, doi = {}, year = {2017}, } Video Info ESEC/FSE '17-IND: "Automated Testing of Hybrid ..." Automated Testing of Hybrid Simulink/Stateflow Controllers: Industrial Case Studies Reza Matinnejad, Shiva Nejati, and Lionel C. Briand (University of Luxembourg, Luxembourg) We present the results of applying our approach for testing Simulink controllers to one public and one proprietary model, both industrial. Our approach combines explorative and exploitative search algorithms to visualize the controller behavior over its input space and to identify test scenarios in the controller input space that violate or are likely to violate the controller requirements. The engineers' feedback shows that our approach is easy to use in practice and gives them confidence about the behavior of their models. @InProceedings{ESEC/FSE17p938, author = {Reza Matinnejad and Shiva Nejati and Lionel C. Briand}, title = {Automated Testing of Hybrid Simulink/Stateflow Controllers: Industrial Case Studies}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {938--943}, doi = {}, year = {2017}, } |
|
Brown, David Bingham |
ESEC/FSE '17: "The Care and Feeding of Wild-Caught ..."
The Care and Feeding of Wild-Caught Mutants
David Bingham Brown, Michael Vaughn, Ben Liblit, and Thomas Reps (University of Wisconsin-Madison, USA) Mutation testing of a test suite and a program provides a way to measure the quality of the test suite. In essence, mutation testing is a form of sensitivity testing: by running mutated versions of the program against the test suite, mutation testing measures the suite’s sensitivity for detecting bugs that a programmer might introduce into the program. This paper introduces a technique to improve mutation testing that we call wild-caught mutants; it provides a method for creating potential faults that are more closely coupled with changes made by actual programmers. This technique allows the mutation tester to have more certainty that the test suite is sensitive to the kind of changes that have been observed to have been made by programmers in real-world cases. @InProceedings{ESEC/FSE17p511, author = {David Bingham Brown and Michael Vaughn and Ben Liblit and Thomas Reps}, title = {The Care and Feeding of Wild-Caught Mutants}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {511--522}, doi = {}, year = {2017}, } Video Info Artifacts Reusable |
|
Brun, Yuriy |
ESEC/FSE '17: "Fairness Testing: Testing ..."
Fairness Testing: Testing Software for Discrimination
Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou (University of Massachusetts at Amherst, USA) This paper defines software fairness and discrimination and develops a testing-based method for measuring if and how much software discriminates, focusing on causality in discriminatory behavior. Evidence of software discrimination has been found in modern software systems that recommend criminal sentences, grant access to financial products, and determine who is allowed to participate in promotions. Our approach, Themis, generates efficient test suites to measure discrimination. Given a schema describing valid system inputs, Themis generates discrimination tests automatically and does not require an oracle. We evaluate Themis on 20 software systems, 12 of which come from prior work with explicit focus on avoiding discrimination. We find that (1) Themis is effective at discovering software discrimination, (2) state-of-the-art techniques for removing discrimination from algorithms fail in many situations, at times discriminating against as much as 98% of an input subdomain, (3) Themis optimizations are effective at producing efficient test suites for measuring discrimination, and (4) Themis is more efficient on systems that exhibit more discrimination. We thus demonstrate that fairness testing is a critical aspect of the software development cycle in domains with possible discrimination and provide initial tools for measuring software discrimination. @InProceedings{ESEC/FSE17p498, author = {Sainyam Galhotra and Yuriy Brun and Alexandra Meliou}, title = {Fairness Testing: Testing Software for Discrimination}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {498--510}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Bu, Wenqi |
ESEC/FSE '17-IND: "When Program Analysis Meets ..."
When Program Analysis Meets Mobile Security: An Industrial Study of Misusing Android Internet Sockets
Wenqi Bu, Minhui Xue, Lihua Xu, Yajin Zhou, Zhushou Tang, and Tao Xie (East China Normal University, China; New York University Shanghai, China; Pwnzen Infotech, China; University of Illinois at Urbana-Champaign, USA) Despite recent progress in program analysis techniques to identify vulnerabilities in Android apps, significant challenges still remain for applying these techniques to large-scale industrial environments. Modern software-security providers, such as Qihoo 360 and Pwnzen (two leading companies in China), are often required to process more than 10 million mobile apps at each run. In this work, we focus on effectively and efficiently identifying vulnerable usage of Internet sockets in an industrial setting. To achieve this goal, we propose a practical hybrid approach that enables lightweight yet precise detection in the industrial setting. In particular, we integrate the process of categorizing potential vulnerable apps with analysis techniques, to reduce the inevitable human inspection effort. We categorize potential vulnerable apps based on characteristics of vulnerability signatures, to reduce the burden on static analysis. We flexibly integrate static and dynamic analyses for apps in each identified family, to refine the family signatures and hence target on precise detection. We implement our approach in a practical system and deploy the system on the Pwnzen platform. By using the system, we identify and report potential vulnerabilities of 24 vulnerable apps (falling into 3 vulnerability families) to their developers, and some of these reported vulnerabilities are previously unknown. The apps of each vulnerability family in total have over 50 million downloads. We also propose countermeasures and highlight promising directions for technology transfer. @InProceedings{ESEC/FSE17p842, author = {Wenqi Bu and Minhui Xue and Lihua Xu and Yajin Zhou and Zhushou Tang and Tao Xie}, title = {When Program Analysis Meets Mobile Security: An Industrial Study of Misusing Android Internet Sockets}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {842--847}, doi = {}, year = {2017}, } |
|
Bultan, Tevfik |
ESEC/FSE '17: "Constraint Normalization and ..."
Constraint Normalization and Parameterized Caching for Quantitative Program Analysis
Tegan Brennan, Nestan Tsiskaridze, Nicolás Rosner, Abdulbaki Aydin, and Tevfik Bultan (University of California at Santa Barbara, USA) Symbolic program analysis techniques rely on satisfiability-checking constraint solvers, while quantitative program analysis techniques rely on model-counting constraint solvers. Hence, the efficiency of satisfiability checking and model counting is crucial for efficiency of modern program analysis techniques. In this paper, we present a constraint caching framework to expedite potentially expensive satisfiability and model-counting queries. Integral to this framework is our new constraint normalization procedure under which the cardinality of the solution set of a constraint, but not necessarily the solution set itself, is preserved. We extend these constraint normalization techniques to string constraints in order to support analysis of string-manipulating code. A group-theoretic framework which generalizes earlier results on constraint normalization is used to express our normalization techniques. We also present a parameterized caching approach where, in addition to storing the result of a model-counting query, we also store a model-counter object in the constraint store that allows us to efficiently recount the number of satisfying models for different maximum bounds. We implement our caching framework in our tool Cashew, which is built as an extension of the Green caching framework, and integrate it with the symbolic execution tool Symbolic PathFinder (SPF) and the model-counting constraint solver ABC. Our experiments show that constraint caching can significantly improve the performance of symbolic and quantitative program analyses. For instance, Cashew can normalize the 10,104 unique constraints in the SMC/Kaluza benchmark down to 394 normal forms, achieve a 10x speedup on the SMC/Kaluza-Big dataset, and an average 3x speedup in our SPF-based side-channel analysis experiments. @InProceedings{ESEC/FSE17p535, author = {Tegan Brennan and Nestan Tsiskaridze and Nicolás Rosner and Abdulbaki Aydin and Tevfik Bultan}, title = {Constraint Normalization and Parameterized Caching for Quantitative Program Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {535--546}, doi = {}, year = {2017}, } Info Artifacts Reusable |
|
Bunyakiati, Panuchart |
ESEC/FSE '17-DEMO: "Cherry-Picking of Code Commits ..."
Cherry-Picking of Code Commits in Long-Running, Multi-release Software
Panuchart Bunyakiati and Chadarat Phipathananunth (University of the Thai Chamber of Commerce, Thailand) This paper presents Tartarian, a tool that supports maintenance of software with long-running, multi-release branches in distributed version control systems. When new maintenance code, such as bug fixes and code improvement, is committed into a branch, it is likely that such code can be applied or reused with some other branches. To do so, a developer may manually identify a commit and cherry pick it. Tartarian can support this activity by providing commit hashtags, which the developer uses as metadata to specify their intentions when committing the code. With these tags, Tartarian uses dependency graph, that represents the dependency constraints of the branches, and Branch Identifier, which matches the commit hashtags with the dependency graph, to identify the applicable branches for the commits. Using Tartarian, developers may be able to maintain software with multiple releases more efficiently. A video demo of Tartarian is available at www.github.com/tartarian. @InProceedings{ESEC/FSE17p994, author = {Panuchart Bunyakiati and Chadarat Phipathananunth}, title = {Cherry-Picking of Code Commits in Long-Running, Multi-release Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {994--998}, doi = {}, year = {2017}, } Info |
|
Búr, Márton |
ESEC/FSE '17-DEMO: "The MONDO Collaboration Framework: ..."
The MONDO Collaboration Framework: Secure Collaborative Modeling over Existing Version Control Systems
Csaba Debreceni, Gábor Bergmann, Márton Búr, István Ráth, and Dániel Varró (Budapest University of Technology and Economics, Hungary; MTA-BME Lendület Research Group on Cyber-Physical Systems, Hungary; McGill University, Canada) Model-based systems engineering of critical cyber-physical systems necessitates effective collaboration between different stakeholders while still providing secure protection of intellectual properties of all involved parties. While engineering artifacts are frequently stored in version control repositories, secure access control is limited to file-level strategies in most existing frameworks where models are split into multiple fragments with all-or-nothing permissions, which becomes a scalability and usability bottleneck in case of complex industrial models. In this paper, we introduce the MONDO Collaboration Framework, which provides rule-based fine-grained model-level secure access control, property-based locking and automated model merge integrated over existing version control systems such as Subversion (SVN) for storage and version control. Our framework simultaneously supports offline collaboration (asynchronous checkout-modify-commit) on top of off-the-shelf modeling tools and online scenarios (GoogleDocs-style short transactions) scenarios by offering a web-based modeling frontend. Screencast Demo: https://youtu.be/Ix3CgmsYIU0 @InProceedings{ESEC/FSE17p984, author = {Csaba Debreceni and Gábor Bergmann and Márton Búr and István Ráth and Dániel Varró}, title = {The MONDO Collaboration Framework: Secure Collaborative Modeling over Existing Version Control Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {984--988}, doi = {}, year = {2017}, } |
|
Cai, Yan |
ESEC/FSE '17: "AtexRace: Across Thread and ..."
AtexRace: Across Thread and Execution Sampling for In-House Race Detection
Yu Guo, Yan Cai, and Zijiang Yang (Western Michigan University, USA; Institute of Software at Chinese Academy of Sciences, China) Data race is a major source of concurrency bugs. Dynamic data race detection tools (e.g., FastTrack) monitor the execu-tions of a program to report data races occurring in runtime. However, such tools incur significant overhead that slows down and perturbs executions. To address the issue, the state-of-the-art dynamic data race detection tools (e.g., LiteRace) ap-ply sampling techniques to selectively monitor memory access-es. Although they reduce overhead, they also miss many data races as confirmed by existing studies. Thus, practitioners face a dilemma on whether to use FastTrack, which detects more data races but is much slower, or LiteRace, which is faster but detects less data races. In this paper, we propose a new sam-pling approach to address the major limitations of current sampling techniques, which ignore the facts that a data race involves two threads and a program under testing is repeatedly executed. We develop a tool called AtexRace to sample memory accesses across both threads and executions. By selectively monitoring the pairs of memory accesses that have not been frequently observed in current and previous executions, AtexRace detects as many data races as FastTrack at a cost as low as LiteRace. We have compared AtexRace against FastTrack and LiteRace on both Parsec benchmark suite and a large-scale real-world MySQL Server with 223 test cases. The experiments confirm that AtexRace can be a replacement of FastTrack and LiteRace. @InProceedings{ESEC/FSE17p315, author = {Yu Guo and Yan Cai and Zijiang Yang}, title = {AtexRace: Across Thread and Execution Sampling for In-House Race Detection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {315--325}, doi = {}, year = {2017}, } ESEC/FSE '17: "Adaptively Generating High ..." Adaptively Generating High Quality Fixes for Atomicity Violations Yan Cai, Lingwei Cao, and Jing Zhao (Institute of Software at Chinese Academy of Sciences, China; University at Chinese Academy of Sciences, China; Harbin Engineering University, China) It is difficult to fix atomicity violations correctly. Existing gate lock algorithm (GLA) simply inserts gate locks to serialize exe-cutions, which may introduce performance bugs and deadlocks. Synthesized context-aware gate locks (by Grail) require complex source code synthesis. We propose Fixer to adaptively fix ato-micity violations. It firstly analyses the lock acquisitions of an atomicity violation. Then it either adjusts the existing lock scope or inserts a gate lock. The former addresses cases where some locks are used but fail to provide atomic accesses. For the latter, it infers the visibility (being global or a field of a class/struct) of the gate lock such that the lock only protects related accesses. For both cases, Fixer further eliminates new lock orders to avoid introducing deadlocks. Of course, Fixer can produce both kinds of fixes on atomicity violations with locks. The experi-mental results on 15 previously used atomicity violations show that: Fixer correctly fixed all 15 atomicity violations without introducing deadlocks. However, GLA and Grail both intro-duced 5 deadlocks. HFix (that only targets on fixing certain types of atomicity violations) only fixed 2 atomicity violations and introduced 4 deadlocks. Fixer also provides an alternative way to insert gate locks (by inserting gate locks with proper visibility) considering fix acceptance. @InProceedings{ESEC/FSE17p303, author = {Yan Cai and Lingwei Cao and Jing Zhao}, title = {Adaptively Generating High Quality Fixes for Atomicity Violations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {303--314}, doi = {}, year = {2017}, } |
|
Cao, Lingwei |
ESEC/FSE '17: "Adaptively Generating High ..."
Adaptively Generating High Quality Fixes for Atomicity Violations
Yan Cai, Lingwei Cao, and Jing Zhao (Institute of Software at Chinese Academy of Sciences, China; University at Chinese Academy of Sciences, China; Harbin Engineering University, China) It is difficult to fix atomicity violations correctly. Existing gate lock algorithm (GLA) simply inserts gate locks to serialize exe-cutions, which may introduce performance bugs and deadlocks. Synthesized context-aware gate locks (by Grail) require complex source code synthesis. We propose Fixer to adaptively fix ato-micity violations. It firstly analyses the lock acquisitions of an atomicity violation. Then it either adjusts the existing lock scope or inserts a gate lock. The former addresses cases where some locks are used but fail to provide atomic accesses. For the latter, it infers the visibility (being global or a field of a class/struct) of the gate lock such that the lock only protects related accesses. For both cases, Fixer further eliminates new lock orders to avoid introducing deadlocks. Of course, Fixer can produce both kinds of fixes on atomicity violations with locks. The experi-mental results on 15 previously used atomicity violations show that: Fixer correctly fixed all 15 atomicity violations without introducing deadlocks. However, GLA and Grail both intro-duced 5 deadlocks. HFix (that only targets on fixing certain types of atomicity violations) only fixed 2 atomicity violations and introduced 4 deadlocks. Fixer also provides an alternative way to insert gate locks (by inserting gate locks with proper visibility) considering fix acceptance. @InProceedings{ESEC/FSE17p303, author = {Yan Cai and Lingwei Cao and Jing Zhao}, title = {Adaptively Generating High Quality Fixes for Atomicity Violations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {303--314}, doi = {}, year = {2017}, } |
|
Cappos, Justin |
ESEC/FSE '17: "Understanding Misunderstandings ..."
Understanding Misunderstandings in Source Code
Dan Gopstein, Jake Iannacone, Yu Yan, Lois DeLong, Yanyan Zhuang, Martin K.-C. Yeh, and Justin Cappos (New York University, USA; Pennsylvania State University, USA; University of Colorado at Colorado Springs, USA) Humans often mistake the meaning of source code, and so misjudge a program's true behavior. These mistakes can be caused by extremely small, isolated patterns in code, which can lead to significant runtime errors. These patterns are used in large, popular software projects and even recommended in style guides. To identify code patterns that may confuse programmers we extracted a preliminary set of `atoms of confusion' from known confusing code. We show empirically in an experiment with 73 participants that these code patterns can lead to a significantly increased rate of misunderstanding versus equivalent code without the patterns. We then go on to take larger confusing programs and measure (in an experiment with 43 participants) the impact, in terms of programmer confusion, of removing these confusing patterns. All of our instruments, analysis code, and data are publicly available online for replication, experimentation, and feedback. @InProceedings{ESEC/FSE17p129, author = {Dan Gopstein and Jake Iannacone and Yu Yan and Lois DeLong and Yanyan Zhuang and Martin K.-C. Yeh and Justin Cappos}, title = {Understanding Misunderstandings in Source Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {129--139}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Casalnuovo, Casey |
ESEC/FSE '17: "Recovering Clear, Natural ..."
Recovering Clear, Natural Identifiers from Obfuscated JS Names
Bogdan Vasilescu, Casey Casalnuovo, and Premkumar Devanbu (Carnegie Mellon University, USA; University of California at Davis, USA) Well-chosen variable names are critical to source code readability, reusability, and maintainability. Unfortunately, in deployed JavaScript code (which is ubiquitous on the web) the identifier names are frequently minified and overloaded. This is done both for efficiency and also to protect potentially proprietary intellectual property. In this paper, we describe an approach based on statistical machine translation (SMT) that recovers some of the original names from the JavaScript programs minified by the very popular UglifyJS. This simple tool, Autonym, performs comparably to the best currently available deobfuscator for JavaScript, JSNice, which uses sophisticated static analysis. In fact, Autonym is quite complementary to JSNice, performing well when it does not, and vice versa. We also introduce a new tool, JSNaughty, which blends Autonym and JSNice, and significantly outperforms both at identifier name recovery, while remaining just as easy to use as JSNice. JSNaughty is available online at http://jsnaughty.org. @InProceedings{ESEC/FSE17p683, author = {Bogdan Vasilescu and Casey Casalnuovo and Premkumar Devanbu}, title = {Recovering Clear, Natural Identifiers from Obfuscated JS Names}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {683--693}, doi = {}, year = {2017}, } |
|
Castelluccio, Marco |
ESEC/FSE '17: "Automatically Analyzing Groups ..."
Automatically Analyzing Groups of Crashes for Finding Correlations
Marco Castelluccio, Carlo Sansone, Luisa Verdoliva, and Giovanni Poggi (Federico II University of Naples, Italy; Mozilla, UK) We devised an algorithm, inspired by contrast-set mining algorithms such as STUCCO, to automatically find statistically significant properties (correlations) in crash groups. Many earlier works focused on improving the clustering of crashes but, to the best of our knowledge, the problem of automatically describing properties of a cluster of crashes is so far unexplored. This means developers currently spend a fair amount of time analyzing the groups themselves, which in turn means that a) they are not spending their time actually developing a fix for the crash; and b) they might miss something in their exploration of the crash data (there is a large number of attributes in crash reports and it is hard and error-prone to manually analyze everything). Our algorithm helps developers and release managers understand crash reports more easily and in an automated way, helping in pinpointing the root cause of the crash. The tool implementing the algorithm has been deployed on Mozilla's crash reporting service. @InProceedings{ESEC/FSE17p717, author = {Marco Castelluccio and Carlo Sansone and Luisa Verdoliva and Giovanni Poggi}, title = {Automatically Analyzing Groups of Crashes for Finding Correlations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {717--726}, doi = {}, year = {2017}, } |
|
Cedrim, Diego |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects
Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo Sousa, Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez (PUC-Rio, Brazil; Federal University of Campina Grande, Brazil; Federal University of Alagoas, Brazil) Code smells in a program represent indications of structural quality problems, which can be addressed by software refactoring. However, refactoring intends to achieve different goals in practice, and its application may not reduce smelly structures. Developers may neglect or end up creating new code smells through refactoring. Unfortunately, little has been reported about the beneficial and harmful effects of refactoring on code smells. This paper reports a longitudinal study intended to address this gap. We analyze how often commonly-used refactoring types affect the density of 13 types of code smells along the version histories of 23 projects. Our findings are based on the analysis of 16,566 refactorings distributed in 10 different types. Even though 79.4% of the refactorings touched smelly elements, 57% did not reduce their occurrences. Surprisingly, only 9.7% of refactorings removed smells, while 33.3% induced the introduction of new ones. More than 95% of such refactoring-induced smells were not removed in successive commits, which suggest refactorings tend to more frequently introduce long-living smells instead of eliminating existing ones. We also characterized and quantified typical refactoring-smell patterns, and observed that harmful patterns are frequent, including: (i) approximately 30% of the Move Method and Pull Up Method refactorings induced the emergence of God Class, and (ii) the Extract Superclass refactoring creates the smell Speculative Generality in 68% of the cases. @InProceedings{ESEC/FSE17p465, author = {Diego Cedrim and Alessandro Garcia and Melina Mongiovi and Rohit Gheyi and Leonardo Sousa and Rafael de Mello and Baldoino Fonseca and Márcio Ribeiro and Alexander Chávez}, title = {Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {465--475}, doi = {}, year = {2017}, } Info |
|
Celik, Ahmet |
ESEC/FSE '17: "Regression Test Selection ..."
Regression Test Selection Across JVM Boundaries
Ahmet Celik, Marko Vasic, Aleksandar Milicevic, and Milos Gligoric (University of Texas at Austin, USA; Microsoft, USA) Modern software development processes recommend that changes be integrated into the main development line of a project multiple times a day. Before a new revision may be integrated, developers practice regression testing to ensure that the latest changes do not break any previously established functionality. The cost of regression testing is high, due to an increase in the number of revisions that are introduced per day, as well as the number of tests developers write per revision. Regression test selection (RTS) optimizes regression testing by skipping tests that are not affected by recent project changes. Existing dynamic RTS techniques support only projects written in a single programming language, which is unfortunate knowing that an open-source project is on average written in several programming languages. We present the first dynamic RTS technique that does not stop at predefined language boundaries. Our technique dynamically detects, at the operating system level, all file artifacts a test depends on. Our technique is, hence, oblivious to the specific means the test uses to actually access the files: be it through spawning a new process, invoking a system call, invoking a library written in a different language, invoking a library that spawns a process which makes a system call, etc. We also provide a set of extension points which allow for a smooth integration with testing frameworks and build systems. We implemented our technique in a tool called RTSLinux as a loadable Linux kernel module and evaluated it on 21 Java projects that escape JVM by spawning new processes or invoking native code, totaling 2,050,791 lines of code. Our results show that RTSLinux, on average, skips 74.17% of tests and saves 52.83% of test execution time compared to executing all tests. @InProceedings{ESEC/FSE17p809, author = {Ahmet Celik and Marko Vasic and Aleksandar Milicevic and Milos Gligoric}, title = {Regression Test Selection Across JVM Boundaries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {809--820}, doi = {}, year = {2017}, } |
|
Chandramohan, Mahinthan |
ESEC/FSE '17: "Steelix: Program-State Based ..."
Steelix: Program-State Based Binary Fuzzing
Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu (Nanyang Technological University, Singapore; Fudan University, China) Coverage-based fuzzing is one of the most effective techniques to find vulnerabilities, bugs or crashes. However, existing techniques suffer from the difficulty in exercising the paths that are protected by magic bytes comparisons (e.g., string equality comparisons). Several approaches have been proposed to use heavy-weight program analysis to break through magic bytes comparisons, and hence are less scalable. In this paper, we propose a program-state based binary fuzzing approach, named Steelix, which improves the penetration power of a fuzzer at the cost of an acceptable slow down of the execution speed. In particular, we use light-weight static analysis and binary instrumentation to provide not only coverage information but also comparison progress information to a fuzzer. Such program state information informs a fuzzer about where the magic bytes are located in the test input and how to perform mutations to match the magic bytes efficiently. We have implemented Steelix and evaluated it on three datasets: LAVA-M dataset, DARPA CGC sample binaries and five real-life programs. The results show that Steelix has better code coverage and bug detection capability than the state-of-the-art fuzzers. Moreover, we found one CVE and nine new bugs. @InProceedings{ESEC/FSE17p627, author = {Yuekang Li and Bihuan Chen and Mahinthan Chandramohan and Shang-Wei Lin and Yang Liu and Alwen Tiu}, title = {Steelix: Program-State Based Binary Fuzzing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {627--637}, doi = {}, year = {2017}, } |
|
Chaparro, Oscar |
ESEC/FSE '17: "Detecting Missing Information ..."
Detecting Missing Information in Bug Descriptions
Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng (University of Texas at Dallas, USA; University of Sannio, Italy; Colorado State University, USA; University of Lugano, Switzerland) Bug reports document unexpected software behaviors experienced by users. To be effective, they should allow bug triagers to easily understand and reproduce the potential reported bugs, by clearly describing the Observed Behavior (OB), the Steps to Reproduce (S2R), and the Expected Behavior (EB). Unfortunately, while considered extremely useful, reporters often miss such pieces of information in bug reports and, to date, there is no effective way to automatically check and enforce their presence. We manually analyzed nearly 3k bug reports to understand to what extent OB, EB, and S2R are reported in bug reports and what discourse patterns reporters use to describe such information. We found that (i) while most reports contain OB (i.e., 93.5%), only 35.2% and 51.4% explicitly describe EB and S2R, respectively; and (ii) reporters recurrently use 154 discourse patterns to describe such content. Based on these findings, we designed and evaluated an automated approach to detect the absence (or presence) of EB and S2R in bug descriptions. With its best setting, our approach is able to detect missing EB (S2R) with 85.9% (69.2%) average precision and 93.2% (83%) average recall. Our approach intends to improve bug descriptions quality by alerting reporters about missing EB and S2R at reporting time. @InProceedings{ESEC/FSE17p396, author = {Oscar Chaparro and Jing Lu and Fiorella Zampetti and Laura Moreno and Massimiliano Di Penta and Andrian Marcus and Gabriele Bavota and Vincent Ng}, title = {Detecting Missing Information in Bug Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {396--407}, doi = {}, year = {2017}, } |
|
Chatley, Robert |
ESEC/FSE '17-IND: "Serverless Computing: Economic ..."
Serverless Computing: Economic and Architectural Impact
Gojko Adzic and Robert Chatley (Neuri Consulting, UK; Imperial College London, UK) Amazon Web Services unveiled their ‘Lambda’ platform in late 2014. Since then, each of the major cloud computing infrastructure providers has released services supporting a similar style of deployment and operation, where rather than deploying and running monolithic services, or dedicated virtual machines, users are able to deploy individual functions, and pay only for the time that their code is actually executing. These technologies are gathered together under the marketing term ‘serverless’ and the providers suggest that they have the potential to significantly change how client/server applications are designed, developed and operated. This paper presents two case industrial studies of early adopters, showing how migrating an application to the Lambda deployment architecture reduced hosting costs – by between 66% and 95% – and discusses how further adoption of this trend might influence common software architecture design practices. @InProceedings{ESEC/FSE17p884, author = {Gojko Adzic and Robert Chatley}, title = {Serverless Computing: Economic and Architectural Impact}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {884--889}, doi = {}, year = {2017}, } |
|
Chattopadhyay, Sudipta |
ESEC/FSE '17: "Where Is the Bug and How Is ..."
Where Is the Bug and How Is It Fixed? An Experiment with Practitioners
Marcel Böhme, Ezekiel O. Soremekun, Sudipta Chattopadhyay, Emamurho Ugherughe, and Andreas Zeller (National University of Singapore, Singapore; Saarland University, Germany; Singapore University of Technology and Design, Singapore; SAP, Germany) Research has produced many approaches to automatically locate, explain, and repair software bugs. But do these approaches relate to the way practitioners actually locate, understand, and fix bugs? To help answer this question, we have collected a dataset named DBGBENCH --- the correct fault locations, bug diagnoses, and software patches of 27 real errors in open-source C projects that were consolidated from hundreds of debugging sessions of professional software engineers. Moreover, we shed light on the entire debugging process, from constructing a hypothesis to submitting a patch, and how debugging time, difficulty, and strategies vary across practitioners and types of errors. Most notably, DBGBENCH can serve as reality check for novel automated debugging and repair techniques. @InProceedings{ESEC/FSE17p117, author = {Marcel Böhme and Ezekiel O. Soremekun and Sudipta Chattopadhyay and Emamurho Ugherughe and Andreas Zeller}, title = {Where Is the Bug and How Is It Fixed? An Experiment with Practitioners}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {117--128}, doi = {}, year = {2017}, } Info Artifacts Reusable |
|
Chaudhuri, Swarat |
ESEC/FSE '17: "Bayesian Specification Learning ..."
Bayesian Specification Learning for Finding API Usage Errors
Vijayaraghavan Murali, Swarat Chaudhuri, and Chris Jermaine (Rice University, USA) We present a Bayesian framework for learning probabilistic specifications from large, unstructured code corpora, and then using these specifications to statically detect anomalous, hence likely buggy, program behavior. Our key insight is to build a statistical model that correlates all specifications hidden inside a corpus with the syntax and observed behavior of programs that implement these specifications. During the analysis of a particular program, this model is conditioned into a posterior distribution that prioritizes specifications that are relevant to the program. The problem of finding anomalies is now framed quantitatively, as a problem of computing a distance between a "reference distribution" over program behaviors that our model expects from the program, and the distribution over behaviors that the program actually produces. We implement our ideas in a system, called Salento, for finding anomalous API usage in Android programs. Salento learns specifications using a combination of a topic model and a neural network model. Our encouraging experimental results show that the system can automatically discover subtle errors in Android applications in the wild, and has high precision and recall compared to competing probabilistic approaches. @InProceedings{ESEC/FSE17p151, author = {Vijayaraghavan Murali and Swarat Chaudhuri and Chris Jermaine}, title = {Bayesian Specification Learning for Finding API Usage Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {151--162}, doi = {}, year = {2017}, } |
|
Chávez, Alexander |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects
Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo Sousa, Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez (PUC-Rio, Brazil; Federal University of Campina Grande, Brazil; Federal University of Alagoas, Brazil) Code smells in a program represent indications of structural quality problems, which can be addressed by software refactoring. However, refactoring intends to achieve different goals in practice, and its application may not reduce smelly structures. Developers may neglect or end up creating new code smells through refactoring. Unfortunately, little has been reported about the beneficial and harmful effects of refactoring on code smells. This paper reports a longitudinal study intended to address this gap. We analyze how often commonly-used refactoring types affect the density of 13 types of code smells along the version histories of 23 projects. Our findings are based on the analysis of 16,566 refactorings distributed in 10 different types. Even though 79.4% of the refactorings touched smelly elements, 57% did not reduce their occurrences. Surprisingly, only 9.7% of refactorings removed smells, while 33.3% induced the introduction of new ones. More than 95% of such refactoring-induced smells were not removed in successive commits, which suggest refactorings tend to more frequently introduce long-living smells instead of eliminating existing ones. We also characterized and quantified typical refactoring-smell patterns, and observed that harmful patterns are frequent, including: (i) approximately 30% of the Move Method and Pull Up Method refactorings induced the emergence of God Class, and (ii) the Extract Superclass refactoring creates the smell Speculative Generality in 68% of the cases. @InProceedings{ESEC/FSE17p465, author = {Diego Cedrim and Alessandro Garcia and Melina Mongiovi and Rohit Gheyi and Leonardo Sousa and Rafael de Mello and Baldoino Fonseca and Márcio Ribeiro and Alexander Chávez}, title = {Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {465--475}, doi = {}, year = {2017}, } Info |
|
Chen, Bihuan |
ESEC/FSE '17: "Loopster: Static Loop Termination ..."
Loopster: Static Loop Termination Analysis
Xiaofei Xie, Bihuan Chen, Liang Zou, Shang-Wei Lin, Yang Liu, and Xiaohong Li (Tianjin University, China; Nanyang Technological University, Singapore) Loop termination is an important problem for proving the correctness of a system and ensuring that the system always reacts. Existing loop termination analysis techniques mainly depend on the synthesis of ranking functions, which is often expensive. In this paper, we present a novel approach, named Loopster, which performs an efficient static analysis to decide the termination for loops based on path termination analysis and path dependency reasoning. Loopster adopts a divide-and-conquer approach: (1) we extract individual paths from a target multi-path loop and analyze the termination of each path, (2) analyze the dependencies between each two paths, and then (3) determine the overall termination of the target loop based on the relations among paths. We evaluate Loopster by applying it on the loop termination competition benchmark and three real-world projects. The results show that Loopster is effective in a majority of loops with better accuracy and 20 ×+ performance improvement compared to the state-of-the-art tools. @InProceedings{ESEC/FSE17p84, author = {Xiaofei Xie and Bihuan Chen and Liang Zou and Shang-Wei Lin and Yang Liu and Xiaohong Li}, title = {Loopster: Static Loop Termination Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {84--94}, doi = {}, year = {2017}, } ESEC/FSE '17: "Steelix: Program-State Based ..." Steelix: Program-State Based Binary Fuzzing Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu (Nanyang Technological University, Singapore; Fudan University, China) Coverage-based fuzzing is one of the most effective techniques to find vulnerabilities, bugs or crashes. However, existing techniques suffer from the difficulty in exercising the paths that are protected by magic bytes comparisons (e.g., string equality comparisons). Several approaches have been proposed to use heavy-weight program analysis to break through magic bytes comparisons, and hence are less scalable. In this paper, we propose a program-state based binary fuzzing approach, named Steelix, which improves the penetration power of a fuzzer at the cost of an acceptable slow down of the execution speed. In particular, we use light-weight static analysis and binary instrumentation to provide not only coverage information but also comparison progress information to a fuzzer. Such program state information informs a fuzzer about where the magic bytes are located in the test input and how to perform mutations to match the magic bytes efficiently. We have implemented Steelix and evaluated it on three datasets: LAVA-M dataset, DARPA CGC sample binaries and five real-life programs. The results show that Steelix has better code coverage and bug detection capability than the state-of-the-art fuzzers. Moreover, we found one CVE and nine new bugs. @InProceedings{ESEC/FSE17p627, author = {Yuekang Li and Bihuan Chen and Mahinthan Chandramohan and Shang-Wei Lin and Yang Liu and Alwen Tiu}, title = {Steelix: Program-State Based Binary Fuzzing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {627--637}, doi = {}, year = {2017}, } |
|
Chen, Qingying |
ESEC/FSE '17: "On the Scalability of Linux ..."
On the Scalability of Linux Kernel Maintainers' Work
Minghui Zhou, Qingying Chen, Audris Mockus, and Fengguang Wu (Peking University, China; University of Tennessee, USA; Intel, China) Open source software ecosystems evolve ways to balance the workload among groups of participants ranging from core groups to peripheral groups. As ecosystems grow, it is not clear whether the mechanisms that previously made them work will continue to be relevant or whether new mechanisms will need to evolve. The impact of failure for critical ecosystems such as Linux is enormous, yet the understanding of why they function and are effective is limited. We, therefore, aim to understand how the Linux kernel sustains its growth, how to characterize the workload of maintainers, and whether or not the existing mechanisms are scalable. We quantify maintainers’ work through the files that are maintained, and the change activity and the numbers of contributors in those files. We find systematic differences among modules; these differences are stable over time, which suggests that certain architectural features, commercial interests, or module-specific practices lead to distinct sustainable equilibria. We find that most of the modules have not grown appreciably over the last decade; most growth has been absorbed by a few modules. We also find that the effort per maintainer does not increase, even though the community has hypothesized that required effort might increase. However, the distribution of work among maintainers is highly unbalanced, suggesting that a few maintainers may experience increasing workload. We find that the practice of assigning multiple maintainers to a file yields only a power of 1/2 increase in productivity. We expect that our proposed framework to quantify maintainer practices will help clarify the factors that allow rapidly growing ecosystems to be sustainable. @InProceedings{ESEC/FSE17p27, author = {Minghui Zhou and Qingying Chen and Audris Mockus and Fengguang Wu}, title = {On the Scalability of Linux Kernel Maintainers' Work}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {27--37}, doi = {}, year = {2017}, } Info |
|
Chen, Yuting |
ESEC/FSE '17: "Guided, Stochastic Model-Based ..."
Guided, Stochastic Model-Based GUI Testing of Android Apps
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su (East China Normal University, China; Nanyang Technological University, Singapore; Shanghai Jiao Tong University, China; University of California at Davis, USA) Mobile apps are ubiquitous, operate in complex environments and are developed under the time-to-market pressure. Ensuring their correctness and reliability thus becomes an important challenge. This paper introduces Stoat, a novel guided approach to perform stochastic model-based testing on Android apps. Stoat operates in two phases: (1) Given an app as input, it uses dynamic analysis enhanced by a weighted UI exploration strategy and static analysis to reverse engineer a stochastic model of the app's GUI interactions; and (2) it adapts Gibbs sampling to iteratively mutate/refine the stochastic model and guides test generation from the mutated models toward achieving high code and model coverage and exhibiting diverse sequences. During testing, system-level events are randomly injected to further enhance the testing effectiveness. Stoat was evaluated on 93 open-source apps. The results show (1) the models produced by Stoat cover 17~31% more code than those by existing modeling tools; (2) Stoat detects 3X more unique crashes than two state-of-the-art testing tools, Monkey and Sapienz. Furthermore, Stoat tested 1661 most popular Google Play apps, and detected 2110 previously unknown and unique crashes. So far, 43 developers have responded that they are investigating our reports. 20 of reported crashes have been confirmed, and 8 already fixed. @InProceedings{ESEC/FSE17p245, author = {Ting Su and Guozhu Meng and Yuting Chen and Ke Wu and Weiming Yang and Yao Yao and Geguang Pu and Yang Liu and Zhendong Su}, title = {Guided, Stochastic Model-Based GUI Testing of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {245--256}, doi = {}, year = {2017}, } |
|
Chen, Zhenbang |
ESEC/FSE '17-DEMO: "RGSE: A Regular Property Guided ..."
RGSE: A Regular Property Guided Symbolic Executor for Java
Hengbiao Yu, Zhenbang Chen, Yufeng Zhang, Ji Wang, and Wei Dong (National University of Defense Technology, China; Jiangnan Institute of Computing, China) It is challenging to effectively check a regular property of a program. This paper presents RGSE, a regular property guided dynamic symbolic execution (DSE) engine, for finding a program path satisfying a regular property as soon as possible. The key idea is to evaluate the candidate branches based on the history and future information, and explore the branches along which the paths are more likely to satisfy the property in priority. We have applied RGSE to 16 real-world open source Java programs, totaling 270K lines of code. Compared with the state-of-the-art, RGSE achieves two orders of magnitude speedups for finding the first target path. RGSE can benefit many research topics of software testing and analysis, such as path-oriented test case generation, typestate bug finding, and performance tuning. The demo video is at: https://youtu.be/7zAhvRIdaUU, and RGSE can be accessed at: http://jrgse.github.io. @InProceedings{ESEC/FSE17p954, author = {Hengbiao Yu and Zhenbang Chen and Yufeng Zhang and Ji Wang and Wei Dong}, title = {RGSE: A Regular Property Guided Symbolic Executor for Java}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {954--958}, doi = {}, year = {2017}, } Video Info |
|
Cheung, Shing-Chi |
ESEC/FSE '17: "OASIS: Prioritizing Static ..."
OASIS: Prioritizing Static Analysis Warnings for Android Apps Based on App User Reviews
Lili Wei, Yepang Liu, and Shing-Chi Cheung (Hong Kong University of Science and Technology, China) Lint is a widely-used static analyzer for detecting bugs/issues in Android apps. However, it can generate many false warnings. One existing solution to this problem is to leverage project history data (e.g., bug fixing statistics) for warning prioritization. Unfortunately, such techniques are biased toward a project’s archived warnings and can easily miss newissues. Anotherweakness is that developers cannot readily relate the warnings to the impacts perceivable by users. To overcome these weaknesses, in this paper, we propose a semantics-aware approach, OASIS, to prioritizing Lint warnings by leveraging app user reviews. OASIS combines program analysis and NLP techniques to recover the intrinsic links between the Lint warnings for a given app and the user complaints on the app problems caused by the issues of concern. OASIS leverages the strength of such links to prioritize warnings. We evaluated OASIS on six popular and large-scale open-source Android apps. The results show that OASIS can effectively prioritize Lint warnings and help identify new issues that are previously-unknown to app developers. @InProceedings{ESEC/FSE17p672, author = {Lili Wei and Yepang Liu and Shing-Chi Cheung}, title = {OASIS: Prioritizing Static Analysis Warnings for Android Apps Based on App User Reviews}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {672--682}, doi = {}, year = {2017}, } |
|
Christakis, Maria |
ESEC/FSE '17: "Failure-Directed Program Trimming ..."
Failure-Directed Program Trimming
Kostas Ferles, Valentin Wüstholz, Maria Christakis, and Isil Dillig (University of Texas at Austin, USA; University of Kent, UK) This paper describes a new program simplification technique called program trimming that aims to improve the scalability and precision of safety checking tools. Given a program P, program trimming generates a new program P′ such that P and P′ are equi-safe (i.e., P′ has a bug if and only if P has a bug), but P′ has fewer execution paths than P. Since many program analyzers are sensitive to the number of execution paths, program trimming has the potential to improve the effectiveness of safety checking tools. In addition to introducing the concept of program trimming, this paper also presents a lightweight static analysis that can be used as a pre-processing step to remove program paths while retaining equi-safety. We have implemented the proposed technique in a tool called Trimmer and evaluate it in the context of two program analysis techniques, namely abstract interpretation and dynamic symbolic execution. Our experiments show that program trimming significantly improves the effectiveness of both techniques. @InProceedings{ESEC/FSE17p174, author = {Kostas Ferles and Valentin Wüstholz and Maria Christakis and Isil Dillig}, title = {Failure-Directed Program Trimming}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {174--185}, doi = {}, year = {2017}, } |
|
Chu, Duc-Hiep |
ESEC/FSE '17: "S3: Syntax- and Semantic-Guided ..."
S3: Syntax- and Semantic-Guided Repair Synthesis via Programming by Examples
Xuan-Bach D. Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser (Singapore Management University, Singapore; IST Austria, Austria; Carnegie Mellon University, USA; Stellenbosch University, South Africa) A notable class of techniques for automatic program repair is known as semantics-based. Such techniques, e.g., Angelix, infer semantic specifications via symbolic execution, and then use program synthesis to construct new code that satisfies those inferred specifications. However, the obtained specifications are naturally incomplete, leaving the synthesis engine with a difficult task of synthesizing a general solution from a sparse space of many possible solutions that are consistent with the provided specifications but that do not necessarily generalize. We present S3, a new repair synthesis engine that leverages programming-by-examples methodology to synthesize high-quality bug repairs. The novelty in S3 that allows it to tackle the sparse search space to create more general repairs is three-fold: (1) A systematic way to customize and constrain the syntactic search space via a domain-specific language, (2) An efficient enumeration- based search strategy over the constrained search space, and (3) A number of ranking features based on measures of the syntactic and semantic distances between candidate solutions and the original buggy program. We compare S3’s repair effectiveness with state-of-the-art synthesis engines Angelix, Enumerative, and CVC4. S3 can successfully and correctly fix at least three times more bugs than the best baseline on datasets of 52 bugs in small programs, and 100 bugs in real-world large programs. @InProceedings{ESEC/FSE17p593, author = {Xuan-Bach D. Le and Duc-Hiep Chu and David Lo and Claire Le Goues and Willem Visser}, title = {S3: Syntax- and Semantic-Guided Repair Synthesis via Programming by Examples}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {593--604}, doi = {}, year = {2017}, } |
|
Coelho, Jailton |
ESEC/FSE '17: "Why Modern Open Source Projects ..."
Why Modern Open Source Projects Fail
Jailton Coelho and Marco Tulio Valente (Federal University of Minas Gerais, Brazil) Open source is experiencing a renaissance period, due to the appearance of modern platforms and workflows for developing and maintaining public code. As a result, developers are creating open source software at speeds never seen before. Consequently, these projects are also facing unprecedented mortality rates. To better understand the reasons for the failure of modern open source projects, this paper describes the results of a survey with the maintainers of 104 popular GitHub systems that have been deprecated. We provide a set of nine reasons for the failure of these open source projects. We also show that some maintenance practices---specifically the adoption of contributing guidelines and continuous integration---have an important association with a project failure or success. Finally, we discuss and reveal the principal strategies developers have tried to overcome the failure of the studied projects. @InProceedings{ESEC/FSE17p186, author = {Jailton Coelho and Marco Tulio Valente}, title = {Why Modern Open Source Projects Fail}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {186--196}, doi = {}, year = {2017}, } |
|
Cornejo, César |
ESEC/FSE '17-DEMO: "DynAlloy Analyzer: A Tool ..."
DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour
Germán Regis, César Cornejo, Simón Gutiérrez Brida, Mariano Politano, Fernando Raverta, Pablo Ponzio, Nazareno Aguirre, Juan Pablo Galeotti, and Marcelo Frias (Universidad Nacional de Río Cuarto, Argentina; Universidad Nacional de Córdoba, Argentina; University of Buenos Aires, Argentina; Buenos Aires Institute of Technology, Argentina) We describe DynAlloy Analyzer, a tool that extends Alloy Analyzer with support for dynamic elements in Alloy models. The tool builds upon Alloy Analyzer in a way that makes it fully compatible with Alloy models, and extends their syntax with a particular idiom, inspired in dynamic logic, for the description of dynamic behaviours, understood as sequences of states over standard Alloy models, in terms of programs. The syntax is broad enough to accommodate abstract dynamic behaviours, e.g., using nondeterministic choice and finite unbounded iteration, as well as more concrete ones, using standard sequential programming constructions. The analysis of DynAlloy models resorts to the analysis of Alloy models, through an optimized translation that often makes the analysis more efficient than that of typical ad-hoc constructions to capture dynamism in Alloy. Tool screencast, binaries and further details available in: http://dc.exa.unrc.edu.ar/tools/dynalloy @InProceedings{ESEC/FSE17p969, author = {Germán Regis and César Cornejo and Simón Gutiérrez Brida and Mariano Politano and Fernando Raverta and Pablo Ponzio and Nazareno Aguirre and Juan Pablo Galeotti and Marcelo Frias}, title = {DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {969--973}, doi = {}, year = {2017}, } Video Info |
|
Creeten, Christophe |
ESEC/FSE '17-IND: "Model-Driven Software Engineering ..."
Model-Driven Software Engineering in Practice: Privacy-Enhanced Filtering of Network Traffic
Roel van Dijk, Christophe Creeten, Jeroen van der Ham, and Jeroen van den Bos (Netherlands Forensic Institute, Netherlands; Delft University of Technology, Netherlands; National Cyber Security Centre, Netherlands; Zuyd University of Applied Sciences, Netherlands) Network traffic data contains a wealth of information for use in security analysis and application development. Unfortunately, it also usually contains confidential or otherwise sensitive information, prohibiting sharing and analysis. Existing automated anonymization solutions are hard to maintain and tend to be outdated. We present Privacy-Enhanced Filtering (PEF), a model-driven prototype framework that relies on declarative descriptions of protocols and a set of filter rules, which are used to automatically transform network traffic data to remove sensitive information. This paper discusses the design, implementation and application of PEF, which is available as open-source software and configured for use in a typical malware detection scenario. @InProceedings{ESEC/FSE17p860, author = {Roel van Dijk and Christophe Creeten and Jeroen van der Ham and Jeroen van den Bos}, title = {Model-Driven Software Engineering in Practice: Privacy-Enhanced Filtering of Network Traffic}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {860--865}, doi = {}, year = {2017}, } |
|
Dahlke, Jannik |
ESEC/FSE '17-DEMO: "From Scenario Modeling to ..."
From Scenario Modeling to Scenario Programming for Reactive Systems with Dynamic Topology
Joel Greenyer, Daniel Gritzner, Florian König, Jannik Dahlke, Jianwei Shi, and Eric Wete (Leibniz Universität Hannover, Germany) Software-intensive systems often consist of cooperating reactive components. In mobile and reconfigurable systems, their topology changes at run-time, which influences how the components must cooperate. The Scenario Modeling Language (SML) offers a formal approach for specifying the reactive behavior such systems that aligns with how humans conceive and communicate behavioral requirements. Simulation and formal checks can find specification flaws early. We present a framework for the Scenario-based Programming (SBP) that reflects the concepts of SML in Java and makes the scenario modeling approach available for programming. SBP code can also be generated from SML and extended with platform-specific code, thus streamlining the transition from design to implementation. As an example serves a car-to-x communication system. Demo video and artifact: http://scenariotools.org/esecfse-2017-tool-demo/ @InProceedings{ESEC/FSE17p974, author = {Joel Greenyer and Daniel Gritzner and Florian König and Jannik Dahlke and Jianwei Shi and Eric Wete}, title = {From Scenario Modeling to Scenario Programming for Reactive Systems with Dynamic Topology}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {974--978}, doi = {}, year = {2017}, } Video Info |
|
Danas, Natasha |
ESEC/FSE '17: "The Power of "Why" ..."
The Power of "Why" and "Why Not": Enriching Scenario Exploration with Provenance
Tim Nelson, Natasha Danas, Daniel J. Dougherty, and Shriram Krishnamurthi (Brown University, USA; Worcester Polytechnic Institute, USA) Scenario-finding tools like the Alloy Analyzer are widely used in numerous concrete domains like security, network analysis, UML analysis, and so on. They can help to verify properties and, more generally, aid in exploring a system's behavior. While scenario finders are valuable for their ability to produce concrete examples, individual scenarios only give insight into what is possible, leaving the user to make their own conclusions about what might be necessary. This paper enriches scenario finding by allowing users to ask ``why?'' and ``why not?'' questions about the examples they are given. We show how to distinguish parts of an example that cannot be consistently removed (or changed) from those that merely reflect underconstraint in the specification. In the former case we show how to determine which elements of the specification and which other components of the example together explain the presence of such facts. This paper formalizes the act of computing provenance in scenario-finding. We present Amalgam, an extension of the popular Alloy scenario-finder, which implements these foundations and provides interactive exploration of examples. We also evaluate Amalgam's algorithmics on a variety of both textbook and real-world examples. @InProceedings{ESEC/FSE17p106, author = {Tim Nelson and Natasha Danas and Daniel J. Dougherty and Shriram Krishnamurthi}, title = {The Power of "Why" and "Why Not": Enriching Scenario Exploration with Provenance}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {106--116}, doi = {}, year = {2017}, } Info Artifacts Reusable Best-Paper Award |
|
D'Antoni, Loris |
ESEC/FSE '17: "NoFAQ: Synthesizing Command ..."
NoFAQ: Synthesizing Command Repairs from Examples
Loris D'Antoni, Rishabh Singh, and Michael Vaughn (University of Wisconsin-Madison, USA; Microsoft Research, USA) Command-line tools are confusing and hard to use due to their cryptic error messages and lack of documentation. Novice users often resort to online help-forums for finding corrections to their buggy commands, but have a hard time in searching precisely for posts that are relevant to their problem and then applying the suggested solutions to their buggy command. We present NoFAQ, a tool that uses a set of rules to suggest possible fixes when users write buggy commands that trigger commonly occurring errors. The rules are expressed in a language called FIXIT and each rule pattern-matches against the user's buggy command and corresponding error message, and uses these inputs to produce a possible fixed command. NoFAQ automatically learns FIXIT rules from examples of buggy and repaired commands. We evaluate NoFAQ on two fronts. First, we use 92 benchmark problems drawn from an existing tool and show that NoFAQ is able to synthesize rules for 81 benchmark problems in real time using just 2 to 5 input-output examples for each rule. Second, we run our learning algorithm on the examples obtained through a crowd-sourcing interface and show that the learning algorithm scales to large sets of examples. @InProceedings{ESEC/FSE17p582, author = {Loris D'Antoni and Rishabh Singh and Michael Vaughn}, title = {NoFAQ: Synthesizing Command Repairs from Examples}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {582--592}, doi = {}, year = {2017}, } |
|
Davidson, Drew |
ESEC/FSE '17: "Cimplifier: Automatically ..."
Cimplifier: Automatically Debloating Containers
Vaibhav Rastogi, Drew Davidson, Lorenzo De Carli, Somesh Jha, and Patrick McDaniel (University of Wisconsin-Madison, USA; Tala Security, USA; Colorado State University, USA; Pennsylvania State University, USA) Application containers, such as those provided by Docker, have recently gained popularity as a solution for agile and seamless software deployment. These light-weight virtualization environments run applications that are packed together with their resources and configuration information, and thus can be deployed across various software platforms. Unfortunately, the ease with which containers can be created is oftentimes a double-edged sword, encouraging the packaging of logically distinct applications, and the inclusion of significant amount of unnecessary components, within a single container. These practices needlessly increase the container size—sometimes by orders of magnitude. They also decrease the overall security, as each included component—necessary or not—may bring in security issues of its own, and there is no isolation between multiple applications packaged within the same container image. We propose algorithms and a tool called Cimplifier, which address these concerns: given a container and simple user-defined constraints, our tool partitions it into simpler containers, which (i) are isolated from each other, only communicating as necessary, and (ii) only include enough resources to perform their functionality. Our evaluation on real-world containers demonstrates that Cimplifier preserves the original functionality, leads to reduction in image size of up to 95%, and processes even large containers in under thirty seconds. @InProceedings{ESEC/FSE17p476, author = {Vaibhav Rastogi and Drew Davidson and Lorenzo De Carli and Somesh Jha and Patrick McDaniel}, title = {Cimplifier: Automatically Debloating Containers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {476--486}, doi = {}, year = {2017}, } |
|
Debreceni, Csaba |
ESEC/FSE '17-DEMO: "The MONDO Collaboration Framework: ..."
The MONDO Collaboration Framework: Secure Collaborative Modeling over Existing Version Control Systems
Csaba Debreceni, Gábor Bergmann, Márton Búr, István Ráth, and Dániel Varró (Budapest University of Technology and Economics, Hungary; MTA-BME Lendület Research Group on Cyber-Physical Systems, Hungary; McGill University, Canada) Model-based systems engineering of critical cyber-physical systems necessitates effective collaboration between different stakeholders while still providing secure protection of intellectual properties of all involved parties. While engineering artifacts are frequently stored in version control repositories, secure access control is limited to file-level strategies in most existing frameworks where models are split into multiple fragments with all-or-nothing permissions, which becomes a scalability and usability bottleneck in case of complex industrial models. In this paper, we introduce the MONDO Collaboration Framework, which provides rule-based fine-grained model-level secure access control, property-based locking and automated model merge integrated over existing version control systems such as Subversion (SVN) for storage and version control. Our framework simultaneously supports offline collaboration (asynchronous checkout-modify-commit) on top of off-the-shelf modeling tools and online scenarios (GoogleDocs-style short transactions) scenarios by offering a web-based modeling frontend. Screencast Demo: https://youtu.be/Ix3CgmsYIU0 @InProceedings{ESEC/FSE17p984, author = {Csaba Debreceni and Gábor Bergmann and Márton Búr and István Ráth and Dániel Varró}, title = {The MONDO Collaboration Framework: Secure Collaborative Modeling over Existing Version Control Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {984--988}, doi = {}, year = {2017}, } |
|
De Carli, Lorenzo |
ESEC/FSE '17: "Cimplifier: Automatically ..."
Cimplifier: Automatically Debloating Containers
Vaibhav Rastogi, Drew Davidson, Lorenzo De Carli, Somesh Jha, and Patrick McDaniel (University of Wisconsin-Madison, USA; Tala Security, USA; Colorado State University, USA; Pennsylvania State University, USA) Application containers, such as those provided by Docker, have recently gained popularity as a solution for agile and seamless software deployment. These light-weight virtualization environments run applications that are packed together with their resources and configuration information, and thus can be deployed across various software platforms. Unfortunately, the ease with which containers can be created is oftentimes a double-edged sword, encouraging the packaging of logically distinct applications, and the inclusion of significant amount of unnecessary components, within a single container. These practices needlessly increase the container size—sometimes by orders of magnitude. They also decrease the overall security, as each included component—necessary or not—may bring in security issues of its own, and there is no isolation between multiple applications packaged within the same container image. We propose algorithms and a tool called Cimplifier, which address these concerns: given a container and simple user-defined constraints, our tool partitions it into simpler containers, which (i) are isolated from each other, only communicating as necessary, and (ii) only include enough resources to perform their functionality. Our evaluation on real-world containers demonstrates that Cimplifier preserves the original functionality, leads to reduction in image size of up to 95%, and processes even large containers in under thirty seconds. @InProceedings{ESEC/FSE17p476, author = {Vaibhav Rastogi and Drew Davidson and Lorenzo De Carli and Somesh Jha and Patrick McDaniel}, title = {Cimplifier: Automatically Debloating Containers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {476--486}, doi = {}, year = {2017}, } |
|
Degiovanni, Renzo |
ESEC/FSE '17-DEMO: "CLTSA: Labelled Transition ..."
CLTSA: Labelled Transition System Analyser with Counting Fluent Support
Germán Regis, Renzo Degiovanni, Nicolas D'Ippolito, and Nazareno Aguirre (Universidad Nacional de Río Cuarto, Argentina; CONICET, Argentina; University of Buenos Aires, Argentina) In this paper we present CLTSA (Counting Fluents Labelled Transition System Analyser), an extension of LTSA (Labelled Transition System Analyser) that incorporates counting fluents, a useful mechanism to capture properties related to counting events. Counting fluent temporal logic is a formalism for specifying properties of event-based systems, which complements the notion of fluent by the related concept of counting fluent. While fluents allow us to capture boolean properties of the behaviour of a reactive system, counting fluents are numerical values, that enumerate event occurrences. The tool supports a superset of FSP (Finite State Processes), that allows one to define LTL properties involving counting fluents, which can be model checked on FSP processes. Detailed information can be found at http://dc.exa.unrc.edu.ar/tools/cltsa. @InProceedings{ESEC/FSE17p979, author = {Germán Regis and Renzo Degiovanni and Nicolas D'Ippolito and Nazareno Aguirre}, title = {CLTSA: Labelled Transition System Analyser with Counting Fluent Support}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {979--983}, doi = {}, year = {2017}, } Video Info |
|
DeLong, Lois |
ESEC/FSE '17: "Understanding Misunderstandings ..."
Understanding Misunderstandings in Source Code
Dan Gopstein, Jake Iannacone, Yu Yan, Lois DeLong, Yanyan Zhuang, Martin K.-C. Yeh, and Justin Cappos (New York University, USA; Pennsylvania State University, USA; University of Colorado at Colorado Springs, USA) Humans often mistake the meaning of source code, and so misjudge a program's true behavior. These mistakes can be caused by extremely small, isolated patterns in code, which can lead to significant runtime errors. These patterns are used in large, popular software projects and even recommended in style guides. To identify code patterns that may confuse programmers we extracted a preliminary set of `atoms of confusion' from known confusing code. We show empirically in an experiment with 73 participants that these code patterns can lead to a significantly increased rate of misunderstanding versus equivalent code without the patterns. We then go on to take larger confusing programs and measure (in an experiment with 43 participants) the impact, in terms of programmer confusion, of removing these confusing patterns. All of our instruments, analysis code, and data are publicly available online for replication, experimentation, and feedback. @InProceedings{ESEC/FSE17p129, author = {Dan Gopstein and Jake Iannacone and Yu Yan and Lois DeLong and Yanyan Zhuang and Martin K.-C. Yeh and Justin Cappos}, title = {Understanding Misunderstandings in Source Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {129--139}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
De Mello, Rafael |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects
Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo Sousa, Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez (PUC-Rio, Brazil; Federal University of Campina Grande, Brazil; Federal University of Alagoas, Brazil) Code smells in a program represent indications of structural quality problems, which can be addressed by software refactoring. However, refactoring intends to achieve different goals in practice, and its application may not reduce smelly structures. Developers may neglect or end up creating new code smells through refactoring. Unfortunately, little has been reported about the beneficial and harmful effects of refactoring on code smells. This paper reports a longitudinal study intended to address this gap. We analyze how often commonly-used refactoring types affect the density of 13 types of code smells along the version histories of 23 projects. Our findings are based on the analysis of 16,566 refactorings distributed in 10 different types. Even though 79.4% of the refactorings touched smelly elements, 57% did not reduce their occurrences. Surprisingly, only 9.7% of refactorings removed smells, while 33.3% induced the introduction of new ones. More than 95% of such refactoring-induced smells were not removed in successive commits, which suggest refactorings tend to more frequently introduce long-living smells instead of eliminating existing ones. We also characterized and quantified typical refactoring-smell patterns, and observed that harmful patterns are frequent, including: (i) approximately 30% of the Move Method and Pull Up Method refactorings induced the emergence of God Class, and (ii) the Extract Superclass refactoring creates the smell Speculative Generality in 68% of the cases. @InProceedings{ESEC/FSE17p465, author = {Diego Cedrim and Alessandro Garcia and Melina Mongiovi and Rohit Gheyi and Leonardo Sousa and Rafael de Mello and Baldoino Fonseca and Márcio Ribeiro and Alexander Chávez}, title = {Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {465--475}, doi = {}, year = {2017}, } Info |
|
Deng, Yuetang |
ESEC/FSE '17-IND: "Record and Replay for Android: ..."
Record and Replay for Android: Are We There Yet in Industrial Cases?
Wing Lam, Zhengkai Wu, Dengfeng Li, Wenyu Wang, Haibing Zheng, Hui Luo, Peng Yan, Yuetang Deng, and Tao Xie (University of Illinois at Urbana-Champaign, USA; Tencent, China) Mobile applications, or apps for short, are gaining popularity. The input sources (e.g., touchscreen, sensors, transmitters) of the smart devices that host these apps enable the apps to offer a rich experience to the users, but these input sources pose testing complications to the developers (e.g., writing tests to accurately utilize multiple input sources together and be able to replay such tests at a later time). To alleviate these complications, researchers and practitioners in recent years have developed a variety of record-and-replay tools to support the testing expressiveness of smart devices. These tools allow developers to easily record and automate the replay of complicated usage scenarios of their app. Due to Android's large share of the smart-device market, numerous record-and-replay tools have been developed using a variety of techniques to test Android apps. To better understand the strengths and weaknesses of these tools, we present a comparison of popular record-and-replay tools from researchers and practitioners, by applying these tools to test three popular industrial apps downloaded from the Google Play store. Our comparison is based on three main metrics: (1) ability to reproduce common usage scenarios, (2) space overhead of traces created by the tools, and (3) robustness of traces created by the tools (when being replayed on devices with different resolutions). The results from our comparison show which record-and-replay tools may be the best for developers and identify future directions for improving these tools to better address testing complications of smart devices. @InProceedings{ESEC/FSE17p854, author = {Wing Lam and Zhengkai Wu and Dengfeng Li and Wenyu Wang and Haibing Zheng and Hui Luo and Peng Yan and Yuetang Deng and Tao Xie}, title = {Record and Replay for Android: Are We There Yet in Industrial Cases?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {854--859}, doi = {}, year = {2017}, } |
|
De Oliveira, Marcos César |
ESEC/FSE '17-DOC: "DRACO: Discovering Refactorings ..."
DRACO: Discovering Refactorings That Improve Architecture using Fine-Grained Co-change Dependencies
Marcos César de Oliveira (University of Brasília, Brazil) Co-change dependencies arise whenever two source code entities, such as classes, methods, or fields, change frequently together. Similar to other kinds of software dependencies, it is possible to build software clusters from co-change relationships and, as such, previous studies explored the use of this kind of dependency in several software engineering tasks, such as predicting software faults, recommending related source code changes, and assessing software modularity. In this ongoing work, our goal is to provide tools to discover refactoring opportunities—such as move method, move field, split class, or merge classes-that are revealed when comparing the co-change clusters of fine-grained source code entities (methods, fields, constructors) to the original class decomposition; specifically when a source code entity is in the same class but in different clusters (or vice-versa). Our approach, named Draco, aims to produce minimal refactoring sequences that improve architecture decomposition. @InProceedings{ESEC/FSE17p1018, author = {Marcos César de Oliveira}, title = {DRACO: Discovering Refactorings That Improve Architecture using Fine-Grained Co-change Dependencies}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1018--1021}, doi = {}, year = {2017}, } |
|
Devanbu, Premkumar |
ESEC/FSE '17: "Are Deep Neural Networks the ..."
Are Deep Neural Networks the Best Choice for Modeling Source Code?
Vincent J. Hellendoorn and Premkumar Devanbu (University of California at Davis, USA) Current statistical language modeling techniques, including deep-learning based models, have proven to be quite effective for source code. We argue here that the special properties of source code can be exploited for further improvements. In this work, we enhance established language modeling approaches to handle the special challenges of modeling source code, such as: frequent changes, larger, changing vocabularies, deeply nested scopes, etc. We present a fast, nested language modeling toolkit specifically designed for software, with the ability to add & remove text, and mix & swap out many models. Specifically, we improve upon prior cache-modeling work and present a model with a much more expansive, multi-level notion of locality that we show to be well-suited for modeling software. We present results on varying corpora in comparison with traditional N-gram, as well as RNN, and LSTM deep-learning language models, and release all our source code for public use. Our evaluations suggest that carefully adapting N-gram models for source code can yield performance that surpasses even RNN and LSTM based deep-learning models. @InProceedings{ESEC/FSE17p763, author = {Vincent J. Hellendoorn and Premkumar Devanbu}, title = {Are Deep Neural Networks the Best Choice for Modeling Source Code?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {763--773}, doi = {}, year = {2017}, } Info ESEC/FSE '17: "Recovering Clear, Natural ..." Recovering Clear, Natural Identifiers from Obfuscated JS Names Bogdan Vasilescu, Casey Casalnuovo, and Premkumar Devanbu (Carnegie Mellon University, USA; University of California at Davis, USA) Well-chosen variable names are critical to source code readability, reusability, and maintainability. Unfortunately, in deployed JavaScript code (which is ubiquitous on the web) the identifier names are frequently minified and overloaded. This is done both for efficiency and also to protect potentially proprietary intellectual property. In this paper, we describe an approach based on statistical machine translation (SMT) that recovers some of the original names from the JavaScript programs minified by the very popular UglifyJS. This simple tool, Autonym, performs comparably to the best currently available deobfuscator for JavaScript, JSNice, which uses sophisticated static analysis. In fact, Autonym is quite complementary to JSNice, performing well when it does not, and vice versa. We also introduce a new tool, JSNaughty, which blends Autonym and JSNice, and significantly outperforms both at identifier name recovery, while remaining just as easy to use as JSNice. JSNaughty is available online at http://jsnaughty.org. @InProceedings{ESEC/FSE17p683, author = {Bogdan Vasilescu and Casey Casalnuovo and Premkumar Devanbu}, title = {Recovering Clear, Natural Identifiers from Obfuscated JS Names}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {683--693}, doi = {}, year = {2017}, } |
|
Dietsch, Daniel |
ESEC/FSE '17: "Craig vs. Newton in Software ..."
Craig vs. Newton in Software Model Checking
Daniel Dietsch, Matthias Heizmann, Betim Musa, Alexander Nutz, and Andreas Podelski (University of Freiburg, Germany) Ever since the seminal work on SLAM and BLAST, software model checking with counterexample-guided abstraction refinement (CEGAR) has been an active topic of research. The crucial procedure here is to analyze a sequence of program statements (the counterexample) to find building blocks for the overall proof of the program. We can distinguish two approaches (which we name Craig and Newton) to implement the procedure. The historically first approach, Newton (named after the tool from the SLAM toolkit), is based on symbolic execution. The second approach, Craig, is based on Craig interpolation. It was widely believed that Craig is substantially more effective than Newton. In fact, 12 out of the 15 CEGAR-based tools in SV-COMP are based on Craig. Advances in software model checkers based on Craig, however, can go only lockstep with advances in SMT solvers with Craig interpolation. It may be time to revisit Newton and ask whether Newton can be as effective as Craig. We have implemented a total of 11 variants of Craig and Newton in two different state-of-the-art software model checking tools and present the outcome of our experimental comparison. @InProceedings{ESEC/FSE17p487, author = {Daniel Dietsch and Matthias Heizmann and Betim Musa and Alexander Nutz and Andreas Podelski}, title = {Craig vs. Newton in Software Model Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {487--497}, doi = {}, year = {2017}, } |
|
Dig, Danny |
ESEC/FSE '17: "Trade-Offs in Continuous Integration: ..."
Trade-Offs in Continuous Integration: Assurance, Security, and Flexibility
Michael Hilton, Nicholas Nelson, Timothy Tunnell, Darko Marinov, and Danny Dig (Oregon State University, USA; University of Illinois at Urbana-Champaign, USA) Continuous integration (CI) systems automate the compilation, building, and testing of software. Despite CI being a widely used activity in software engineering, we do not know what motivates developers to use CI, and what barriers and unmet needs they face. Without such knowledge, developers make easily avoidable errors, tool builders invest in the wrong direction, and researchers miss opportunities for improving the practice of CI. We present a qualitative study of the barriers and needs developers face when using CI. We conduct semi-structured interviews with developers from different industries and development scales. We triangulate our findings by running two surveys. We find that developers face trade-offs between speed and certainty (Assurance), between better access and information security (Security), and between more configuration options and greater ease of use (Flexi- bility). We present implications of these trade-offs for developers, tool builders, and researchers. @InProceedings{ESEC/FSE17p197, author = {Michael Hilton and Nicholas Nelson and Timothy Tunnell and Darko Marinov and Danny Dig}, title = {Trade-Offs in Continuous Integration: Assurance, Security, and Flexibility}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {197--207}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Dijk, Roel van |
ESEC/FSE '17-IND: "Model-Driven Software Engineering ..."
Model-Driven Software Engineering in Practice: Privacy-Enhanced Filtering of Network Traffic
Roel van Dijk, Christophe Creeten, Jeroen van der Ham, and Jeroen van den Bos (Netherlands Forensic Institute, Netherlands; Delft University of Technology, Netherlands; National Cyber Security Centre, Netherlands; Zuyd University of Applied Sciences, Netherlands) Network traffic data contains a wealth of information for use in security analysis and application development. Unfortunately, it also usually contains confidential or otherwise sensitive information, prohibiting sharing and analysis. Existing automated anonymization solutions are hard to maintain and tend to be outdated. We present Privacy-Enhanced Filtering (PEF), a model-driven prototype framework that relies on declarative descriptions of protocols and a set of filter rules, which are used to automatically transform network traffic data to remove sensitive information. This paper discusses the design, implementation and application of PEF, which is available as open-source software and configured for use in a typical malware detection scenario. @InProceedings{ESEC/FSE17p860, author = {Roel van Dijk and Christophe Creeten and Jeroen van der Ham and Jeroen van den Bos}, title = {Model-Driven Software Engineering in Practice: Privacy-Enhanced Filtering of Network Traffic}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {860--865}, doi = {}, year = {2017}, } |
|
Dillig, Isil |
ESEC/FSE '17: "Failure-Directed Program Trimming ..."
Failure-Directed Program Trimming
Kostas Ferles, Valentin Wüstholz, Maria Christakis, and Isil Dillig (University of Texas at Austin, USA; University of Kent, UK) This paper describes a new program simplification technique called program trimming that aims to improve the scalability and precision of safety checking tools. Given a program P, program trimming generates a new program P′ such that P and P′ are equi-safe (i.e., P′ has a bug if and only if P has a bug), but P′ has fewer execution paths than P. Since many program analyzers are sensitive to the number of execution paths, program trimming has the potential to improve the effectiveness of safety checking tools. In addition to introducing the concept of program trimming, this paper also presents a lightweight static analysis that can be used as a pre-processing step to remove program paths while retaining equi-safety. We have implemented the proposed technique in a tool called Trimmer and evaluate it in the context of two program analysis techniques, namely abstract interpretation and dynamic symbolic execution. Our experiments show that program trimming significantly improves the effectiveness of both techniques. @InProceedings{ESEC/FSE17p174, author = {Kostas Ferles and Valentin Wüstholz and Maria Christakis and Isil Dillig}, title = {Failure-Directed Program Trimming}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {174--185}, doi = {}, year = {2017}, } |
|
Dingel, Juergen |
ESEC/FSE '17: "Model-Level, Platform-Independent ..."
Model-Level, Platform-Independent Debugging in the Context of the Model-Driven Development of Real-Time Systems
Mojtaba Bagherzadeh, Nicolas Hili, and Juergen Dingel (Queen's University, Canada) Providing proper support for debugging models at model-level is one of the main barriers to a broader adoption of Model Driven Development (MDD). In this paper, we focus on the use of MDD for the development of real-time embedded systems (RTE). We introduce a new platform-independent approach to implement model-level debuggers. We describe how to realize support for model-level debugging entirely in terms of the modeling language and show how to implement this support in terms of a model-to-model transformation. Key advantages of the approach over existing work are that (1) it does not require a program debugger for the code generated from the model, and that (2) any changes to, e.g., the code generator, the target language, or the hardware platform leave the debugger completely unaffected. We also describe an implementation of the approach in the context of Papyrus-RT, an open source MDD tool based on the modeling language UML-RT. We summarize the results of the use of our model-based debugger on several use cases to determine its overhead in terms of size and performance. Despite being a prototype, the performance overhead is in the order of microseconds, while the size overhead is comparable with that of GDB, the GNU Debugger. @InProceedings{ESEC/FSE17p419, author = {Mojtaba Bagherzadeh and Nicolas Hili and Juergen Dingel}, title = {Model-Level, Platform-Independent Debugging in the Context of the Model-Driven Development of Real-Time Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {419--430}, doi = {}, year = {2017}, } Video Info Artifacts Functional |
|
Di Penta, Massimiliano |
ESEC/FSE '17: "Enabling Mutation Testing ..."
Enabling Mutation Testing for Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Michele Tufano, Kevin Moran, Massimiliano Di Penta, Christopher Vendome, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (Universidad de los Andes, Colombia; University of Lugano, Switzerland; College of William and Mary, USA; University of Sannio, Italy) Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well as to guide test case generation or prioritization. Empirical studies have shown that, while mutants are generally representative of real faults, an effective application of mutation testing requires “traditional” operators designed for programming languages to be augmented with operators specific to an application domain and/or technology. This paper proposes MDroid+, a framework for effective mutation testing of Android apps. First, we systematically devise a taxonomy of 262 types of Android faults grouped in 14 categories by manually analyzing 2,023 so ware artifacts from different sources (e.g., bug reports, commits). Then, we identified a set of 38 mutation operators, and implemented an infrastructure to automatically seed mutations in Android apps with 35 of the identified operators. The taxonomy and the proposed operators have been evaluated in terms of stillborn/trivial mutants generated as compared to well know mutation tools, and their capacity to represent real faults in Android apps @InProceedings{ESEC/FSE17p233, author = {Mario Linares-Vásquez and Gabriele Bavota and Michele Tufano and Kevin Moran and Massimiliano Di Penta and Christopher Vendome and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Enabling Mutation Testing for Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {233--244}, doi = {}, year = {2017}, } Info ESEC/FSE '17: "Detecting Missing Information ..." Detecting Missing Information in Bug Descriptions Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng (University of Texas at Dallas, USA; University of Sannio, Italy; Colorado State University, USA; University of Lugano, Switzerland) Bug reports document unexpected software behaviors experienced by users. To be effective, they should allow bug triagers to easily understand and reproduce the potential reported bugs, by clearly describing the Observed Behavior (OB), the Steps to Reproduce (S2R), and the Expected Behavior (EB). Unfortunately, while considered extremely useful, reporters often miss such pieces of information in bug reports and, to date, there is no effective way to automatically check and enforce their presence. We manually analyzed nearly 3k bug reports to understand to what extent OB, EB, and S2R are reported in bug reports and what discourse patterns reporters use to describe such information. We found that (i) while most reports contain OB (i.e., 93.5%), only 35.2% and 51.4% explicitly describe EB and S2R, respectively; and (ii) reporters recurrently use 154 discourse patterns to describe such content. Based on these findings, we designed and evaluated an automated approach to detect the absence (or presence) of EB and S2R in bug descriptions. With its best setting, our approach is able to detect missing EB (S2R) with 85.9% (69.2%) average precision and 93.2% (83%) average recall. Our approach intends to improve bug descriptions quality by alerting reporters about missing EB and S2R at reporting time. @InProceedings{ESEC/FSE17p396, author = {Oscar Chaparro and Jing Lu and Fiorella Zampetti and Laura Moreno and Massimiliano Di Penta and Andrian Marcus and Gabriele Bavota and Vincent Ng}, title = {Detecting Missing Information in Bug Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {396--407}, doi = {}, year = {2017}, } |
|
D'Ippolito, Nicolas |
ESEC/FSE '17-DEMO: "CLTSA: Labelled Transition ..."
CLTSA: Labelled Transition System Analyser with Counting Fluent Support
Germán Regis, Renzo Degiovanni, Nicolas D'Ippolito, and Nazareno Aguirre (Universidad Nacional de Río Cuarto, Argentina; CONICET, Argentina; University of Buenos Aires, Argentina) In this paper we present CLTSA (Counting Fluents Labelled Transition System Analyser), an extension of LTSA (Labelled Transition System Analyser) that incorporates counting fluents, a useful mechanism to capture properties related to counting events. Counting fluent temporal logic is a formalism for specifying properties of event-based systems, which complements the notion of fluent by the related concept of counting fluent. While fluents allow us to capture boolean properties of the behaviour of a reactive system, counting fluents are numerical values, that enumerate event occurrences. The tool supports a superset of FSP (Finite State Processes), that allows one to define LTL properties involving counting fluents, which can be model checked on FSP processes. Detailed information can be found at http://dc.exa.unrc.edu.ar/tools/cltsa. @InProceedings{ESEC/FSE17p979, author = {Germán Regis and Renzo Degiovanni and Nicolas D'Ippolito and Nazareno Aguirre}, title = {CLTSA: Labelled Transition System Analyser with Counting Fluent Support}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {979--983}, doi = {}, year = {2017}, } Video Info |
|
Donaldson, Alastair F. |
ESEC/FSE '17: "Cooperative Kernels: GPU Multitasking ..."
Cooperative Kernels: GPU Multitasking for Blocking Algorithms
Tyler Sorensen, Hugues Evrard, and Alastair F. Donaldson (Imperial College London, UK) There is growing interest in accelerating irregular data-parallel algorithms on GPUs. These algorithms are typically blocking, so they require fair scheduling. But GPU programming models (e.g. OpenCL) do not mandate fair scheduling, and GPU schedulers are unfair in practice. Current approaches avoid this issue by exploiting scheduling quirks of today's GPUs in a manner that does not allow the GPU to be shared with other workloads (such as graphics rendering tasks). We propose cooperative kernels, an extension to the traditional GPU programming model geared towards writing blocking algorithms. Workgroups of a cooperative kernel are fairly scheduled, and multitasking is supported via a small set of language extensions through which the kernel and scheduler cooperate. We describe a prototype implementation of a cooperative kernel framework implemented in OpenCL 2.0 and evaluate our approach by porting a set of blocking GPU applications to cooperative kernels and examining their performance under multitasking. Our prototype exploits no vendor-specific hardware, driver or compiler support, thus our results provide a lower-bound on the efficiency with which cooperative kernels can be implemented in practice. @InProceedings{ESEC/FSE17p431, author = {Tyler Sorensen and Hugues Evrard and Alastair F. Donaldson}, title = {Cooperative Kernels: GPU Multitasking for Blocking Algorithms}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {431--441}, doi = {}, year = {2017}, } Best-Paper Award |
|
Dong, Wei |
ESEC/FSE '17-DEMO: "RGSE: A Regular Property Guided ..."
RGSE: A Regular Property Guided Symbolic Executor for Java
Hengbiao Yu, Zhenbang Chen, Yufeng Zhang, Ji Wang, and Wei Dong (National University of Defense Technology, China; Jiangnan Institute of Computing, China) It is challenging to effectively check a regular property of a program. This paper presents RGSE, a regular property guided dynamic symbolic execution (DSE) engine, for finding a program path satisfying a regular property as soon as possible. The key idea is to evaluate the candidate branches based on the history and future information, and explore the branches along which the paths are more likely to satisfy the property in priority. We have applied RGSE to 16 real-world open source Java programs, totaling 270K lines of code. Compared with the state-of-the-art, RGSE achieves two orders of magnitude speedups for finding the first target path. RGSE can benefit many research topics of software testing and analysis, such as path-oriented test case generation, typestate bug finding, and performance tuning. The demo video is at: https://youtu.be/7zAhvRIdaUU, and RGSE can be accessed at: http://jrgse.github.io. @InProceedings{ESEC/FSE17p954, author = {Hengbiao Yu and Zhenbang Chen and Yufeng Zhang and Ji Wang and Wei Dong}, title = {RGSE: A Regular Property Guided Symbolic Executor for Java}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {954--958}, doi = {}, year = {2017}, } Video Info |
|
Dotzler, Georg |
ESEC/FSE '17: "More Accurate Recommendations ..."
More Accurate Recommendations for Method-Level Changes
Georg Dotzler, Marius Kamp, Patrick Kreutzer, and Michael Philippsen (Friedrich-Alexander University Erlangen-Nürnberg, Germany) During the life span of large software projects, developers often apply the same code changes to different code locations in slight variations. Since the application of these changes to all locations is time-consuming and error-prone, tools exist that learn change patterns from input examples, search for possible pattern applications, and generate corresponding recommendations. In many cases, the generated recommendations are syntactically or semantically wrong due to code movements in the input examples. Thus, they are of low accuracy and developers cannot directly copy them into their projects without adjustments. We present the Accurate REcommendation System (ARES) that achieves a higher accuracy than other tools because its algorithms take care of code movements when creating patterns and recommendations. On average, the recommendations by ARES have an accuracy of 96% with respect to code changes that developers have manually performed in commits of source code archives. At the same time ARES achieves precision and recall values that are on par with other tools. @InProceedings{ESEC/FSE17p798, author = {Georg Dotzler and Marius Kamp and Patrick Kreutzer and Michael Philippsen}, title = {More Accurate Recommendations for Method-Level Changes}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {798--808}, doi = {}, year = {2017}, } Info |
|
Dougherty, Daniel J. |
ESEC/FSE '17: "The Power of "Why" ..."
The Power of "Why" and "Why Not": Enriching Scenario Exploration with Provenance
Tim Nelson, Natasha Danas, Daniel J. Dougherty, and Shriram Krishnamurthi (Brown University, USA; Worcester Polytechnic Institute, USA) Scenario-finding tools like the Alloy Analyzer are widely used in numerous concrete domains like security, network analysis, UML analysis, and so on. They can help to verify properties and, more generally, aid in exploring a system's behavior. While scenario finders are valuable for their ability to produce concrete examples, individual scenarios only give insight into what is possible, leaving the user to make their own conclusions about what might be necessary. This paper enriches scenario finding by allowing users to ask ``why?'' and ``why not?'' questions about the examples they are given. We show how to distinguish parts of an example that cannot be consistently removed (or changed) from those that merely reflect underconstraint in the specification. In the former case we show how to determine which elements of the specification and which other components of the example together explain the presence of such facts. This paper formalizes the act of computing provenance in scenario-finding. We present Amalgam, an extension of the popular Alloy scenario-finder, which implements these foundations and provides interactive exploration of examples. We also evaluate Amalgam's algorithmics on a variety of both textbook and real-world examples. @InProceedings{ESEC/FSE17p106, author = {Tim Nelson and Natasha Danas and Daniel J. Dougherty and Shriram Krishnamurthi}, title = {The Power of "Why" and "Why Not": Enriching Scenario Exploration with Provenance}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {106--116}, doi = {}, year = {2017}, } Info Artifacts Reusable Best-Paper Award |
|
Dovgalyuk, Pavel |
ESEC/FSE '17-DEMO: "QEMU-Based Framework for Non-intrusive ..."
QEMU-Based Framework for Non-intrusive Virtual Machine Instrumentation and Introspection
Pavel Dovgalyuk, Natalia Fursova, Ivan Vasiliev, and Vladimir Makarov (Novgorod State University, Russia) This paper presents the framework based on the emulator QEMU. Our framework provides set of multi-platform analysis tools for the virtual machines and mechanism for creating instrumentation and analysis tools. Our framework is based on a lightweight approach to dynamic analysis of binary code executed in virtual machines. This approach is non-intrusive and provides system-wide analysis capabilities. It does not require loading any guest agents and source code of the OS. Therefore it may be applied to ROM-based guest systems and enables using of record/replay of the system execution. We use application binary interface (ABI) of the platform to be analyzed for creating introspection tools. These tools recover the part of kernel-level information related to the system calls executed on the guest machine. @InProceedings{ESEC/FSE17p944, author = {Pavel Dovgalyuk and Natalia Fursova and Ivan Vasiliev and Vladimir Makarov}, title = {QEMU-Based Framework for Non-intrusive Virtual Machine Instrumentation and Introspection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {944--948}, doi = {}, year = {2017}, } Info |
|
Easterbrook, Steve |
ESEC/FSE '17-INV: "Verifying the Forecast: How ..."
Verifying the Forecast: How Climate Models Are Developed and Tested (Invited Talk)
Steve Easterbrook (University of Toronto, Canada) Stolen passwords, compromised medical records, taking the internet out through video cameras– cybersecurity breaches are in the news every day. Despite all this, the practice of cybersecurity today is generally reactive rather than proactive. That is, rather than improving their defenses in advance, organizations react to attacks once they have occurred by patching the individual vulnerabilities that led to those attacks. Researchers engineer solutions to the latest form of attack. What we need, instead, are scientifically founded design principles for building in security mechanisms from the beginning, giving protection against broad classes of attacks. Through scientific measurement, we can improve our ability to make decisions that are evidence-based, proactive, and long-sighted. Recognizing these needs, the US National Security Agency (NSA) devised a new framework for collaborative research, the “Lablet” structure, with the intent to more aggressively advance the science of cybersecurity. A key motivation was to catalyze a shift in relevant areas towards a more organized and cohesive scientific community. The NSA named Carnegie Mellon University, North Carolina State University, and the University of Illinois – Urbana Champaign its initial Lablets in 2011, and added the University of Maryland in 2014. This talk will reflect on the structure of the collaborative research efforts of the Lablets, lessons learned in the transition to more scientific concepts to cybersecurity, research results in solving five hard security problems, and methods that are being used for the measurement of scientific progress of the Lablet research. @InProceedings{ESEC/FSE17p2, author = {Steve Easterbrook}, title = {Verifying the Forecast: How Climate Models Are Developed and Tested (Invited Talk)}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {2--2}, doi = {}, year = {2017}, } |
|
Eden, Anthony |
ESEC/FSE '17: "CodeCarbonCopy ..."
CodeCarbonCopy
Stelios Sidiroglou-Douskos, Eric Lahtinen, Anthony Eden, Fan Long, and Martin Rinard (Massachusetts Institute of Technology, USA) We present CodeCarbonCopy (CCC), a system for transferring code from a donor application into a recipient application. CCC starts with functionality identified by the developer to transfer into an insertion point (again identified by the developer) in the recipient. CCC uses paired executions of the donor and recipient on the same input file to obtain a translation between the data representation and name space of the recipient and the data representation and name space of the donor. It also implements a static analysis that identifies and removes irrelevant functionality useful in the donor but not in the recipient. We evaluate CCC on eight transfers between six applications. Our results show that CCC can successfully transfer donor functionality into recipient applications. @InProceedings{ESEC/FSE17p95, author = {Stelios Sidiroglou-Douskos and Eric Lahtinen and Anthony Eden and Fan Long and Martin Rinard}, title = {CodeCarbonCopy}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {95--105}, doi = {}, year = {2017}, } |
|
Eichberg, Michael |
ESEC/FSE '17: "CodeMatch: Obfuscation Won't ..."
CodeMatch: Obfuscation Won't Conceal Your Repackaged App
Leonid Glanz, Sven Amann, Michael Eichberg, Michael Reif, Ben Hermann, Johannes Lerch, and Mira Mezini (TU Darmstadt, Germany) An established way to steal the income of app developers, or to trick users into installing malware, is the creation of repackaged apps. These are clones of – typically – successful apps. To conceal their nature, they are often obfuscated by their creators. But, given that it is a common best practice to obfuscate apps, a trivial identification of repackaged apps is not possible. The problem is further intensified by the prevalent usage of libraries. In many apps, the size of the overall code base is basically determined by the used libraries. Therefore, two apps, where the obfuscated code bases are very similar, do not have to be repackages of each other. To reliably detect repackaged apps, we propose a two step approach which first focuses on the identification and removal of the library code in obfuscated apps. This approach – LibDetect – relies on code representations which abstract over several parts of the underlying bytecode to be resilient against certain obfuscation techniques. Using this approach, we are able to identify on average 70% more used libraries per app than previous approaches. After the removal of an app’s library code, we then fuzzy hash the most abstract representation of the remaining app code to ensure that we can identify repackaged apps even if very advanced obfuscation techniques are used. This makes it possible to identify repackaged apps. Using our approach, we found that ≈ 15% of all apps in Android app stores are repackages @InProceedings{ESEC/FSE17p638, author = {Leonid Glanz and Sven Amann and Michael Eichberg and Michael Reif and Ben Hermann and Johannes Lerch and Mira Mezini}, title = {CodeMatch: Obfuscation Won't Conceal Your Repackaged App}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {638--648}, doi = {}, year = {2017}, } Info |
|
Ellmann, Mathias |
ESEC/FSE '17-DOC: "On the Similarity of Software ..."
On the Similarity of Software Development Documentation
Mathias Ellmann (University of Hamburg, Germany) Software developers spent 20% of their time on information seeking on Stack Overflow, YouTube or an API reference documentation. Software developers can search within Stack Overflow for duplicates or similar posts. They can also take a look on software development documentations that have similar and additional information included as a Stack Overflow post or a development screencast in order to get new inspirations on how to solve their current development problem. The linkage of same and different types of software development documentation might safe time to evolve new software solutions and might increase the productivity of the developer’s work day. In this paper we will discuss our approach to get a broader understanding of different similarity types (exact, similar and maybe) within and between software documentation as well as an understanding of how different software documentations can be extended. @InProceedings{ESEC/FSE17p1030, author = {Mathias Ellmann}, title = {On the Similarity of Software Development Documentation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1030--1033}, doi = {}, year = {2017}, } |
|
Emmerich, Wolfgang |
ESEC/FSE '17-INV: "Software Engineering Research ..."
Software Engineering Research Results in Industrial Practice: A Tale of Two Projects (Invited Talk)
Wolfgang Emmerich (Zuhlke, UK) In this talk, I will discuss the use of software engineering research results in industrial practice, based on two projects I have been involved with. The first project addressed the challenge that manipulation of financial market data had to be expressed precisely for a large number of different financial markets. The challenge was addressed by defining a functional Domain Specific Language (DSL) that was geared towards expressing these manipulations at a high level of abstraction. An environment that implements the DSL was built using the Eclipse platform together with a compiler that generates a Java-based reference implementation of these manipulations. The implementation is used as a test oracle to generate test cases, which are in turn used to validate a soft real-time system that implements these manipulations. In another project that is still ongoing, I have proposed the use of software product line research to engineer a family of mobile banking applications. I will reflect on the experience of integrating software product line principles and modern Agile development practices. I will then discuss a few areas of software engineering research, that I have personally been involved in, that I have found not to be very useful in practice. I will conclude by outlining some topics where novel research results would be very beneficial from an industrial point of view. @InProceedings{ESEC/FSE17p3, author = {Wolfgang Emmerich}, title = {Software Engineering Research Results in Industrial Practice: A Tale of Two Projects (Invited Talk)}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {3--3}, doi = {}, year = {2017}, } |
|
Erata, Ferhat |
ESEC/FSE '17-DEMO: "A Tool for Automated Reasoning ..."
A Tool for Automated Reasoning about Traces Based on Configurable Formal Semantics
Ferhat Erata, Arda Goknil, Bedir Tekinerdogan, and Geylani Kardas (Wageningen University, Netherlands; University of Luxembourg, Luxembourg; Ege University, Turkey) We present Tarski, a tool for specifying configurable trace semantics to facilitate automated reasoning about traces. Software development projects require that various types of traces be modeled between and within development artifacts. For any given artifact (e.g., requirements, architecture models and source code), Tarski allows the user to specify new trace types and their configurable semantics, while, using the semantics, it automatically infers new traces based on existing traces provided by the user, and checks the consistency of traces. It has been evaluated on three industrial case studies in the automotive domain (https://modelwriter.github.io/Tarski/). @InProceedings{ESEC/FSE17p959, author = {Ferhat Erata and Arda Goknil and Bedir Tekinerdogan and Geylani Kardas}, title = {A Tool for Automated Reasoning about Traces Based on Configurable Formal Semantics}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {959--963}, doi = {}, year = {2017}, } Video Info |
|
Erenkrantz, Justin R. |
ESEC/FSE '17-INV: "Reflections on the REST Architectural ..."
Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)
Roy T. Fielding, Richard N. Taylor, Justin R. Erenkrantz, Michael M. Gorlick, Jim Whitehead, Rohit Khare, and Peyman Oreizy (Adobe, USA; University of California at Irvine, USA; Bloomberg, USA; University of California at Santa Cruz, USA; Google, USA; Dynamic Variable, USA) Seventeen years after its initial publication at ICSE 2000, the Representational State Transfer (REST) architectural style continues to hold significance as both a guide for understanding how the World Wide Web is designed to work and an example of how principled design, through the application of architectural styles, can impact the development and understanding of large-scale software architecture. However, REST has also become an industry buzzword: frequently abused to suit a particular argument, confused with the general notion of using HTTP, and denigrated for not being more like a programming methodology or implementation framework. In this paper, we chart the history, evolution, and shortcomings of REST, as well as several related architectural styles that it inspired, from the perspective of a chain of doctoral dissertations produced by the University of California's Institute for Software Research at UC Irvine. These successive theses share a common theme: extending the insights of REST to new domains and, in their own way, exploring the boundary of software engineering as it applies to decentralized software architectures and architectural design. We conclude with discussion of the circumstances, environment, and organizational characteristics that gave rise to this body of work. @InProceedings{ESEC/FSE17p4, author = {Roy T. Fielding and Richard N. Taylor and Justin R. Erenkrantz and Michael M. Gorlick and Jim Whitehead and Rohit Khare and Peyman Oreizy}, title = {Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {4--14}, doi = {}, year = {2017}, } |
|
Evrard, Hugues |
ESEC/FSE '17: "Cooperative Kernels: GPU Multitasking ..."
Cooperative Kernels: GPU Multitasking for Blocking Algorithms
Tyler Sorensen, Hugues Evrard, and Alastair F. Donaldson (Imperial College London, UK) There is growing interest in accelerating irregular data-parallel algorithms on GPUs. These algorithms are typically blocking, so they require fair scheduling. But GPU programming models (e.g. OpenCL) do not mandate fair scheduling, and GPU schedulers are unfair in practice. Current approaches avoid this issue by exploiting scheduling quirks of today's GPUs in a manner that does not allow the GPU to be shared with other workloads (such as graphics rendering tasks). We propose cooperative kernels, an extension to the traditional GPU programming model geared towards writing blocking algorithms. Workgroups of a cooperative kernel are fairly scheduled, and multitasking is supported via a small set of language extensions through which the kernel and scheduler cooperate. We describe a prototype implementation of a cooperative kernel framework implemented in OpenCL 2.0 and evaluate our approach by porting a set of blocking GPU applications to cooperative kernels and examining their performance under multitasking. Our prototype exploits no vendor-specific hardware, driver or compiler support, thus our results provide a lower-bound on the efficiency with which cooperative kernels can be implemented in practice. @InProceedings{ESEC/FSE17p431, author = {Tyler Sorensen and Hugues Evrard and Alastair F. Donaldson}, title = {Cooperative Kernels: GPU Multitasking for Blocking Algorithms}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {431--441}, doi = {}, year = {2017}, } Best-Paper Award |
|
Ferles, Kostas |
ESEC/FSE '17: "Failure-Directed Program Trimming ..."
Failure-Directed Program Trimming
Kostas Ferles, Valentin Wüstholz, Maria Christakis, and Isil Dillig (University of Texas at Austin, USA; University of Kent, UK) This paper describes a new program simplification technique called program trimming that aims to improve the scalability and precision of safety checking tools. Given a program P, program trimming generates a new program P′ such that P and P′ are equi-safe (i.e., P′ has a bug if and only if P has a bug), but P′ has fewer execution paths than P. Since many program analyzers are sensitive to the number of execution paths, program trimming has the potential to improve the effectiveness of safety checking tools. In addition to introducing the concept of program trimming, this paper also presents a lightweight static analysis that can be used as a pre-processing step to remove program paths while retaining equi-safety. We have implemented the proposed technique in a tool called Trimmer and evaluate it in the context of two program analysis techniques, namely abstract interpretation and dynamic symbolic execution. Our experiments show that program trimming significantly improves the effectiveness of both techniques. @InProceedings{ESEC/FSE17p174, author = {Kostas Ferles and Valentin Wüstholz and Maria Christakis and Isil Dillig}, title = {Failure-Directed Program Trimming}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {174--185}, doi = {}, year = {2017}, } |
|
Ferns, Gabriel |
ESEC/FSE '17: "Discovering Relational Specifications ..."
Discovering Relational Specifications
Calvin Smith, Gabriel Ferns, and Aws Albarghouthi (University of Wisconsin-Madison, USA) Formal specifications of library functions play a critical role in a number of program analysis and development tasks. We present Bach, a technique for discovering likely relational specifications from data describing input–output behavior of a set of functions comprising a library or a program. Relational specifications correlate different executions of different functions; for instance, commutativity, transitivity, equivalence of two functions, etc. Bach combines novel insights from program synthesis and databases to discover a rich array of specifications. We apply Bach to learn specifications from data generated for a number of standard libraries. Our experimental evaluation demonstrates Bach’s ability to learn useful and deep specifications in a small amount of time. @InProceedings{ESEC/FSE17p616, author = {Calvin Smith and Gabriel Ferns and Aws Albarghouthi}, title = {Discovering Relational Specifications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {616--626}, doi = {}, year = {2017}, } Best-Paper Award |
|
Fielding, Roy T. |
ESEC/FSE '17-INV: "Reflections on the REST Architectural ..."
Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)
Roy T. Fielding, Richard N. Taylor, Justin R. Erenkrantz, Michael M. Gorlick, Jim Whitehead, Rohit Khare, and Peyman Oreizy (Adobe, USA; University of California at Irvine, USA; Bloomberg, USA; University of California at Santa Cruz, USA; Google, USA; Dynamic Variable, USA) Seventeen years after its initial publication at ICSE 2000, the Representational State Transfer (REST) architectural style continues to hold significance as both a guide for understanding how the World Wide Web is designed to work and an example of how principled design, through the application of architectural styles, can impact the development and understanding of large-scale software architecture. However, REST has also become an industry buzzword: frequently abused to suit a particular argument, confused with the general notion of using HTTP, and denigrated for not being more like a programming methodology or implementation framework. In this paper, we chart the history, evolution, and shortcomings of REST, as well as several related architectural styles that it inspired, from the perspective of a chain of doctoral dissertations produced by the University of California's Institute for Software Research at UC Irvine. These successive theses share a common theme: extending the insights of REST to new domains and, in their own way, exploring the boundary of software engineering as it applies to decentralized software architectures and architectural design. We conclude with discussion of the circumstances, environment, and organizational characteristics that gave rise to this body of work. @InProceedings{ESEC/FSE17p4, author = {Roy T. Fielding and Richard N. Taylor and Justin R. Erenkrantz and Michael M. Gorlick and Jim Whitehead and Rohit Khare and Peyman Oreizy}, title = {Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {4--14}, doi = {}, year = {2017}, } |
|
Filieri, Antonio |
ESEC/FSE '17: "Automated Control of Multiple ..."
Automated Control of Multiple Software Goals using Multiple Actuators
Martina Maggio, Alessandro Vittorio Papadopoulos, Antonio Filieri, and Henry Hoffmann (Lund University, Sweden; Mälardalen University, Sweden; Imperial College London, UK; University of Chicago, USA) Modern software should satisfy multiple goals simultaneously: it should provide predictable performance, be robust to failures, handle peak loads and deal seamlessly with unexpected conditions and changes in the execution environment. For this to happen, software designs should account for the possibility of runtime changes and provide formal guarantees of the software's behavior. Control theory is one of the possible design drivers for runtime adaptation, but adopting control theoretic principles often requires additional, specialized knowledge. To overcome this limitation, automated methodologies have been proposed to extract the necessary information from experimental data and design a control system for runtime adaptation. These proposals, however, only process one goal at a time, creating a chain of controllers. In this paper, we propose and evaluate the first automated strategy that takes into account multiple goals without separating them into multiple control strategies. Avoiding the separation allows us to tackle a larger class of problems and provide stronger guarantees. We test our methodology's generality with three case studies that demonstrate its broad applicability in meeting performance, reliability, quality, security, and energy goals despite environmental or requirements changes. @InProceedings{ESEC/FSE17p373, author = {Martina Maggio and Alessandro Vittorio Papadopoulos and Antonio Filieri and Henry Hoffmann}, title = {Automated Control of Multiple Software Goals using Multiple Actuators}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {373--384}, doi = {}, year = {2017}, } Info |
|
Fonseca, Baldoino |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects
Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo Sousa, Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez (PUC-Rio, Brazil; Federal University of Campina Grande, Brazil; Federal University of Alagoas, Brazil) Code smells in a program represent indications of structural quality problems, which can be addressed by software refactoring. However, refactoring intends to achieve different goals in practice, and its application may not reduce smelly structures. Developers may neglect or end up creating new code smells through refactoring. Unfortunately, little has been reported about the beneficial and harmful effects of refactoring on code smells. This paper reports a longitudinal study intended to address this gap. We analyze how often commonly-used refactoring types affect the density of 13 types of code smells along the version histories of 23 projects. Our findings are based on the analysis of 16,566 refactorings distributed in 10 different types. Even though 79.4% of the refactorings touched smelly elements, 57% did not reduce their occurrences. Surprisingly, only 9.7% of refactorings removed smells, while 33.3% induced the introduction of new ones. More than 95% of such refactoring-induced smells were not removed in successive commits, which suggest refactorings tend to more frequently introduce long-living smells instead of eliminating existing ones. We also characterized and quantified typical refactoring-smell patterns, and observed that harmful patterns are frequent, including: (i) approximately 30% of the Move Method and Pull Up Method refactorings induced the emergence of God Class, and (ii) the Extract Superclass refactoring creates the smell Speculative Generality in 68% of the cases. @InProceedings{ESEC/FSE17p465, author = {Diego Cedrim and Alessandro Garcia and Melina Mongiovi and Rohit Gheyi and Leonardo Sousa and Rafael de Mello and Baldoino Fonseca and Márcio Ribeiro and Alexander Chávez}, title = {Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {465--475}, doi = {}, year = {2017}, } Info |
|
Frias, Marcelo |
ESEC/FSE '17-DEMO: "DynAlloy Analyzer: A Tool ..."
DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour
Germán Regis, César Cornejo, Simón Gutiérrez Brida, Mariano Politano, Fernando Raverta, Pablo Ponzio, Nazareno Aguirre, Juan Pablo Galeotti, and Marcelo Frias (Universidad Nacional de Río Cuarto, Argentina; Universidad Nacional de Córdoba, Argentina; University of Buenos Aires, Argentina; Buenos Aires Institute of Technology, Argentina) We describe DynAlloy Analyzer, a tool that extends Alloy Analyzer with support for dynamic elements in Alloy models. The tool builds upon Alloy Analyzer in a way that makes it fully compatible with Alloy models, and extends their syntax with a particular idiom, inspired in dynamic logic, for the description of dynamic behaviours, understood as sequences of states over standard Alloy models, in terms of programs. The syntax is broad enough to accommodate abstract dynamic behaviours, e.g., using nondeterministic choice and finite unbounded iteration, as well as more concrete ones, using standard sequential programming constructions. The analysis of DynAlloy models resorts to the analysis of Alloy models, through an optimized translation that often makes the analysis more efficient than that of typical ad-hoc constructions to capture dynamism in Alloy. Tool screencast, binaries and further details available in: http://dc.exa.unrc.edu.ar/tools/dynalloy @InProceedings{ESEC/FSE17p969, author = {Germán Regis and César Cornejo and Simón Gutiérrez Brida and Mariano Politano and Fernando Raverta and Pablo Ponzio and Nazareno Aguirre and Juan Pablo Galeotti and Marcelo Frias}, title = {DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {969--973}, doi = {}, year = {2017}, } Video Info |
|
Fu, Wei |
ESEC/FSE '17: "Revisiting Unsupervised Learning ..."
Revisiting Unsupervised Learning for Defect Prediction
Wei Fu and Tim Menzies (North Carolina State University, USA) Collecting quality data from software projects can be time-consuming and expensive. Hence, some researchers explore “unsupervised” approaches to quality prediction that does not require labelled data. An alternate technique is to use “supervised” approaches that learn models from project data labelled with, say, “defective” or “not-defective”. Most researchers use these supervised models since, it is argued, they can exploit more knowledge of the projects. At FSE’16, Yang et al. reported startling results where unsupervised defect predictors outperformed supervised predictors for effort-aware just-in-time defect prediction. If confirmed, these results would lead to a dramatic simplification of a seemingly complex task (data mining) that is widely explored in the software engineering literature. This paper repeats and refutes those results as follows. (1) There is much variability in the efficacy of the Yang et al. predictors so even with their approach, some supervised data is required to prune weaker predictors away. (2) Their findings were grouped across N projects. When we repeat their analysis on a project-by-project basis, supervised predictors are seen to work better. Even though this paper rejects the specific conclusions of Yang et al., we still endorse their general goal. In our our experiments, supervised predictors did not perform outstandingly better than unsupervised ones for effort-aware just-in-time defect prediction. Hence, they may indeed be some combination of unsupervised learners to achieve comparable performance to supervised ones. We therefore encourage others to work in this promising area. @InProceedings{ESEC/FSE17p72, author = {Wei Fu and Tim Menzies}, title = {Revisiting Unsupervised Learning for Defect Prediction}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {72--83}, doi = {}, year = {2017}, } ESEC/FSE '17: "Easy over Hard: A Case Study ..." Easy over Hard: A Case Study on Deep Learning Wei Fu and Tim Menzies (North Carolina State University, USA) While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost. This is particularly important for deep learning since these learners need hours (to weeks) to train the model. Such long training time limits the ability of (a)~a researcher to test the stability of their conclusion via repeated runs with different random seeds; and (b)~other researchers to repeat, improve, or even refute that original work. For example, recently, deep learning was used to find which questions in the Stack Overflow programmer discussion forum can be linked together. That deep learning system took 14 hours to execute. We show here that applying a very simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results. The DE approach terminated in 10 minutes; i.e. 84 times faster hours than deep learning method. We offer these results as a cautionary tale to the software analytics community and suggest that not every new innovation should be applied without critical analysis. If researchers deploy some new and expensive process, that work should be baselined against some simpler and faster alternatives. @InProceedings{ESEC/FSE17p49, author = {Wei Fu and Tim Menzies}, title = {Easy over Hard: A Case Study on Deep Learning}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {49--60}, doi = {}, year = {2017}, } |
|
Fursova, Natalia |
ESEC/FSE '17-DEMO: "QEMU-Based Framework for Non-intrusive ..."
QEMU-Based Framework for Non-intrusive Virtual Machine Instrumentation and Introspection
Pavel Dovgalyuk, Natalia Fursova, Ivan Vasiliev, and Vladimir Makarov (Novgorod State University, Russia) This paper presents the framework based on the emulator QEMU. Our framework provides set of multi-platform analysis tools for the virtual machines and mechanism for creating instrumentation and analysis tools. Our framework is based on a lightweight approach to dynamic analysis of binary code executed in virtual machines. This approach is non-intrusive and provides system-wide analysis capabilities. It does not require loading any guest agents and source code of the OS. Therefore it may be applied to ROM-based guest systems and enables using of record/replay of the system execution. We use application binary interface (ABI) of the platform to be analyzed for creating introspection tools. These tools recover the part of kernel-level information related to the system calls executed on the guest machine. @InProceedings{ESEC/FSE17p944, author = {Pavel Dovgalyuk and Natalia Fursova and Ivan Vasiliev and Vladimir Makarov}, title = {QEMU-Based Framework for Non-intrusive Virtual Machine Instrumentation and Introspection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {944--948}, doi = {}, year = {2017}, } Info |
|
Galeotti, Juan Pablo |
ESEC/FSE '17-DEMO: "DynAlloy Analyzer: A Tool ..."
DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour
Germán Regis, César Cornejo, Simón Gutiérrez Brida, Mariano Politano, Fernando Raverta, Pablo Ponzio, Nazareno Aguirre, Juan Pablo Galeotti, and Marcelo Frias (Universidad Nacional de Río Cuarto, Argentina; Universidad Nacional de Córdoba, Argentina; University of Buenos Aires, Argentina; Buenos Aires Institute of Technology, Argentina) We describe DynAlloy Analyzer, a tool that extends Alloy Analyzer with support for dynamic elements in Alloy models. The tool builds upon Alloy Analyzer in a way that makes it fully compatible with Alloy models, and extends their syntax with a particular idiom, inspired in dynamic logic, for the description of dynamic behaviours, understood as sequences of states over standard Alloy models, in terms of programs. The syntax is broad enough to accommodate abstract dynamic behaviours, e.g., using nondeterministic choice and finite unbounded iteration, as well as more concrete ones, using standard sequential programming constructions. The analysis of DynAlloy models resorts to the analysis of Alloy models, through an optimized translation that often makes the analysis more efficient than that of typical ad-hoc constructions to capture dynamism in Alloy. Tool screencast, binaries and further details available in: http://dc.exa.unrc.edu.ar/tools/dynalloy @InProceedings{ESEC/FSE17p969, author = {Germán Regis and César Cornejo and Simón Gutiérrez Brida and Mariano Politano and Fernando Raverta and Pablo Ponzio and Nazareno Aguirre and Juan Pablo Galeotti and Marcelo Frias}, title = {DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {969--973}, doi = {}, year = {2017}, } Video Info |
|
Galhotra, Sainyam |
ESEC/FSE '17: "Fairness Testing: Testing ..."
Fairness Testing: Testing Software for Discrimination
Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou (University of Massachusetts at Amherst, USA) This paper defines software fairness and discrimination and develops a testing-based method for measuring if and how much software discriminates, focusing on causality in discriminatory behavior. Evidence of software discrimination has been found in modern software systems that recommend criminal sentences, grant access to financial products, and determine who is allowed to participate in promotions. Our approach, Themis, generates efficient test suites to measure discrimination. Given a schema describing valid system inputs, Themis generates discrimination tests automatically and does not require an oracle. We evaluate Themis on 20 software systems, 12 of which come from prior work with explicit focus on avoiding discrimination. We find that (1) Themis is effective at discovering software discrimination, (2) state-of-the-art techniques for removing discrimination from algorithms fail in many situations, at times discriminating against as much as 98% of an input subdomain, (3) Themis optimizations are effective at producing efficient test suites for measuring discrimination, and (4) Themis is more efficient on systems that exhibit more discrimination. We thus demonstrate that fairness testing is a critical aspect of the software development cycle in domains with possible discrimination and provide initial tools for measuring software discrimination. @InProceedings{ESEC/FSE17p498, author = {Sainyam Galhotra and Yuriy Brun and Alexandra Meliou}, title = {Fairness Testing: Testing Software for Discrimination}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {498--510}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Galster, Matthias |
ESEC/FSE '17-IND: "Reference Architectures and ..."
Reference Architectures and Scrum: Friends or Foes?
Matthias Galster, Samuil Angelov, Silverio Martínez-Fernández, and Dan Tofan (University of Canterbury, New Zealand; Fontys University of Applied Sciences, Netherlands; Fraunhofer IESE, Germany) Software reference architectures provide templates and guidelines for designing systems in a particular domain. Companies use them to achieve interoperability of (parts of) their software, standardization, and faster development. In contrast to system-specific software architectures that "emerge" during development, reference architectures dictate significant parts of the software design early on. Agile software development frameworks (such as Scrum) acknowledge changing software requirements and the need to adapt the software design accordingly. In this paper, we present lessons learned about how reference architectures interact with Scrum (the most frequently used agile process framework). These lessons are based on observing software development projects in five companies. We found that reference architectures can support good practice in Scrum: They provide enough design upfront without too much effort, reduce documentation activities, facilitate knowledge sharing, and contribute to "architectural thinking" of developers. However, reference architectures can impose risks or even threats to the success of Scrum (e.g., to self-organizing and motivated teams). @InProceedings{ESEC/FSE17p896, author = {Matthias Galster and Samuil Angelov and Silverio Martínez-Fernández and Dan Tofan}, title = {Reference Architectures and Scrum: Friends or Foes?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {896--901}, doi = {}, year = {2017}, } |
|
Gantayat, Neelamadhav |
ESEC/FSE '17-IND: "Natural Language Querying ..."
Natural Language Querying in SAP-ERP Platform
Diptikalyan Saha, Neelamadhav Gantayat, Senthil Mani, and Barry Mitchell (IBM Research, India; IBM, USA) With the omnipresence of mobile devices coupled with recent advances in automatic speech recognition capabilities, there has been a growing demand for natural language query (NLQ) interface to retrieve information from the knowledge bases. Business users particularly find this useful as NLQ interface enables them to ask questions without the knowledge of the query language or the data schema. In this paper, we apply an existing research technology called ``ATHENA: An Ontology-Driven System for Natural Language Querying over Relational Data Stores'' in the industry domain of SAP-ERP systems. The goal is to enable users to query SAP-ERP data using natural language. We present the challenges and their solutions of such a technology transfer. We present the effectiveness of the natural language query interface on a set of questions given by a set of SAP practitioners. @InProceedings{ESEC/FSE17p878, author = {Diptikalyan Saha and Neelamadhav Gantayat and Senthil Mani and Barry Mitchell}, title = {Natural Language Querying in SAP-ERP Platform}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {878--883}, doi = {}, year = {2017}, } |
|
Garbervetsky, Diego |
ESEC/FSE '17-IND: "Static Analysis for Optimizing ..."
Static Analysis for Optimizing Big Data Queries
Diego Garbervetsky, Zvonimir Pavlinovic, Michael Barnett, Madanlal Musuvathi, Todd Mytkowicz, and Edgardo Zoppi (University of Buenos Aires, Argentina; New York University, USA; Microsoft Research, USA) Query languages for big data analysis provide user extensibility through a mechanism of user-defined operators (UDOs). These operators allow programmers to write proprietary functionalities on top of a relational query skeleton. However, achieving effective query optimization for such languages is extremely challenging since the optimizer needs to understand data dependencies induced by UDOs. SCOPE, the query language from Microsoft, allows for hand coded declarations of UDO data dependencies. Unfortunately, most programmers avoid using this facility since writing and maintaining the declarations is tedious and error-prone. In this work, we designed and implemented two sound and robust static analyses for computing UDO data dependencies. The analyses can detect what columns of an input table are never used or pass-through a UDO unchanged. This information can be used to significantly improve execution of SCOPE scripts. We evaluate our analyses on thousands of real-world queries and show we can catch many unused and pass-through columns automatically without relying on any manually provided declarations. @InProceedings{ESEC/FSE17p932, author = {Diego Garbervetsky and Zvonimir Pavlinovic and Michael Barnett and Madanlal Musuvathi and Todd Mytkowicz and Edgardo Zoppi}, title = {Static Analysis for Optimizing Big Data Queries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {932--937}, doi = {}, year = {2017}, } ESEC/FSE '17: "Toward Full Elasticity in ..." Toward Full Elasticity in Distributed Static Analysis: The Case of Callgraph Analysis Diego Garbervetsky, Edgardo Zoppi, and Benjamin Livshits (University of Buenos Aires, Argentina; Imperial College London, UK) In this paper we present the design and implementation of a distributed, whole-program static analysis framework that is designed to scale with the size of the input. Our approach is based on the actor programming model and is deployed in the cloud. Our reliance on a cloud cluster provides a degree of elasticity for CPU, memory, and storage resources. To demonstrate the potential of our technique, we show how a typical call graph analysis can be implemented in a distributed setting. The vision that motivates this work is that every large-scale software repository such as GitHub, BitBucket, or Visual Studio Online will be able to perform static analysis on a large scale. We experimentally validate our implementation of the distributed call graph analysis using a combination of both synthetic and real benchmarks. To show scalability, we demonstrate how the analysis presented in this paper is able to handle inputs that are almost 10 million lines of code (LOC) in size, without running out of memory. Our results show that the analysis scales well in terms of memory pressure independently of the input size, as we add more virtual machines (VMs). As the number of worker VMs increases, we observe that the analysis time generally improves as well. Lastly, we demonstrate that querying the results can be performed with a median latency of 15 ms. @InProceedings{ESEC/FSE17p442, author = {Diego Garbervetsky and Edgardo Zoppi and Benjamin Livshits}, title = {Toward Full Elasticity in Distributed Static Analysis: The Case of Callgraph Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {442--453}, doi = {}, year = {2017}, } |
|
Garcés, Kelly |
ESEC/FSE '17-IND: "Improving Understanding of ..."
Improving Understanding of Dynamically Typed Software Developed by Agile Practitioners
Jair García and Kelly Garcés (Grupo Saberes, Colombia; Universidad de los Andes, Colombia) Agile Development values working software over documentation. Therefore, in maintenance stages of existing software, the source code is the sole software artifact that developers have for analyzing the viability and impact of a new user story. Since functionality is often spread in hundreds of lines of code, it is hard for the developer to understand the system, which may lead to under-/overestimation of the new feature cost and rework/delays in the subsequent phases of development. In a previous work, we proposed a Model-Driven Reverse Engineering approach for obtaining software visualizations from source code. Two case studies of comprehension of applications written in statically typed languages have shown the applicability of this approach. A recent experience with an industrial partner, where the systems are developed on dynamically typed languages, has motivated us to adapt the previous proposal to take as input not only the source code but also the application data schema to complete the information that is missing in the code, and then automatically generate more meaningful diagrams that help developers in maintenance tasks. In this article, we present the adaptation of the general approach to support data schema as an additional input and its instrumentation in an industrial case study where the technology is Ruby on Rails. The paper ends by explaining the precision and performance of the instrumentation when used in a Colombian company as well as lessons learned. @InProceedings{ESEC/FSE17p908, author = {Jair García and Kelly Garcés}, title = {Improving Understanding of Dynamically Typed Software Developed by Agile Practitioners}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {908--913}, doi = {}, year = {2017}, } |
|
Garcia, Alessandro |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects
Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo Sousa, Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez (PUC-Rio, Brazil; Federal University of Campina Grande, Brazil; Federal University of Alagoas, Brazil) Code smells in a program represent indications of structural quality problems, which can be addressed by software refactoring. However, refactoring intends to achieve different goals in practice, and its application may not reduce smelly structures. Developers may neglect or end up creating new code smells through refactoring. Unfortunately, little has been reported about the beneficial and harmful effects of refactoring on code smells. This paper reports a longitudinal study intended to address this gap. We analyze how often commonly-used refactoring types affect the density of 13 types of code smells along the version histories of 23 projects. Our findings are based on the analysis of 16,566 refactorings distributed in 10 different types. Even though 79.4% of the refactorings touched smelly elements, 57% did not reduce their occurrences. Surprisingly, only 9.7% of refactorings removed smells, while 33.3% induced the introduction of new ones. More than 95% of such refactoring-induced smells were not removed in successive commits, which suggest refactorings tend to more frequently introduce long-living smells instead of eliminating existing ones. We also characterized and quantified typical refactoring-smell patterns, and observed that harmful patterns are frequent, including: (i) approximately 30% of the Move Method and Pull Up Method refactorings induced the emergence of God Class, and (ii) the Extract Superclass refactoring creates the smell Speculative Generality in 68% of the cases. @InProceedings{ESEC/FSE17p465, author = {Diego Cedrim and Alessandro Garcia and Melina Mongiovi and Rohit Gheyi and Leonardo Sousa and Rafael de Mello and Baldoino Fonseca and Márcio Ribeiro and Alexander Chávez}, title = {Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {465--475}, doi = {}, year = {2017}, } Info |
|
García, Jair |
ESEC/FSE '17-IND: "Improving Understanding of ..."
Improving Understanding of Dynamically Typed Software Developed by Agile Practitioners
Jair García and Kelly Garcés (Grupo Saberes, Colombia; Universidad de los Andes, Colombia) Agile Development values working software over documentation. Therefore, in maintenance stages of existing software, the source code is the sole software artifact that developers have for analyzing the viability and impact of a new user story. Since functionality is often spread in hundreds of lines of code, it is hard for the developer to understand the system, which may lead to under-/overestimation of the new feature cost and rework/delays in the subsequent phases of development. In a previous work, we proposed a Model-Driven Reverse Engineering approach for obtaining software visualizations from source code. Two case studies of comprehension of applications written in statically typed languages have shown the applicability of this approach. A recent experience with an industrial partner, where the systems are developed on dynamically typed languages, has motivated us to adapt the previous proposal to take as input not only the source code but also the application data schema to complete the information that is missing in the code, and then automatically generate more meaningful diagrams that help developers in maintenance tasks. In this article, we present the adaptation of the general approach to support data schema as an additional input and its instrumentation in an industrial case study where the technology is Ruby on Rails. The paper ends by explaining the precision and performance of the instrumentation when used in a Colombian company as well as lessons learned. @InProceedings{ESEC/FSE17p908, author = {Jair García and Kelly Garcés}, title = {Improving Understanding of Dynamically Typed Software Developed by Agile Practitioners}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {908--913}, doi = {}, year = {2017}, } |
|
Garcia, Joshua |
ESEC/FSE '17: "Automatic Generation of Inter-Component ..."
Automatic Generation of Inter-Component Communication Exploits for Android Applications
Joshua Garcia, Mahmoud Hammad, Negar Ghorbani, and Sam Malek (University of California at Irvine, USA) Although a wide variety of approaches identify vulnerabilities in Android apps, none attempt to determine exploitability of those vulnerabilities. Exploitability can aid in reducing false positives of vulnerability analysis, and can help engineers triage bugs. Specifically, one of the main attack vectors of Android apps is their inter-component communication interface, where apps may receive messages called Intents. In this paper, we provide the first approach for automatically generating exploits for Android apps, called LetterBomb, relying on a combined path-sensitive symbolic execution-based static analysis, and the use of software instrumentation and test oracles. We run LetterBomb on 10,000 Android apps from Google Play, where we identify 181 exploits from 835 vulnerable apps. Compared to a state-of-the-art detection approach for three ICC-based vulnerabilities, LetterBomb obtains 33%-60% more vulnerabilities at a 6.66 to 7 times faster speed. @InProceedings{ESEC/FSE17p661, author = {Joshua Garcia and Mahmoud Hammad and Negar Ghorbani and Sam Malek}, title = {Automatic Generation of Inter-Component Communication Exploits for Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {661--671}, doi = {}, year = {2017}, } Info |
|
Gascon-Samson, Julien |
ESEC/FSE '17: "ARTINALI: Dynamic Invariant ..."
ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security
Maryam Raiyat Aliabadi, Amita Ajith Kamath, Julien Gascon-Samson, and Karthik Pattabiraman (University of British Columbia, Canada; National Institute of Technology Karnataka, India) Cyber-Physical Systems (CPSes) are being widely deployed in security critical scenarios such as smart homes and medical devices. Unfortunately, the connectedness of these systems and their relative lack of security measures makes them ripe targets for attacks. Specification-based Intrusion Detection Systems (IDS) have been shown to be effective for securing CPSs. Unfortunately, deriving invariants for capturing the specifications of CPS systems is a tedious and error-prone process. Therefore, it is important to dynamically monitor the CPS system to learn its common behaviors and formulate invariants for detecting security attacks. Existing techniques for invariant mining only incorporate data and events, but not time. However, time is central to most CPS systems, and hence incorporating time in addition to data and events, is essential for achieving low false positives and false negatives. This paper proposes ARTINALI, which mines dynamic system properties by incorporating time as a first-class property of the system. We build ARTINALI-based Intrusion Detection Systems (IDSes) for two CPSes, namely smart meters and smart medical devices, and measure their efficacy. We find that the ARTINALI-based IDSes significantly reduce the ratio of false positives and false negatives by 16 to 48% (average 30.75%) and 89 to 95% (average 93.4%) respectively over other dynamic invariant detection tools. @InProceedings{ESEC/FSE17p349, author = {Maryam Raiyat Aliabadi and Amita Ajith Kamath and Julien Gascon-Samson and Karthik Pattabiraman}, title = {ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {349--361}, doi = {}, year = {2017}, } |
|
Gazzillo, Paul |
ESEC/FSE '17: "Kmax: Finding All Configurations ..."
Kmax: Finding All Configurations of Kbuild Makefiles Statically
Paul Gazzillo (Yale University, USA) Feature-oriented software design is a useful paradigm for building and reasoning about highly-configurable software. By making variability explicit, feature-oriented tools and languages make program analysis tasks easier, such as bug-finding, maintenance, and more. But critical software, such as Linux, coreboot, and BusyBox rely instead on brittle tools, such as Makefiles, to encode variability, impeding variability-aware tool development. Summarizing Makefile behavior for all configurations is difficult, because Makefiles have unusual semantics, and exhaustive enumeration of all configurations is intractable in practice. Existing approaches use ad-hoc heuristics, missing much of the encoded variability in Makefiles. We present Kmax, a new static analysis algorithm and tool for Kbuild Makefiles. It is a family-based variability analysis algorithm, where paths are Boolean expressions of configuration options, called reaching configurations, and its abstract state enumerates string values for all configurations. Kmax localizes configuration explosion to the statement level, making precise analysis tractable. The implementation analyzes Makefiles from the Kbuild build system used by several low-level systems projects. Evaluation of Kmax on the Linux and BusyBox build systems shows it to be accurate, precise, and fast. It is the first tool to collect all source files and their configurations from Linux. Compared to previous approaches, Kmax is far more accurate and precise, performs with little overhead, and scales better. @InProceedings{ESEC/FSE17p279, author = {Paul Gazzillo}, title = {Kmax: Finding All Configurations of Kbuild Makefiles Statically}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {279--290}, doi = {}, year = {2017}, } Info |
|
Gheyi, Rohit |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects
Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo Sousa, Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez (PUC-Rio, Brazil; Federal University of Campina Grande, Brazil; Federal University of Alagoas, Brazil) Code smells in a program represent indications of structural quality problems, which can be addressed by software refactoring. However, refactoring intends to achieve different goals in practice, and its application may not reduce smelly structures. Developers may neglect or end up creating new code smells through refactoring. Unfortunately, little has been reported about the beneficial and harmful effects of refactoring on code smells. This paper reports a longitudinal study intended to address this gap. We analyze how often commonly-used refactoring types affect the density of 13 types of code smells along the version histories of 23 projects. Our findings are based on the analysis of 16,566 refactorings distributed in 10 different types. Even though 79.4% of the refactorings touched smelly elements, 57% did not reduce their occurrences. Surprisingly, only 9.7% of refactorings removed smells, while 33.3% induced the introduction of new ones. More than 95% of such refactoring-induced smells were not removed in successive commits, which suggest refactorings tend to more frequently introduce long-living smells instead of eliminating existing ones. We also characterized and quantified typical refactoring-smell patterns, and observed that harmful patterns are frequent, including: (i) approximately 30% of the Move Method and Pull Up Method refactorings induced the emergence of God Class, and (ii) the Extract Superclass refactoring creates the smell Speculative Generality in 68% of the cases. @InProceedings{ESEC/FSE17p465, author = {Diego Cedrim and Alessandro Garcia and Melina Mongiovi and Rohit Gheyi and Leonardo Sousa and Rafael de Mello and Baldoino Fonseca and Márcio Ribeiro and Alexander Chávez}, title = {Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {465--475}, doi = {}, year = {2017}, } Info |
|
Ghezzi, Carlo |
ESEC/FSE '17: "Modeling and Verification ..."
Modeling and Verification of Evolving Cyber-Physical Spaces
Christos Tsigkanos, Timo Kehrer, and Carlo Ghezzi (Politecnico di Milano, Italy) We increasingly live in cyber-physical spaces -- spaces that are both physical and digital, and where the two aspects are intertwined. Such spaces are highly dynamic and typically undergo continuous change. Software engineering can have a profound impact in this domain, by defining suitable modeling and specification notations as well as supporting design-time formal verification. In this paper, we present a methodology and a technical framework which support modeling of evolving cyber-physical spaces and reasoning about their spatio-temporal properties. We utilize a discrete, graph-based formalism for modeling cyber-physical spaces as well as primitives of change, giving rise to a reactive system consisting of rewriting rules with both local and global application conditions. Formal reasoning facilities are implemented adopting logic-based specification of properties and according model checking procedures, in both spatial and temporal fragments. We evaluate our approach using a case study of a disaster scenario in a smart city. @InProceedings{ESEC/FSE17p38, author = {Christos Tsigkanos and Timo Kehrer and Carlo Ghezzi}, title = {Modeling and Verification of Evolving Cyber-Physical Spaces}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {38--48}, doi = {}, year = {2017}, } |
|
Ghorbani, Negar |
ESEC/FSE '17: "Automatic Generation of Inter-Component ..."
Automatic Generation of Inter-Component Communication Exploits for Android Applications
Joshua Garcia, Mahmoud Hammad, Negar Ghorbani, and Sam Malek (University of California at Irvine, USA) Although a wide variety of approaches identify vulnerabilities in Android apps, none attempt to determine exploitability of those vulnerabilities. Exploitability can aid in reducing false positives of vulnerability analysis, and can help engineers triage bugs. Specifically, one of the main attack vectors of Android apps is their inter-component communication interface, where apps may receive messages called Intents. In this paper, we provide the first approach for automatically generating exploits for Android apps, called LetterBomb, relying on a combined path-sensitive symbolic execution-based static analysis, and the use of software instrumentation and test oracles. We run LetterBomb on 10,000 Android apps from Google Play, where we identify 181 exploits from 835 vulnerable apps. Compared to a state-of-the-art detection approach for three ICC-based vulnerabilities, LetterBomb obtains 33%-60% more vulnerabilities at a 6.66 to 7 times faster speed. @InProceedings{ESEC/FSE17p661, author = {Joshua Garcia and Mahmoud Hammad and Negar Ghorbani and Sam Malek}, title = {Automatic Generation of Inter-Component Communication Exploits for Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {661--671}, doi = {}, year = {2017}, } Info |
|
Glanz, Leonid |
ESEC/FSE '17: "CodeMatch: Obfuscation Won't ..."
CodeMatch: Obfuscation Won't Conceal Your Repackaged App
Leonid Glanz, Sven Amann, Michael Eichberg, Michael Reif, Ben Hermann, Johannes Lerch, and Mira Mezini (TU Darmstadt, Germany) An established way to steal the income of app developers, or to trick users into installing malware, is the creation of repackaged apps. These are clones of – typically – successful apps. To conceal their nature, they are often obfuscated by their creators. But, given that it is a common best practice to obfuscate apps, a trivial identification of repackaged apps is not possible. The problem is further intensified by the prevalent usage of libraries. In many apps, the size of the overall code base is basically determined by the used libraries. Therefore, two apps, where the obfuscated code bases are very similar, do not have to be repackages of each other. To reliably detect repackaged apps, we propose a two step approach which first focuses on the identification and removal of the library code in obfuscated apps. This approach – LibDetect – relies on code representations which abstract over several parts of the underlying bytecode to be resilient against certain obfuscation techniques. Using this approach, we are able to identify on average 70% more used libraries per app than previous approaches. After the removal of an app’s library code, we then fuzzy hash the most abstract representation of the remaining app code to ensure that we can identify repackaged apps even if very advanced obfuscation techniques are used. This makes it possible to identify repackaged apps. Using our approach, we found that ≈ 15% of all apps in Android app stores are repackages @InProceedings{ESEC/FSE17p638, author = {Leonid Glanz and Sven Amann and Michael Eichberg and Michael Reif and Ben Hermann and Johannes Lerch and Mira Mezini}, title = {CodeMatch: Obfuscation Won't Conceal Your Repackaged App}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {638--648}, doi = {}, year = {2017}, } Info |
|
Gligoric, Milos |
ESEC/FSE '17-IND: "File-Level vs. Module-Level ..."
File-Level vs. Module-Level Regression Test Selection for .NET
Marko Vasic, Zuhair Parvez, Aleksandar Milicevic, and Milos Gligoric (University of Texas at Austin, USA; Microsoft, USA) Regression testing is used to check the correctness of evolving software. With the adoption of Agile development methodology, the number of tests and software revisions has dramatically increased, and hence has the cost of regression testing. Researchers proposed regression test selection (RTS) techniques that optimize regression testing by skipping tests that are not impacted by recent program changes. Ekstazi is one such state-of-the art technique; Ekstazi is implemented for the Java programming language and has been adopted by several companies and open-source projects. We report on our experience implementing and evaluating Ekstazi#, an Ekstazi-like tool for .NET. We describe the key challenges of bringing the Ekstazi idea to the .NET platform. We evaluate Ekstazi# on 11 open-source projects, as well as an internal Microsoft project substantially larger than each of the open-source projects. Finally, we compare Ekstazi# to an incremental build system (also developed at Microsoft), which, out of the box, provides module-level dependency tracking and skipping tasks (including test execution) whenever dependencies of a task do not change between the current and the last successful build. Ekstazi# on average reduced regression testing time by 43.70% for the open-source projects and by 65.26% for the Microsoft project (the latter is in addition to the savings provided by incremental builds). @InProceedings{ESEC/FSE17p848, author = {Marko Vasic and Zuhair Parvez and Aleksandar Milicevic and Milos Gligoric}, title = {File-Level vs. Module-Level Regression Test Selection for .NET}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {848--853}, doi = {}, year = {2017}, } ESEC/FSE '17: "Regression Test Selection ..." Regression Test Selection Across JVM Boundaries Ahmet Celik, Marko Vasic, Aleksandar Milicevic, and Milos Gligoric (University of Texas at Austin, USA; Microsoft, USA) Modern software development processes recommend that changes be integrated into the main development line of a project multiple times a day. Before a new revision may be integrated, developers practice regression testing to ensure that the latest changes do not break any previously established functionality. The cost of regression testing is high, due to an increase in the number of revisions that are introduced per day, as well as the number of tests developers write per revision. Regression test selection (RTS) optimizes regression testing by skipping tests that are not affected by recent project changes. Existing dynamic RTS techniques support only projects written in a single programming language, which is unfortunate knowing that an open-source project is on average written in several programming languages. We present the first dynamic RTS technique that does not stop at predefined language boundaries. Our technique dynamically detects, at the operating system level, all file artifacts a test depends on. Our technique is, hence, oblivious to the specific means the test uses to actually access the files: be it through spawning a new process, invoking a system call, invoking a library written in a different language, invoking a library that spawns a process which makes a system call, etc. We also provide a set of extension points which allow for a smooth integration with testing frameworks and build systems. We implemented our technique in a tool called RTSLinux as a loadable Linux kernel module and evaluated it on 21 Java projects that escape JVM by spawning new processes or invoking native code, totaling 2,050,791 lines of code. Our results show that RTSLinux, on average, skips 74.17% of tests and saves 52.83% of test execution time compared to executing all tests. @InProceedings{ESEC/FSE17p809, author = {Ahmet Celik and Marko Vasic and Aleksandar Milicevic and Milos Gligoric}, title = {Regression Test Selection Across JVM Boundaries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {809--820}, doi = {}, year = {2017}, } |
|
Goknil, Arda |
ESEC/FSE '17-DEMO: "A Tool for Automated Reasoning ..."
A Tool for Automated Reasoning about Traces Based on Configurable Formal Semantics
Ferhat Erata, Arda Goknil, Bedir Tekinerdogan, and Geylani Kardas (Wageningen University, Netherlands; University of Luxembourg, Luxembourg; Ege University, Turkey) We present Tarski, a tool for specifying configurable trace semantics to facilitate automated reasoning about traces. Software development projects require that various types of traces be modeled between and within development artifacts. For any given artifact (e.g., requirements, architecture models and source code), Tarski allows the user to specify new trace types and their configurable semantics, while, using the semantics, it automatically infers new traces based on existing traces provided by the user, and checks the consistency of traces. It has been evaluated on three industrial case studies in the automotive domain (https://modelwriter.github.io/Tarski/). @InProceedings{ESEC/FSE17p959, author = {Ferhat Erata and Arda Goknil and Bedir Tekinerdogan and Geylani Kardas}, title = {A Tool for Automated Reasoning about Traces Based on Configurable Formal Semantics}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {959--963}, doi = {}, year = {2017}, } Video Info |
|
Gold, Nicolas E. |
ESEC/FSE '17: "Generalized Observational ..."
Generalized Observational Slicing for Tree-Represented Modelling Languages
Nicolas E. Gold, David Binkley, Mark Harman, Syed Islam, Jens Krinke, and Shin Yoo (University College London, UK; Loyola University Maryland, USA; University of East London, UK; KAIST, South Korea) Model-driven software engineering raises the abstraction level making complex systems easier to understand than if written in textual code. Nevertheless, large complicated software systems can have large models, motivating the need for slicing techniques that reduce the size of a model. We present a generalization of observation-based slicing that allows the criterion to be defined using a variety of kinds of observable behavior and does not require any complex dependence analysis. We apply our implementation of generalized observational slicing for tree-structured representations to Simulink models. The resulting slice might be the subset of the original model responsible for an observed failure or simply the sub-model semantically related to a classic slicing criterion. Unlike its predecessors, the algorithm is also capable of slicing embedded Stateflow state machines. A study of nine real-world models drawn from four different application domains demonstrates the effectiveness of our approach at dramatically reducing Simulink model sizes for realistic observation scenarios: for 9 out of 20 cases, the resulting model has fewer than 25% of the original model's elements. @InProceedings{ESEC/FSE17p547, author = {Nicolas E. Gold and David Binkley and Mark Harman and Syed Islam and Jens Krinke and Shin Yoo}, title = {Generalized Observational Slicing for Tree-Represented Modelling Languages}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {547--558}, doi = {}, year = {2017}, } |
|
Gopstein, Dan |
ESEC/FSE '17: "Understanding Misunderstandings ..."
Understanding Misunderstandings in Source Code
Dan Gopstein, Jake Iannacone, Yu Yan, Lois DeLong, Yanyan Zhuang, Martin K.-C. Yeh, and Justin Cappos (New York University, USA; Pennsylvania State University, USA; University of Colorado at Colorado Springs, USA) Humans often mistake the meaning of source code, and so misjudge a program's true behavior. These mistakes can be caused by extremely small, isolated patterns in code, which can lead to significant runtime errors. These patterns are used in large, popular software projects and even recommended in style guides. To identify code patterns that may confuse programmers we extracted a preliminary set of `atoms of confusion' from known confusing code. We show empirically in an experiment with 73 participants that these code patterns can lead to a significantly increased rate of misunderstanding versus equivalent code without the patterns. We then go on to take larger confusing programs and measure (in an experiment with 43 participants) the impact, in terms of programmer confusion, of removing these confusing patterns. All of our instruments, analysis code, and data are publicly available online for replication, experimentation, and feedback. @InProceedings{ESEC/FSE17p129, author = {Dan Gopstein and Jake Iannacone and Yu Yan and Lois DeLong and Yanyan Zhuang and Martin K.-C. Yeh and Justin Cappos}, title = {Understanding Misunderstandings in Source Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {129--139}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Gorlick, Michael M. |
ESEC/FSE '17-INV: "Reflections on the REST Architectural ..."
Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)
Roy T. Fielding, Richard N. Taylor, Justin R. Erenkrantz, Michael M. Gorlick, Jim Whitehead, Rohit Khare, and Peyman Oreizy (Adobe, USA; University of California at Irvine, USA; Bloomberg, USA; University of California at Santa Cruz, USA; Google, USA; Dynamic Variable, USA) Seventeen years after its initial publication at ICSE 2000, the Representational State Transfer (REST) architectural style continues to hold significance as both a guide for understanding how the World Wide Web is designed to work and an example of how principled design, through the application of architectural styles, can impact the development and understanding of large-scale software architecture. However, REST has also become an industry buzzword: frequently abused to suit a particular argument, confused with the general notion of using HTTP, and denigrated for not being more like a programming methodology or implementation framework. In this paper, we chart the history, evolution, and shortcomings of REST, as well as several related architectural styles that it inspired, from the perspective of a chain of doctoral dissertations produced by the University of California's Institute for Software Research at UC Irvine. These successive theses share a common theme: extending the insights of REST to new domains and, in their own way, exploring the boundary of software engineering as it applies to decentralized software architectures and architectural design. We conclude with discussion of the circumstances, environment, and organizational characteristics that gave rise to this body of work. @InProceedings{ESEC/FSE17p4, author = {Roy T. Fielding and Richard N. Taylor and Justin R. Erenkrantz and Michael M. Gorlick and Jim Whitehead and Rohit Khare and Peyman Oreizy}, title = {Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {4--14}, doi = {}, year = {2017}, } |
|
Gousios, Georgios |
ESEC/FSE '17-IND: "Strong Agile Metrics: Mining ..."
Strong Agile Metrics: Mining Log Data to Determine Predictive Power of Software Metrics for Continuous Delivery Teams
Hennie Huijgens, Robert Lamping, Dick Stevens, Hartger Rothengatter, Georgios Gousios, and Daniele Romano (Delft University of Technology, Netherlands; ING Bank, Netherlands; CGI, Netherlands) ING Bank, a large Netherlands-based internationally operating bank, implemented a fully automated continuous delivery pipe-line for its software engineering activities in more than 300 teams, that perform more than 2500 deployments to production each month on more than 750 different applications. Our objective is to examine how strong metrics for agile (Scrum) DevOps teams can be set in an iterative fashion. We perform an exploratory case study that focuses on the classification based on predictive power of software metrics, in which we analyze log data derived from two initial sources within this pipeline. We analyzed a subset of 16 metrics from 59 squads. We identified two lagging metrics and assessed four leading metrics to be strong. @InProceedings{ESEC/FSE17p866, author = {Hennie Huijgens and Robert Lamping and Dick Stevens and Hartger Rothengatter and Georgios Gousios and Daniele Romano}, title = {Strong Agile Metrics: Mining Log Data to Determine Predictive Power of Software Metrics for Continuous Delivery Teams}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {866--871}, doi = {}, year = {2017}, } |
|
Greenyer, Joel |
ESEC/FSE '17-DEMO: "From Scenario Modeling to ..."
From Scenario Modeling to Scenario Programming for Reactive Systems with Dynamic Topology
Joel Greenyer, Daniel Gritzner, Florian König, Jannik Dahlke, Jianwei Shi, and Eric Wete (Leibniz Universität Hannover, Germany) Software-intensive systems often consist of cooperating reactive components. In mobile and reconfigurable systems, their topology changes at run-time, which influences how the components must cooperate. The Scenario Modeling Language (SML) offers a formal approach for specifying the reactive behavior such systems that aligns with how humans conceive and communicate behavioral requirements. Simulation and formal checks can find specification flaws early. We present a framework for the Scenario-based Programming (SBP) that reflects the concepts of SML in Java and makes the scenario modeling approach available for programming. SBP code can also be generated from SML and extended with platform-specific code, thus streamlining the transition from design to implementation. As an example serves a car-to-x communication system. Demo video and artifact: http://scenariotools.org/esecfse-2017-tool-demo/ @InProceedings{ESEC/FSE17p974, author = {Joel Greenyer and Daniel Gritzner and Florian König and Jannik Dahlke and Jianwei Shi and Eric Wete}, title = {From Scenario Modeling to Scenario Programming for Reactive Systems with Dynamic Topology}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {974--978}, doi = {}, year = {2017}, } Video Info |
|
Gritzner, Daniel |
ESEC/FSE '17-DEMO: "From Scenario Modeling to ..."
From Scenario Modeling to Scenario Programming for Reactive Systems with Dynamic Topology
Joel Greenyer, Daniel Gritzner, Florian König, Jannik Dahlke, Jianwei Shi, and Eric Wete (Leibniz Universität Hannover, Germany) Software-intensive systems often consist of cooperating reactive components. In mobile and reconfigurable systems, their topology changes at run-time, which influences how the components must cooperate. The Scenario Modeling Language (SML) offers a formal approach for specifying the reactive behavior such systems that aligns with how humans conceive and communicate behavioral requirements. Simulation and formal checks can find specification flaws early. We present a framework for the Scenario-based Programming (SBP) that reflects the concepts of SML in Java and makes the scenario modeling approach available for programming. SBP code can also be generated from SML and extended with platform-specific code, thus streamlining the transition from design to implementation. As an example serves a car-to-x communication system. Demo video and artifact: http://scenariotools.org/esecfse-2017-tool-demo/ @InProceedings{ESEC/FSE17p974, author = {Joel Greenyer and Daniel Gritzner and Florian König and Jannik Dahlke and Jianwei Shi and Eric Wete}, title = {From Scenario Modeling to Scenario Programming for Reactive Systems with Dynamic Topology}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {974--978}, doi = {}, year = {2017}, } Video Info |
|
Guo, Shengjian |
ESEC/FSE '17: "Symbolic Execution of Programmable ..."
Symbolic Execution of Programmable Logic Controller Code
Shengjian Guo, Meng Wu, and Chao Wang (Virginia Tech, USA; University of Southern California, USA) Programmable logic controllers (PLCs) are specialized computers for automating a wide range of cyber-physical systems. Since these systems are often safety-critical, software running on PLCs need to be free of programming errors. However, automated tools for testing PLC software are lacking despite the pervasive use of PLCs in industry. We propose a symbolic execution based method, named SymPLC, for automatically testing PLC software written in programming languages specified in the IEC 61131-3 standard. SymPLC takes the PLC source code as input and translates it into C before applying symbolic execution, to systematically generate test inputs that cover both paths in each periodic task and interleavings of these tasks. Toward this end, we propose a number of PLC-specific reduction techniques for identifying and eliminating redundant interleavings. We have evaluated SymPLC on a large set of benchmark programs with both single and multiple tasks. Our experiments show that SymPLC can handle these programs efficiently, and for multi-task PLC programs, our new reduction techniques outperform the state-of-the-art partial order reduction technique by more than two orders of magnitude. @InProceedings{ESEC/FSE17p326, author = {Shengjian Guo and Meng Wu and Chao Wang}, title = {Symbolic Execution of Programmable Logic Controller Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {326--336}, doi = {}, year = {2017}, } |
|
Guo, Yu |
ESEC/FSE '17: "AtexRace: Across Thread and ..."
AtexRace: Across Thread and Execution Sampling for In-House Race Detection
Yu Guo, Yan Cai, and Zijiang Yang (Western Michigan University, USA; Institute of Software at Chinese Academy of Sciences, China) Data race is a major source of concurrency bugs. Dynamic data race detection tools (e.g., FastTrack) monitor the execu-tions of a program to report data races occurring in runtime. However, such tools incur significant overhead that slows down and perturbs executions. To address the issue, the state-of-the-art dynamic data race detection tools (e.g., LiteRace) ap-ply sampling techniques to selectively monitor memory access-es. Although they reduce overhead, they also miss many data races as confirmed by existing studies. Thus, practitioners face a dilemma on whether to use FastTrack, which detects more data races but is much slower, or LiteRace, which is faster but detects less data races. In this paper, we propose a new sam-pling approach to address the major limitations of current sampling techniques, which ignore the facts that a data race involves two threads and a program under testing is repeatedly executed. We develop a tool called AtexRace to sample memory accesses across both threads and executions. By selectively monitoring the pairs of memory accesses that have not been frequently observed in current and previous executions, AtexRace detects as many data races as FastTrack at a cost as low as LiteRace. We have compared AtexRace against FastTrack and LiteRace on both Parsec benchmark suite and a large-scale real-world MySQL Server with 223 test cases. The experiments confirm that AtexRace can be a replacement of FastTrack and LiteRace. @InProceedings{ESEC/FSE17p315, author = {Yu Guo and Yan Cai and Zijiang Yang}, title = {AtexRace: Across Thread and Execution Sampling for In-House Race Detection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {315--325}, doi = {}, year = {2017}, } |
|
Gutiérrez Brida, Simón |
ESEC/FSE '17-DEMO: "DynAlloy Analyzer: A Tool ..."
DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour
Germán Regis, César Cornejo, Simón Gutiérrez Brida, Mariano Politano, Fernando Raverta, Pablo Ponzio, Nazareno Aguirre, Juan Pablo Galeotti, and Marcelo Frias (Universidad Nacional de Río Cuarto, Argentina; Universidad Nacional de Córdoba, Argentina; University of Buenos Aires, Argentina; Buenos Aires Institute of Technology, Argentina) We describe DynAlloy Analyzer, a tool that extends Alloy Analyzer with support for dynamic elements in Alloy models. The tool builds upon Alloy Analyzer in a way that makes it fully compatible with Alloy models, and extends their syntax with a particular idiom, inspired in dynamic logic, for the description of dynamic behaviours, understood as sequences of states over standard Alloy models, in terms of programs. The syntax is broad enough to accommodate abstract dynamic behaviours, e.g., using nondeterministic choice and finite unbounded iteration, as well as more concrete ones, using standard sequential programming constructions. The analysis of DynAlloy models resorts to the analysis of Alloy models, through an optimized translation that often makes the analysis more efficient than that of typical ad-hoc constructions to capture dynamism in Alloy. Tool screencast, binaries and further details available in: http://dc.exa.unrc.edu.ar/tools/dynalloy @InProceedings{ESEC/FSE17p969, author = {Germán Regis and César Cornejo and Simón Gutiérrez Brida and Mariano Politano and Fernando Raverta and Pablo Ponzio and Nazareno Aguirre and Juan Pablo Galeotti and Marcelo Frias}, title = {DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {969--973}, doi = {}, year = {2017}, } Video Info |
|
Ham, Jeroen van der |
ESEC/FSE '17-IND: "Model-Driven Software Engineering ..."
Model-Driven Software Engineering in Practice: Privacy-Enhanced Filtering of Network Traffic
Roel van Dijk, Christophe Creeten, Jeroen van der Ham, and Jeroen van den Bos (Netherlands Forensic Institute, Netherlands; Delft University of Technology, Netherlands; National Cyber Security Centre, Netherlands; Zuyd University of Applied Sciences, Netherlands) Network traffic data contains a wealth of information for use in security analysis and application development. Unfortunately, it also usually contains confidential or otherwise sensitive information, prohibiting sharing and analysis. Existing automated anonymization solutions are hard to maintain and tend to be outdated. We present Privacy-Enhanced Filtering (PEF), a model-driven prototype framework that relies on declarative descriptions of protocols and a set of filter rules, which are used to automatically transform network traffic data to remove sensitive information. This paper discusses the design, implementation and application of PEF, which is available as open-source software and configured for use in a typical malware detection scenario. @InProceedings{ESEC/FSE17p860, author = {Roel van Dijk and Christophe Creeten and Jeroen van der Ham and Jeroen van den Bos}, title = {Model-Driven Software Engineering in Practice: Privacy-Enhanced Filtering of Network Traffic}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {860--865}, doi = {}, year = {2017}, } |
|
Hammad, Mahmoud |
ESEC/FSE '17: "Automatic Generation of Inter-Component ..."
Automatic Generation of Inter-Component Communication Exploits for Android Applications
Joshua Garcia, Mahmoud Hammad, Negar Ghorbani, and Sam Malek (University of California at Irvine, USA) Although a wide variety of approaches identify vulnerabilities in Android apps, none attempt to determine exploitability of those vulnerabilities. Exploitability can aid in reducing false positives of vulnerability analysis, and can help engineers triage bugs. Specifically, one of the main attack vectors of Android apps is their inter-component communication interface, where apps may receive messages called Intents. In this paper, we provide the first approach for automatically generating exploits for Android apps, called LetterBomb, relying on a combined path-sensitive symbolic execution-based static analysis, and the use of software instrumentation and test oracles. We run LetterBomb on 10,000 Android apps from Google Play, where we identify 181 exploits from 835 vulnerable apps. Compared to a state-of-the-art detection approach for three ICC-based vulnerabilities, LetterBomb obtains 33%-60% more vulnerabilities at a 6.66 to 7 times faster speed. @InProceedings{ESEC/FSE17p661, author = {Joshua Garcia and Mahmoud Hammad and Negar Ghorbani and Sam Malek}, title = {Automatic Generation of Inter-Component Communication Exploits for Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {661--671}, doi = {}, year = {2017}, } Info |
|
Harman, Mark |
ESEC/FSE '17: "Generalized Observational ..."
Generalized Observational Slicing for Tree-Represented Modelling Languages
Nicolas E. Gold, David Binkley, Mark Harman, Syed Islam, Jens Krinke, and Shin Yoo (University College London, UK; Loyola University Maryland, USA; University of East London, UK; KAIST, South Korea) Model-driven software engineering raises the abstraction level making complex systems easier to understand than if written in textual code. Nevertheless, large complicated software systems can have large models, motivating the need for slicing techniques that reduce the size of a model. We present a generalization of observation-based slicing that allows the criterion to be defined using a variety of kinds of observable behavior and does not require any complex dependence analysis. We apply our implementation of generalized observational slicing for tree-structured representations to Simulink models. The resulting slice might be the subset of the original model responsible for an observed failure or simply the sub-model semantically related to a classic slicing criterion. Unlike its predecessors, the algorithm is also capable of slicing embedded Stateflow state machines. A study of nine real-world models drawn from four different application domains demonstrates the effectiveness of our approach at dramatically reducing Simulink model sizes for realistic observation scenarios: for 9 out of 20 cases, the resulting model has fewer than 25% of the original model's elements. @InProceedings{ESEC/FSE17p547, author = {Nicolas E. Gold and David Binkley and Mark Harman and Syed Islam and Jens Krinke and Shin Yoo}, title = {Generalized Observational Slicing for Tree-Represented Modelling Languages}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {547--558}, doi = {}, year = {2017}, } |
|
Harms, Holger |
ESEC/FSE '17-IND: "Guidelines for Adopting Frontend ..."
Guidelines for Adopting Frontend Architectures and Patterns in Microservices-Based Systems
Holger Harms, Collin Rogowski, and Luigi Lo Iacono (Funke Digital, Germany; inovex, Germany; Cologne University of Applied Sciences, Germany) Microservice-based systems enable the independent development, deployment, and scalability for separate system components of enterprise applications. A significant aspect during development is the microservice integration in frontends of web, mobile, and desktop applications. One challenge here is the selection of an adequate frontend architecture as well as suitable patterns that satisfy the application requirements. This paper analyses available strategies for organizing and implementing microservices frontends. These approaches are then evaluated based on a quality model and various prototypes of the same application implemented using the distinct approaches. The results of this analysis are generalized to a guideline that supports the selection of a suitable architecture. @InProceedings{ESEC/FSE17p902, author = {Holger Harms and Collin Rogowski and Luigi Lo Iacono}, title = {Guidelines for Adopting Frontend Architectures and Patterns in Microservices-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {902--907}, doi = {}, year = {2017}, } |
|
Heizmann, Matthias |
ESEC/FSE '17: "Craig vs. Newton in Software ..."
Craig vs. Newton in Software Model Checking
Daniel Dietsch, Matthias Heizmann, Betim Musa, Alexander Nutz, and Andreas Podelski (University of Freiburg, Germany) Ever since the seminal work on SLAM and BLAST, software model checking with counterexample-guided abstraction refinement (CEGAR) has been an active topic of research. The crucial procedure here is to analyze a sequence of program statements (the counterexample) to find building blocks for the overall proof of the program. We can distinguish two approaches (which we name Craig and Newton) to implement the procedure. The historically first approach, Newton (named after the tool from the SLAM toolkit), is based on symbolic execution. The second approach, Craig, is based on Craig interpolation. It was widely believed that Craig is substantially more effective than Newton. In fact, 12 out of the 15 CEGAR-based tools in SV-COMP are based on Craig. Advances in software model checkers based on Craig, however, can go only lockstep with advances in SMT solvers with Craig interpolation. It may be time to revisit Newton and ask whether Newton can be as effective as Craig. We have implemented a total of 11 variants of Craig and Newton in two different state-of-the-art software model checking tools and present the outcome of our experimental comparison. @InProceedings{ESEC/FSE17p487, author = {Daniel Dietsch and Matthias Heizmann and Betim Musa and Alexander Nutz and Andreas Podelski}, title = {Craig vs. Newton in Software Model Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {487--497}, doi = {}, year = {2017}, } |
|
Hellendoorn, Vincent J. |
ESEC/FSE '17: "Are Deep Neural Networks the ..."
Are Deep Neural Networks the Best Choice for Modeling Source Code?
Vincent J. Hellendoorn and Premkumar Devanbu (University of California at Davis, USA) Current statistical language modeling techniques, including deep-learning based models, have proven to be quite effective for source code. We argue here that the special properties of source code can be exploited for further improvements. In this work, we enhance established language modeling approaches to handle the special challenges of modeling source code, such as: frequent changes, larger, changing vocabularies, deeply nested scopes, etc. We present a fast, nested language modeling toolkit specifically designed for software, with the ability to add & remove text, and mix & swap out many models. Specifically, we improve upon prior cache-modeling work and present a model with a much more expansive, multi-level notion of locality that we show to be well-suited for modeling software. We present results on varying corpora in comparison with traditional N-gram, as well as RNN, and LSTM deep-learning language models, and release all our source code for public use. Our evaluations suggest that carefully adapting N-gram models for source code can yield performance that surpasses even RNN and LSTM based deep-learning models. @InProceedings{ESEC/FSE17p763, author = {Vincent J. Hellendoorn and Premkumar Devanbu}, title = {Are Deep Neural Networks the Best Choice for Modeling Source Code?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {763--773}, doi = {}, year = {2017}, } Info |
|
Heo, Min-Jae |
ESEC/FSE '17-IND: "Applying Deep Learning Based ..."
Applying Deep Learning Based Automatic Bug Triager to Industrial Projects
Sun-Ro Lee, Min-Jae Heo, Chan-Gun Lee, Milhan Kim, and Gaeul Jeong (Chung-Ang University, South Korea; LG Electronics, South Korea) Finding the appropriate developer for a bug report, so called `Bug Triage', is one of the bottlenecks in the bug resolution process. To address this problem, many approaches have proposed various automatic bug triage techniques in recent studies. We argue that most previous studies focused on open source projects only and did not consider deep learning techniques. In this paper, we propose to use Convolutional Neural Network and word embedding to build an automatic bug triager. The results of the experiments applied to both industrial and open source projects reveal benefits of the automatic approach and suggest co-operation of human and automatic triagers. Our experience in integrating and operating the proposed system in an industrial development environment is also reported. @InProceedings{ESEC/FSE17p926, author = {Sun-Ro Lee and Min-Jae Heo and Chan-Gun Lee and Milhan Kim and Gaeul Jeong}, title = {Applying Deep Learning Based Automatic Bug Triager to Industrial Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {926--931}, doi = {}, year = {2017}, } |
|
Hermann, Ben |
ESEC/FSE '17: "CodeMatch: Obfuscation Won't ..."
CodeMatch: Obfuscation Won't Conceal Your Repackaged App
Leonid Glanz, Sven Amann, Michael Eichberg, Michael Reif, Ben Hermann, Johannes Lerch, and Mira Mezini (TU Darmstadt, Germany) An established way to steal the income of app developers, or to trick users into installing malware, is the creation of repackaged apps. These are clones of – typically – successful apps. To conceal their nature, they are often obfuscated by their creators. But, given that it is a common best practice to obfuscate apps, a trivial identification of repackaged apps is not possible. The problem is further intensified by the prevalent usage of libraries. In many apps, the size of the overall code base is basically determined by the used libraries. Therefore, two apps, where the obfuscated code bases are very similar, do not have to be repackages of each other. To reliably detect repackaged apps, we propose a two step approach which first focuses on the identification and removal of the library code in obfuscated apps. This approach – LibDetect – relies on code representations which abstract over several parts of the underlying bytecode to be resilient against certain obfuscation techniques. Using this approach, we are able to identify on average 70% more used libraries per app than previous approaches. After the removal of an app’s library code, we then fuzzy hash the most abstract representation of the remaining app code to ensure that we can identify repackaged apps even if very advanced obfuscation techniques are used. This makes it possible to identify repackaged apps. Using our approach, we found that ≈ 15% of all apps in Android app stores are repackages @InProceedings{ESEC/FSE17p638, author = {Leonid Glanz and Sven Amann and Michael Eichberg and Michael Reif and Ben Hermann and Johannes Lerch and Mira Mezini}, title = {CodeMatch: Obfuscation Won't Conceal Your Repackaged App}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {638--648}, doi = {}, year = {2017}, } Info |
|
Hicks, Michael |
ESEC/FSE '17: "Counterexample-Guided Approach ..."
Counterexample-Guided Approach to Finding Numerical Invariants
ThanhVu Nguyen, Timos Antonopoulos, Andrew Ruef, and Michael Hicks (University of Nebraska-Lincoln, USA; Yale University, USA; University of Maryland, USA) Numerical invariants, e.g., relationships among numerical variables in a program, represent a useful class of properties to analyze programs. General polynomial invariants represent more complex numerical relations, but they are often required in many scientific and engineering applications. We present NumInv, a tool that implements a counterexample-guided invariant generation (CEGIR) technique to automatically discover numerical invariants, which are polynomial equality and inequality relations among numerical variables. This CEGIR technique infers candidate invariants from program traces and then checks them against the program source code using the KLEE test-input generation tool. If the invariants are incorrect KLEE returns counterexample traces, which help the dynamic inference obtain better results. Existing CEGIR approaches often require sound invariants, however NumInv sacrifices soundness and produces results that KLEE cannot refute within certain time bounds. This design and the use of KLEE as a verifier allow NumInv to discover useful and important numerical invariants for many challenging programs. Preliminary results show that NumInv generates required invariants for understanding and verifying correctness of programs involving complex arithmetic. We also show that NumInv discovers polynomial invariants that capture precise complexity bounds of programs used to benchmark existing static complexity analysis techniques. Finally, we show that NumInv performs competitively comparing to state of the art numerical invariant analysis tools. @InProceedings{ESEC/FSE17p605, author = {ThanhVu Nguyen and Timos Antonopoulos and Andrew Ruef and Michael Hicks}, title = {Counterexample-Guided Approach to Finding Numerical Invariants}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {605--615}, doi = {}, year = {2017}, } |
|
Hili, Nicolas |
ESEC/FSE '17: "Model-Level, Platform-Independent ..."
Model-Level, Platform-Independent Debugging in the Context of the Model-Driven Development of Real-Time Systems
Mojtaba Bagherzadeh, Nicolas Hili, and Juergen Dingel (Queen's University, Canada) Providing proper support for debugging models at model-level is one of the main barriers to a broader adoption of Model Driven Development (MDD). In this paper, we focus on the use of MDD for the development of real-time embedded systems (RTE). We introduce a new platform-independent approach to implement model-level debuggers. We describe how to realize support for model-level debugging entirely in terms of the modeling language and show how to implement this support in terms of a model-to-model transformation. Key advantages of the approach over existing work are that (1) it does not require a program debugger for the code generated from the model, and that (2) any changes to, e.g., the code generator, the target language, or the hardware platform leave the debugger completely unaffected. We also describe an implementation of the approach in the context of Papyrus-RT, an open source MDD tool based on the modeling language UML-RT. We summarize the results of the use of our model-based debugger on several use cases to determine its overhead in terms of size and performance. Despite being a prototype, the performance overhead is in the order of microseconds, while the size overhead is comparable with that of GDB, the GNU Debugger. @InProceedings{ESEC/FSE17p419, author = {Mojtaba Bagherzadeh and Nicolas Hili and Juergen Dingel}, title = {Model-Level, Platform-Independent Debugging in the Context of the Model-Driven Development of Real-Time Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {419--430}, doi = {}, year = {2017}, } Video Info Artifacts Functional |
|
Hilton, Michael |
ESEC/FSE '17: "Trade-Offs in Continuous Integration: ..."
Trade-Offs in Continuous Integration: Assurance, Security, and Flexibility
Michael Hilton, Nicholas Nelson, Timothy Tunnell, Darko Marinov, and Danny Dig (Oregon State University, USA; University of Illinois at Urbana-Champaign, USA) Continuous integration (CI) systems automate the compilation, building, and testing of software. Despite CI being a widely used activity in software engineering, we do not know what motivates developers to use CI, and what barriers and unmet needs they face. Without such knowledge, developers make easily avoidable errors, tool builders invest in the wrong direction, and researchers miss opportunities for improving the practice of CI. We present a qualitative study of the barriers and needs developers face when using CI. We conduct semi-structured interviews with developers from different industries and development scales. We triangulate our findings by running two surveys. We find that developers face trade-offs between speed and certainty (Assurance), between better access and information security (Security), and between more configuration options and greater ease of use (Flexi- bility). We present implications of these trade-offs for developers, tool builders, and researchers. @InProceedings{ESEC/FSE17p197, author = {Michael Hilton and Nicholas Nelson and Timothy Tunnell and Darko Marinov and Danny Dig}, title = {Trade-Offs in Continuous Integration: Assurance, Security, and Flexibility}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {197--207}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Hoek, André van der |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Support for Iteration on Code Search
Lee Martie, André van der Hoek, and Thomas Kwak (University of California at Irvine, USA) Sometimes, when programmers use a search engine they know more or less what they need. Other times, programmers use the search engine to look around and generate possible ideas for the programming problem they are working on. The key insight we explore in this paper is that the results found in the latter case tend to serve as inspiration or triggers for the next queries issued. We introduce two search engines, CodeExchange and CodeLikeThis, both of which are specifically designed to enable the user to directly leverage the results in formulating the next query. CodeExchange does this with a set of four features supporting the programmer to use characteristics of the results to find other code with or without those characteristics. CodeLikeThis supports simply selecting an entire result to find code that is analogous, to some degree, to that result. We evaluated how these approaches were used along with two approaches not explicitly supporting iteration, a baseline and Google, in a user study among 24 developers. We find that search engines that support using results to form the next query can improve the programmers’ search experience and different approaches to iteration can provide better experiences depending on the task. @InProceedings{ESEC/FSE17p774, author = {Lee Martie and André van der Hoek and Thomas Kwak}, title = {Understanding the Impact of Support for Iteration on Code Search}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {774--785}, doi = {}, year = {2017}, } |
|
Hoffmann, Henry |
ESEC/FSE '17: "Automated Control of Multiple ..."
Automated Control of Multiple Software Goals using Multiple Actuators
Martina Maggio, Alessandro Vittorio Papadopoulos, Antonio Filieri, and Henry Hoffmann (Lund University, Sweden; Mälardalen University, Sweden; Imperial College London, UK; University of Chicago, USA) Modern software should satisfy multiple goals simultaneously: it should provide predictable performance, be robust to failures, handle peak loads and deal seamlessly with unexpected conditions and changes in the execution environment. For this to happen, software designs should account for the possibility of runtime changes and provide formal guarantees of the software's behavior. Control theory is one of the possible design drivers for runtime adaptation, but adopting control theoretic principles often requires additional, specialized knowledge. To overcome this limitation, automated methodologies have been proposed to extract the necessary information from experimental data and design a control system for runtime adaptation. These proposals, however, only process one goal at a time, creating a chain of controllers. In this paper, we propose and evaluate the first automated strategy that takes into account multiple goals without separating them into multiple control strategies. Avoiding the separation allows us to tackle a larger class of problems and provide stronger guarantees. We test our methodology's generality with three case studies that demonstrate its broad applicability in meeting performance, reliability, quality, security, and energy goals despite environmental or requirements changes. @InProceedings{ESEC/FSE17p373, author = {Martina Maggio and Alessandro Vittorio Papadopoulos and Antonio Filieri and Henry Hoffmann}, title = {Automated Control of Multiple Software Goals using Multiple Actuators}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {373--384}, doi = {}, year = {2017}, } Info |
|
Hofmeister, Johannes |
ESEC/FSE '17: "Measuring Neural Efficiency ..."
Measuring Neural Efficiency of Program Comprehension
Janet Siegmund, Norman Peitek, Chris Parnin, Sven Apel, Johannes Hofmeister, Christian Kästner, Andrew Begel, Anja Bethmann, and André Brechmann (University of Passau, Germany; Leibniz Institute for Neurobiology, Germany; North Carolina State University, USA; Carnegie Mellon University, USA; Microsoft Research, USA) Most modern software programs cannot be understood in their entirety by a single programmer. Instead, programmers must rely on a set of cognitive processes that aid in seeking, filtering, and shaping relevant information for a given programming task. Several theories have been proposed to explain these processes, such as ``beacons,' for locating relevant code, and ``plans,'' for encoding cognitive models. However, these theories are decades old and lack validation with modern cognitive-neuroscience methods. In this paper, we report on a study using functional magnetic resonance imaging (fMRI) with 11 participants who performed program comprehension tasks. We manipulated experimental conditions related to beacons and layout to isolate specific cognitive processes related to bottom-up comprehension and comprehension based on semantic cues. We found evidence of semantic chunking during bottom-up comprehension and lower activation of brain areas during comprehension based on semantic cues, confirming that beacons ease comprehension. @InProceedings{ESEC/FSE17p140, author = {Janet Siegmund and Norman Peitek and Chris Parnin and Sven Apel and Johannes Hofmeister and Christian Kästner and Andrew Begel and Anja Bethmann and André Brechmann}, title = {Measuring Neural Efficiency of Program Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {140--150}, doi = {}, year = {2017}, } Info |
|
Holmes, Reid |
ESEC/FSE '17: "Measuring the Cost of Regression ..."
Measuring the Cost of Regression Testing in Practice: A Study of Java Projects using Continuous Integration
Adriaan Labuschagne, Laura Inozemtseva, and Reid Holmes (University of Waterloo, Canada; University of British Columbia, Canada) Software defects cost time and money to diagnose and fix. Consequently, developers use a variety of techniques to avoid introducing defects into their systems. However, these techniques have costs of their own; the benefit of using a technique must outweigh the cost of applying it. In this paper we investigate the costs and benefits of automated regression testing in practice. Specifically, we studied 61 projects that use Travis CI, a cloud-based continuous integration tool, in order to examine real test failures that were encountered by the developers of those projects. We determined how the developers resolved the failures they encountered and used this information to classify the failures as being caused by a flaky test, by a bug in the system under test, or by a broken or obsolete test. We consider that test failures caused by bugs represent a benefit of the test suite, while failures caused by broken or obsolete tests represent a test suite maintenance cost. We found that 18% of test suite executions fail and that 13% of these failures are flaky. Of the non-flaky failures, only 74% were caused by a bug in the system under test; the remaining 26% were due to incorrect or obsolete tests. In addition, we found that, in the failed builds, only 0.38% of the test case executions failed and 64% of failed builds contained more than one failed test. Our findings contribute to a wider understanding of the unforeseen costs that can impact the overall cost effectiveness of regression testing in practice. They can also inform research into test case selection techniques, as we have provided an approximate empirical bound on the practical value that could be extracted from such techniques. This value appears to be large, as the 61 systems under study contained nearly 3 million lines of test code and yet over 99% of test case executions could have been eliminated with a perfect oracle. @InProceedings{ESEC/FSE17p821, author = {Adriaan Labuschagne and Laura Inozemtseva and Reid Holmes}, title = {Measuring the Cost of Regression Testing in Practice: A Study of Java Projects using Continuous Integration}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {821--830}, doi = {}, year = {2017}, } Info |
|
Huijgens, Hennie |
ESEC/FSE '17-IND: "Strong Agile Metrics: Mining ..."
Strong Agile Metrics: Mining Log Data to Determine Predictive Power of Software Metrics for Continuous Delivery Teams
Hennie Huijgens, Robert Lamping, Dick Stevens, Hartger Rothengatter, Georgios Gousios, and Daniele Romano (Delft University of Technology, Netherlands; ING Bank, Netherlands; CGI, Netherlands) ING Bank, a large Netherlands-based internationally operating bank, implemented a fully automated continuous delivery pipe-line for its software engineering activities in more than 300 teams, that perform more than 2500 deployments to production each month on more than 750 different applications. Our objective is to examine how strong metrics for agile (Scrum) DevOps teams can be set in an iterative fashion. We perform an exploratory case study that focuses on the classification based on predictive power of software metrics, in which we analyze log data derived from two initial sources within this pipeline. We analyzed a subset of 16 metrics from 59 squads. We identified two lagging metrics and assessed four leading metrics to be strong. @InProceedings{ESEC/FSE17p866, author = {Hennie Huijgens and Robert Lamping and Dick Stevens and Hartger Rothengatter and Georgios Gousios and Daniele Romano}, title = {Strong Agile Metrics: Mining Log Data to Determine Predictive Power of Software Metrics for Continuous Delivery Teams}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {866--871}, doi = {}, year = {2017}, } |
|
Iannacone, Jake |
ESEC/FSE '17: "Understanding Misunderstandings ..."
Understanding Misunderstandings in Source Code
Dan Gopstein, Jake Iannacone, Yu Yan, Lois DeLong, Yanyan Zhuang, Martin K.-C. Yeh, and Justin Cappos (New York University, USA; Pennsylvania State University, USA; University of Colorado at Colorado Springs, USA) Humans often mistake the meaning of source code, and so misjudge a program's true behavior. These mistakes can be caused by extremely small, isolated patterns in code, which can lead to significant runtime errors. These patterns are used in large, popular software projects and even recommended in style guides. To identify code patterns that may confuse programmers we extracted a preliminary set of `atoms of confusion' from known confusing code. We show empirically in an experiment with 73 participants that these code patterns can lead to a significantly increased rate of misunderstanding versus equivalent code without the patterns. We then go on to take larger confusing programs and measure (in an experiment with 43 participants) the impact, in terms of programmer confusion, of removing these confusing patterns. All of our instruments, analysis code, and data are publicly available online for replication, experimentation, and feedback. @InProceedings{ESEC/FSE17p129, author = {Dan Gopstein and Jake Iannacone and Yu Yan and Lois DeLong and Yanyan Zhuang and Martin K.-C. Yeh and Justin Cappos}, title = {Understanding Misunderstandings in Source Code}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {129--139}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Inozemtseva, Laura |
ESEC/FSE '17: "Measuring the Cost of Regression ..."
Measuring the Cost of Regression Testing in Practice: A Study of Java Projects using Continuous Integration
Adriaan Labuschagne, Laura Inozemtseva, and Reid Holmes (University of Waterloo, Canada; University of British Columbia, Canada) Software defects cost time and money to diagnose and fix. Consequently, developers use a variety of techniques to avoid introducing defects into their systems. However, these techniques have costs of their own; the benefit of using a technique must outweigh the cost of applying it. In this paper we investigate the costs and benefits of automated regression testing in practice. Specifically, we studied 61 projects that use Travis CI, a cloud-based continuous integration tool, in order to examine real test failures that were encountered by the developers of those projects. We determined how the developers resolved the failures they encountered and used this information to classify the failures as being caused by a flaky test, by a bug in the system under test, or by a broken or obsolete test. We consider that test failures caused by bugs represent a benefit of the test suite, while failures caused by broken or obsolete tests represent a test suite maintenance cost. We found that 18% of test suite executions fail and that 13% of these failures are flaky. Of the non-flaky failures, only 74% were caused by a bug in the system under test; the remaining 26% were due to incorrect or obsolete tests. In addition, we found that, in the failed builds, only 0.38% of the test case executions failed and 64% of failed builds contained more than one failed test. Our findings contribute to a wider understanding of the unforeseen costs that can impact the overall cost effectiveness of regression testing in practice. They can also inform research into test case selection techniques, as we have provided an approximate empirical bound on the practical value that could be extracted from such techniques. This value appears to be large, as the 61 systems under study contained nearly 3 million lines of test code and yet over 99% of test case executions could have been eliminated with a perfect oracle. @InProceedings{ESEC/FSE17p821, author = {Adriaan Labuschagne and Laura Inozemtseva and Reid Holmes}, title = {Measuring the Cost of Regression Testing in Practice: A Study of Java Projects using Continuous Integration}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {821--830}, doi = {}, year = {2017}, } Info |
|
Islam, Syed |
ESEC/FSE '17: "Generalized Observational ..."
Generalized Observational Slicing for Tree-Represented Modelling Languages
Nicolas E. Gold, David Binkley, Mark Harman, Syed Islam, Jens Krinke, and Shin Yoo (University College London, UK; Loyola University Maryland, USA; University of East London, UK; KAIST, South Korea) Model-driven software engineering raises the abstraction level making complex systems easier to understand than if written in textual code. Nevertheless, large complicated software systems can have large models, motivating the need for slicing techniques that reduce the size of a model. We present a generalization of observation-based slicing that allows the criterion to be defined using a variety of kinds of observable behavior and does not require any complex dependence analysis. We apply our implementation of generalized observational slicing for tree-structured representations to Simulink models. The resulting slice might be the subset of the original model responsible for an observed failure or simply the sub-model semantically related to a classic slicing criterion. Unlike its predecessors, the algorithm is also capable of slicing embedded Stateflow state machines. A study of nine real-world models drawn from four different application domains demonstrates the effectiveness of our approach at dramatically reducing Simulink model sizes for realistic observation scenarios: for 9 out of 20 cases, the resulting model has fewer than 25% of the original model's elements. @InProceedings{ESEC/FSE17p547, author = {Nicolas E. Gold and David Binkley and Mark Harman and Syed Islam and Jens Krinke and Shin Yoo}, title = {Generalized Observational Slicing for Tree-Represented Modelling Languages}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {547--558}, doi = {}, year = {2017}, } |
|
Ivanov, Vladimir |
ESEC/FSE '17-IND: "What Do Software Engineers ..."
What Do Software Engineers Care About? Gaps between Research and Practice
Vladimir Ivanov, Alan Rogers, Giancarlo Succi, Jooyong Yi, and Vasilii Zorin (Innopolis University, Russia) It is a cliche to say that there is a gap between research and practice. As the interest and importance in the practical impact of research has been growing, the gap between research and practice is expected to be narrowing. However, our study reveals that there still seems to be a wide gap. We survey so ware engineers about what they care about when developing so ware. We then compare our survey results with the research topics of the papers published in ICSE/FSE recently. We found the following discrepancy: while so ware engineers care more about so ware development productivity than the quality of so ware, papers on research areas closely related to so ware productivity--such as so ware development process management and so ware development techniques--are significantly less published than papers on so ware verification and validation that account for more than half of publications. We also found that so ware engineers are in great need for techniques for accurate effort estimation, and they are not necessarily knowledgable about techniques they can use to meet their needs. @InProceedings{ESEC/FSE17p890, author = {Vladimir Ivanov and Alan Rogers and Giancarlo Succi and Jooyong Yi and Vasilii Zorin}, title = {What Do Software Engineers Care About? Gaps between Research and Practice}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {890--895}, doi = {}, year = {2017}, } |
|
Jabbarvand, Reyhaneh |
ESEC/FSE '17: "µDroid: An Energy-Aware Mutation ..."
µDroid: An Energy-Aware Mutation Testing Framework for Android
Reyhaneh Jabbarvand and Sam Malek (University of California at Irvine, USA) The rising popularity of mobile apps deployed on battery-constrained devices underlines the need for effectively evaluating their energy properties. However, currently there is a lack of testing tools for evaluating the energy properties of apps. As a result, for energy testing, developers are relying on tests intended for evaluating the functional correctness of apps. Such tests may not be adequate for revealing energy defects and inefficiencies in apps. This paper presents an energy-aware mutation testing framework, called μDROID, that can be used by developers to assess the adequacy of their test suite for revealing energy-related defects. μDROID implements fifty energy-aware mutation operators and relies on a novel, automatic oracle to determine if a mutant can be killed by a test. Our evaluation on real-world Android apps shows the ability of proposed mutation operators for evaluating the utility of tests in revealing energy defects. Moreover, our automated oracle can detect whether tests kill the energy mutants with an overall accuracy of 94%, thereby making it possible to apply μDROID automatically. @InProceedings{ESEC/FSE17p208, author = {Reyhaneh Jabbarvand and Sam Malek}, title = {µDroid: An Energy-Aware Mutation Testing Framework for Android}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {208--219}, doi = {}, year = {2017}, } ESEC/FSE '17: "PATDroid: Permission-Aware ..." PATDroid: Permission-Aware GUI Testing of Android Alireza Sadeghi, Reyhaneh Jabbarvand, and Sam Malek (University of California at Irvine, USA) Recent introduction of a dynamic permission system in Android, allowing the users to grant and revoke permissions after the installation of an app, has made it harder to properly test apps. Since an app's behavior may change depending on the granted permissions, it needs to be tested under a wide range of permission combinations. At the state-of-the-art, in the absence of any automated tool support, a developer needs to either manually determine the interaction of tests and app permissions, or exhaustively re-execute tests for all possible permission combinations, thereby increasing the time and resources required to test apps. This paper presents an automated approach, called PATDroid, for efficiently testing an Android app while taking the impact of permissions on its behavior into account. PATDroid performs a hybrid program analysis on both an app under test and its test suite to determine which tests should be executed on what permission combinations. Our experimental results show that PATDroid significantly reduces the testing effort, yet achieves comparable code coverage and fault detection capability as exhaustively testing an app under all permission combinations. @InProceedings{ESEC/FSE17p220, author = {Alireza Sadeghi and Reyhaneh Jabbarvand and Sam Malek}, title = {PATDroid: Permission-Aware GUI Testing of Android}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {220--232}, doi = {}, year = {2017}, } Info Artifacts Functional |
|
Jaffe, Alan |
ESEC/FSE '17-SRC: "Suggesting Meaningful Variable ..."
Suggesting Meaningful Variable Names for Decompiled Code: A Machine Translation Approach
Alan Jaffe (Carnegie Mellon University, USA) Decompiled code lacks meaningful variable names. We used statistical machine translation to suggest variable names that are natural given the context. This technique has previously been successfully applied to obfuscated JavaScript code, but decompiled C code poses unique challenges in constructing an aligned corpus and selecting the best translation from among several candidates. @InProceedings{ESEC/FSE17p1050, author = {Alan Jaffe}, title = {Suggesting Meaningful Variable Names for Decompiled Code: A Machine Translation Approach}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1050--1052}, doi = {}, year = {2017}, } |
|
Jeong, Gaeul |
ESEC/FSE '17-IND: "Applying Deep Learning Based ..."
Applying Deep Learning Based Automatic Bug Triager to Industrial Projects
Sun-Ro Lee, Min-Jae Heo, Chan-Gun Lee, Milhan Kim, and Gaeul Jeong (Chung-Ang University, South Korea; LG Electronics, South Korea) Finding the appropriate developer for a bug report, so called `Bug Triage', is one of the bottlenecks in the bug resolution process. To address this problem, many approaches have proposed various automatic bug triage techniques in recent studies. We argue that most previous studies focused on open source projects only and did not consider deep learning techniques. In this paper, we propose to use Convolutional Neural Network and word embedding to build an automatic bug triager. The results of the experiments applied to both industrial and open source projects reveal benefits of the automatic approach and suggest co-operation of human and automatic triagers. Our experience in integrating and operating the proposed system in an industrial development environment is also reported. @InProceedings{ESEC/FSE17p926, author = {Sun-Ro Lee and Min-Jae Heo and Chan-Gun Lee and Milhan Kim and Gaeul Jeong}, title = {Applying Deep Learning Based Automatic Bug Triager to Industrial Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {926--931}, doi = {}, year = {2017}, } |
|
Jermaine, Chris |
ESEC/FSE '17: "Bayesian Specification Learning ..."
Bayesian Specification Learning for Finding API Usage Errors
Vijayaraghavan Murali, Swarat Chaudhuri, and Chris Jermaine (Rice University, USA) We present a Bayesian framework for learning probabilistic specifications from large, unstructured code corpora, and then using these specifications to statically detect anomalous, hence likely buggy, program behavior. Our key insight is to build a statistical model that correlates all specifications hidden inside a corpus with the syntax and observed behavior of programs that implement these specifications. During the analysis of a particular program, this model is conditioned into a posterior distribution that prioritizes specifications that are relevant to the program. The problem of finding anomalies is now framed quantitatively, as a problem of computing a distance between a "reference distribution" over program behaviors that our model expects from the program, and the distribution over behaviors that the program actually produces. We implement our ideas in a system, called Salento, for finding anomalous API usage in Android programs. Salento learns specifications using a combination of a topic model and a neural network model. Our encouraging experimental results show that the system can automatically discover subtle errors in Android applications in the wild, and has high precision and recall compared to competing probabilistic approaches. @InProceedings{ESEC/FSE17p151, author = {Vijayaraghavan Murali and Swarat Chaudhuri and Chris Jermaine}, title = {Bayesian Specification Learning for Finding API Usage Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {151--162}, doi = {}, year = {2017}, } |
|
Jha, Somesh |
ESEC/FSE '17: "Cimplifier: Automatically ..."
Cimplifier: Automatically Debloating Containers
Vaibhav Rastogi, Drew Davidson, Lorenzo De Carli, Somesh Jha, and Patrick McDaniel (University of Wisconsin-Madison, USA; Tala Security, USA; Colorado State University, USA; Pennsylvania State University, USA) Application containers, such as those provided by Docker, have recently gained popularity as a solution for agile and seamless software deployment. These light-weight virtualization environments run applications that are packed together with their resources and configuration information, and thus can be deployed across various software platforms. Unfortunately, the ease with which containers can be created is oftentimes a double-edged sword, encouraging the packaging of logically distinct applications, and the inclusion of significant amount of unnecessary components, within a single container. These practices needlessly increase the container size—sometimes by orders of magnitude. They also decrease the overall security, as each included component—necessary or not—may bring in security issues of its own, and there is no isolation between multiple applications packaged within the same container image. We propose algorithms and a tool called Cimplifier, which address these concerns: given a container and simple user-defined constraints, our tool partitions it into simpler containers, which (i) are isolated from each other, only communicating as necessary, and (ii) only include enough resources to perform their functionality. Our evaluation on real-world containers demonstrates that Cimplifier preserves the original functionality, leads to reduction in image size of up to 95%, and processes even large containers in under thirty seconds. @InProceedings{ESEC/FSE17p476, author = {Vaibhav Rastogi and Drew Davidson and Lorenzo De Carli and Somesh Jha and Patrick McDaniel}, title = {Cimplifier: Automatically Debloating Containers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {476--486}, doi = {}, year = {2017}, } |
|
Jürjens, Jan |
ESEC/FSE '17-DEMO: "Model-Based Privacy and Security ..."
Model-Based Privacy and Security Analysis with CARiSMA
Amir Shayan Ahmadian, Sven Peldszus, Qusai Ramadan, and Jan Jürjens (University of Koblenz-Landau, Germany; Fraunhofer ISST, Germany) We present CARiSMA, a tool that is originally designed to support model-based security analysis of IT systems. In our recent work, we added several new functionalities to CARiSMA to support the privacy of personal data. Moreover, we introduced a mechanism to assist the system designers to perform a CARiSMA analysis by automatically initializing an appropriate CARiSMA analysis concerning security and privacy requirements. The motivation for our work is Article 25 of Regulation (EU) 2016/679, which requires appropriate technical and organizational controls must be implemented for ensuring that, by default, the processing of personal data complies with the principles on processing of personal data. This implies that initially IT systems must be analyzed to verify if such principles are respected. System models allow the system developers to handle the complexity of systems and to focus on key aspects such as privacy and security. CARiSMA is available at http://carisma.umlsec.de and our screen cast at https://youtu.be/b5zeHig3ARw. @InProceedings{ESEC/FSE17p989, author = {Amir Shayan Ahmadian and Sven Peldszus and Qusai Ramadan and Jan Jürjens}, title = {Model-Based Privacy and Security Analysis with CARiSMA}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {989--993}, doi = {}, year = {2017}, } Video Info |
|
Käfer, Verena |
ESEC/FSE '17-DOC: "Summarizing Software Engineering ..."
Summarizing Software Engineering Communication Artifacts from Different Sources
Verena Käfer (University of Stuttgart, Germany) During software development, developers communicate a lot and with many different people. Communication is an important factor, to the point that communication failures are seen as the causes of productivity losses or even project failures. To communicate with each other, software developers use many different tools like mailing lists, forums, issue trackers or chats. Even in a short time span, a lot of information artifacts can arise through these channels, which can be very time consuming to get through after a long vacation or for new members of the team. This paper describes a research plan for an approach which can summarize different communication sources into one big summary using and improving existing text summarization approaches. The resulting tool would have the potential to decrease the effort needed for sense-making and comprehension of communication, as well as the time needed for locating and using information from the communication sources. This reduction in effort will result in a significant increase in the productivity of software development companies. @InProceedings{ESEC/FSE17p1038, author = {Verena Käfer}, title = {Summarizing Software Engineering Communication Artifacts from Different Sources}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1038--1041}, doi = {}, year = {2017}, } |
|
Kamath, Amita Ajith |
ESEC/FSE '17: "ARTINALI: Dynamic Invariant ..."
ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security
Maryam Raiyat Aliabadi, Amita Ajith Kamath, Julien Gascon-Samson, and Karthik Pattabiraman (University of British Columbia, Canada; National Institute of Technology Karnataka, India) Cyber-Physical Systems (CPSes) are being widely deployed in security critical scenarios such as smart homes and medical devices. Unfortunately, the connectedness of these systems and their relative lack of security measures makes them ripe targets for attacks. Specification-based Intrusion Detection Systems (IDS) have been shown to be effective for securing CPSs. Unfortunately, deriving invariants for capturing the specifications of CPS systems is a tedious and error-prone process. Therefore, it is important to dynamically monitor the CPS system to learn its common behaviors and formulate invariants for detecting security attacks. Existing techniques for invariant mining only incorporate data and events, but not time. However, time is central to most CPS systems, and hence incorporating time in addition to data and events, is essential for achieving low false positives and false negatives. This paper proposes ARTINALI, which mines dynamic system properties by incorporating time as a first-class property of the system. We build ARTINALI-based Intrusion Detection Systems (IDSes) for two CPSes, namely smart meters and smart medical devices, and measure their efficacy. We find that the ARTINALI-based IDSes significantly reduce the ratio of false positives and false negatives by 16 to 48% (average 30.75%) and 89 to 95% (average 93.4%) respectively over other dynamic invariant detection tools. @InProceedings{ESEC/FSE17p349, author = {Maryam Raiyat Aliabadi and Amita Ajith Kamath and Julien Gascon-Samson and Karthik Pattabiraman}, title = {ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {349--361}, doi = {}, year = {2017}, } |
|
Kamp, Marius |
ESEC/FSE '17: "More Accurate Recommendations ..."
More Accurate Recommendations for Method-Level Changes
Georg Dotzler, Marius Kamp, Patrick Kreutzer, and Michael Philippsen (Friedrich-Alexander University Erlangen-Nürnberg, Germany) During the life span of large software projects, developers often apply the same code changes to different code locations in slight variations. Since the application of these changes to all locations is time-consuming and error-prone, tools exist that learn change patterns from input examples, search for possible pattern applications, and generate corresponding recommendations. In many cases, the generated recommendations are syntactically or semantically wrong due to code movements in the input examples. Thus, they are of low accuracy and developers cannot directly copy them into their projects without adjustments. We present the Accurate REcommendation System (ARES) that achieves a higher accuracy than other tools because its algorithms take care of code movements when creating patterns and recommendations. On average, the recommendations by ARES have an accuracy of 96% with respect to code changes that developers have manually performed in commits of source code archives. At the same time ARES achieves precision and recall values that are on par with other tools. @InProceedings{ESEC/FSE17p798, author = {Georg Dotzler and Marius Kamp and Patrick Kreutzer and Michael Philippsen}, title = {More Accurate Recommendations for Method-Level Changes}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {798--808}, doi = {}, year = {2017}, } Info |
|
Kardas, Geylani |
ESEC/FSE '17-DEMO: "A Tool for Automated Reasoning ..."
A Tool for Automated Reasoning about Traces Based on Configurable Formal Semantics
Ferhat Erata, Arda Goknil, Bedir Tekinerdogan, and Geylani Kardas (Wageningen University, Netherlands; University of Luxembourg, Luxembourg; Ege University, Turkey) We present Tarski, a tool for specifying configurable trace semantics to facilitate automated reasoning about traces. Software development projects require that various types of traces be modeled between and within development artifacts. For any given artifact (e.g., requirements, architecture models and source code), Tarski allows the user to specify new trace types and their configurable semantics, while, using the semantics, it automatically infers new traces based on existing traces provided by the user, and checks the consistency of traces. It has been evaluated on three industrial case studies in the automotive domain (https://modelwriter.github.io/Tarski/). @InProceedings{ESEC/FSE17p959, author = {Ferhat Erata and Arda Goknil and Bedir Tekinerdogan and Geylani Kardas}, title = {A Tool for Automated Reasoning about Traces Based on Configurable Formal Semantics}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {959--963}, doi = {}, year = {2017}, } Video Info |
|
Karkare, Amey |
ESEC/FSE '17: "A Feasibility Study of Using ..."
A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments
Jooyong Yi, Umair Z. Ahmed, Amey Karkare, Shin Hwei Tan, and Abhik Roychoudhury (Innopolis University, Russia; IIT Kanpur, India; National University of Singapore, Singapore) Despite the fact an intelligent tutoring system for programming (ITSP) education has long attracted interest, its widespread use has been hindered by the difficulty of generating personalized feedback automatically. Meanwhile, automated program repair (APR) is an emerging new technology that automatically fixes software bugs, and it has been shown that APR can fix the bugs of large real-world software. In this paper, we study the feasibility of marrying intelligent programming tutoring and APR. We perform our feasibility study with four state-of-the-art APR tools (GenProg, AE, Angelix, and Prophet), and 661 programs written by the students taking an introductory programming course. We found that when APR tools are used out of the box, only about 30% of the programs in our dataset are repaired. This low repair rate is largely due to the student programs often being significantly incorrect — in contrast, professional software for which APR was successfully applied typically fails only a small portion of tests. To bridge this gap, we adopt in APR a new repair policy akin to the hint generation policy employed in the existing ITSP. This new repair policy admits partial repairs that address part of failing tests, which results in 84% improvement of repair rate. We also performed a user study with 263 novice students and 37 graders, and identified an understudied problem; while novice students do not seem to know how to effectively make use of generated repairs as hints, the graders do seem to gain benefits from repairs. @InProceedings{ESEC/FSE17p740, author = {Jooyong Yi and Umair Z. Ahmed and Amey Karkare and Shin Hwei Tan and Abhik Roychoudhury}, title = {A Feasibility Study of Using Automated Program Repair for Introductory Programming Assignments}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {740--751}, doi = {}, year = {2017}, } Info Artifacts Functional |
|
Kästner, Christian |
ESEC/FSE '17: "Measuring Neural Efficiency ..."
Measuring Neural Efficiency of Program Comprehension
Janet Siegmund, Norman Peitek, Chris Parnin, Sven Apel, Johannes Hofmeister, Christian Kästner, Andrew Begel, Anja Bethmann, and André Brechmann (University of Passau, Germany; Leibniz Institute for Neurobiology, Germany; North Carolina State University, USA; Carnegie Mellon University, USA; Microsoft Research, USA) Most modern software programs cannot be understood in their entirety by a single programmer. Instead, programmers must rely on a set of cognitive processes that aid in seeking, filtering, and shaping relevant information for a given programming task. Several theories have been proposed to explain these processes, such as ``beacons,' for locating relevant code, and ``plans,'' for encoding cognitive models. However, these theories are decades old and lack validation with modern cognitive-neuroscience methods. In this paper, we report on a study using functional magnetic resonance imaging (fMRI) with 11 participants who performed program comprehension tasks. We manipulated experimental conditions related to beacons and layout to isolate specific cognitive processes related to bottom-up comprehension and comprehension based on semantic cues. We found evidence of semantic chunking during bottom-up comprehension and lower activation of brain areas during comprehension based on semantic cues, confirming that beacons ease comprehension. @InProceedings{ESEC/FSE17p140, author = {Janet Siegmund and Norman Peitek and Chris Parnin and Sven Apel and Johannes Hofmeister and Christian Kästner and Andrew Begel and Anja Bethmann and André Brechmann}, title = {Measuring Neural Efficiency of Program Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {140--150}, doi = {}, year = {2017}, } Info |
|
Kehrer, Timo |
ESEC/FSE '17: "Modeling and Verification ..."
Modeling and Verification of Evolving Cyber-Physical Spaces
Christos Tsigkanos, Timo Kehrer, and Carlo Ghezzi (Politecnico di Milano, Italy) We increasingly live in cyber-physical spaces -- spaces that are both physical and digital, and where the two aspects are intertwined. Such spaces are highly dynamic and typically undergo continuous change. Software engineering can have a profound impact in this domain, by defining suitable modeling and specification notations as well as supporting design-time formal verification. In this paper, we present a methodology and a technical framework which support modeling of evolving cyber-physical spaces and reasoning about their spatio-temporal properties. We utilize a discrete, graph-based formalism for modeling cyber-physical spaces as well as primitives of change, giving rise to a reactive system consisting of rewriting rules with both local and global application conditions. Formal reasoning facilities are implemented adopting logic-based specification of properties and according model checking procedures, in both spatial and temporal fragments. We evaluate our approach using a case study of a disaster scenario in a smart city. @InProceedings{ESEC/FSE17p38, author = {Christos Tsigkanos and Timo Kehrer and Carlo Ghezzi}, title = {Modeling and Verification of Evolving Cyber-Physical Spaces}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {38--48}, doi = {}, year = {2017}, } |
|
Khare, Rohit |
ESEC/FSE '17-INV: "Reflections on the REST Architectural ..."
Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)
Roy T. Fielding, Richard N. Taylor, Justin R. Erenkrantz, Michael M. Gorlick, Jim Whitehead, Rohit Khare, and Peyman Oreizy (Adobe, USA; University of California at Irvine, USA; Bloomberg, USA; University of California at Santa Cruz, USA; Google, USA; Dynamic Variable, USA) Seventeen years after its initial publication at ICSE 2000, the Representational State Transfer (REST) architectural style continues to hold significance as both a guide for understanding how the World Wide Web is designed to work and an example of how principled design, through the application of architectural styles, can impact the development and understanding of large-scale software architecture. However, REST has also become an industry buzzword: frequently abused to suit a particular argument, confused with the general notion of using HTTP, and denigrated for not being more like a programming methodology or implementation framework. In this paper, we chart the history, evolution, and shortcomings of REST, as well as several related architectural styles that it inspired, from the perspective of a chain of doctoral dissertations produced by the University of California's Institute for Software Research at UC Irvine. These successive theses share a common theme: extending the insights of REST to new domains and, in their own way, exploring the boundary of software engineering as it applies to decentralized software architectures and architectural design. We conclude with discussion of the circumstances, environment, and organizational characteristics that gave rise to this body of work. @InProceedings{ESEC/FSE17p4, author = {Roy T. Fielding and Richard N. Taylor and Justin R. Erenkrantz and Michael M. Gorlick and Jim Whitehead and Rohit Khare and Peyman Oreizy}, title = {Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {4--14}, doi = {}, year = {2017}, } |
|
Kim, Milhan |
ESEC/FSE '17-IND: "Applying Deep Learning Based ..."
Applying Deep Learning Based Automatic Bug Triager to Industrial Projects
Sun-Ro Lee, Min-Jae Heo, Chan-Gun Lee, Milhan Kim, and Gaeul Jeong (Chung-Ang University, South Korea; LG Electronics, South Korea) Finding the appropriate developer for a bug report, so called `Bug Triage', is one of the bottlenecks in the bug resolution process. To address this problem, many approaches have proposed various automatic bug triage techniques in recent studies. We argue that most previous studies focused on open source projects only and did not consider deep learning techniques. In this paper, we propose to use Convolutional Neural Network and word embedding to build an automatic bug triager. The results of the experiments applied to both industrial and open source projects reveal benefits of the automatic approach and suggest co-operation of human and automatic triagers. Our experience in integrating and operating the proposed system in an industrial development environment is also reported. @InProceedings{ESEC/FSE17p926, author = {Sun-Ro Lee and Min-Jae Heo and Chan-Gun Lee and Milhan Kim and Gaeul Jeong}, title = {Applying Deep Learning Based Automatic Bug Triager to Industrial Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {926--931}, doi = {}, year = {2017}, } |
|
Knüppel, Alexander |
ESEC/FSE '17: "Is There a Mismatch between ..."
Is There a Mismatch between Real-World Feature Models and Product-Line Research?
Alexander Knüppel, Thomas Thüm, Stephan Mennicke, Jens Meinicke, and Ina Schaefer (TU Braunschweig, Germany; University of Magdeburg, Germany) Feature modeling has emerged as the de-facto standard to compactly capture the variability of a software product line. Multiple feature modeling languages have been proposed that evolved over the last decades to manage industrial-size product lines. However, less expressive languages, solely permitting require and exclude constraints, are permanently and carelessly used in product-line research. We address the problem whether those less expressive languages are sufficient for industrial product lines. We developed an algorithm to eliminate complex cross-tree constraints in a feature model, enabling the combination of tools and algorithms working with different feature model dialects in a plug-and-play manner. However, the scope of our algorithm is limited. Our evaluation on large feature models, including the Linux kernel, gives evidence that require and exclude constraints are not sufficient to express real-world feature models. Hence, we promote that research on feature models needs to consider arbitrary propositional formulas as cross-tree constraints prospectively. @InProceedings{ESEC/FSE17p291, author = {Alexander Knüppel and Thomas Thüm and Stephan Mennicke and Jens Meinicke and Ina Schaefer}, title = {Is There a Mismatch between Real-World Feature Models and Product-Line Research?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {291--302}, doi = {}, year = {2017}, } Info Artifacts Reusable |
|
Kögel, Stefan |
ESEC/FSE '17-DOC: "Recommender System for Model ..."
Recommender System for Model Driven Software Development
Stefan Kögel (University of Ulm, Germany) Models are key artifacts in model driven software engineering, similar to source code in traditional software engineering. Integrated development environments help users while writing source code, e.g. with typed auto completions, quick fixes, or automatic refactorings. Similar integrated features are rare for modeling IDEs. The above source code IDE features can be seen as a recommender system. A recommender system for model driven software engineering can combine data from different sources in order to infer a list of relevant and actionable model changes in real time. These recommendations can speed up working on models by automating repetitive tasks and preventing errors when the changes are atypical for the changed models. Recommendations can be based on common model transformations that are taken from the literature or learned from models in version control systems. Further information can be taken from instance- to meta-model relationships, modeling related artifacts (e.g. correctness constraints), and versions histories of models under version control. We created a prototype recommender that analyses the change history of a single model. We computed its accuracy via cross-validation and found that it was between 0.43 and 0.82 for models from an open source project. In order to have a bigger data set for the evaluation and the learning of model transformation, we also mined repositories from Eclipse projects for Ecore meta models and their versions. We found 4374 meta models with 17249 versions. 244 of these meta models were changed at least ten times and are candidates for learning common model transformations. We plan to evaluate our recommender system in two ways: (1) In off-line evaluations with data sets of models from the literature, created by us, or taken from industry partners. (2) In on-line user studies with participants from academia and industry, performed as case studies and controlled experiments. @InProceedings{ESEC/FSE17p1026, author = {Stefan Kögel}, title = {Recommender System for Model Driven Software Development}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1026--1029}, doi = {}, year = {2017}, } |
|
Kohli, Ayush |
ESEC/FSE '17-SRC: "DecisionDroid: A Supervised ..."
DecisionDroid: A Supervised Learning-Based System to Identify Cloned Android Applications
Ayush Kohli (Southern Illinois University at Carbondale, USA) This study presents DecisionDroid, a supervised learning based system to identify cloned Android app pairs. DecisionDroid is trained using a manually verified diverse dataset of 12,000 Android app pairs. On a hundred ten-fold cross validations, DecisionDroid achieved 97.9% precision, 98.3% recall, and 98.4% accuracy. @InProceedings{ESEC/FSE17p1059, author = {Ayush Kohli}, title = {DecisionDroid: A Supervised Learning-Based System to Identify Cloned Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1059--1061}, doi = {}, year = {2017}, } |
|
König, Florian |
ESEC/FSE '17-DEMO: "From Scenario Modeling to ..."
From Scenario Modeling to Scenario Programming for Reactive Systems with Dynamic Topology
Joel Greenyer, Daniel Gritzner, Florian König, Jannik Dahlke, Jianwei Shi, and Eric Wete (Leibniz Universität Hannover, Germany) Software-intensive systems often consist of cooperating reactive components. In mobile and reconfigurable systems, their topology changes at run-time, which influences how the components must cooperate. The Scenario Modeling Language (SML) offers a formal approach for specifying the reactive behavior such systems that aligns with how humans conceive and communicate behavioral requirements. Simulation and formal checks can find specification flaws early. We present a framework for the Scenario-based Programming (SBP) that reflects the concepts of SML in Java and makes the scenario modeling approach available for programming. SBP code can also be generated from SML and extended with platform-specific code, thus streamlining the transition from design to implementation. As an example serves a car-to-x communication system. Demo video and artifact: http://scenariotools.org/esecfse-2017-tool-demo/ @InProceedings{ESEC/FSE17p974, author = {Joel Greenyer and Daniel Gritzner and Florian König and Jannik Dahlke and Jianwei Shi and Eric Wete}, title = {From Scenario Modeling to Scenario Programming for Reactive Systems with Dynamic Topology}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {974--978}, doi = {}, year = {2017}, } Video Info |
|
Kreutzer, Patrick |
ESEC/FSE '17: "More Accurate Recommendations ..."
More Accurate Recommendations for Method-Level Changes
Georg Dotzler, Marius Kamp, Patrick Kreutzer, and Michael Philippsen (Friedrich-Alexander University Erlangen-Nürnberg, Germany) During the life span of large software projects, developers often apply the same code changes to different code locations in slight variations. Since the application of these changes to all locations is time-consuming and error-prone, tools exist that learn change patterns from input examples, search for possible pattern applications, and generate corresponding recommendations. In many cases, the generated recommendations are syntactically or semantically wrong due to code movements in the input examples. Thus, they are of low accuracy and developers cannot directly copy them into their projects without adjustments. We present the Accurate REcommendation System (ARES) that achieves a higher accuracy than other tools because its algorithms take care of code movements when creating patterns and recommendations. On average, the recommendations by ARES have an accuracy of 96% with respect to code changes that developers have manually performed in commits of source code archives. At the same time ARES achieves precision and recall values that are on par with other tools. @InProceedings{ESEC/FSE17p798, author = {Georg Dotzler and Marius Kamp and Patrick Kreutzer and Michael Philippsen}, title = {More Accurate Recommendations for Method-Level Changes}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {798--808}, doi = {}, year = {2017}, } Info |
|
Krinke, Jens |
ESEC/FSE '17: "Generalized Observational ..."
Generalized Observational Slicing for Tree-Represented Modelling Languages
Nicolas E. Gold, David Binkley, Mark Harman, Syed Islam, Jens Krinke, and Shin Yoo (University College London, UK; Loyola University Maryland, USA; University of East London, UK; KAIST, South Korea) Model-driven software engineering raises the abstraction level making complex systems easier to understand than if written in textual code. Nevertheless, large complicated software systems can have large models, motivating the need for slicing techniques that reduce the size of a model. We present a generalization of observation-based slicing that allows the criterion to be defined using a variety of kinds of observable behavior and does not require any complex dependence analysis. We apply our implementation of generalized observational slicing for tree-structured representations to Simulink models. The resulting slice might be the subset of the original model responsible for an observed failure or simply the sub-model semantically related to a classic slicing criterion. Unlike its predecessors, the algorithm is also capable of slicing embedded Stateflow state machines. A study of nine real-world models drawn from four different application domains demonstrates the effectiveness of our approach at dramatically reducing Simulink model sizes for realistic observation scenarios: for 9 out of 20 cases, the resulting model has fewer than 25% of the original model's elements. @InProceedings{ESEC/FSE17p547, author = {Nicolas E. Gold and David Binkley and Mark Harman and Syed Islam and Jens Krinke and Shin Yoo}, title = {Generalized Observational Slicing for Tree-Represented Modelling Languages}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {547--558}, doi = {}, year = {2017}, } |
|
Krishnamurthi, Shriram |
ESEC/FSE '17: "The Power of "Why" ..."
The Power of "Why" and "Why Not": Enriching Scenario Exploration with Provenance
Tim Nelson, Natasha Danas, Daniel J. Dougherty, and Shriram Krishnamurthi (Brown University, USA; Worcester Polytechnic Institute, USA) Scenario-finding tools like the Alloy Analyzer are widely used in numerous concrete domains like security, network analysis, UML analysis, and so on. They can help to verify properties and, more generally, aid in exploring a system's behavior. While scenario finders are valuable for their ability to produce concrete examples, individual scenarios only give insight into what is possible, leaving the user to make their own conclusions about what might be necessary. This paper enriches scenario finding by allowing users to ask ``why?'' and ``why not?'' questions about the examples they are given. We show how to distinguish parts of an example that cannot be consistently removed (or changed) from those that merely reflect underconstraint in the specification. In the former case we show how to determine which elements of the specification and which other components of the example together explain the presence of such facts. This paper formalizes the act of computing provenance in scenario-finding. We present Amalgam, an extension of the popular Alloy scenario-finder, which implements these foundations and provides interactive exploration of examples. We also evaluate Amalgam's algorithmics on a variety of both textbook and real-world examples. @InProceedings{ESEC/FSE17p106, author = {Tim Nelson and Natasha Danas and Daniel J. Dougherty and Shriram Krishnamurthi}, title = {The Power of "Why" and "Why Not": Enriching Scenario Exploration with Provenance}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {106--116}, doi = {}, year = {2017}, } Info Artifacts Reusable Best-Paper Award |
|
Kusano, Markus |
ESEC/FSE '17: "Thread-Modular Static Analysis ..."
Thread-Modular Static Analysis for Relaxed Memory Models
Markus Kusano and Chao Wang (Virginia Tech, USA; University of Southern California, USA) We propose a memory-model-aware static program analysis method for accurately analyzing the behavior of concurrent software running on processors with weak consistency models such as x86-TSO, SPARC-PSO, and SPARC-RMO. At the center of our method is a unified framework for deciding the feasibility of inter-thread interferences to avoid propagating spurious data flows during static analysis and thus boost the performance of the static analyzer. We formulate the checking of interference feasibility as a set of Datalog rules which are both efficiently solvable and general enough to capture a range of hardware-level memory models. Compared to existing techniques, our method can significantly reduce the number of bogus alarms as well as unsound proofs. We implemented the method and evaluated it on a large set of multithreaded C programs. Our experiments show the method significantly outperforms state-of-the-art techniques in terms of accuracy with only moderate runtime overhead. @InProceedings{ESEC/FSE17p337, author = {Markus Kusano and Chao Wang}, title = {Thread-Modular Static Analysis for Relaxed Memory Models}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {337--348}, doi = {}, year = {2017}, } |
|
Kuvent, Aviv |
ESEC/FSE '17: "A Symbolic Justice Violations ..."
A Symbolic Justice Violations Transition System for Unrealizable GR(1) Specifications
Aviv Kuvent, Shahar Maoz, and Jan Oliver Ringert (Tel Aviv University, Israel) One of the main challenges of reactive synthesis, an automated procedure to obtain a correct-by-construction reactive system, is to deal with unrealizable specifications. Existing approaches to deal with unrealizability, in the context of GR(1), an expressive assume-guarantee fragment of LTL that enables efficient synthesis, include the generation of concrete counter-strategies and the computation of an unrealizable core. Although correct, such approaches produce large and complicated counter-strategies, often containing thousands of states. This hinders their use by engineers. In this work we present the Justice Violations Transition System (JVTS), a novel symbolic representation of counter-strategies for GR(1). The JVTS is much smaller and simpler than its corresponding concrete counter-strategy. Moreover, it is annotated with invariants that explain how the counter-strategy forces the system to violate the specification. We compute the JVTS symbolically, and thus more efficiently, without the expensive enumeration of concrete states. Finally, we provide the JVTS with an on-demand interactive concrete and symbolic play. We implemented our work, validated its correctness, and evaluated it on 14 unrealizable specifications of autonomous Lego robots as well as on benchmarks from the literature. The evaluation shows not only that the JVTS is in most cases much smaller than the corresponding concrete counter-strategy, but also that its computation is faster. @InProceedings{ESEC/FSE17p362, author = {Aviv Kuvent and Shahar Maoz and Jan Oliver Ringert}, title = {A Symbolic Justice Violations Transition System for Unrealizable GR(1) Specifications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {362--372}, doi = {}, year = {2017}, } Info |
|
Kwak, Thomas |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Support for Iteration on Code Search
Lee Martie, André van der Hoek, and Thomas Kwak (University of California at Irvine, USA) Sometimes, when programmers use a search engine they know more or less what they need. Other times, programmers use the search engine to look around and generate possible ideas for the programming problem they are working on. The key insight we explore in this paper is that the results found in the latter case tend to serve as inspiration or triggers for the next queries issued. We introduce two search engines, CodeExchange and CodeLikeThis, both of which are specifically designed to enable the user to directly leverage the results in formulating the next query. CodeExchange does this with a set of four features supporting the programmer to use characteristics of the results to find other code with or without those characteristics. CodeLikeThis supports simply selecting an entire result to find code that is analogous, to some degree, to that result. We evaluated how these approaches were used along with two approaches not explicitly supporting iteration, a baseline and Google, in a user study among 24 developers. We find that search engines that support using results to form the next query can improve the programmers’ search experience and different approaches to iteration can provide better experiences depending on the task. @InProceedings{ESEC/FSE17p774, author = {Lee Martie and André van der Hoek and Thomas Kwak}, title = {Understanding the Impact of Support for Iteration on Code Search}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {774--785}, doi = {}, year = {2017}, } |
|
Labuschagne, Adriaan |
ESEC/FSE '17: "Measuring the Cost of Regression ..."
Measuring the Cost of Regression Testing in Practice: A Study of Java Projects using Continuous Integration
Adriaan Labuschagne, Laura Inozemtseva, and Reid Holmes (University of Waterloo, Canada; University of British Columbia, Canada) Software defects cost time and money to diagnose and fix. Consequently, developers use a variety of techniques to avoid introducing defects into their systems. However, these techniques have costs of their own; the benefit of using a technique must outweigh the cost of applying it. In this paper we investigate the costs and benefits of automated regression testing in practice. Specifically, we studied 61 projects that use Travis CI, a cloud-based continuous integration tool, in order to examine real test failures that were encountered by the developers of those projects. We determined how the developers resolved the failures they encountered and used this information to classify the failures as being caused by a flaky test, by a bug in the system under test, or by a broken or obsolete test. We consider that test failures caused by bugs represent a benefit of the test suite, while failures caused by broken or obsolete tests represent a test suite maintenance cost. We found that 18% of test suite executions fail and that 13% of these failures are flaky. Of the non-flaky failures, only 74% were caused by a bug in the system under test; the remaining 26% were due to incorrect or obsolete tests. In addition, we found that, in the failed builds, only 0.38% of the test case executions failed and 64% of failed builds contained more than one failed test. Our findings contribute to a wider understanding of the unforeseen costs that can impact the overall cost effectiveness of regression testing in practice. They can also inform research into test case selection techniques, as we have provided an approximate empirical bound on the practical value that could be extracted from such techniques. This value appears to be large, as the 61 systems under study contained nearly 3 million lines of test code and yet over 99% of test case executions could have been eliminated with a perfect oracle. @InProceedings{ESEC/FSE17p821, author = {Adriaan Labuschagne and Laura Inozemtseva and Reid Holmes}, title = {Measuring the Cost of Regression Testing in Practice: A Study of Java Projects using Continuous Integration}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {821--830}, doi = {}, year = {2017}, } Info |
|
Lahtinen, Eric |
ESEC/FSE '17: "CodeCarbonCopy ..."
CodeCarbonCopy
Stelios Sidiroglou-Douskos, Eric Lahtinen, Anthony Eden, Fan Long, and Martin Rinard (Massachusetts Institute of Technology, USA) We present CodeCarbonCopy (CCC), a system for transferring code from a donor application into a recipient application. CCC starts with functionality identified by the developer to transfer into an insertion point (again identified by the developer) in the recipient. CCC uses paired executions of the donor and recipient on the same input file to obtain a translation between the data representation and name space of the recipient and the data representation and name space of the donor. It also implements a static analysis that identifies and removes irrelevant functionality useful in the donor but not in the recipient. We evaluate CCC on eight transfers between six applications. Our results show that CCC can successfully transfer donor functionality into recipient applications. @InProceedings{ESEC/FSE17p95, author = {Stelios Sidiroglou-Douskos and Eric Lahtinen and Anthony Eden and Fan Long and Martin Rinard}, title = {CodeCarbonCopy}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {95--105}, doi = {}, year = {2017}, } |
|
Lam, Wing |
ESEC/FSE '17-IND: "Record and Replay for Android: ..."
Record and Replay for Android: Are We There Yet in Industrial Cases?
Wing Lam, Zhengkai Wu, Dengfeng Li, Wenyu Wang, Haibing Zheng, Hui Luo, Peng Yan, Yuetang Deng, and Tao Xie (University of Illinois at Urbana-Champaign, USA; Tencent, China) Mobile applications, or apps for short, are gaining popularity. The input sources (e.g., touchscreen, sensors, transmitters) of the smart devices that host these apps enable the apps to offer a rich experience to the users, but these input sources pose testing complications to the developers (e.g., writing tests to accurately utilize multiple input sources together and be able to replay such tests at a later time). To alleviate these complications, researchers and practitioners in recent years have developed a variety of record-and-replay tools to support the testing expressiveness of smart devices. These tools allow developers to easily record and automate the replay of complicated usage scenarios of their app. Due to Android's large share of the smart-device market, numerous record-and-replay tools have been developed using a variety of techniques to test Android apps. To better understand the strengths and weaknesses of these tools, we present a comparison of popular record-and-replay tools from researchers and practitioners, by applying these tools to test three popular industrial apps downloaded from the Google Play store. Our comparison is based on three main metrics: (1) ability to reproduce common usage scenarios, (2) space overhead of traces created by the tools, and (3) robustness of traces created by the tools (when being replayed on devices with different resolutions). The results from our comparison show which record-and-replay tools may be the best for developers and identify future directions for improving these tools to better address testing complications of smart devices. @InProceedings{ESEC/FSE17p854, author = {Wing Lam and Zhengkai Wu and Dengfeng Li and Wenyu Wang and Haibing Zheng and Hui Luo and Peng Yan and Yuetang Deng and Tao Xie}, title = {Record and Replay for Android: Are We There Yet in Industrial Cases?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {854--859}, doi = {}, year = {2017}, } |
|
Lamping, Robert |
ESEC/FSE '17-IND: "Strong Agile Metrics: Mining ..."
Strong Agile Metrics: Mining Log Data to Determine Predictive Power of Software Metrics for Continuous Delivery Teams
Hennie Huijgens, Robert Lamping, Dick Stevens, Hartger Rothengatter, Georgios Gousios, and Daniele Romano (Delft University of Technology, Netherlands; ING Bank, Netherlands; CGI, Netherlands) ING Bank, a large Netherlands-based internationally operating bank, implemented a fully automated continuous delivery pipe-line for its software engineering activities in more than 300 teams, that perform more than 2500 deployments to production each month on more than 750 different applications. Our objective is to examine how strong metrics for agile (Scrum) DevOps teams can be set in an iterative fashion. We perform an exploratory case study that focuses on the classification based on predictive power of software metrics, in which we analyze log data derived from two initial sources within this pipeline. We analyzed a subset of 16 metrics from 59 squads. We identified two lagging metrics and assessed four leading metrics to be strong. @InProceedings{ESEC/FSE17p866, author = {Hennie Huijgens and Robert Lamping and Dick Stevens and Hartger Rothengatter and Georgios Gousios and Daniele Romano}, title = {Strong Agile Metrics: Mining Log Data to Determine Predictive Power of Software Metrics for Continuous Delivery Teams}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {866--871}, doi = {}, year = {2017}, } |
|
Le, Xuan-Bach D. |
ESEC/FSE '17-DEMO: "XSearch: A Domain-Specific ..."
XSearch: A Domain-Specific Cross-Language Relevant Question Retrieval Tool
Bowen Xu, Zhenchang Xing, Xin Xia, David Lo, and Xuan-Bach D. Le (Zhejiang University, China; Australian National University, Australia; University of British Columbia, Canada; Singapore Management University, Singapore) During software development process, Chinese developers often seek solutions to the technical problems they encounter by searching relevant questions on Q&A sites. When developers fail to find solutions on Q&A sites in Chinese, they could translate their query and search on the English Q&A sites. However, Chinese developers who are non-native English speakers often are not comfortable to ask or search questions in English, as they do not know the proper translation of the Chinese technical words into the English technical words. Furthermore, the process of manually formulating cross-language queries and determining the importance of query words is a tedious and time-consuming process. For the purpose of helping Chinese developers take advantages of the rich knowledge base of the English version of Stack Overflow and simplify the retrieval process, we propose an automated cross-language relevant question retrieval tool (XSearch) to retrieve relevant English questions on Stack Overflow for a given Chinese question. This tool can address the increasing need for developer to solve technical problems by retrieving cross-language relevant Q&A resources. Demo Tool Website: http://172.93.36.10:8080/XSearch Demo Video: https://goo.gl/h57sed @InProceedings{ESEC/FSE17p1009, author = {Bowen Xu and Zhenchang Xing and Xin Xia and David Lo and Xuan-Bach D. Le}, title = {XSearch: A Domain-Specific Cross-Language Relevant Question Retrieval Tool}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1009--1013}, doi = {}, year = {2017}, } Video ESEC/FSE '17: "S3: Syntax- and Semantic-Guided ..." S3: Syntax- and Semantic-Guided Repair Synthesis via Programming by Examples Xuan-Bach D. Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser (Singapore Management University, Singapore; IST Austria, Austria; Carnegie Mellon University, USA; Stellenbosch University, South Africa) A notable class of techniques for automatic program repair is known as semantics-based. Such techniques, e.g., Angelix, infer semantic specifications via symbolic execution, and then use program synthesis to construct new code that satisfies those inferred specifications. However, the obtained specifications are naturally incomplete, leaving the synthesis engine with a difficult task of synthesizing a general solution from a sparse space of many possible solutions that are consistent with the provided specifications but that do not necessarily generalize. We present S3, a new repair synthesis engine that leverages programming-by-examples methodology to synthesize high-quality bug repairs. The novelty in S3 that allows it to tackle the sparse search space to create more general repairs is three-fold: (1) A systematic way to customize and constrain the syntactic search space via a domain-specific language, (2) An efficient enumeration- based search strategy over the constrained search space, and (3) A number of ranking features based on measures of the syntactic and semantic distances between candidate solutions and the original buggy program. We compare S3’s repair effectiveness with state-of-the-art synthesis engines Angelix, Enumerative, and CVC4. S3 can successfully and correctly fix at least three times more bugs than the best baseline on datasets of 52 bugs in small programs, and 100 bugs in real-world large programs. @InProceedings{ESEC/FSE17p593, author = {Xuan-Bach D. Le and Duc-Hiep Chu and David Lo and Claire Le Goues and Willem Visser}, title = {S3: Syntax- and Semantic-Guided Repair Synthesis via Programming by Examples}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {593--604}, doi = {}, year = {2017}, } |
|
Lee, Chan-Gun |
ESEC/FSE '17-IND: "Applying Deep Learning Based ..."
Applying Deep Learning Based Automatic Bug Triager to Industrial Projects
Sun-Ro Lee, Min-Jae Heo, Chan-Gun Lee, Milhan Kim, and Gaeul Jeong (Chung-Ang University, South Korea; LG Electronics, South Korea) Finding the appropriate developer for a bug report, so called `Bug Triage', is one of the bottlenecks in the bug resolution process. To address this problem, many approaches have proposed various automatic bug triage techniques in recent studies. We argue that most previous studies focused on open source projects only and did not consider deep learning techniques. In this paper, we propose to use Convolutional Neural Network and word embedding to build an automatic bug triager. The results of the experiments applied to both industrial and open source projects reveal benefits of the automatic approach and suggest co-operation of human and automatic triagers. Our experience in integrating and operating the proposed system in an industrial development environment is also reported. @InProceedings{ESEC/FSE17p926, author = {Sun-Ro Lee and Min-Jae Heo and Chan-Gun Lee and Milhan Kim and Gaeul Jeong}, title = {Applying Deep Learning Based Automatic Bug Triager to Industrial Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {926--931}, doi = {}, year = {2017}, } |
|
Lee, Sun-Ro |
ESEC/FSE '17-IND: "Applying Deep Learning Based ..."
Applying Deep Learning Based Automatic Bug Triager to Industrial Projects
Sun-Ro Lee, Min-Jae Heo, Chan-Gun Lee, Milhan Kim, and Gaeul Jeong (Chung-Ang University, South Korea; LG Electronics, South Korea) Finding the appropriate developer for a bug report, so called `Bug Triage', is one of the bottlenecks in the bug resolution process. To address this problem, many approaches have proposed various automatic bug triage techniques in recent studies. We argue that most previous studies focused on open source projects only and did not consider deep learning techniques. In this paper, we propose to use Convolutional Neural Network and word embedding to build an automatic bug triager. The results of the experiments applied to both industrial and open source projects reveal benefits of the automatic approach and suggest co-operation of human and automatic triagers. Our experience in integrating and operating the proposed system in an industrial development environment is also reported. @InProceedings{ESEC/FSE17p926, author = {Sun-Ro Lee and Min-Jae Heo and Chan-Gun Lee and Milhan Kim and Gaeul Jeong}, title = {Applying Deep Learning Based Automatic Bug Triager to Industrial Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {926--931}, doi = {}, year = {2017}, } |
|
Lee, Wen-Chuan |
ESEC/FSE '17: "LAMP: Data Provenance for ..."
LAMP: Data Provenance for Graph Based Machine Learning Algorithms through Derivative Computation
Shiqing Ma, Yousra Aafer, Zhaogui Xu, Wen-Chuan Lee, Juan Zhai, Yingqi Liu, and Xiangyu Zhang (Purdue University, USA; Nanjing University, China) Data provenance tracking determines the set of inputs related to a given output. It enables quality control and problem diagnosis in data engineering. Most existing techniques work by tracking program dependencies. They cannot quantitatively assess the importance of related inputs, which is critical to machine learning algorithms, in which an output tends to depend on a huge set of inputs while only some of them are of importance. In this paper, we propose LAMP, a provenance computation system for machine learning algorithms. Inspired by automatic differentiation (AD), LAMP quantifies the importance of an input for an output by computing the partial derivative. LAMP separates the original data processing and the more expensive derivative computation to different processes to achieve cost-effectiveness. In addition, it allows quantifying importance for inputs related to discrete behavior, such as control flow selection. The evaluation on a set of real world programs and data sets illustrates that LAMP produces more precise and succinct provenance than program dependence based techniques, with much less overhead. Our case studies demonstrate the potential of LAMP in problem diagnosis in data engineering. @InProceedings{ESEC/FSE17p786, author = {Shiqing Ma and Yousra Aafer and Zhaogui Xu and Wen-Chuan Lee and Juan Zhai and Yingqi Liu and Xiangyu Zhang}, title = {LAMP: Data Provenance for Graph Based Machine Learning Algorithms through Derivative Computation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {786--797}, doi = {}, year = {2017}, } |
|
Le Goues, Claire |
ESEC/FSE '17: "S3: Syntax- and Semantic-Guided ..."
S3: Syntax- and Semantic-Guided Repair Synthesis via Programming by Examples
Xuan-Bach D. Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser (Singapore Management University, Singapore; IST Austria, Austria; Carnegie Mellon University, USA; Stellenbosch University, South Africa) A notable class of techniques for automatic program repair is known as semantics-based. Such techniques, e.g., Angelix, infer semantic specifications via symbolic execution, and then use program synthesis to construct new code that satisfies those inferred specifications. However, the obtained specifications are naturally incomplete, leaving the synthesis engine with a difficult task of synthesizing a general solution from a sparse space of many possible solutions that are consistent with the provided specifications but that do not necessarily generalize. We present S3, a new repair synthesis engine that leverages programming-by-examples methodology to synthesize high-quality bug repairs. The novelty in S3 that allows it to tackle the sparse search space to create more general repairs is three-fold: (1) A systematic way to customize and constrain the syntactic search space via a domain-specific language, (2) An efficient enumeration- based search strategy over the constrained search space, and (3) A number of ranking features based on measures of the syntactic and semantic distances between candidate solutions and the original buggy program. We compare S3’s repair effectiveness with state-of-the-art synthesis engines Angelix, Enumerative, and CVC4. S3 can successfully and correctly fix at least three times more bugs than the best baseline on datasets of 52 bugs in small programs, and 100 bugs in real-world large programs. @InProceedings{ESEC/FSE17p593, author = {Xuan-Bach D. Le and Duc-Hiep Chu and David Lo and Claire Le Goues and Willem Visser}, title = {S3: Syntax- and Semantic-Guided Repair Synthesis via Programming by Examples}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {593--604}, doi = {}, year = {2017}, } |
|
Lerch, Johannes |
ESEC/FSE '17: "CodeMatch: Obfuscation Won't ..."
CodeMatch: Obfuscation Won't Conceal Your Repackaged App
Leonid Glanz, Sven Amann, Michael Eichberg, Michael Reif, Ben Hermann, Johannes Lerch, and Mira Mezini (TU Darmstadt, Germany) An established way to steal the income of app developers, or to trick users into installing malware, is the creation of repackaged apps. These are clones of – typically – successful apps. To conceal their nature, they are often obfuscated by their creators. But, given that it is a common best practice to obfuscate apps, a trivial identification of repackaged apps is not possible. The problem is further intensified by the prevalent usage of libraries. In many apps, the size of the overall code base is basically determined by the used libraries. Therefore, two apps, where the obfuscated code bases are very similar, do not have to be repackages of each other. To reliably detect repackaged apps, we propose a two step approach which first focuses on the identification and removal of the library code in obfuscated apps. This approach – LibDetect – relies on code representations which abstract over several parts of the underlying bytecode to be resilient against certain obfuscation techniques. Using this approach, we are able to identify on average 70% more used libraries per app than previous approaches. After the removal of an app’s library code, we then fuzzy hash the most abstract representation of the remaining app code to ensure that we can identify repackaged apps even if very advanced obfuscation techniques are used. This makes it possible to identify repackaged apps. Using our approach, we found that ≈ 15% of all apps in Android app stores are repackages @InProceedings{ESEC/FSE17p638, author = {Leonid Glanz and Sven Amann and Michael Eichberg and Michael Reif and Ben Hermann and Johannes Lerch and Mira Mezini}, title = {CodeMatch: Obfuscation Won't Conceal Your Repackaged App}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {638--648}, doi = {}, year = {2017}, } Info |
|
Li, Dengfeng |
ESEC/FSE '17-IND: "Record and Replay for Android: ..."
Record and Replay for Android: Are We There Yet in Industrial Cases?
Wing Lam, Zhengkai Wu, Dengfeng Li, Wenyu Wang, Haibing Zheng, Hui Luo, Peng Yan, Yuetang Deng, and Tao Xie (University of Illinois at Urbana-Champaign, USA; Tencent, China) Mobile applications, or apps for short, are gaining popularity. The input sources (e.g., touchscreen, sensors, transmitters) of the smart devices that host these apps enable the apps to offer a rich experience to the users, but these input sources pose testing complications to the developers (e.g., writing tests to accurately utilize multiple input sources together and be able to replay such tests at a later time). To alleviate these complications, researchers and practitioners in recent years have developed a variety of record-and-replay tools to support the testing expressiveness of smart devices. These tools allow developers to easily record and automate the replay of complicated usage scenarios of their app. Due to Android's large share of the smart-device market, numerous record-and-replay tools have been developed using a variety of techniques to test Android apps. To better understand the strengths and weaknesses of these tools, we present a comparison of popular record-and-replay tools from researchers and practitioners, by applying these tools to test three popular industrial apps downloaded from the Google Play store. Our comparison is based on three main metrics: (1) ability to reproduce common usage scenarios, (2) space overhead of traces created by the tools, and (3) robustness of traces created by the tools (when being replayed on devices with different resolutions). The results from our comparison show which record-and-replay tools may be the best for developers and identify future directions for improving these tools to better address testing complications of smart devices. @InProceedings{ESEC/FSE17p854, author = {Wing Lam and Zhengkai Wu and Dengfeng Li and Wenyu Wang and Haibing Zheng and Hui Luo and Peng Yan and Yuetang Deng and Tao Xie}, title = {Record and Replay for Android: Are We There Yet in Industrial Cases?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {854--859}, doi = {}, year = {2017}, } |
|
Li, Xiaohong |
ESEC/FSE '17: "Loopster: Static Loop Termination ..."
Loopster: Static Loop Termination Analysis
Xiaofei Xie, Bihuan Chen, Liang Zou, Shang-Wei Lin, Yang Liu, and Xiaohong Li (Tianjin University, China; Nanyang Technological University, Singapore) Loop termination is an important problem for proving the correctness of a system and ensuring that the system always reacts. Existing loop termination analysis techniques mainly depend on the synthesis of ranking functions, which is often expensive. In this paper, we present a novel approach, named Loopster, which performs an efficient static analysis to decide the termination for loops based on path termination analysis and path dependency reasoning. Loopster adopts a divide-and-conquer approach: (1) we extract individual paths from a target multi-path loop and analyze the termination of each path, (2) analyze the dependencies between each two paths, and then (3) determine the overall termination of the target loop based on the relations among paths. We evaluate Loopster by applying it on the loop termination competition benchmark and three real-world projects. The results show that Loopster is effective in a majority of loops with better accuracy and 20 ×+ performance improvement compared to the state-of-the-art tools. @InProceedings{ESEC/FSE17p84, author = {Xiaofei Xie and Bihuan Chen and Liang Zou and Shang-Wei Lin and Yang Liu and Xiaohong Li}, title = {Loopster: Static Loop Termination Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {84--94}, doi = {}, year = {2017}, } |
|
Li, Yuekang |
ESEC/FSE '17: "Steelix: Program-State Based ..."
Steelix: Program-State Based Binary Fuzzing
Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu (Nanyang Technological University, Singapore; Fudan University, China) Coverage-based fuzzing is one of the most effective techniques to find vulnerabilities, bugs or crashes. However, existing techniques suffer from the difficulty in exercising the paths that are protected by magic bytes comparisons (e.g., string equality comparisons). Several approaches have been proposed to use heavy-weight program analysis to break through magic bytes comparisons, and hence are less scalable. In this paper, we propose a program-state based binary fuzzing approach, named Steelix, which improves the penetration power of a fuzzer at the cost of an acceptable slow down of the execution speed. In particular, we use light-weight static analysis and binary instrumentation to provide not only coverage information but also comparison progress information to a fuzzer. Such program state information informs a fuzzer about where the magic bytes are located in the test input and how to perform mutations to match the magic bytes efficiently. We have implemented Steelix and evaluated it on three datasets: LAVA-M dataset, DARPA CGC sample binaries and five real-life programs. The results show that Steelix has better code coverage and bug detection capability than the state-of-the-art fuzzers. Moreover, we found one CVE and nine new bugs. @InProceedings{ESEC/FSE17p627, author = {Yuekang Li and Bihuan Chen and Mahinthan Chandramohan and Shang-Wei Lin and Yang Liu and Alwen Tiu}, title = {Steelix: Program-State Based Binary Fuzzing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {627--637}, doi = {}, year = {2017}, } |
|
Liblit, Ben |
ESEC/FSE '17: "The Care and Feeding of Wild-Caught ..."
The Care and Feeding of Wild-Caught Mutants
David Bingham Brown, Michael Vaughn, Ben Liblit, and Thomas Reps (University of Wisconsin-Madison, USA) Mutation testing of a test suite and a program provides a way to measure the quality of the test suite. In essence, mutation testing is a form of sensitivity testing: by running mutated versions of the program against the test suite, mutation testing measures the suite’s sensitivity for detecting bugs that a programmer might introduce into the program. This paper introduces a technique to improve mutation testing that we call wild-caught mutants; it provides a method for creating potential faults that are more closely coupled with changes made by actual programmers. This technique allows the mutation tester to have more certainty that the test suite is sensitive to the kind of changes that have been observed to have been made by programmers in real-world cases. @InProceedings{ESEC/FSE17p511, author = {David Bingham Brown and Michael Vaughn and Ben Liblit and Thomas Reps}, title = {The Care and Feeding of Wild-Caught Mutants}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {511--522}, doi = {}, year = {2017}, } Video Info Artifacts Reusable |
|
Lin, Shang-Wei |
ESEC/FSE '17: "Loopster: Static Loop Termination ..."
Loopster: Static Loop Termination Analysis
Xiaofei Xie, Bihuan Chen, Liang Zou, Shang-Wei Lin, Yang Liu, and Xiaohong Li (Tianjin University, China; Nanyang Technological University, Singapore) Loop termination is an important problem for proving the correctness of a system and ensuring that the system always reacts. Existing loop termination analysis techniques mainly depend on the synthesis of ranking functions, which is often expensive. In this paper, we present a novel approach, named Loopster, which performs an efficient static analysis to decide the termination for loops based on path termination analysis and path dependency reasoning. Loopster adopts a divide-and-conquer approach: (1) we extract individual paths from a target multi-path loop and analyze the termination of each path, (2) analyze the dependencies between each two paths, and then (3) determine the overall termination of the target loop based on the relations among paths. We evaluate Loopster by applying it on the loop termination competition benchmark and three real-world projects. The results show that Loopster is effective in a majority of loops with better accuracy and 20 ×+ performance improvement compared to the state-of-the-art tools. @InProceedings{ESEC/FSE17p84, author = {Xiaofei Xie and Bihuan Chen and Liang Zou and Shang-Wei Lin and Yang Liu and Xiaohong Li}, title = {Loopster: Static Loop Termination Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {84--94}, doi = {}, year = {2017}, } ESEC/FSE '17: "Steelix: Program-State Based ..." Steelix: Program-State Based Binary Fuzzing Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu (Nanyang Technological University, Singapore; Fudan University, China) Coverage-based fuzzing is one of the most effective techniques to find vulnerabilities, bugs or crashes. However, existing techniques suffer from the difficulty in exercising the paths that are protected by magic bytes comparisons (e.g., string equality comparisons). Several approaches have been proposed to use heavy-weight program analysis to break through magic bytes comparisons, and hence are less scalable. In this paper, we propose a program-state based binary fuzzing approach, named Steelix, which improves the penetration power of a fuzzer at the cost of an acceptable slow down of the execution speed. In particular, we use light-weight static analysis and binary instrumentation to provide not only coverage information but also comparison progress information to a fuzzer. Such program state information informs a fuzzer about where the magic bytes are located in the test input and how to perform mutations to match the magic bytes efficiently. We have implemented Steelix and evaluated it on three datasets: LAVA-M dataset, DARPA CGC sample binaries and five real-life programs. The results show that Steelix has better code coverage and bug detection capability than the state-of-the-art fuzzers. Moreover, we found one CVE and nine new bugs. @InProceedings{ESEC/FSE17p627, author = {Yuekang Li and Bihuan Chen and Mahinthan Chandramohan and Shang-Wei Lin and Yang Liu and Alwen Tiu}, title = {Steelix: Program-State Based Binary Fuzzing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {627--637}, doi = {}, year = {2017}, } |
|
Linares-Vásquez, Mario |
ESEC/FSE '17: "Enabling Mutation Testing ..."
Enabling Mutation Testing for Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Michele Tufano, Kevin Moran, Massimiliano Di Penta, Christopher Vendome, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (Universidad de los Andes, Colombia; University of Lugano, Switzerland; College of William and Mary, USA; University of Sannio, Italy) Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well as to guide test case generation or prioritization. Empirical studies have shown that, while mutants are generally representative of real faults, an effective application of mutation testing requires “traditional” operators designed for programming languages to be augmented with operators specific to an application domain and/or technology. This paper proposes MDroid+, a framework for effective mutation testing of Android apps. First, we systematically devise a taxonomy of 262 types of Android faults grouped in 14 categories by manually analyzing 2,023 so ware artifacts from different sources (e.g., bug reports, commits). Then, we identified a set of 38 mutation operators, and implemented an infrastructure to automatically seed mutations in Android apps with 35 of the identified operators. The taxonomy and the proposed operators have been evaluated in terms of stillborn/trivial mutants generated as compared to well know mutation tools, and their capacity to represent real faults in Android apps @InProceedings{ESEC/FSE17p233, author = {Mario Linares-Vásquez and Gabriele Bavota and Michele Tufano and Kevin Moran and Massimiliano Di Penta and Christopher Vendome and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Enabling Mutation Testing for Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {233--244}, doi = {}, year = {2017}, } Info |
|
Liu, Yang |
ESEC/FSE '17: "Guided, Stochastic Model-Based ..."
Guided, Stochastic Model-Based GUI Testing of Android Apps
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su (East China Normal University, China; Nanyang Technological University, Singapore; Shanghai Jiao Tong University, China; University of California at Davis, USA) Mobile apps are ubiquitous, operate in complex environments and are developed under the time-to-market pressure. Ensuring their correctness and reliability thus becomes an important challenge. This paper introduces Stoat, a novel guided approach to perform stochastic model-based testing on Android apps. Stoat operates in two phases: (1) Given an app as input, it uses dynamic analysis enhanced by a weighted UI exploration strategy and static analysis to reverse engineer a stochastic model of the app's GUI interactions; and (2) it adapts Gibbs sampling to iteratively mutate/refine the stochastic model and guides test generation from the mutated models toward achieving high code and model coverage and exhibiting diverse sequences. During testing, system-level events are randomly injected to further enhance the testing effectiveness. Stoat was evaluated on 93 open-source apps. The results show (1) the models produced by Stoat cover 17~31% more code than those by existing modeling tools; (2) Stoat detects 3X more unique crashes than two state-of-the-art testing tools, Monkey and Sapienz. Furthermore, Stoat tested 1661 most popular Google Play apps, and detected 2110 previously unknown and unique crashes. So far, 43 developers have responded that they are investigating our reports. 20 of reported crashes have been confirmed, and 8 already fixed. @InProceedings{ESEC/FSE17p245, author = {Ting Su and Guozhu Meng and Yuting Chen and Ke Wu and Weiming Yang and Yao Yao and Geguang Pu and Yang Liu and Zhendong Su}, title = {Guided, Stochastic Model-Based GUI Testing of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {245--256}, doi = {}, year = {2017}, } ESEC/FSE '17: "Loopster: Static Loop Termination ..." Loopster: Static Loop Termination Analysis Xiaofei Xie, Bihuan Chen, Liang Zou, Shang-Wei Lin, Yang Liu, and Xiaohong Li (Tianjin University, China; Nanyang Technological University, Singapore) Loop termination is an important problem for proving the correctness of a system and ensuring that the system always reacts. Existing loop termination analysis techniques mainly depend on the synthesis of ranking functions, which is often expensive. In this paper, we present a novel approach, named Loopster, which performs an efficient static analysis to decide the termination for loops based on path termination analysis and path dependency reasoning. Loopster adopts a divide-and-conquer approach: (1) we extract individual paths from a target multi-path loop and analyze the termination of each path, (2) analyze the dependencies between each two paths, and then (3) determine the overall termination of the target loop based on the relations among paths. We evaluate Loopster by applying it on the loop termination competition benchmark and three real-world projects. The results show that Loopster is effective in a majority of loops with better accuracy and 20 ×+ performance improvement compared to the state-of-the-art tools. @InProceedings{ESEC/FSE17p84, author = {Xiaofei Xie and Bihuan Chen and Liang Zou and Shang-Wei Lin and Yang Liu and Xiaohong Li}, title = {Loopster: Static Loop Termination Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {84--94}, doi = {}, year = {2017}, } ESEC/FSE '17: "Steelix: Program-State Based ..." Steelix: Program-State Based Binary Fuzzing Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu (Nanyang Technological University, Singapore; Fudan University, China) Coverage-based fuzzing is one of the most effective techniques to find vulnerabilities, bugs or crashes. However, existing techniques suffer from the difficulty in exercising the paths that are protected by magic bytes comparisons (e.g., string equality comparisons). Several approaches have been proposed to use heavy-weight program analysis to break through magic bytes comparisons, and hence are less scalable. In this paper, we propose a program-state based binary fuzzing approach, named Steelix, which improves the penetration power of a fuzzer at the cost of an acceptable slow down of the execution speed. In particular, we use light-weight static analysis and binary instrumentation to provide not only coverage information but also comparison progress information to a fuzzer. Such program state information informs a fuzzer about where the magic bytes are located in the test input and how to perform mutations to match the magic bytes efficiently. We have implemented Steelix and evaluated it on three datasets: LAVA-M dataset, DARPA CGC sample binaries and five real-life programs. The results show that Steelix has better code coverage and bug detection capability than the state-of-the-art fuzzers. Moreover, we found one CVE and nine new bugs. @InProceedings{ESEC/FSE17p627, author = {Yuekang Li and Bihuan Chen and Mahinthan Chandramohan and Shang-Wei Lin and Yang Liu and Alwen Tiu}, title = {Steelix: Program-State Based Binary Fuzzing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {627--637}, doi = {}, year = {2017}, } |
|
Liu, Yepang |
ESEC/FSE '17: "OASIS: Prioritizing Static ..."
OASIS: Prioritizing Static Analysis Warnings for Android Apps Based on App User Reviews
Lili Wei, Yepang Liu, and Shing-Chi Cheung (Hong Kong University of Science and Technology, China) Lint is a widely-used static analyzer for detecting bugs/issues in Android apps. However, it can generate many false warnings. One existing solution to this problem is to leverage project history data (e.g., bug fixing statistics) for warning prioritization. Unfortunately, such techniques are biased toward a project’s archived warnings and can easily miss newissues. Anotherweakness is that developers cannot readily relate the warnings to the impacts perceivable by users. To overcome these weaknesses, in this paper, we propose a semantics-aware approach, OASIS, to prioritizing Lint warnings by leveraging app user reviews. OASIS combines program analysis and NLP techniques to recover the intrinsic links between the Lint warnings for a given app and the user complaints on the app problems caused by the issues of concern. OASIS leverages the strength of such links to prioritize warnings. We evaluated OASIS on six popular and large-scale open-source Android apps. The results show that OASIS can effectively prioritize Lint warnings and help identify new issues that are previously-unknown to app developers. @InProceedings{ESEC/FSE17p672, author = {Lili Wei and Yepang Liu and Shing-Chi Cheung}, title = {OASIS: Prioritizing Static Analysis Warnings for Android Apps Based on App User Reviews}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {672--682}, doi = {}, year = {2017}, } |
|
Liu, Yingqi |
ESEC/FSE '17: "LAMP: Data Provenance for ..."
LAMP: Data Provenance for Graph Based Machine Learning Algorithms through Derivative Computation
Shiqing Ma, Yousra Aafer, Zhaogui Xu, Wen-Chuan Lee, Juan Zhai, Yingqi Liu, and Xiangyu Zhang (Purdue University, USA; Nanjing University, China) Data provenance tracking determines the set of inputs related to a given output. It enables quality control and problem diagnosis in data engineering. Most existing techniques work by tracking program dependencies. They cannot quantitatively assess the importance of related inputs, which is critical to machine learning algorithms, in which an output tends to depend on a huge set of inputs while only some of them are of importance. In this paper, we propose LAMP, a provenance computation system for machine learning algorithms. Inspired by automatic differentiation (AD), LAMP quantifies the importance of an input for an output by computing the partial derivative. LAMP separates the original data processing and the more expensive derivative computation to different processes to achieve cost-effectiveness. In addition, it allows quantifying importance for inputs related to discrete behavior, such as control flow selection. The evaluation on a set of real world programs and data sets illustrates that LAMP produces more precise and succinct provenance than program dependence based techniques, with much less overhead. Our case studies demonstrate the potential of LAMP in problem diagnosis in data engineering. @InProceedings{ESEC/FSE17p786, author = {Shiqing Ma and Yousra Aafer and Zhaogui Xu and Wen-Chuan Lee and Juan Zhai and Yingqi Liu and Xiangyu Zhang}, title = {LAMP: Data Provenance for Graph Based Machine Learning Algorithms through Derivative Computation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {786--797}, doi = {}, year = {2017}, } |
|
Liu, Yuefei |
ESEC/FSE '17: "Better Test Cases for Better ..."
Better Test Cases for Better Automated Program Repair
Jinqiu Yang, Alexey Zhikhartsev, Yuefei Liu, and Lin Tan (University of Waterloo, Canada) Automated generate-and-validate program repair techniques (G&V techniques) suffer from generating many overfitted patches due to in-capabilities of test cases. Such overfitted patches are incor- rect patches, which only make all given test cases pass, but fail to fix the bugs. In this work, we propose an overfitted patch detec- tion framework named Opad (Overfitted PAtch Detection). Opad helps improve G&V techniques by enhancing existing test cases to filter out overfitted patches. To enhance test cases, Opad uses fuzz testing to generate new test cases, and employs two test or- acles (crash and memory-safety) to enhance validity checking of automatically-generated patches. Opad also uses a novel metric (named O-measure) for deciding whether automatically-generated patches overfit. Evaluated on 45 bugs from 7 large systems (the same benchmark used by GenProg and SPR), Opad filters out 75.2% (321/427) over- fitted patches generated by GenProg/AE, Kali, and SPR. In addition, Opad guides SPR to generate correct patches for one more bug (the original SPR generates correct patches for 11 bugs). Our analysis also shows that up to 40% of such automatically-generated test cases may further improve G&V techniques if empowered with better test oracles (in addition to crash and memory-safety oracles employed by Opad). @InProceedings{ESEC/FSE17p831, author = {Jinqiu Yang and Alexey Zhikhartsev and Yuefei Liu and Lin Tan}, title = {Better Test Cases for Better Automated Program Repair}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {831--841}, doi = {}, year = {2017}, } |
|
Livshits, Benjamin |
ESEC/FSE '17: "Toward Full Elasticity in ..."
Toward Full Elasticity in Distributed Static Analysis: The Case of Callgraph Analysis
Diego Garbervetsky, Edgardo Zoppi, and Benjamin Livshits (University of Buenos Aires, Argentina; Imperial College London, UK) In this paper we present the design and implementation of a distributed, whole-program static analysis framework that is designed to scale with the size of the input. Our approach is based on the actor programming model and is deployed in the cloud. Our reliance on a cloud cluster provides a degree of elasticity for CPU, memory, and storage resources. To demonstrate the potential of our technique, we show how a typical call graph analysis can be implemented in a distributed setting. The vision that motivates this work is that every large-scale software repository such as GitHub, BitBucket, or Visual Studio Online will be able to perform static analysis on a large scale. We experimentally validate our implementation of the distributed call graph analysis using a combination of both synthetic and real benchmarks. To show scalability, we demonstrate how the analysis presented in this paper is able to handle inputs that are almost 10 million lines of code (LOC) in size, without running out of memory. Our results show that the analysis scales well in terms of memory pressure independently of the input size, as we add more virtual machines (VMs). As the number of worker VMs increases, we observe that the analysis time generally improves as well. Lastly, we demonstrate that querying the results can be performed with a median latency of 15 ms. @InProceedings{ESEC/FSE17p442, author = {Diego Garbervetsky and Edgardo Zoppi and Benjamin Livshits}, title = {Toward Full Elasticity in Distributed Static Analysis: The Case of Callgraph Analysis}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {442--453}, doi = {}, year = {2017}, } |
|
Llerena, Yamilet R. Serrano |
ESEC/FSE '17: "Probabilistic Model Checking ..."
Probabilistic Model Checking of Perturbed MDPs with Applications to Cloud Computing
Yamilet R. Serrano Llerena, Guoxin Su, and David S. Rosenblum (National University of Singapore, Singapore; University of Wollongong, Australia) Probabilistic model checking is a formal verification technique that has been applied successfully in a variety of domains, providing identification of system errors through quantitative verification of stochastic system models. One domain that can benefit from probabilistic model checking is cloud computing, which must provide highly reliable and secure computational and storage services to large numbers of mission-critical software systems. For real-world domains like cloud computing, external system factors and environmental changes must be estimated accurately in the form of probabilities in system models; inaccurate estimates for the model probabilities can lead to invalid verification results. To address the effects of uncertainty in probability estimates, in previous work we have developed a variety of techniques for perturbation analysis of discrete- and continuous-time Markov chains (DTMCs and CTMCs). These techniques determine the consequences of the uncertainty on verification of system properties. In this paper, we present the first approach for perturbation analysis of Markov decision processes (MDPs), a stochastic formalism that is especially popular due to the significant expressive power it provides through the combination of both probabilistic and nondeterministic choice. Our primary contribution is a novel technique for efficiently analyzing the effects of perturbations of model probabilities on verification of reachability properties of MDPs. The technique heuristically explores the space of adversaries of an MDP, which encode the different ways of resolving the MDP’s nondeterministic choices. We demonstrate the practical effectiveness of our approach by applying it to two case studies of cloud systems. @InProceedings{ESEC/FSE17p454, author = {Yamilet R. Serrano Llerena and Guoxin Su and David S. Rosenblum}, title = {Probabilistic Model Checking of Perturbed MDPs with Applications to Cloud Computing}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {454--464}, doi = {}, year = {2017}, } |
|
Lo, David |
ESEC/FSE '17-DEMO: "XSearch: A Domain-Specific ..."
XSearch: A Domain-Specific Cross-Language Relevant Question Retrieval Tool
Bowen Xu, Zhenchang Xing, Xin Xia, David Lo, and Xuan-Bach D. Le (Zhejiang University, China; Australian National University, Australia; University of British Columbia, Canada; Singapore Management University, Singapore) During software development process, Chinese developers often seek solutions to the technical problems they encounter by searching relevant questions on Q&A sites. When developers fail to find solutions on Q&A sites in Chinese, they could translate their query and search on the English Q&A sites. However, Chinese developers who are non-native English speakers often are not comfortable to ask or search questions in English, as they do not know the proper translation of the Chinese technical words into the English technical words. Furthermore, the process of manually formulating cross-language queries and determining the importance of query words is a tedious and time-consuming process. For the purpose of helping Chinese developers take advantages of the rich knowledge base of the English version of Stack Overflow and simplify the retrieval process, we propose an automated cross-language relevant question retrieval tool (XSearch) to retrieve relevant English questions on Stack Overflow for a given Chinese question. This tool can address the increasing need for developer to solve technical problems by retrieving cross-language relevant Q&A resources. Demo Tool Website: http://172.93.36.10:8080/XSearch Demo Video: https://goo.gl/h57sed @InProceedings{ESEC/FSE17p1009, author = {Bowen Xu and Zhenchang Xing and Xin Xia and David Lo and Xuan-Bach D. Le}, title = {XSearch: A Domain-Specific Cross-Language Relevant Question Retrieval Tool}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1009--1013}, doi = {}, year = {2017}, } Video ESEC/FSE '17: "S3: Syntax- and Semantic-Guided ..." S3: Syntax- and Semantic-Guided Repair Synthesis via Programming by Examples Xuan-Bach D. Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser (Singapore Management University, Singapore; IST Austria, Austria; Carnegie Mellon University, USA; Stellenbosch University, South Africa) A notable class of techniques for automatic program repair is known as semantics-based. Such techniques, e.g., Angelix, infer semantic specifications via symbolic execution, and then use program synthesis to construct new code that satisfies those inferred specifications. However, the obtained specifications are naturally incomplete, leaving the synthesis engine with a difficult task of synthesizing a general solution from a sparse space of many possible solutions that are consistent with the provided specifications but that do not necessarily generalize. We present S3, a new repair synthesis engine that leverages programming-by-examples methodology to synthesize high-quality bug repairs. The novelty in S3 that allows it to tackle the sparse search space to create more general repairs is three-fold: (1) A systematic way to customize and constrain the syntactic search space via a domain-specific language, (2) An efficient enumeration- based search strategy over the constrained search space, and (3) A number of ranking features based on measures of the syntactic and semantic distances between candidate solutions and the original buggy program. We compare S3’s repair effectiveness with state-of-the-art synthesis engines Angelix, Enumerative, and CVC4. S3 can successfully and correctly fix at least three times more bugs than the best baseline on datasets of 52 bugs in small programs, and 100 bugs in real-world large programs. @InProceedings{ESEC/FSE17p593, author = {Xuan-Bach D. Le and Duc-Hiep Chu and David Lo and Claire Le Goues and Willem Visser}, title = {S3: Syntax- and Semantic-Guided Repair Synthesis via Programming by Examples}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {593--604}, doi = {}, year = {2017}, } |
|
Lo Iacono, Luigi |
ESEC/FSE '17-IND: "Guidelines for Adopting Frontend ..."
Guidelines for Adopting Frontend Architectures and Patterns in Microservices-Based Systems
Holger Harms, Collin Rogowski, and Luigi Lo Iacono (Funke Digital, Germany; inovex, Germany; Cologne University of Applied Sciences, Germany) Microservice-based systems enable the independent development, deployment, and scalability for separate system components of enterprise applications. A significant aspect during development is the microservice integration in frontends of web, mobile, and desktop applications. One challenge here is the selection of an adequate frontend architecture as well as suitable patterns that satisfy the application requirements. This paper analyses available strategies for organizing and implementing microservices frontends. These approaches are then evaluated based on a quality model and various prototypes of the same application implemented using the distinct approaches. The results of this analysis are generalized to a guideline that supports the selection of a suitable architecture. @InProceedings{ESEC/FSE17p902, author = {Holger Harms and Collin Rogowski and Luigi Lo Iacono}, title = {Guidelines for Adopting Frontend Architectures and Patterns in Microservices-Based Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {902--907}, doi = {}, year = {2017}, } |
|
Long, Fan |
ESEC/FSE '17: "Automatic Inference of Code ..."
Automatic Inference of Code Transforms for Patch Generation
Fan Long, Peter Amidon, and Martin Rinard (Massachusetts Institute of Technology, USA; University of California at San Diego, USA) We present a new system, Genesis, that processes human patches to automatically infer code transforms for automatic patch generation. We present results that characterize the effectiveness of the Genesis inference algorithms and the complete Genesis patch generation system working with real-world patches and defects collected from 372 Java projects. To the best of our knowledge, Genesis is the first system to automatically infer patch generation transforms or candidate patch search spaces from previous successful patches. @InProceedings{ESEC/FSE17p727, author = {Fan Long and Peter Amidon and Martin Rinard}, title = {Automatic Inference of Code Transforms for Patch Generation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {727--739}, doi = {}, year = {2017}, } Info Artifacts Functional ESEC/FSE '17: "CodeCarbonCopy ..." CodeCarbonCopy Stelios Sidiroglou-Douskos, Eric Lahtinen, Anthony Eden, Fan Long, and Martin Rinard (Massachusetts Institute of Technology, USA) We present CodeCarbonCopy (CCC), a system for transferring code from a donor application into a recipient application. CCC starts with functionality identified by the developer to transfer into an insertion point (again identified by the developer) in the recipient. CCC uses paired executions of the donor and recipient on the same input file to obtain a translation between the data representation and name space of the recipient and the data representation and name space of the donor. It also implements a static analysis that identifies and removes irrelevant functionality useful in the donor but not in the recipient. We evaluate CCC on eight transfers between six applications. Our results show that CCC can successfully transfer donor functionality into recipient applications. @InProceedings{ESEC/FSE17p95, author = {Stelios Sidiroglou-Douskos and Eric Lahtinen and Anthony Eden and Fan Long and Martin Rinard}, title = {CodeCarbonCopy}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {95--105}, doi = {}, year = {2017}, } |
|
Lu, Jing |
ESEC/FSE '17: "Detecting Missing Information ..."
Detecting Missing Information in Bug Descriptions
Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng (University of Texas at Dallas, USA; University of Sannio, Italy; Colorado State University, USA; University of Lugano, Switzerland) Bug reports document unexpected software behaviors experienced by users. To be effective, they should allow bug triagers to easily understand and reproduce the potential reported bugs, by clearly describing the Observed Behavior (OB), the Steps to Reproduce (S2R), and the Expected Behavior (EB). Unfortunately, while considered extremely useful, reporters often miss such pieces of information in bug reports and, to date, there is no effective way to automatically check and enforce their presence. We manually analyzed nearly 3k bug reports to understand to what extent OB, EB, and S2R are reported in bug reports and what discourse patterns reporters use to describe such information. We found that (i) while most reports contain OB (i.e., 93.5%), only 35.2% and 51.4% explicitly describe EB and S2R, respectively; and (ii) reporters recurrently use 154 discourse patterns to describe such content. Based on these findings, we designed and evaluated an automated approach to detect the absence (or presence) of EB and S2R in bug descriptions. With its best setting, our approach is able to detect missing EB (S2R) with 85.9% (69.2%) average precision and 93.2% (83%) average recall. Our approach intends to improve bug descriptions quality by alerting reporters about missing EB and S2R at reporting time. @InProceedings{ESEC/FSE17p396, author = {Oscar Chaparro and Jing Lu and Fiorella Zampetti and Laura Moreno and Massimiliano Di Penta and Andrian Marcus and Gabriele Bavota and Vincent Ng}, title = {Detecting Missing Information in Bug Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {396--407}, doi = {}, year = {2017}, } |
|
Luo, Hui |
ESEC/FSE '17-IND: "Record and Replay for Android: ..."
Record and Replay for Android: Are We There Yet in Industrial Cases?
Wing Lam, Zhengkai Wu, Dengfeng Li, Wenyu Wang, Haibing Zheng, Hui Luo, Peng Yan, Yuetang Deng, and Tao Xie (University of Illinois at Urbana-Champaign, USA; Tencent, China) Mobile applications, or apps for short, are gaining popularity. The input sources (e.g., touchscreen, sensors, transmitters) of the smart devices that host these apps enable the apps to offer a rich experience to the users, but these input sources pose testing complications to the developers (e.g., writing tests to accurately utilize multiple input sources together and be able to replay such tests at a later time). To alleviate these complications, researchers and practitioners in recent years have developed a variety of record-and-replay tools to support the testing expressiveness of smart devices. These tools allow developers to easily record and automate the replay of complicated usage scenarios of their app. Due to Android's large share of the smart-device market, numerous record-and-replay tools have been developed using a variety of techniques to test Android apps. To better understand the strengths and weaknesses of these tools, we present a comparison of popular record-and-replay tools from researchers and practitioners, by applying these tools to test three popular industrial apps downloaded from the Google Play store. Our comparison is based on three main metrics: (1) ability to reproduce common usage scenarios, (2) space overhead of traces created by the tools, and (3) robustness of traces created by the tools (when being replayed on devices with different resolutions). The results from our comparison show which record-and-replay tools may be the best for developers and identify future directions for improving these tools to better address testing complications of smart devices. @InProceedings{ESEC/FSE17p854, author = {Wing Lam and Zhengkai Wu and Dengfeng Li and Wenyu Wang and Haibing Zheng and Hui Luo and Peng Yan and Yuetang Deng and Tao Xie}, title = {Record and Replay for Android: Are We There Yet in Industrial Cases?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {854--859}, doi = {}, year = {2017}, } |
|
Ma, Shiqing |
ESEC/FSE '17: "LAMP: Data Provenance for ..."
LAMP: Data Provenance for Graph Based Machine Learning Algorithms through Derivative Computation
Shiqing Ma, Yousra Aafer, Zhaogui Xu, Wen-Chuan Lee, Juan Zhai, Yingqi Liu, and Xiangyu Zhang (Purdue University, USA; Nanjing University, China) Data provenance tracking determines the set of inputs related to a given output. It enables quality control and problem diagnosis in data engineering. Most existing techniques work by tracking program dependencies. They cannot quantitatively assess the importance of related inputs, which is critical to machine learning algorithms, in which an output tends to depend on a huge set of inputs while only some of them are of importance. In this paper, we propose LAMP, a provenance computation system for machine learning algorithms. Inspired by automatic differentiation (AD), LAMP quantifies the importance of an input for an output by computing the partial derivative. LAMP separates the original data processing and the more expensive derivative computation to different processes to achieve cost-effectiveness. In addition, it allows quantifying importance for inputs related to discrete behavior, such as control flow selection. The evaluation on a set of real world programs and data sets illustrates that LAMP produces more precise and succinct provenance than program dependence based techniques, with much less overhead. Our case studies demonstrate the potential of LAMP in problem diagnosis in data engineering. @InProceedings{ESEC/FSE17p786, author = {Shiqing Ma and Yousra Aafer and Zhaogui Xu and Wen-Chuan Lee and Juan Zhai and Yingqi Liu and Xiangyu Zhang}, title = {LAMP: Data Provenance for Graph Based Machine Learning Algorithms through Derivative Computation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {786--797}, doi = {}, year = {2017}, } |
|
Maggio, Martina |
ESEC/FSE '17: "Automated Control of Multiple ..."
Automated Control of Multiple Software Goals using Multiple Actuators
Martina Maggio, Alessandro Vittorio Papadopoulos, Antonio Filieri, and Henry Hoffmann (Lund University, Sweden; Mälardalen University, Sweden; Imperial College London, UK; University of Chicago, USA) Modern software should satisfy multiple goals simultaneously: it should provide predictable performance, be robust to failures, handle peak loads and deal seamlessly with unexpected conditions and changes in the execution environment. For this to happen, software designs should account for the possibility of runtime changes and provide formal guarantees of the software's behavior. Control theory is one of the possible design drivers for runtime adaptation, but adopting control theoretic principles often requires additional, specialized knowledge. To overcome this limitation, automated methodologies have been proposed to extract the necessary information from experimental data and design a control system for runtime adaptation. These proposals, however, only process one goal at a time, creating a chain of controllers. In this paper, we propose and evaluate the first automated strategy that takes into account multiple goals without separating them into multiple control strategies. Avoiding the separation allows us to tackle a larger class of problems and provide stronger guarantees. We test our methodology's generality with three case studies that demonstrate its broad applicability in meeting performance, reliability, quality, security, and energy goals despite environmental or requirements changes. @InProceedings{ESEC/FSE17p373, author = {Martina Maggio and Alessandro Vittorio Papadopoulos and Antonio Filieri and Henry Hoffmann}, title = {Automated Control of Multiple Software Goals using Multiple Actuators}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {373--384}, doi = {}, year = {2017}, } Info |
|
Makarov, Vladimir |
ESEC/FSE '17-DEMO: "QEMU-Based Framework for Non-intrusive ..."
QEMU-Based Framework for Non-intrusive Virtual Machine Instrumentation and Introspection
Pavel Dovgalyuk, Natalia Fursova, Ivan Vasiliev, and Vladimir Makarov (Novgorod State University, Russia) This paper presents the framework based on the emulator QEMU. Our framework provides set of multi-platform analysis tools for the virtual machines and mechanism for creating instrumentation and analysis tools. Our framework is based on a lightweight approach to dynamic analysis of binary code executed in virtual machines. This approach is non-intrusive and provides system-wide analysis capabilities. It does not require loading any guest agents and source code of the OS. Therefore it may be applied to ROM-based guest systems and enables using of record/replay of the system execution. We use application binary interface (ABI) of the platform to be analyzed for creating introspection tools. These tools recover the part of kernel-level information related to the system calls executed on the guest machine. @InProceedings{ESEC/FSE17p944, author = {Pavel Dovgalyuk and Natalia Fursova and Ivan Vasiliev and Vladimir Makarov}, title = {QEMU-Based Framework for Non-intrusive Virtual Machine Instrumentation and Introspection}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {944--948}, doi = {}, year = {2017}, } Info |
|
Malek, Sam |
ESEC/FSE '17: "µDroid: An Energy-Aware Mutation ..."
µDroid: An Energy-Aware Mutation Testing Framework for Android
Reyhaneh Jabbarvand and Sam Malek (University of California at Irvine, USA) The rising popularity of mobile apps deployed on battery-constrained devices underlines the need for effectively evaluating their energy properties. However, currently there is a lack of testing tools for evaluating the energy properties of apps. As a result, for energy testing, developers are relying on tests intended for evaluating the functional correctness of apps. Such tests may not be adequate for revealing energy defects and inefficiencies in apps. This paper presents an energy-aware mutation testing framework, called μDROID, that can be used by developers to assess the adequacy of their test suite for revealing energy-related defects. μDROID implements fifty energy-aware mutation operators and relies on a novel, automatic oracle to determine if a mutant can be killed by a test. Our evaluation on real-world Android apps shows the ability of proposed mutation operators for evaluating the utility of tests in revealing energy defects. Moreover, our automated oracle can detect whether tests kill the energy mutants with an overall accuracy of 94%, thereby making it possible to apply μDROID automatically. @InProceedings{ESEC/FSE17p208, author = {Reyhaneh Jabbarvand and Sam Malek}, title = {µDroid: An Energy-Aware Mutation Testing Framework for Android}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {208--219}, doi = {}, year = {2017}, } ESEC/FSE '17: "Automatic Generation of Inter-Component ..." Automatic Generation of Inter-Component Communication Exploits for Android Applications Joshua Garcia, Mahmoud Hammad, Negar Ghorbani, and Sam Malek (University of California at Irvine, USA) Although a wide variety of approaches identify vulnerabilities in Android apps, none attempt to determine exploitability of those vulnerabilities. Exploitability can aid in reducing false positives of vulnerability analysis, and can help engineers triage bugs. Specifically, one of the main attack vectors of Android apps is their inter-component communication interface, where apps may receive messages called Intents. In this paper, we provide the first approach for automatically generating exploits for Android apps, called LetterBomb, relying on a combined path-sensitive symbolic execution-based static analysis, and the use of software instrumentation and test oracles. We run LetterBomb on 10,000 Android apps from Google Play, where we identify 181 exploits from 835 vulnerable apps. Compared to a state-of-the-art detection approach for three ICC-based vulnerabilities, LetterBomb obtains 33%-60% more vulnerabilities at a 6.66 to 7 times faster speed. @InProceedings{ESEC/FSE17p661, author = {Joshua Garcia and Mahmoud Hammad and Negar Ghorbani and Sam Malek}, title = {Automatic Generation of Inter-Component Communication Exploits for Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {661--671}, doi = {}, year = {2017}, } Info ESEC/FSE '17: "PATDroid: Permission-Aware ..." PATDroid: Permission-Aware GUI Testing of Android Alireza Sadeghi, Reyhaneh Jabbarvand, and Sam Malek (University of California at Irvine, USA) Recent introduction of a dynamic permission system in Android, allowing the users to grant and revoke permissions after the installation of an app, has made it harder to properly test apps. Since an app's behavior may change depending on the granted permissions, it needs to be tested under a wide range of permission combinations. At the state-of-the-art, in the absence of any automated tool support, a developer needs to either manually determine the interaction of tests and app permissions, or exhaustively re-execute tests for all possible permission combinations, thereby increasing the time and resources required to test apps. This paper presents an automated approach, called PATDroid, for efficiently testing an Android app while taking the impact of permissions on its behavior into account. PATDroid performs a hybrid program analysis on both an app under test and its test suite to determine which tests should be executed on what permission combinations. Our experimental results show that PATDroid significantly reduces the testing effort, yet achieves comparable code coverage and fault detection capability as exhaustively testing an app under all permission combinations. @InProceedings{ESEC/FSE17p220, author = {Alireza Sadeghi and Reyhaneh Jabbarvand and Sam Malek}, title = {PATDroid: Permission-Aware GUI Testing of Android}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {220--232}, doi = {}, year = {2017}, } Info Artifacts Functional |
|
Mani, Senthil |
ESEC/FSE '17-IND: "Natural Language Querying ..."
Natural Language Querying in SAP-ERP Platform
Diptikalyan Saha, Neelamadhav Gantayat, Senthil Mani, and Barry Mitchell (IBM Research, India; IBM, USA) With the omnipresence of mobile devices coupled with recent advances in automatic speech recognition capabilities, there has been a growing demand for natural language query (NLQ) interface to retrieve information from the knowledge bases. Business users particularly find this useful as NLQ interface enables them to ask questions without the knowledge of the query language or the data schema. In this paper, we apply an existing research technology called ``ATHENA: An Ontology-Driven System for Natural Language Querying over Relational Data Stores'' in the industry domain of SAP-ERP systems. The goal is to enable users to query SAP-ERP data using natural language. We present the challenges and their solutions of such a technology transfer. We present the effectiveness of the natural language query interface on a set of questions given by a set of SAP practitioners. @InProceedings{ESEC/FSE17p878, author = {Diptikalyan Saha and Neelamadhav Gantayat and Senthil Mani and Barry Mitchell}, title = {Natural Language Querying in SAP-ERP Platform}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {878--883}, doi = {}, year = {2017}, } |
|
Maoz, Shahar |
ESEC/FSE '17: "A Symbolic Justice Violations ..."
A Symbolic Justice Violations Transition System for Unrealizable GR(1) Specifications
Aviv Kuvent, Shahar Maoz, and Jan Oliver Ringert (Tel Aviv University, Israel) One of the main challenges of reactive synthesis, an automated procedure to obtain a correct-by-construction reactive system, is to deal with unrealizable specifications. Existing approaches to deal with unrealizability, in the context of GR(1), an expressive assume-guarantee fragment of LTL that enables efficient synthesis, include the generation of concrete counter-strategies and the computation of an unrealizable core. Although correct, such approaches produce large and complicated counter-strategies, often containing thousands of states. This hinders their use by engineers. In this work we present the Justice Violations Transition System (JVTS), a novel symbolic representation of counter-strategies for GR(1). The JVTS is much smaller and simpler than its corresponding concrete counter-strategy. Moreover, it is annotated with invariants that explain how the counter-strategy forces the system to violate the specification. We compute the JVTS symbolically, and thus more efficiently, without the expensive enumeration of concrete states. Finally, we provide the JVTS with an on-demand interactive concrete and symbolic play. We implemented our work, validated its correctness, and evaluated it on 14 unrealizable specifications of autonomous Lego robots as well as on benchmarks from the literature. The evaluation shows not only that the JVTS is in most cases much smaller than the corresponding concrete counter-strategy, but also that its computation is faster. @InProceedings{ESEC/FSE17p362, author = {Aviv Kuvent and Shahar Maoz and Jan Oliver Ringert}, title = {A Symbolic Justice Violations Transition System for Unrealizable GR(1) Specifications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {362--372}, doi = {}, year = {2017}, } Info |
|
Marcus, Andrian |
ESEC/FSE '17: "Detecting Missing Information ..."
Detecting Missing Information in Bug Descriptions
Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng (University of Texas at Dallas, USA; University of Sannio, Italy; Colorado State University, USA; University of Lugano, Switzerland) Bug reports document unexpected software behaviors experienced by users. To be effective, they should allow bug triagers to easily understand and reproduce the potential reported bugs, by clearly describing the Observed Behavior (OB), the Steps to Reproduce (S2R), and the Expected Behavior (EB). Unfortunately, while considered extremely useful, reporters often miss such pieces of information in bug reports and, to date, there is no effective way to automatically check and enforce their presence. We manually analyzed nearly 3k bug reports to understand to what extent OB, EB, and S2R are reported in bug reports and what discourse patterns reporters use to describe such information. We found that (i) while most reports contain OB (i.e., 93.5%), only 35.2% and 51.4% explicitly describe EB and S2R, respectively; and (ii) reporters recurrently use 154 discourse patterns to describe such content. Based on these findings, we designed and evaluated an automated approach to detect the absence (or presence) of EB and S2R in bug descriptions. With its best setting, our approach is able to detect missing EB (S2R) with 85.9% (69.2%) average precision and 93.2% (83%) average recall. Our approach intends to improve bug descriptions quality by alerting reporters about missing EB and S2R at reporting time. @InProceedings{ESEC/FSE17p396, author = {Oscar Chaparro and Jing Lu and Fiorella Zampetti and Laura Moreno and Massimiliano Di Penta and Andrian Marcus and Gabriele Bavota and Vincent Ng}, title = {Detecting Missing Information in Bug Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {396--407}, doi = {}, year = {2017}, } |
|
Mariani, Leonardo |
ESEC/FSE '17-DEMO: "VART: A Tool for the Automatic ..."
VART: A Tool for the Automatic Detection of Regression Faults
Fabrizio Pastore and Leonardo Mariani (University of Milano-Bicocca, Italy) In this paper we present VART, a tool for automatically revealing regression faults missed by regression test suites. Interestingly, VART is not limited to faults causing crashing or exceptions, but can reveal faults that cause the violation of application-specific correctness properties. VART achieves this goal by combining static and dynamic program analysis. @InProceedings{ESEC/FSE17p964, author = {Fabrizio Pastore and Leonardo Mariani}, title = {VART: A Tool for the Automatic Detection of Regression Faults}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {964--968}, doi = {}, year = {2017}, } Video Info ESEC/FSE '17: "BDCI: Behavioral Driven Conflict ..." BDCI: Behavioral Driven Conflict Identification Fabrizio Pastore, Leonardo Mariani, and Daniela Micucci (University of Milano-Bicocca, Italy) Source Code Management (SCM) systems support software evolution by providing features, such as version control, branching, and conflict detection. Despite the presence of these features, support to parallel software development is often limited. SCM systems can only address a subset of the conflicts that might be introduced by developers when concurrently working on multiple parallel branches. In fact, SCM systems can detect textual conflicts, which are generated by the concurrent modification of the same program locations, but they are unable to detect higher-order conflicts, which are generated by the concurrent modification of different program locations that generate program misbehaviors once merged. Higher-order conflicts are painful to detect and expensive to fix because they might be originated by the interference of apparently unrelated changes. In this paper we present Behavioral Driven Conflict Identification (BDCI), a novel approach to conflict detection. BDCI moves the analysis of conflicts from the source code level to the level of program behavior by generating and comparing behavioral models. The analysis based on behavioral models can reveal interfering changes as soon as they are introduced in the SCM system, even if they do not introduce any textual conflict. To evaluate the effectiveness and the cost of the proposed approach, we developed BDCIf, a specific instance of BDCI dedicated to the detection of higher-order conflicts related to the functional behavior of a program. The evidence collected by analyzing multiple versions of Git and Redis suggests that BDCIf can effectively detect higher-order conflicts and report how changes might interfere. @InProceedings{ESEC/FSE17p570, author = {Fabrizio Pastore and Leonardo Mariani and Daniela Micucci}, title = {BDCI: Behavioral Driven Conflict Identification}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {570--581}, doi = {}, year = {2017}, } Info Artifacts Functional |
|
Marin, Victor J. |
ESEC/FSE '17-DEMO: "ARCC: Assistant for Repetitive ..."
ARCC: Assistant for Repetitive Code Comprehension
Wilberto Z. Nunez, Victor J. Marin, and Carlos R. Rivero (Rochester Institute of Technology, USA) As software projects evolve, carefully understanding the behavior of a program is mandatory before making any change. Repetitive code snippets also tend to appear throughout the codebase, and developers have to understand similar semantics multiple times. Building on this observation, we present Arcc: an Assistant for Repetitive Code Comprehension. The tool, implemented as an Eclipse plugin, assists developers in leveraging knowledge of a program to understand other programs containing a subset of the semantics in the former. Arcc differs from existing approaches in that it uses an extensible knowledge base of recurrent semantic code snippets, instead of heuristics or salient features, to summarize the behavior of a program. Given a program, we detect the occurrences of such snippets. Developers can create strategies as combinations of the snippets found and search for strategy occurrences in their workspace. Arcc highlights the source code related to every snippet and their interleaving, assisting in getting an intuition of similar programs. Finally, Arcc underlines potential common errors associated with the snippets, assisting in detecting overlooked problems. https://youtube.com/playlist?list=PLmizZtBESdPHDyKXKHMXj13r2pBCKzIoA @InProceedings{ESEC/FSE17p999, author = {Wilberto Z. Nunez and Victor J. Marin and Carlos R. Rivero}, title = {ARCC: Assistant for Repetitive Code Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {999--1003}, doi = {}, year = {2017}, } Info |
|
Marinov, Darko |
ESEC/FSE '17: "Trade-Offs in Continuous Integration: ..."
Trade-Offs in Continuous Integration: Assurance, Security, and Flexibility
Michael Hilton, Nicholas Nelson, Timothy Tunnell, Darko Marinov, and Danny Dig (Oregon State University, USA; University of Illinois at Urbana-Champaign, USA) Continuous integration (CI) systems automate the compilation, building, and testing of software. Despite CI being a widely used activity in software engineering, we do not know what motivates developers to use CI, and what barriers and unmet needs they face. Without such knowledge, developers make easily avoidable errors, tool builders invest in the wrong direction, and researchers miss opportunities for improving the practice of CI. We present a qualitative study of the barriers and needs developers face when using CI. We conduct semi-structured interviews with developers from different industries and development scales. We triangulate our findings by running two surveys. We find that developers face trade-offs between speed and certainty (Assurance), between better access and information security (Security), and between more configuration options and greater ease of use (Flexi- bility). We present implications of these trade-offs for developers, tool builders, and researchers. @InProceedings{ESEC/FSE17p197, author = {Michael Hilton and Nicholas Nelson and Timothy Tunnell and Darko Marinov and Danny Dig}, title = {Trade-Offs in Continuous Integration: Assurance, Security, and Flexibility}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {197--207}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Martie, Lee |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Support for Iteration on Code Search
Lee Martie, André van der Hoek, and Thomas Kwak (University of California at Irvine, USA) Sometimes, when programmers use a search engine they know more or less what they need. Other times, programmers use the search engine to look around and generate possible ideas for the programming problem they are working on. The key insight we explore in this paper is that the results found in the latter case tend to serve as inspiration or triggers for the next queries issued. We introduce two search engines, CodeExchange and CodeLikeThis, both of which are specifically designed to enable the user to directly leverage the results in formulating the next query. CodeExchange does this with a set of four features supporting the programmer to use characteristics of the results to find other code with or without those characteristics. CodeLikeThis supports simply selecting an entire result to find code that is analogous, to some degree, to that result. We evaluated how these approaches were used along with two approaches not explicitly supporting iteration, a baseline and Google, in a user study among 24 developers. We find that search engines that support using results to form the next query can improve the programmers’ search experience and different approaches to iteration can provide better experiences depending on the task. @InProceedings{ESEC/FSE17p774, author = {Lee Martie and André van der Hoek and Thomas Kwak}, title = {Understanding the Impact of Support for Iteration on Code Search}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {774--785}, doi = {}, year = {2017}, } |
|
Martínez-Fernández, Silverio |
ESEC/FSE '17-IND: "Reference Architectures and ..."
Reference Architectures and Scrum: Friends or Foes?
Matthias Galster, Samuil Angelov, Silverio Martínez-Fernández, and Dan Tofan (University of Canterbury, New Zealand; Fontys University of Applied Sciences, Netherlands; Fraunhofer IESE, Germany) Software reference architectures provide templates and guidelines for designing systems in a particular domain. Companies use them to achieve interoperability of (parts of) their software, standardization, and faster development. In contrast to system-specific software architectures that "emerge" during development, reference architectures dictate significant parts of the software design early on. Agile software development frameworks (such as Scrum) acknowledge changing software requirements and the need to adapt the software design accordingly. In this paper, we present lessons learned about how reference architectures interact with Scrum (the most frequently used agile process framework). These lessons are based on observing software development projects in five companies. We found that reference architectures can support good practice in Scrum: They provide enough design upfront without too much effort, reduce documentation activities, facilitate knowledge sharing, and contribute to "architectural thinking" of developers. However, reference architectures can impose risks or even threats to the success of Scrum (e.g., to self-organizing and motivated teams). @InProceedings{ESEC/FSE17p896, author = {Matthias Galster and Samuil Angelov and Silverio Martínez-Fernández and Dan Tofan}, title = {Reference Architectures and Scrum: Friends or Foes?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {896--901}, doi = {}, year = {2017}, } |
|
Matinnejad, Reza |
ESEC/FSE '17-IND: "Automated Testing of Hybrid ..."
Automated Testing of Hybrid Simulink/Stateflow Controllers: Industrial Case Studies
Reza Matinnejad, Shiva Nejati, and Lionel C. Briand (University of Luxembourg, Luxembourg) We present the results of applying our approach for testing Simulink controllers to one public and one proprietary model, both industrial. Our approach combines explorative and exploitative search algorithms to visualize the controller behavior over its input space and to identify test scenarios in the controller input space that violate or are likely to violate the controller requirements. The engineers' feedback shows that our approach is easy to use in practice and gives them confidence about the behavior of their models. @InProceedings{ESEC/FSE17p938, author = {Reza Matinnejad and Shiva Nejati and Lionel C. Briand}, title = {Automated Testing of Hybrid Simulink/Stateflow Controllers: Industrial Case Studies}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {938--943}, doi = {}, year = {2017}, } |
|
McDaniel, Patrick |
ESEC/FSE '17: "Cimplifier: Automatically ..."
Cimplifier: Automatically Debloating Containers
Vaibhav Rastogi, Drew Davidson, Lorenzo De Carli, Somesh Jha, and Patrick McDaniel (University of Wisconsin-Madison, USA; Tala Security, USA; Colorado State University, USA; Pennsylvania State University, USA) Application containers, such as those provided by Docker, have recently gained popularity as a solution for agile and seamless software deployment. These light-weight virtualization environments run applications that are packed together with their resources and configuration information, and thus can be deployed across various software platforms. Unfortunately, the ease with which containers can be created is oftentimes a double-edged sword, encouraging the packaging of logically distinct applications, and the inclusion of significant amount of unnecessary components, within a single container. These practices needlessly increase the container size—sometimes by orders of magnitude. They also decrease the overall security, as each included component—necessary or not—may bring in security issues of its own, and there is no isolation between multiple applications packaged within the same container image. We propose algorithms and a tool called Cimplifier, which address these concerns: given a container and simple user-defined constraints, our tool partitions it into simpler containers, which (i) are isolated from each other, only communicating as necessary, and (ii) only include enough resources to perform their functionality. Our evaluation on real-world containers demonstrates that Cimplifier preserves the original functionality, leads to reduction in image size of up to 95%, and processes even large containers in under thirty seconds. @InProceedings{ESEC/FSE17p476, author = {Vaibhav Rastogi and Drew Davidson and Lorenzo De Carli and Somesh Jha and Patrick McDaniel}, title = {Cimplifier: Automatically Debloating Containers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {476--486}, doi = {}, year = {2017}, } |
|
Meinicke, Jens |
ESEC/FSE '17: "Is There a Mismatch between ..."
Is There a Mismatch between Real-World Feature Models and Product-Line Research?
Alexander Knüppel, Thomas Thüm, Stephan Mennicke, Jens Meinicke, and Ina Schaefer (TU Braunschweig, Germany; University of Magdeburg, Germany) Feature modeling has emerged as the de-facto standard to compactly capture the variability of a software product line. Multiple feature modeling languages have been proposed that evolved over the last decades to manage industrial-size product lines. However, less expressive languages, solely permitting require and exclude constraints, are permanently and carelessly used in product-line research. We address the problem whether those less expressive languages are sufficient for industrial product lines. We developed an algorithm to eliminate complex cross-tree constraints in a feature model, enabling the combination of tools and algorithms working with different feature model dialects in a plug-and-play manner. However, the scope of our algorithm is limited. Our evaluation on large feature models, including the Linux kernel, gives evidence that require and exclude constraints are not sufficient to express real-world feature models. Hence, we promote that research on feature models needs to consider arbitrary propositional formulas as cross-tree constraints prospectively. @InProceedings{ESEC/FSE17p291, author = {Alexander Knüppel and Thomas Thüm and Stephan Mennicke and Jens Meinicke and Ina Schaefer}, title = {Is There a Mismatch between Real-World Feature Models and Product-Line Research?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {291--302}, doi = {}, year = {2017}, } Info Artifacts Reusable |
|
Meliou, Alexandra |
ESEC/FSE '17: "Fairness Testing: Testing ..."
Fairness Testing: Testing Software for Discrimination
Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou (University of Massachusetts at Amherst, USA) This paper defines software fairness and discrimination and develops a testing-based method for measuring if and how much software discriminates, focusing on causality in discriminatory behavior. Evidence of software discrimination has been found in modern software systems that recommend criminal sentences, grant access to financial products, and determine who is allowed to participate in promotions. Our approach, Themis, generates efficient test suites to measure discrimination. Given a schema describing valid system inputs, Themis generates discrimination tests automatically and does not require an oracle. We evaluate Themis on 20 software systems, 12 of which come from prior work with explicit focus on avoiding discrimination. We find that (1) Themis is effective at discovering software discrimination, (2) state-of-the-art techniques for removing discrimination from algorithms fail in many situations, at times discriminating against as much as 98% of an input subdomain, (3) Themis optimizations are effective at producing efficient test suites for measuring discrimination, and (4) Themis is more efficient on systems that exhibit more discrimination. We thus demonstrate that fairness testing is a critical aspect of the software development cycle in domains with possible discrimination and provide initial tools for measuring software discrimination. @InProceedings{ESEC/FSE17p498, author = {Sainyam Galhotra and Yuriy Brun and Alexandra Meliou}, title = {Fairness Testing: Testing Software for Discrimination}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {498--510}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Meng, Guozhu |
ESEC/FSE '17: "Guided, Stochastic Model-Based ..."
Guided, Stochastic Model-Based GUI Testing of Android Apps
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su (East China Normal University, China; Nanyang Technological University, Singapore; Shanghai Jiao Tong University, China; University of California at Davis, USA) Mobile apps are ubiquitous, operate in complex environments and are developed under the time-to-market pressure. Ensuring their correctness and reliability thus becomes an important challenge. This paper introduces Stoat, a novel guided approach to perform stochastic model-based testing on Android apps. Stoat operates in two phases: (1) Given an app as input, it uses dynamic analysis enhanced by a weighted UI exploration strategy and static analysis to reverse engineer a stochastic model of the app's GUI interactions; and (2) it adapts Gibbs sampling to iteratively mutate/refine the stochastic model and guides test generation from the mutated models toward achieving high code and model coverage and exhibiting diverse sequences. During testing, system-level events are randomly injected to further enhance the testing effectiveness. Stoat was evaluated on 93 open-source apps. The results show (1) the models produced by Stoat cover 17~31% more code than those by existing modeling tools; (2) Stoat detects 3X more unique crashes than two state-of-the-art testing tools, Monkey and Sapienz. Furthermore, Stoat tested 1661 most popular Google Play apps, and detected 2110 previously unknown and unique crashes. So far, 43 developers have responded that they are investigating our reports. 20 of reported crashes have been confirmed, and 8 already fixed. @InProceedings{ESEC/FSE17p245, author = {Ting Su and Guozhu Meng and Yuting Chen and Ke Wu and Weiming Yang and Yao Yao and Geguang Pu and Yang Liu and Zhendong Su}, title = {Guided, Stochastic Model-Based GUI Testing of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {245--256}, doi = {}, year = {2017}, } |
|
Mennicke, Stephan |
ESEC/FSE '17: "Is There a Mismatch between ..."
Is There a Mismatch between Real-World Feature Models and Product-Line Research?
Alexander Knüppel, Thomas Thüm, Stephan Mennicke, Jens Meinicke, and Ina Schaefer (TU Braunschweig, Germany; University of Magdeburg, Germany) Feature modeling has emerged as the de-facto standard to compactly capture the variability of a software product line. Multiple feature modeling languages have been proposed that evolved over the last decades to manage industrial-size product lines. However, less expressive languages, solely permitting require and exclude constraints, are permanently and carelessly used in product-line research. We address the problem whether those less expressive languages are sufficient for industrial product lines. We developed an algorithm to eliminate complex cross-tree constraints in a feature model, enabling the combination of tools and algorithms working with different feature model dialects in a plug-and-play manner. However, the scope of our algorithm is limited. Our evaluation on large feature models, including the Linux kernel, gives evidence that require and exclude constraints are not sufficient to express real-world feature models. Hence, we promote that research on feature models needs to consider arbitrary propositional formulas as cross-tree constraints prospectively. @InProceedings{ESEC/FSE17p291, author = {Alexander Knüppel and Thomas Thüm and Stephan Mennicke and Jens Meinicke and Ina Schaefer}, title = {Is There a Mismatch between Real-World Feature Models and Product-Line Research?}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {291--302}, doi = {}, year = {2017}, } Info Artifacts Reusable |
|
Menzies, Tim |
ESEC/FSE '17: "Revisiting Unsupervised Learning ..."
Revisiting Unsupervised Learning for Defect Prediction
Wei Fu and Tim Menzies (North Carolina State University, USA) Collecting quality data from software projects can be time-consuming and expensive. Hence, some researchers explore “unsupervised” approaches to quality prediction that does not require labelled data. An alternate technique is to use “supervised” approaches that learn models from project data labelled with, say, “defective” or “not-defective”. Most researchers use these supervised models since, it is argued, they can exploit more knowledge of the projects. At FSE’16, Yang et al. reported startling results where unsupervised defect predictors outperformed supervised predictors for effort-aware just-in-time defect prediction. If confirmed, these results would lead to a dramatic simplification of a seemingly complex task (data mining) that is widely explored in the software engineering literature. This paper repeats and refutes those results as follows. (1) There is much variability in the efficacy of the Yang et al. predictors so even with their approach, some supervised data is required to prune weaker predictors away. (2) Their findings were grouped across N projects. When we repeat their analysis on a project-by-project basis, supervised predictors are seen to work better. Even though this paper rejects the specific conclusions of Yang et al., we still endorse their general goal. In our our experiments, supervised predictors did not perform outstandingly better than unsupervised ones for effort-aware just-in-time defect prediction. Hence, they may indeed be some combination of unsupervised learners to achieve comparable performance to supervised ones. We therefore encourage others to work in this promising area. @InProceedings{ESEC/FSE17p72, author = {Wei Fu and Tim Menzies}, title = {Revisiting Unsupervised Learning for Defect Prediction}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {72--83}, doi = {}, year = {2017}, } ESEC/FSE '17: "Using Bad Learners to Find ..." Using Bad Learners to Find Good Configurations Vivek Nair, Tim Menzies, Norbert Siegmund, and Sven Apel (North Carolina State University, USA; Bauhaus-University Weimar, Germany; University of Passau, Germany) Finding the optimally performing configuration of a software system for a given setting is often challenging. Recent approaches address this challenge by learning performance models based on a sample set of configurations. However, building an accurate performance model can be very expensive (and is often infeasible in practice). The central insight of this paper is that exact performance values (e.g., the response time of a software system) are not required to rank configurations and to identify the optimal one. As shown by our experiments, performance models that are cheap to learn but inaccurate (with respect to the difference between actual and predicted performance) can still be used rank configurations and hence find the optimal configuration. This novel rank-based approach allows us to significantly reduce the cost (in terms of number of measurements of sample configuration) as well as the time required to build performance models. We evaluate our approach with 21 scenarios based on 9 software systems and demonstrate that our approach is beneficial in 16 scenarios; for the remaining 5 scenarios, an accurate model can be built by using very few samples anyway, without the need for a rank-based approach. @InProceedings{ESEC/FSE17p257, author = {Vivek Nair and Tim Menzies and Norbert Siegmund and Sven Apel}, title = {Using Bad Learners to Find Good Configurations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2017}, } ESEC/FSE '17: "Easy over Hard: A Case Study ..." Easy over Hard: A Case Study on Deep Learning Wei Fu and Tim Menzies (North Carolina State University, USA) While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost. This is particularly important for deep learning since these learners need hours (to weeks) to train the model. Such long training time limits the ability of (a)~a researcher to test the stability of their conclusion via repeated runs with different random seeds; and (b)~other researchers to repeat, improve, or even refute that original work. For example, recently, deep learning was used to find which questions in the Stack Overflow programmer discussion forum can be linked together. That deep learning system took 14 hours to execute. We show here that applying a very simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results. The DE approach terminated in 10 minutes; i.e. 84 times faster hours than deep learning method. We offer these results as a cautionary tale to the software analytics community and suggest that not every new innovation should be applied without critical analysis. If researchers deploy some new and expensive process, that work should be baselined against some simpler and faster alternatives. @InProceedings{ESEC/FSE17p49, author = {Wei Fu and Tim Menzies}, title = {Easy over Hard: A Case Study on Deep Learning}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {49--60}, doi = {}, year = {2017}, } |
|
Mezini, Mira |
ESEC/FSE '17: "CodeMatch: Obfuscation Won't ..."
CodeMatch: Obfuscation Won't Conceal Your Repackaged App
Leonid Glanz, Sven Amann, Michael Eichberg, Michael Reif, Ben Hermann, Johannes Lerch, and Mira Mezini (TU Darmstadt, Germany) An established way to steal the income of app developers, or to trick users into installing malware, is the creation of repackaged apps. These are clones of – typically – successful apps. To conceal their nature, they are often obfuscated by their creators. But, given that it is a common best practice to obfuscate apps, a trivial identification of repackaged apps is not possible. The problem is further intensified by the prevalent usage of libraries. In many apps, the size of the overall code base is basically determined by the used libraries. Therefore, two apps, where the obfuscated code bases are very similar, do not have to be repackages of each other. To reliably detect repackaged apps, we propose a two step approach which first focuses on the identification and removal of the library code in obfuscated apps. This approach – LibDetect – relies on code representations which abstract over several parts of the underlying bytecode to be resilient against certain obfuscation techniques. Using this approach, we are able to identify on average 70% more used libraries per app than previous approaches. After the removal of an app’s library code, we then fuzzy hash the most abstract representation of the remaining app code to ensure that we can identify repackaged apps even if very advanced obfuscation techniques are used. This makes it possible to identify repackaged apps. Using our approach, we found that ≈ 15% of all apps in Android app stores are repackages @InProceedings{ESEC/FSE17p638, author = {Leonid Glanz and Sven Amann and Michael Eichberg and Michael Reif and Ben Hermann and Johannes Lerch and Mira Mezini}, title = {CodeMatch: Obfuscation Won't Conceal Your Repackaged App}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {638--648}, doi = {}, year = {2017}, } Info |
|
Micucci, Daniela |
ESEC/FSE '17: "BDCI: Behavioral Driven Conflict ..."
BDCI: Behavioral Driven Conflict Identification
Fabrizio Pastore, Leonardo Mariani, and Daniela Micucci (University of Milano-Bicocca, Italy) Source Code Management (SCM) systems support software evolution by providing features, such as version control, branching, and conflict detection. Despite the presence of these features, support to parallel software development is often limited. SCM systems can only address a subset of the conflicts that might be introduced by developers when concurrently working on multiple parallel branches. In fact, SCM systems can detect textual conflicts, which are generated by the concurrent modification of the same program locations, but they are unable to detect higher-order conflicts, which are generated by the concurrent modification of different program locations that generate program misbehaviors once merged. Higher-order conflicts are painful to detect and expensive to fix because they might be originated by the interference of apparently unrelated changes. In this paper we present Behavioral Driven Conflict Identification (BDCI), a novel approach to conflict detection. BDCI moves the analysis of conflicts from the source code level to the level of program behavior by generating and comparing behavioral models. The analysis based on behavioral models can reveal interfering changes as soon as they are introduced in the SCM system, even if they do not introduce any textual conflict. To evaluate the effectiveness and the cost of the proposed approach, we developed BDCIf, a specific instance of BDCI dedicated to the detection of higher-order conflicts related to the functional behavior of a program. The evidence collected by analyzing multiple versions of Git and Redis suggests that BDCIf can effectively detect higher-order conflicts and report how changes might interfere. @InProceedings{ESEC/FSE17p570, author = {Fabrizio Pastore and Leonardo Mariani and Daniela Micucci}, title = {BDCI: Behavioral Driven Conflict Identification}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {570--581}, doi = {}, year = {2017}, } Info Artifacts Functional |
|
Milicevic, Aleksandar |
ESEC/FSE '17-IND: "File-Level vs. Module-Level ..."
File-Level vs. Module-Level Regression Test Selection for .NET
Marko Vasic, Zuhair Parvez, Aleksandar Milicevic, and Milos Gligoric (University of Texas at Austin, USA; Microsoft, USA) Regression testing is used to check the correctness of evolving software. With the adoption of Agile development methodology, the number of tests and software revisions has dramatically increased, and hence has the cost of regression testing. Researchers proposed regression test selection (RTS) techniques that optimize regression testing by skipping tests that are not impacted by recent program changes. Ekstazi is one such state-of-the art technique; Ekstazi is implemented for the Java programming language and has been adopted by several companies and open-source projects. We report on our experience implementing and evaluating Ekstazi#, an Ekstazi-like tool for .NET. We describe the key challenges of bringing the Ekstazi idea to the .NET platform. We evaluate Ekstazi# on 11 open-source projects, as well as an internal Microsoft project substantially larger than each of the open-source projects. Finally, we compare Ekstazi# to an incremental build system (also developed at Microsoft), which, out of the box, provides module-level dependency tracking and skipping tasks (including test execution) whenever dependencies of a task do not change between the current and the last successful build. Ekstazi# on average reduced regression testing time by 43.70% for the open-source projects and by 65.26% for the Microsoft project (the latter is in addition to the savings provided by incremental builds). @InProceedings{ESEC/FSE17p848, author = {Marko Vasic and Zuhair Parvez and Aleksandar Milicevic and Milos Gligoric}, title = {File-Level vs. Module-Level Regression Test Selection for .NET}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {848--853}, doi = {}, year = {2017}, } ESEC/FSE '17: "Regression Test Selection ..." Regression Test Selection Across JVM Boundaries Ahmet Celik, Marko Vasic, Aleksandar Milicevic, and Milos Gligoric (University of Texas at Austin, USA; Microsoft, USA) Modern software development processes recommend that changes be integrated into the main development line of a project multiple times a day. Before a new revision may be integrated, developers practice regression testing to ensure that the latest changes do not break any previously established functionality. The cost of regression testing is high, due to an increase in the number of revisions that are introduced per day, as well as the number of tests developers write per revision. Regression test selection (RTS) optimizes regression testing by skipping tests that are not affected by recent project changes. Existing dynamic RTS techniques support only projects written in a single programming language, which is unfortunate knowing that an open-source project is on average written in several programming languages. We present the first dynamic RTS technique that does not stop at predefined language boundaries. Our technique dynamically detects, at the operating system level, all file artifacts a test depends on. Our technique is, hence, oblivious to the specific means the test uses to actually access the files: be it through spawning a new process, invoking a system call, invoking a library written in a different language, invoking a library that spawns a process which makes a system call, etc. We also provide a set of extension points which allow for a smooth integration with testing frameworks and build systems. We implemented our technique in a tool called RTSLinux as a loadable Linux kernel module and evaluated it on 21 Java projects that escape JVM by spawning new processes or invoking native code, totaling 2,050,791 lines of code. Our results show that RTSLinux, on average, skips 74.17% of tests and saves 52.83% of test execution time compared to executing all tests. @InProceedings{ESEC/FSE17p809, author = {Ahmet Celik and Marko Vasic and Aleksandar Milicevic and Milos Gligoric}, title = {Regression Test Selection Across JVM Boundaries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {809--820}, doi = {}, year = {2017}, } |
|
Mills, Chris |
ESEC/FSE '17-SRC: "Automating Traceability Link ..."
Automating Traceability Link Recovery through Classification
Chris Mills (Florida State University, USA) Traceability Link Recovery (TLR) is an important software engineering task in which a stakeholder establishes links between related items in two sets of software artifacts. Most existing approaches leverage Information Retrieval (IR) techniques, and formulate the TLR task as a retrieval problem, where pairs of similar artifacts are retrieved and presented to a user. These approaches still require significant human effort, as a stakeholder needs to manually inspect the list of recommendations and decide which ones are true links and which ones are false. In this work, we aim to automate TLR by re-imagining it as a binary classification problem. More specifically, our machine learning classification approach is able to automatically classify each link in the set of all potential links as either valid or invalid, therefore circumventing the substantial human effort required by existing techniques. @InProceedings{ESEC/FSE17p1068, author = {Chris Mills}, title = {Automating Traceability Link Recovery through Classification}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1068--1070}, doi = {}, year = {2017}, } |
|
Mitchell, Barry |
ESEC/FSE '17-IND: "Natural Language Querying ..."
Natural Language Querying in SAP-ERP Platform
Diptikalyan Saha, Neelamadhav Gantayat, Senthil Mani, and Barry Mitchell (IBM Research, India; IBM, USA) With the omnipresence of mobile devices coupled with recent advances in automatic speech recognition capabilities, there has been a growing demand for natural language query (NLQ) interface to retrieve information from the knowledge bases. Business users particularly find this useful as NLQ interface enables them to ask questions without the knowledge of the query language or the data schema. In this paper, we apply an existing research technology called ``ATHENA: An Ontology-Driven System for Natural Language Querying over Relational Data Stores'' in the industry domain of SAP-ERP systems. The goal is to enable users to query SAP-ERP data using natural language. We present the challenges and their solutions of such a technology transfer. We present the effectiveness of the natural language query interface on a set of questions given by a set of SAP practitioners. @InProceedings{ESEC/FSE17p878, author = {Diptikalyan Saha and Neelamadhav Gantayat and Senthil Mani and Barry Mitchell}, title = {Natural Language Querying in SAP-ERP Platform}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {878--883}, doi = {}, year = {2017}, } |
|
Mockus, Audris |
ESEC/FSE '17: "On the Scalability of Linux ..."
On the Scalability of Linux Kernel Maintainers' Work
Minghui Zhou, Qingying Chen, Audris Mockus, and Fengguang Wu (Peking University, China; University of Tennessee, USA; Intel, China) Open source software ecosystems evolve ways to balance the workload among groups of participants ranging from core groups to peripheral groups. As ecosystems grow, it is not clear whether the mechanisms that previously made them work will continue to be relevant or whether new mechanisms will need to evolve. The impact of failure for critical ecosystems such as Linux is enormous, yet the understanding of why they function and are effective is limited. We, therefore, aim to understand how the Linux kernel sustains its growth, how to characterize the workload of maintainers, and whether or not the existing mechanisms are scalable. We quantify maintainers’ work through the files that are maintained, and the change activity and the numbers of contributors in those files. We find systematic differences among modules; these differences are stable over time, which suggests that certain architectural features, commercial interests, or module-specific practices lead to distinct sustainable equilibria. We find that most of the modules have not grown appreciably over the last decade; most growth has been absorbed by a few modules. We also find that the effort per maintainer does not increase, even though the community has hypothesized that required effort might increase. However, the distribution of work among maintainers is highly unbalanced, suggesting that a few maintainers may experience increasing workload. We find that the practice of assigning multiple maintainers to a file yields only a power of 1/2 increase in productivity. We expect that our proposed framework to quantify maintainer practices will help clarify the factors that allow rapidly growing ecosystems to be sustainable. @InProceedings{ESEC/FSE17p27, author = {Minghui Zhou and Qingying Chen and Audris Mockus and Fengguang Wu}, title = {On the Scalability of Linux Kernel Maintainers' Work}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {27--37}, doi = {}, year = {2017}, } Info |
|
Mongiovi, Melina |
ESEC/FSE '17: "Understanding the Impact of ..."
Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects
Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo Sousa, Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez (PUC-Rio, Brazil; Federal University of Campina Grande, Brazil; Federal University of Alagoas, Brazil) Code smells in a program represent indications of structural quality problems, which can be addressed by software refactoring. However, refactoring intends to achieve different goals in practice, and its application may not reduce smelly structures. Developers may neglect or end up creating new code smells through refactoring. Unfortunately, little has been reported about the beneficial and harmful effects of refactoring on code smells. This paper reports a longitudinal study intended to address this gap. We analyze how often commonly-used refactoring types affect the density of 13 types of code smells along the version histories of 23 projects. Our findings are based on the analysis of 16,566 refactorings distributed in 10 different types. Even though 79.4% of the refactorings touched smelly elements, 57% did not reduce their occurrences. Surprisingly, only 9.7% of refactorings removed smells, while 33.3% induced the introduction of new ones. More than 95% of such refactoring-induced smells were not removed in successive commits, which suggest refactorings tend to more frequently introduce long-living smells instead of eliminating existing ones. We also characterized and quantified typical refactoring-smell patterns, and observed that harmful patterns are frequent, including: (i) approximately 30% of the Move Method and Pull Up Method refactorings induced the emergence of God Class, and (ii) the Extract Superclass refactoring creates the smell Speculative Generality in 68% of the cases. @InProceedings{ESEC/FSE17p465, author = {Diego Cedrim and Alessandro Garcia and Melina Mongiovi and Rohit Gheyi and Leonardo Sousa and Rafael de Mello and Baldoino Fonseca and Márcio Ribeiro and Alexander Chávez}, title = {Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23 Software Projects}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {465--475}, doi = {}, year = {2017}, } Info |
|
Moran, Kevin |
ESEC/FSE '17: "Enabling Mutation Testing ..."
Enabling Mutation Testing for Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Michele Tufano, Kevin Moran, Massimiliano Di Penta, Christopher Vendome, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (Universidad de los Andes, Colombia; University of Lugano, Switzerland; College of William and Mary, USA; University of Sannio, Italy) Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well as to guide test case generation or prioritization. Empirical studies have shown that, while mutants are generally representative of real faults, an effective application of mutation testing requires “traditional” operators designed for programming languages to be augmented with operators specific to an application domain and/or technology. This paper proposes MDroid+, a framework for effective mutation testing of Android apps. First, we systematically devise a taxonomy of 262 types of Android faults grouped in 14 categories by manually analyzing 2,023 so ware artifacts from different sources (e.g., bug reports, commits). Then, we identified a set of 38 mutation operators, and implemented an infrastructure to automatically seed mutations in Android apps with 35 of the identified operators. The taxonomy and the proposed operators have been evaluated in terms of stillborn/trivial mutants generated as compared to well know mutation tools, and their capacity to represent real faults in Android apps @InProceedings{ESEC/FSE17p233, author = {Mario Linares-Vásquez and Gabriele Bavota and Michele Tufano and Kevin Moran and Massimiliano Di Penta and Christopher Vendome and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Enabling Mutation Testing for Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {233--244}, doi = {}, year = {2017}, } Info |
|
Moreno, Laura |
ESEC/FSE '17: "Detecting Missing Information ..."
Detecting Missing Information in Bug Descriptions
Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng (University of Texas at Dallas, USA; University of Sannio, Italy; Colorado State University, USA; University of Lugano, Switzerland) Bug reports document unexpected software behaviors experienced by users. To be effective, they should allow bug triagers to easily understand and reproduce the potential reported bugs, by clearly describing the Observed Behavior (OB), the Steps to Reproduce (S2R), and the Expected Behavior (EB). Unfortunately, while considered extremely useful, reporters often miss such pieces of information in bug reports and, to date, there is no effective way to automatically check and enforce their presence. We manually analyzed nearly 3k bug reports to understand to what extent OB, EB, and S2R are reported in bug reports and what discourse patterns reporters use to describe such information. We found that (i) while most reports contain OB (i.e., 93.5%), only 35.2% and 51.4% explicitly describe EB and S2R, respectively; and (ii) reporters recurrently use 154 discourse patterns to describe such content. Based on these findings, we designed and evaluated an automated approach to detect the absence (or presence) of EB and S2R in bug descriptions. With its best setting, our approach is able to detect missing EB (S2R) with 85.9% (69.2%) average precision and 93.2% (83%) average recall. Our approach intends to improve bug descriptions quality by alerting reporters about missing EB and S2R at reporting time. @InProceedings{ESEC/FSE17p396, author = {Oscar Chaparro and Jing Lu and Fiorella Zampetti and Laura Moreno and Massimiliano Di Penta and Andrian Marcus and Gabriele Bavota and Vincent Ng}, title = {Detecting Missing Information in Bug Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {396--407}, doi = {}, year = {2017}, } |
|
Mujahid, Suhaib |
ESEC/FSE '17: "Why Do Developers Use Trivial ..."
Why Do Developers Use Trivial Packages? An Empirical Case Study on npm
Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad Shihab (Concordia University, Canada) Code reuse is traditionally seen as good practice. Recent trends have pushed the concept of code reuse to an extreme, by using packages that implement simple and trivial tasks, which we call `trivial packages'. A recent incident where a trivial package led to the breakdown of some of the most popular web applications such as Facebook and Netflix made it imperative to question the growing use of trivial packages. Therefore, in this paper, we mine more than 230,000 npm packages and 38,000 JavaScript applications in order to study the prevalence of trivial packages. We found that trivial packages are common and are increasing in popularity, making up 16.8% of the studied npm packages. We performed a survey with 88 Node.js developers who use trivial packages to understand the reasons and drawbacks of their use. Our survey revealed that trivial packages are used because they are perceived to be well implemented and tested pieces of code. However, developers are concerned about maintaining and the risks of breakages due to the extra dependencies trivial packages introduce. To objectively verify the survey results, we empirically validate the most cited reason and drawback and find that, contrary to developers' beliefs, only 45.2% of trivial packages even have tests. However, trivial packages appear to be `deployment tested' and to have similar test, usage and community interest as non-trivial packages. On the other hand, we found that 11.5% of the studied trivial packages have more than 20 dependencies. Hence, developers should be careful about which trivial packages they decide to use. @InProceedings{ESEC/FSE17p385, author = {Rabe Abdalkareem and Olivier Nourry and Sultan Wehaibi and Suhaib Mujahid and Emad Shihab}, title = {Why Do Developers Use Trivial Packages? An Empirical Case Study on npm}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {385--395}, doi = {}, year = {2017}, } ESEC/FSE '17-SRC: "Detecting Wearable App Permission ..." Detecting Wearable App Permission Mismatches: A Case Study on Android Wear Suhaib Mujahid (Concordia University, Canada) Wearable devices are becoming increasingly popular. These wearable devices run what is known as wearable apps. Wearable apps are packaged with handheld apps, that must be installed on the accompanying handheld device (e.g., phone). Given that wearable apps are tightly coupled with the handheld apps, any wearable permission must also be requested in the handheld version of the app on the Android Wear platform. However, in some cases, the wearable apps may request permissions that do not exist in the handheld app, resulting in a permission mismatch, and causing the wearable app to error or crash. In this paper, we propose a technique to detect wear app permission mismatches. We perform a case study on 2,409 free Android Wear apps and find that 73 released wearable apps suffer from the permission mismatch problem. @InProceedings{ESEC/FSE17p1065, author = {Suhaib Mujahid}, title = {Detecting Wearable App Permission Mismatches: A Case Study on Android Wear}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1065--1067}, doi = {}, year = {2017}, } |
|
Murali, Vijayaraghavan |
ESEC/FSE '17: "Bayesian Specification Learning ..."
Bayesian Specification Learning for Finding API Usage Errors
Vijayaraghavan Murali, Swarat Chaudhuri, and Chris Jermaine (Rice University, USA) We present a Bayesian framework for learning probabilistic specifications from large, unstructured code corpora, and then using these specifications to statically detect anomalous, hence likely buggy, program behavior. Our key insight is to build a statistical model that correlates all specifications hidden inside a corpus with the syntax and observed behavior of programs that implement these specifications. During the analysis of a particular program, this model is conditioned into a posterior distribution that prioritizes specifications that are relevant to the program. The problem of finding anomalies is now framed quantitatively, as a problem of computing a distance between a "reference distribution" over program behaviors that our model expects from the program, and the distribution over behaviors that the program actually produces. We implement our ideas in a system, called Salento, for finding anomalous API usage in Android programs. Salento learns specifications using a combination of a topic model and a neural network model. Our encouraging experimental results show that the system can automatically discover subtle errors in Android applications in the wild, and has high precision and recall compared to competing probabilistic approaches. @InProceedings{ESEC/FSE17p151, author = {Vijayaraghavan Murali and Swarat Chaudhuri and Chris Jermaine}, title = {Bayesian Specification Learning for Finding API Usage Errors}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {151--162}, doi = {}, year = {2017}, } |
|
Musa, Betim |
ESEC/FSE '17: "Craig vs. Newton in Software ..."
Craig vs. Newton in Software Model Checking
Daniel Dietsch, Matthias Heizmann, Betim Musa, Alexander Nutz, and Andreas Podelski (University of Freiburg, Germany) Ever since the seminal work on SLAM and BLAST, software model checking with counterexample-guided abstraction refinement (CEGAR) has been an active topic of research. The crucial procedure here is to analyze a sequence of program statements (the counterexample) to find building blocks for the overall proof of the program. We can distinguish two approaches (which we name Craig and Newton) to implement the procedure. The historically first approach, Newton (named after the tool from the SLAM toolkit), is based on symbolic execution. The second approach, Craig, is based on Craig interpolation. It was widely believed that Craig is substantially more effective than Newton. In fact, 12 out of the 15 CEGAR-based tools in SV-COMP are based on Craig. Advances in software model checkers based on Craig, however, can go only lockstep with advances in SMT solvers with Craig interpolation. It may be time to revisit Newton and ask whether Newton can be as effective as Craig. We have implemented a total of 11 variants of Craig and Newton in two different state-of-the-art software model checking tools and present the outcome of our experimental comparison. @InProceedings{ESEC/FSE17p487, author = {Daniel Dietsch and Matthias Heizmann and Betim Musa and Alexander Nutz and Andreas Podelski}, title = {Craig vs. Newton in Software Model Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {487--497}, doi = {}, year = {2017}, } |
|
Musuvathi, Madanlal |
ESEC/FSE '17-IND: "Static Analysis for Optimizing ..."
Static Analysis for Optimizing Big Data Queries
Diego Garbervetsky, Zvonimir Pavlinovic, Michael Barnett, Madanlal Musuvathi, Todd Mytkowicz, and Edgardo Zoppi (University of Buenos Aires, Argentina; New York University, USA; Microsoft Research, USA) Query languages for big data analysis provide user extensibility through a mechanism of user-defined operators (UDOs). These operators allow programmers to write proprietary functionalities on top of a relational query skeleton. However, achieving effective query optimization for such languages is extremely challenging since the optimizer needs to understand data dependencies induced by UDOs. SCOPE, the query language from Microsoft, allows for hand coded declarations of UDO data dependencies. Unfortunately, most programmers avoid using this facility since writing and maintaining the declarations is tedious and error-prone. In this work, we designed and implemented two sound and robust static analyses for computing UDO data dependencies. The analyses can detect what columns of an input table are never used or pass-through a UDO unchanged. This information can be used to significantly improve execution of SCOPE scripts. We evaluate our analyses on thousands of real-world queries and show we can catch many unused and pass-through columns automatically without relying on any manually provided declarations. @InProceedings{ESEC/FSE17p932, author = {Diego Garbervetsky and Zvonimir Pavlinovic and Michael Barnett and Madanlal Musuvathi and Todd Mytkowicz and Edgardo Zoppi}, title = {Static Analysis for Optimizing Big Data Queries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {932--937}, doi = {}, year = {2017}, } |
|
Myers, Margaret |
ESEC/FSE '17: "Finding Near-Optimal Configurations ..."
Finding Near-Optimal Configurations in Product Lines by Random Sampling
Jeho Oh, Don Batory, Margaret Myers, and Norbert Siegmund (University of Texas at Austin, USA; Bauhaus-University Weimar, Germany) Software Product Lines (SPLs) are highly configurable systems. This raises the challenge to find optimal performing configurations for an anticipated workload. As SPL configuration spaces are huge, it is infeasible to benchmark all configurations to find an optimal one. Prior work focused on building performance models to predict and optimize SPL configurations. Instead, we randomly sample and recursively search a configuration space directly to find near-optimal configurations without constructing a prediction model. Our algorithms are simpler and have higher accuracy and efficiency. @InProceedings{ESEC/FSE17p61, author = {Jeho Oh and Don Batory and Margaret Myers and Norbert Siegmund}, title = {Finding Near-Optimal Configurations in Product Lines by Random Sampling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {61--71}, doi = {}, year = {2017}, } |
|
Mytkowicz, Todd |
ESEC/FSE '17-IND: "Static Analysis for Optimizing ..."
Static Analysis for Optimizing Big Data Queries
Diego Garbervetsky, Zvonimir Pavlinovic, Michael Barnett, Madanlal Musuvathi, Todd Mytkowicz, and Edgardo Zoppi (University of Buenos Aires, Argentina; New York University, USA; Microsoft Research, USA) Query languages for big data analysis provide user extensibility through a mechanism of user-defined operators (UDOs). These operators allow programmers to write proprietary functionalities on top of a relational query skeleton. However, achieving effective query optimization for such languages is extremely challenging since the optimizer needs to understand data dependencies induced by UDOs. SCOPE, the query language from Microsoft, allows for hand coded declarations of UDO data dependencies. Unfortunately, most programmers avoid using this facility since writing and maintaining the declarations is tedious and error-prone. In this work, we designed and implemented two sound and robust static analyses for computing UDO data dependencies. The analyses can detect what columns of an input table are never used or pass-through a UDO unchanged. This information can be used to significantly improve execution of SCOPE scripts. We evaluate our analyses on thousands of real-world queries and show we can catch many unused and pass-through columns automatically without relying on any manually provided declarations. @InProceedings{ESEC/FSE17p932, author = {Diego Garbervetsky and Zvonimir Pavlinovic and Michael Barnett and Madanlal Musuvathi and Todd Mytkowicz and Edgardo Zoppi}, title = {Static Analysis for Optimizing Big Data Queries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {932--937}, doi = {}, year = {2017}, } |
|
Nagarakatte, Santosh |
ESEC/FSE '17: "A Fast Causal Profiler for ..."
A Fast Causal Profiler for Task Parallel Programs
Adarsh Yoga and Santosh Nagarakatte (Rutgers University, USA) This paper proposes TASKPROF, a profiler that identifies parallelism bottlenecks in task parallel programs. It leverages the structure of a task parallel execution to perform fine-grained attribution of work to various parts of the program. TASKPROF’s use of hardware performance counters to perform fine-grained measurements minimizes perturbation. TASKPROF’s profile execution runs in parallel using multi-cores. TASKPROF’s causal profile enables users to estimate improvements in parallelism when a region of code is optimized even when concrete optimizations are not yet known. We have used TASKPROF to isolate parallelism bottlenecks in twenty three applications that use the Intel Threading Building Blocks library. We have designed parallelization techniques in five applications to increase parallelism by an order of magnitude using TASKPROF. Our user study indicates that developers are able to isolate performance bottlenecks with ease using TASKPROF. @InProceedings{ESEC/FSE17p15, author = {Adarsh Yoga and Santosh Nagarakatte}, title = {A Fast Causal Profiler for Task Parallel Programs}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {15--26}, doi = {}, year = {2017}, } Artifacts Functional |
|
Nair, Vivek |
ESEC/FSE '17: "Using Bad Learners to Find ..."
Using Bad Learners to Find Good Configurations
Vivek Nair, Tim Menzies, Norbert Siegmund, and Sven Apel (North Carolina State University, USA; Bauhaus-University Weimar, Germany; University of Passau, Germany) Finding the optimally performing configuration of a software system for a given setting is often challenging. Recent approaches address this challenge by learning performance models based on a sample set of configurations. However, building an accurate performance model can be very expensive (and is often infeasible in practice). The central insight of this paper is that exact performance values (e.g., the response time of a software system) are not required to rank configurations and to identify the optimal one. As shown by our experiments, performance models that are cheap to learn but inaccurate (with respect to the difference between actual and predicted performance) can still be used rank configurations and hence find the optimal configuration. This novel rank-based approach allows us to significantly reduce the cost (in terms of number of measurements of sample configuration) as well as the time required to build performance models. We evaluate our approach with 21 scenarios based on 9 software systems and demonstrate that our approach is beneficial in 16 scenarios; for the remaining 5 scenarios, an accurate model can be built by using very few samples anyway, without the need for a rank-based approach. @InProceedings{ESEC/FSE17p257, author = {Vivek Nair and Tim Menzies and Norbert Siegmund and Sven Apel}, title = {Using Bad Learners to Find Good Configurations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {257--267}, doi = {}, year = {2017}, } |
|
Nam, Jaechang |
ESEC/FSE '17: "QTEP: Quality-Aware Test Case ..."
QTEP: Quality-Aware Test Case Prioritization
Song Wang, Jaechang Nam, and Lin Tan (University of Waterloo, Canada) Test case prioritization (TCP) is a practical activity in software testing for exposing faults earlier. Researchers have proposed many TCP techniques to reorder test cases. Among them, coverage-based TCPs have been widely investigated. Specifically, coverage-based TCP approaches leverage coverage information between source code and test cases, i.e., static code coverage and dynamic code coverage, to schedule test cases. Existing coverage-based TCP techniques mainly focus on maximizing coverage while often do not consider the likely distribution of faults in source code. However, software faults are not often equally distributed in source code, e.g., around 80% faults are located in about 20% source code. Intuitively, test cases that cover the faulty source code should have higher priorities, since they are more likely to find faults. In this paper, we present a quality-aware test case prioritization technique, QTEP, to address the limitation of existing coverage-based TCP algorithms. In QTEP, we leverage code inspection techniques, i.e., a typical statistic defect prediction model and a typical static bug finder, to detect fault-prone source code and then adapt existing coverage-based TCP algorithms by considering the weighted source code in terms of fault-proneness. Our evaluation with 16 variant QTEP techniques on 33 different versions of 7 open source Java projects shows that QTEP could improve existing coverage-based TCP techniques for both regression and new test cases. Specifically, the improvement of the best variant of QTEP for regression test cases could be up to 15.0% and on average 7.6%, and for all test cases (both regression and new test cases), the improvement could be up to 10.0% and on average 5.0%. @InProceedings{ESEC/FSE17p523, author = {Song Wang and Jaechang Nam and Lin Tan}, title = {QTEP: Quality-Aware Test Case Prioritization}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {523--534}, doi = {}, year = {2017}, } Info |
|
Nejati, Shiva |
ESEC/FSE '17-IND: "Automated Testing of Hybrid ..."
Automated Testing of Hybrid Simulink/Stateflow Controllers: Industrial Case Studies
Reza Matinnejad, Shiva Nejati, and Lionel C. Briand (University of Luxembourg, Luxembourg) We present the results of applying our approach for testing Simulink controllers to one public and one proprietary model, both industrial. Our approach combines explorative and exploitative search algorithms to visualize the controller behavior over its input space and to identify test scenarios in the controller input space that violate or are likely to violate the controller requirements. The engineers' feedback shows that our approach is easy to use in practice and gives them confidence about the behavior of their models. @InProceedings{ESEC/FSE17p938, author = {Reza Matinnejad and Shiva Nejati and Lionel C. Briand}, title = {Automated Testing of Hybrid Simulink/Stateflow Controllers: Industrial Case Studies}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {938--943}, doi = {}, year = {2017}, } |
|
Nelson, Nicholas |
ESEC/FSE '17: "Trade-Offs in Continuous Integration: ..."
Trade-Offs in Continuous Integration: Assurance, Security, and Flexibility
Michael Hilton, Nicholas Nelson, Timothy Tunnell, Darko Marinov, and Danny Dig (Oregon State University, USA; University of Illinois at Urbana-Champaign, USA) Continuous integration (CI) systems automate the compilation, building, and testing of software. Despite CI being a widely used activity in software engineering, we do not know what motivates developers to use CI, and what barriers and unmet needs they face. Without such knowledge, developers make easily avoidable errors, tool builders invest in the wrong direction, and researchers miss opportunities for improving the practice of CI. We present a qualitative study of the barriers and needs developers face when using CI. We conduct semi-structured interviews with developers from different industries and development scales. We triangulate our findings by running two surveys. We find that developers face trade-offs between speed and certainty (Assurance), between better access and information security (Security), and between more configuration options and greater ease of use (Flexi- bility). We present implications of these trade-offs for developers, tool builders, and researchers. @InProceedings{ESEC/FSE17p197, author = {Michael Hilton and Nicholas Nelson and Timothy Tunnell and Darko Marinov and Danny Dig}, title = {Trade-Offs in Continuous Integration: Assurance, Security, and Flexibility}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {197--207}, doi = {}, year = {2017}, } Info Best-Paper Award |
|
Nelson, Tim |
ESEC/FSE '17: "The Power of "Why" ..."
The Power of "Why" and "Why Not": Enriching Scenario Exploration with Provenance
Tim Nelson, Natasha Danas, Daniel J. Dougherty, and Shriram Krishnamurthi (Brown University, USA; Worcester Polytechnic Institute, USA) Scenario-finding tools like the Alloy Analyzer are widely used in numerous concrete domains like security, network analysis, UML analysis, and so on. They can help to verify properties and, more generally, aid in exploring a system's behavior. While scenario finders are valuable for their ability to produce concrete examples, individual scenarios only give insight into what is possible, leaving the user to make their own conclusions about what might be necessary. This paper enriches scenario finding by allowing users to ask ``why?'' and ``why not?'' questions about the examples they are given. We show how to distinguish parts of an example that cannot be consistently removed (or changed) from those that merely reflect underconstraint in the specification. In the former case we show how to determine which elements of the specification and which other components of the example together explain the presence of such facts. This paper formalizes the act of computing provenance in scenario-finding. We present Amalgam, an extension of the popular Alloy scenario-finder, which implements these foundations and provides interactive exploration of examples. We also evaluate Amalgam's algorithmics on a variety of both textbook and real-world examples. @InProceedings{ESEC/FSE17p106, author = {Tim Nelson and Natasha Danas and Daniel J. Dougherty and Shriram Krishnamurthi}, title = {The Power of "Why" and "Why Not": Enriching Scenario Exploration with Provenance}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {106--116}, doi = {}, year = {2017}, } Info Artifacts Reusable Best-Paper Award |
|
Ng, Vincent |
ESEC/FSE '17: "Detecting Missing Information ..."
Detecting Missing Information in Bug Descriptions
Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng (University of Texas at Dallas, USA; University of Sannio, Italy; Colorado State University, USA; University of Lugano, Switzerland) Bug reports document unexpected software behaviors experienced by users. To be effective, they should allow bug triagers to easily understand and reproduce the potential reported bugs, by clearly describing the Observed Behavior (OB), the Steps to Reproduce (S2R), and the Expected Behavior (EB). Unfortunately, while considered extremely useful, reporters often miss such pieces of information in bug reports and, to date, there is no effective way to automatically check and enforce their presence. We manually analyzed nearly 3k bug reports to understand to what extent OB, EB, and S2R are reported in bug reports and what discourse patterns reporters use to describe such information. We found that (i) while most reports contain OB (i.e., 93.5%), only 35.2% and 51.4% explicitly describe EB and S2R, respectively; and (ii) reporters recurrently use 154 discourse patterns to describe such content. Based on these findings, we designed and evaluated an automated approach to detect the absence (or presence) of EB and S2R in bug descriptions. With its best setting, our approach is able to detect missing EB (S2R) with 85.9% (69.2%) average precision and 93.2% (83%) average recall. Our approach intends to improve bug descriptions quality by alerting reporters about missing EB and S2R at reporting time. @InProceedings{ESEC/FSE17p396, author = {Oscar Chaparro and Jing Lu and Fiorella Zampetti and Laura Moreno and Massimiliano Di Penta and Andrian Marcus and Gabriele Bavota and Vincent Ng}, title = {Detecting Missing Information in Bug Descriptions}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {396--407}, doi = {}, year = {2017}, } |
|
Nguyen, ThanhVu |
ESEC/FSE '17: "Counterexample-Guided Approach ..."
Counterexample-Guided Approach to Finding Numerical Invariants
ThanhVu Nguyen, Timos Antonopoulos, Andrew Ruef, and Michael Hicks (University of Nebraska-Lincoln, USA; Yale University, USA; University of Maryland, USA) Numerical invariants, e.g., relationships among numerical variables in a program, represent a useful class of properties to analyze programs. General polynomial invariants represent more complex numerical relations, but they are often required in many scientific and engineering applications. We present NumInv, a tool that implements a counterexample-guided invariant generation (CEGIR) technique to automatically discover numerical invariants, which are polynomial equality and inequality relations among numerical variables. This CEGIR technique infers candidate invariants from program traces and then checks them against the program source code using the KLEE test-input generation tool. If the invariants are incorrect KLEE returns counterexample traces, which help the dynamic inference obtain better results. Existing CEGIR approaches often require sound invariants, however NumInv sacrifices soundness and produces results that KLEE cannot refute within certain time bounds. This design and the use of KLEE as a verifier allow NumInv to discover useful and important numerical invariants for many challenging programs. Preliminary results show that NumInv generates required invariants for understanding and verifying correctness of programs involving complex arithmetic. We also show that NumInv discovers polynomial invariants that capture precise complexity bounds of programs used to benchmark existing static complexity analysis techniques. Finally, we show that NumInv performs competitively comparing to state of the art numerical invariant analysis tools. @InProceedings{ESEC/FSE17p605, author = {ThanhVu Nguyen and Timos Antonopoulos and Andrew Ruef and Michael Hicks}, title = {Counterexample-Guided Approach to Finding Numerical Invariants}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {605--615}, doi = {}, year = {2017}, } |
|
Nigar, Natasha |
ESEC/FSE '17-DOC: "Model-Based Dynamic Software ..."
Model-Based Dynamic Software Project Scheduling
Natasha Nigar (University of Birmingham, UK) Software project scheduling, under uncertain and dynamic environments, is one of the most important challenges in software engineering. Recent studies addressed this challenge in both static and dynamic scenarios for small and medium size software projects. The increasing trend of cloud based software solutions (large scale software projects) needs agility not only for sustainable maintenance but also for in time and within budget completion. Therefore, this paper formulates software project scheduling problem (SPSP) as an optimization problem under uncertainties and dynamics for hybrid scRUmP software model. In this regard, a mathematical model is constructed with five objectives as project duration, task fragmentation, robustness, cost, and stability. @InProceedings{ESEC/FSE17p1042, author = {Natasha Nigar}, title = {Model-Based Dynamic Software Project Scheduling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1042--1045}, doi = {}, year = {2017}, } |
|
Nourry, Olivier |
ESEC/FSE '17: "Why Do Developers Use Trivial ..."
Why Do Developers Use Trivial Packages? An Empirical Case Study on npm
Rabe Abdalkareem, Olivier Nourry, Sultan Wehaibi, Suhaib Mujahid, and Emad Shihab (Concordia University, Canada) Code reuse is traditionally seen as good practice. Recent trends have pushed the concept of code reuse to an extreme, by using packages that implement simple and trivial tasks, which we call `trivial packages'. A recent incident where a trivial package led to the breakdown of some of the most popular web applications such as Facebook and Netflix made it imperative to question the growing use of trivial packages. Therefore, in this paper, we mine more than 230,000 npm packages and 38,000 JavaScript applications in order to study the prevalence of trivial packages. We found that trivial packages are common and are increasing in popularity, making up 16.8% of the studied npm packages. We performed a survey with 88 Node.js developers who use trivial packages to understand the reasons and drawbacks of their use. Our survey revealed that trivial packages are used because they are perceived to be well implemented and tested pieces of code. However, developers are concerned about maintaining and the risks of breakages due to the extra dependencies trivial packages introduce. To objectively verify the survey results, we empirically validate the most cited reason and drawback and find that, contrary to developers' beliefs, only 45.2% of trivial packages even have tests. However, trivial packages appear to be `deployment tested' and to have similar test, usage and community interest as non-trivial packages. On the other hand, we found that 11.5% of the studied trivial packages have more than 20 dependencies. Hence, developers should be careful about which trivial packages they decide to use. @InProceedings{ESEC/FSE17p385, author = {Rabe Abdalkareem and Olivier Nourry and Sultan Wehaibi and Suhaib Mujahid and Emad Shihab}, title = {Why Do Developers Use Trivial Packages? An Empirical Case Study on npm}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {385--395}, doi = {}, year = {2017}, } |
|
Nunez, Wilberto Z. |
ESEC/FSE '17-DEMO: "ARCC: Assistant for Repetitive ..."
ARCC: Assistant for Repetitive Code Comprehension
Wilberto Z. Nunez, Victor J. Marin, and Carlos R. Rivero (Rochester Institute of Technology, USA) As software projects evolve, carefully understanding the behavior of a program is mandatory before making any change. Repetitive code snippets also tend to appear throughout the codebase, and developers have to understand similar semantics multiple times. Building on this observation, we present Arcc: an Assistant for Repetitive Code Comprehension. The tool, implemented as an Eclipse plugin, assists developers in leveraging knowledge of a program to understand other programs containing a subset of the semantics in the former. Arcc differs from existing approaches in that it uses an extensible knowledge base of recurrent semantic code snippets, instead of heuristics or salient features, to summarize the behavior of a program. Given a program, we detect the occurrences of such snippets. Developers can create strategies as combinations of the snippets found and search for strategy occurrences in their workspace. Arcc highlights the source code related to every snippet and their interleaving, assisting in getting an intuition of similar programs. Finally, Arcc underlines potential common errors associated with the snippets, assisting in detecting overlooked problems. https://youtube.com/playlist?list=PLmizZtBESdPHDyKXKHMXj13r2pBCKzIoA @InProceedings{ESEC/FSE17p999, author = {Wilberto Z. Nunez and Victor J. Marin and Carlos R. Rivero}, title = {ARCC: Assistant for Repetitive Code Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {999--1003}, doi = {}, year = {2017}, } Info |
|
Nuseibeh, Bashar |
ESEC/FSE '17: "On Evidence Preservation Requirements ..."
On Evidence Preservation Requirements for Forensic-Ready Systems
Dalal Alrajeh, Liliana Pasquale, and Bashar Nuseibeh (Imperial College London, UK; University College Dublin, Ireland; Open University, UK; Lero, Ireland) Forensic readiness denotes the capability of a system to support digital forensic investigations of potential, known incidents by preserving in advance data that could serve as evidence explaining how an incident occurred. Given the increasing rate at which (potentially criminal) incidents occur, designing soware systems that are forensic-ready can facilitate and reduce the costs of digital forensic investigations. However, to date, little or no attention has been given to how forensic-ready software systems can be designed systematically. In this paper we propose to explicitly represent evidence preservation requirements prescribing preservation of the minimal amount of data that would be relevant to a future digital investigation. We formalise evidence preservation requirements and propose an approach for synthesising specifications for systems to meet these requirements. We present our prototype implementation—based on a satisfiability solver and a logic-based learner—which we use to evaluate our approach, applying it to two digital forensic corpora. Our evaluation suggests that our approach preserves relevant data that could support hypotheses of potential incidents. Moreover, it enables significant reduction in the volume of data that would need to be examined during an investigation. @InProceedings{ESEC/FSE17p559, author = {Dalal Alrajeh and Liliana Pasquale and Bashar Nuseibeh}, title = {On Evidence Preservation Requirements for Forensic-Ready Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {559--569}, doi = {}, year = {2017}, } |
|
Nutz, Alexander |
ESEC/FSE '17: "Craig vs. Newton in Software ..."
Craig vs. Newton in Software Model Checking
Daniel Dietsch, Matthias Heizmann, Betim Musa, Alexander Nutz, and Andreas Podelski (University of Freiburg, Germany) Ever since the seminal work on SLAM and BLAST, software model checking with counterexample-guided abstraction refinement (CEGAR) has been an active topic of research. The crucial procedure here is to analyze a sequence of program statements (the counterexample) to find building blocks for the overall proof of the program. We can distinguish two approaches (which we name Craig and Newton) to implement the procedure. The historically first approach, Newton (named after the tool from the SLAM toolkit), is based on symbolic execution. The second approach, Craig, is based on Craig interpolation. It was widely believed that Craig is substantially more effective than Newton. In fact, 12 out of the 15 CEGAR-based tools in SV-COMP are based on Craig. Advances in software model checkers based on Craig, however, can go only lockstep with advances in SMT solvers with Craig interpolation. It may be time to revisit Newton and ask whether Newton can be as effective as Craig. We have implemented a total of 11 variants of Craig and Newton in two different state-of-the-art software model checking tools and present the outcome of our experimental comparison. @InProceedings{ESEC/FSE17p487, author = {Daniel Dietsch and Matthias Heizmann and Betim Musa and Alexander Nutz and Andreas Podelski}, title = {Craig vs. Newton in Software Model Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {487--497}, doi = {}, year = {2017}, } |
|
Oh, Jeho |
ESEC/FSE '17: "Finding Near-Optimal Configurations ..."
Finding Near-Optimal Configurations in Product Lines by Random Sampling
Jeho Oh, Don Batory, Margaret Myers, and Norbert Siegmund (University of Texas at Austin, USA; Bauhaus-University Weimar, Germany) Software Product Lines (SPLs) are highly configurable systems. This raises the challenge to find optimal performing configurations for an anticipated workload. As SPL configuration spaces are huge, it is infeasible to benchmark all configurations to find an optimal one. Prior work focused on building performance models to predict and optimize SPL configurations. Instead, we randomly sample and recursively search a configuration space directly to find near-optimal configurations without constructing a prediction model. Our algorithms are simpler and have higher accuracy and efficiency. @InProceedings{ESEC/FSE17p61, author = {Jeho Oh and Don Batory and Margaret Myers and Norbert Siegmund}, title = {Finding Near-Optimal Configurations in Product Lines by Random Sampling}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {61--71}, doi = {}, year = {2017}, } |
|
Oreizy, Peyman |
ESEC/FSE '17-INV: "Reflections on the REST Architectural ..."
Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)
Roy T. Fielding, Richard N. Taylor, Justin R. Erenkrantz, Michael M. Gorlick, Jim Whitehead, Rohit Khare, and Peyman Oreizy (Adobe, USA; University of California at Irvine, USA; Bloomberg, USA; University of California at Santa Cruz, USA; Google, USA; Dynamic Variable, USA) Seventeen years after its initial publication at ICSE 2000, the Representational State Transfer (REST) architectural style continues to hold significance as both a guide for understanding how the World Wide Web is designed to work and an example of how principled design, through the application of architectural styles, can impact the development and understanding of large-scale software architecture. However, REST has also become an industry buzzword: frequently abused to suit a particular argument, confused with the general notion of using HTTP, and denigrated for not being more like a programming methodology or implementation framework. In this paper, we chart the history, evolution, and shortcomings of REST, as well as several related architectural styles that it inspired, from the perspective of a chain of doctoral dissertations produced by the University of California's Institute for Software Research at UC Irvine. These successive theses share a common theme: extending the insights of REST to new domains and, in their own way, exploring the boundary of software engineering as it applies to decentralized software architectures and architectural design. We conclude with discussion of the circumstances, environment, and organizational characteristics that gave rise to this body of work. @InProceedings{ESEC/FSE17p4, author = {Roy T. Fielding and Richard N. Taylor and Justin R. Erenkrantz and Michael M. Gorlick and Jim Whitehead and Rohit Khare and Peyman Oreizy}, title = {Reflections on the REST Architectural Style and "Principled Design of the Modern Web Architecture" (Impact Paper Award)}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {4--14}, doi = {}, year = {2017}, } |
|
Papadopoulos, Alessandro Vittorio |
ESEC/FSE '17: "Automated Control of Multiple ..."
Automated Control of Multiple Software Goals using Multiple Actuators
Martina Maggio, Alessandro Vittorio Papadopoulos, Antonio Filieri, and Henry Hoffmann (Lund University, Sweden; Mälardalen University, Sweden; Imperial College London, UK; University of Chicago, USA) Modern software should satisfy multiple goals simultaneously: it should provide predictable performance, be robust to failures, handle peak loads and deal seamlessly with unexpected conditions and changes in the execution environment. For this to happen, software designs should account for the possibility of runtime changes and provide formal guarantees of the software's behavior. Control theory is one of the possible design drivers for runtime adaptation, but adopting control theoretic principles often requires additional, specialized knowledge. To overcome this limitation, automated methodologies have been proposed to extract the necessary information from experimental data and design a control system for runtime adaptation. These proposals, however, only process one goal at a time, creating a chain of controllers. In this paper, we propose and evaluate the first automated strategy that takes into account multiple goals without separating them into multiple control strategies. Avoiding the separation allows us to tackle a larger class of problems and provide stronger guarantees. We test our methodology's generality with three case studies that demonstrate its broad applicability in meeting performance, reliability, quality, security, and energy goals despite environmental or requirements changes. @InProceedings{ESEC/FSE17p373, author = {Martina Maggio and Alessandro Vittorio Papadopoulos and Antonio Filieri and Henry Hoffmann}, title = {Automated Control of Multiple Software Goals using Multiple Actuators}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {373--384}, doi = {}, year = {2017}, } Info |
|
Parnin, Chris |
ESEC/FSE '17: "Measuring Neural Efficiency ..."
Measuring Neural Efficiency of Program Comprehension
Janet Siegmund, Norman Peitek, Chris Parnin, Sven Apel, Johannes Hofmeister, Christian Kästner, Andrew Begel, Anja Bethmann, and André Brechmann (University of Passau, Germany; Leibniz Institute for Neurobiology, Germany; North Carolina State University, USA; Carnegie Mellon University, USA; Microsoft Research, USA) Most modern software programs cannot be understood in their entirety by a single programmer. Instead, programmers must rely on a set of cognitive processes that aid in seeking, filtering, and shaping relevant information for a given programming task. Several theories have been proposed to explain these processes, such as ``beacons,' for locating relevant code, and ``plans,'' for encoding cognitive models. However, these theories are decades old and lack validation with modern cognitive-neuroscience methods. In this paper, we report on a study using functional magnetic resonance imaging (fMRI) with 11 participants who performed program comprehension tasks. We manipulated experimental conditions related to beacons and layout to isolate specific cognitive processes related to bottom-up comprehension and comprehension based on semantic cues. We found evidence of semantic chunking during bottom-up comprehension and lower activation of brain areas during comprehension based on semantic cues, confirming that beacons ease comprehension. @InProceedings{ESEC/FSE17p140, author = {Janet Siegmund and Norman Peitek and Chris Parnin and Sven Apel and Johannes Hofmeister and Christian Kästner and Andrew Begel and Anja Bethmann and André Brechmann}, title = {Measuring Neural Efficiency of Program Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {140--150}, doi = {}, year = {2017}, } Info |
|
Parvez, Zuhair |
ESEC/FSE '17-IND: "File-Level vs. Module-Level ..."
File-Level vs. Module-Level Regression Test Selection for .NET
Marko Vasic, Zuhair Parvez, Aleksandar Milicevic, and Milos Gligoric (University of Texas at Austin, USA; Microsoft, USA) Regression testing is used to check the correctness of evolving software. With the adoption of Agile development methodology, the number of tests and software revisions has dramatically increased, and hence has the cost of regression testing. Researchers proposed regression test selection (RTS) techniques that optimize regression testing by skipping tests that are not impacted by recent program changes. Ekstazi is one such state-of-the art technique; Ekstazi is implemented for the Java programming language and has been adopted by several companies and open-source projects. We report on our experience implementing and evaluating Ekstazi#, an Ekstazi-like tool for .NET. We describe the key challenges of bringing the Ekstazi idea to the .NET platform. We evaluate Ekstazi# on 11 open-source projects, as well as an internal Microsoft project substantially larger than each of the open-source projects. Finally, we compare Ekstazi# to an incremental build system (also developed at Microsoft), which, out of the box, provides module-level dependency tracking and skipping tasks (including test execution) whenever dependencies of a task do not change between the current and the last successful build. Ekstazi# on average reduced regression testing time by 43.70% for the open-source projects and by 65.26% for the Microsoft project (the latter is in addition to the savings provided by incremental builds). @InProceedings{ESEC/FSE17p848, author = {Marko Vasic and Zuhair Parvez and Aleksandar Milicevic and Milos Gligoric}, title = {File-Level vs. Module-Level Regression Test Selection for .NET}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {848--853}, doi = {}, year = {2017}, } |
|
Pashchenko, Ivan |
ESEC/FSE '17-SRC: "FOSS Version Differentiation ..."
FOSS Version Differentiation as a Benchmark for Static Analysis Security Testing Tools
Ivan Pashchenko (University of Trento, Italy) We propose a novel methodology that allows automatic construction of benchmarks for Static Analysis Security Testing (SAST) tools based on real-world software projects by differencing vulnerable and fixed versions in FOSS repositories. The methodology allows us to evaluate ``actual'' performance of SAST tools (without unrelated alarms). To test our approach, we benchmarked 7 SAST tools (although we report only results for the two best tools), against 70 revisions of four major versions of Apache Tomcat with 62 distinct CVEs as the source of ground truth vulnerabilities. @InProceedings{ESEC/FSE17p1056, author = {Ivan Pashchenko}, title = {FOSS Version Differentiation as a Benchmark for Static Analysis Security Testing Tools}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {1056--1058}, doi = {}, year = {2017}, } |
|
Pasquale, Liliana |
ESEC/FSE '17: "On Evidence Preservation Requirements ..."
On Evidence Preservation Requirements for Forensic-Ready Systems
Dalal Alrajeh, Liliana Pasquale, and Bashar Nuseibeh (Imperial College London, UK; University College Dublin, Ireland; Open University, UK; Lero, Ireland) Forensic readiness denotes the capability of a system to support digital forensic investigations of potential, known incidents by preserving in advance data that could serve as evidence explaining how an incident occurred. Given the increasing rate at which (potentially criminal) incidents occur, designing soware systems that are forensic-ready can facilitate and reduce the costs of digital forensic investigations. However, to date, little or no attention has been given to how forensic-ready software systems can be designed systematically. In this paper we propose to explicitly represent evidence preservation requirements prescribing preservation of the minimal amount of data that would be relevant to a future digital investigation. We formalise evidence preservation requirements and propose an approach for synthesising specifications for systems to meet these requirements. We present our prototype implementation—based on a satisfiability solver and a logic-based learner—which we use to evaluate our approach, applying it to two digital forensic corpora. Our evaluation suggests that our approach preserves relevant data that could support hypotheses of potential incidents. Moreover, it enables significant reduction in the volume of data that would need to be examined during an investigation. @InProceedings{ESEC/FSE17p559, author = {Dalal Alrajeh and Liliana Pasquale and Bashar Nuseibeh}, title = {On Evidence Preservation Requirements for Forensic-Ready Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {559--569}, doi = {}, year = {2017}, } |
|
Pastore, Fabrizio |
ESEC/FSE '17-DEMO: "VART: A Tool for the Automatic ..."
VART: A Tool for the Automatic Detection of Regression Faults
Fabrizio Pastore and Leonardo Mariani (University of Milano-Bicocca, Italy) In this paper we present VART, a tool for automatically revealing regression faults missed by regression test suites. Interestingly, VART is not limited to faults causing crashing or exceptions, but can reveal faults that cause the violation of application-specific correctness properties. VART achieves this goal by combining static and dynamic program analysis. @InProceedings{ESEC/FSE17p964, author = {Fabrizio Pastore and Leonardo Mariani}, title = {VART: A Tool for the Automatic Detection of Regression Faults}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {964--968}, doi = {}, year = {2017}, } Video Info ESEC/FSE '17: "BDCI: Behavioral Driven Conflict ..." BDCI: Behavioral Driven Conflict Identification Fabrizio Pastore, Leonardo Mariani, and Daniela Micucci (University of Milano-Bicocca, Italy) Source Code Management (SCM) systems support software evolution by providing features, such as version control, branching, and conflict detection. Despite the presence of these features, support to parallel software development is often limited. SCM systems can only address a subset of the conflicts that might be introduced by developers when concurrently working on multiple parallel branches. In fact, SCM systems can detect textual conflicts, which are generated by the concurrent modification of the same program locations, but they are unable to detect higher-order conflicts, which are generated by the concurrent modification of different program locations that generate program misbehaviors once merged. Higher-order conflicts are painful to detect and expensive to fix because they might be originated by the interference of apparently unrelated changes. In this paper we present Behavioral Driven Conflict Identification (BDCI), a novel approach to conflict detection. BDCI moves the analysis of conflicts from the source code level to the level of program behavior by generating and comparing behavioral models. The analysis based on behavioral models can reveal interfering changes as soon as they are introduced in the SCM system, even if they do not introduce any textual conflict. To evaluate the effectiveness and the cost of the proposed approach, we developed BDCIf, a specific instance of BDCI dedicated to the detection of higher-order conflicts related to the functional behavior of a program. The evidence collected by analyzing multiple versions of Git and Redis suggests that BDCIf can effectively detect higher-order conflicts and report how changes might interfere. @InProceedings{ESEC/FSE17p570, author = {Fabrizio Pastore and Leonardo Mariani and Daniela Micucci}, title = {BDCI: Behavioral Driven Conflict Identification}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {570--581}, doi = {}, year = {2017}, } Info Artifacts Functional |
|
Pattabiraman, Karthik |
ESEC/FSE '17: "ARTINALI: Dynamic Invariant ..."
ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security
Maryam Raiyat Aliabadi, Amita Ajith Kamath, Julien Gascon-Samson, and Karthik Pattabiraman (University of British Columbia, Canada; National Institute of Technology Karnataka, India) Cyber-Physical Systems (CPSes) are being widely deployed in security critical scenarios such as smart homes and medical devices. Unfortunately, the connectedness of these systems and their relative lack of security measures makes them ripe targets for attacks. Specification-based Intrusion Detection Systems (IDS) have been shown to be effective for securing CPSs. Unfortunately, deriving invariants for capturing the specifications of CPS systems is a tedious and error-prone process. Therefore, it is important to dynamically monitor the CPS system to learn its common behaviors and formulate invariants for detecting security attacks. Existing techniques for invariant mining only incorporate data and events, but not time. However, time is central to most CPS systems, and hence incorporating time in addition to data and events, is essential for achieving low false positives and false negatives. This paper proposes ARTINALI, which mines dynamic system properties by incorporating time as a first-class property of the system. We build ARTINALI-based Intrusion Detection Systems (IDSes) for two CPSes, namely smart meters and smart medical devices, and measure their efficacy. We find that the ARTINALI-based IDSes significantly reduce the ratio of false positives and false negatives by 16 to 48% (average 30.75%) and 89 to 95% (average 93.4%) respectively over other dynamic invariant detection tools. @InProceedings{ESEC/FSE17p349, author = {Maryam Raiyat Aliabadi and Amita Ajith Kamath and Julien Gascon-Samson and Karthik Pattabiraman}, title = {ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {349--361}, doi = {}, year = {2017}, } |
|
Pavlinovic, Zvonimir |
ESEC/FSE '17-IND: "Static Analysis for Optimizing ..."
Static Analysis for Optimizing Big Data Queries
Diego Garbervetsky, Zvonimir Pavlinovic, Michael Barnett, Madanlal Musuvathi, Todd Mytkowicz, and Edgardo Zoppi (University of Buenos Aires, Argentina; New York University, USA; Microsoft Research, USA) Query languages for big data analysis provide user extensibility through a mechanism of user-defined operators (UDOs). These operators allow programmers to write proprietary functionalities on top of a relational query skeleton. However, achieving effective query optimization for such languages is extremely challenging since the optimizer needs to understand data dependencies induced by UDOs. SCOPE, the query language from Microsoft, allows for hand coded declarations of UDO data dependencies. Unfortunately, most programmers avoid using this facility since writing and maintaining the declarations is tedious and error-prone. In this work, we designed and implemented two sound and robust static analyses for computing UDO data dependencies. The analyses can detect what columns of an input table are never used or pass-through a UDO unchanged. This information can be used to significantly improve execution of SCOPE scripts. We evaluate our analyses on thousands of real-world queries and show we can catch many unused and pass-through columns automatically without relying on any manually provided declarations. @InProceedings{ESEC/FSE17p932, author = {Diego Garbervetsky and Zvonimir Pavlinovic and Michael Barnett and Madanlal Musuvathi and Todd Mytkowicz and Edgardo Zoppi}, title = {Static Analysis for Optimizing Big Data Queries}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {932--937}, doi = {}, year = {2017}, } |
|
Peitek, Norman |
ESEC/FSE '17: "Measuring Neural Efficiency ..."
Measuring Neural Efficiency of Program Comprehension
Janet Siegmund, Norman Peitek, Chris Parnin, Sven Apel, Johannes Hofmeister, Christian Kästner, Andrew Begel, Anja Bethmann, and André Brechmann (University of Passau, Germany; Leibniz Institute for Neurobiology, Germany; North Carolina State University, USA; Carnegie Mellon University, USA; Microsoft Research, USA) Most modern software programs cannot be understood in their entirety by a single programmer. Instead, programmers must rely on a set of cognitive processes that aid in seeking, filtering, and shaping relevant information for a given programming task. Several theories have been proposed to explain these processes, such as ``beacons,' for locating relevant code, and ``plans,'' for encoding cognitive models. However, these theories are decades old and lack validation with modern cognitive-neuroscience methods. In this paper, we report on a study using functional magnetic resonance imaging (fMRI) with 11 participants who performed program comprehension tasks. We manipulated experimental conditions related to beacons and layout to isolate specific cognitive processes related to bottom-up comprehension and comprehension based on semantic cues. We found evidence of semantic chunking during bottom-up comprehension and lower activation of brain areas during comprehension based on semantic cues, confirming that beacons ease comprehension. @InProceedings{ESEC/FSE17p140, author = {Janet Siegmund and Norman Peitek and Chris Parnin and Sven Apel and Johannes Hofmeister and Christian Kästner and Andrew Begel and Anja Bethmann and André Brechmann}, title = {Measuring Neural Efficiency of Program Comprehension}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {140--150}, doi = {}, year = {2017}, } Info |
|
Peldszus, Sven |
ESEC/FSE '17-DEMO: "Model-Based Privacy and Security ..."
Model-Based Privacy and Security Analysis with CARiSMA
Amir Shayan Ahmadian, Sven Peldszus, Qusai Ramadan, and Jan Jürjens (University of Koblenz-Landau, Germany; Fraunhofer ISST, Germany) We present CARiSMA, a tool that is originally designed to support model-based security analysis of IT systems. In our recent work, we added several new functionalities to CARiSMA to support the privacy of personal data. Moreover, we introduced a mechanism to assist the system designers to perform a CARiSMA analysis by automatically initializing an appropriate CARiSMA analysis concerning security and privacy requirements. The motivation for our work is Article 25 of Regulation (EU) 2016/679, which requires appropriate technical and organizational controls must be implemented for ensuring that, by default, the processing of personal data complies with the principles on processing of personal data. This implies that initially IT systems must be analyzed to verify if such principles are respected. System models allow the system developers to handle the complexity of systems and to focus on key aspects such as privacy and security. CARiSMA is available at http://carisma.umlsec.de and our screen cast at https://youtu.be/b5zeHig3ARw. @InProceedings{ESEC/FSE17p989, author = {Amir Shayan Ahmadian and Sven Peldszus and Qusai Ramadan and Jan Jürjens}, title = {Model-Based Privacy and Security Analysis with CARiSMA}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {989--993}, doi = {}, year = {2017}, } Video Info |
|
Pezzè, Mauro |
ESEC/FSE '17: "Reproducing Concurrency Failures ..."
Reproducing Concurrency Failures from Crash Stacks
Francesco A. Bianchi, Mauro Pezzè, and Valerio Terragni (University of Lugano, Switzerland) Reproducing field failures is the first essential step for understanding, localizing and removing faults. Reproducing concurrency field failures is hard due to the need of synthesizing a test code jointly with a thread interleaving that induce the failure in the presence of limited information from the field. Current techniques for reproducing concurrency failures focus on identifying failure-inducing interleavings, leaving largely open the problem of synthesizing the test code that manifests such interleavings. In this paper, we present ConCrash, a technique to automatically generate test codes that reproduce concurrency failures that violate thread-safety from crash stacks, which commonly summarize the conditions of field failures. ConCrash efficiently explores the huge space of possible test codes to identify a failure-inducing one by using a suitable set of search pruning strategies. Combined with existing techniques for exploring interleavings, ConCrash automatically reproduces a given concurrency failure that violates the thread-safety of a class by identifying both a failure-inducing test code and corresponding interleaving. In the paper, we define the ConCrash approach, present a prototype implementation of ConCrash, and discuss the experimental results that we obtained on a known set of ten field failures that witness the effectiveness of the approach. @InProceedings{ESEC/FSE17p705, author = {Francesco A. Bianchi and Mauro Pezzè and Valerio Terragni}, title = {Reproducing Concurrency Failures from Crash Stacks}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {705--716}, doi = {}, year = {2017}, } |
|
Philippsen, Michael |
ESEC/FSE '17: "More Accurate Recommendations ..."
More Accurate Recommendations for Method-Level Changes
Georg Dotzler, Marius Kamp, Patrick Kreutzer, and Michael Philippsen (Friedrich-Alexander University Erlangen-Nürnberg, Germany) During the life span of large software projects, developers often apply the same code changes to different code locations in slight variations. Since the application of these changes to all locations is time-consuming and error-prone, tools exist that learn change patterns from input examples, search for possible pattern applications, and generate corresponding recommendations. In many cases, the generated recommendations are syntactically or semantically wrong due to code movements in the input examples. Thus, they are of low accuracy and developers cannot directly copy them into their projects without adjustments. We present the Accurate REcommendation System (ARES) that achieves a higher accuracy than other tools because its algorithms take care of code movements when creating patterns and recommendations. On average, the recommendations by ARES have an accuracy of 96% with respect to code changes that developers have manually performed in commits of source code archives. At the same time ARES achieves precision and recall values that are on par with other tools. @InProceedings{ESEC/FSE17p798, author = {Georg Dotzler and Marius Kamp and Patrick Kreutzer and Michael Philippsen}, title = {More Accurate Recommendations for Method-Level Changes}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {798--808}, doi = {}, year = {2017}, } Info |
|
Phipathananunth, Chadarat |
ESEC/FSE '17-DEMO: "Cherry-Picking of Code Commits ..."
Cherry-Picking of Code Commits in Long-Running, Multi-release Software
Panuchart Bunyakiati and Chadarat Phipathananunth (University of the Thai Chamber of Commerce, Thailand) This paper presents Tartarian, a tool that supports maintenance of software with long-running, multi-release branches in distributed version control systems. When new maintenance code, such as bug fixes and code improvement, is committed into a branch, it is likely that such code can be applied or reused with some other branches. To do so, a developer may manually identify a commit and cherry pick it. Tartarian can support this activity by providing commit hashtags, which the developer uses as metadata to specify their intentions when committing the code. With these tags, Tartarian uses dependency graph, that represents the dependency constraints of the branches, and Branch Identifier, which matches the commit hashtags with the dependency graph, to identify the applicable branches for the commits. Using Tartarian, developers may be able to maintain software with multiple releases more efficiently. A video demo of Tartarian is available at www.github.com/tartarian. @InProceedings{ESEC/FSE17p994, author = {Panuchart Bunyakiati and Chadarat Phipathananunth}, title = {Cherry-Picking of Code Commits in Long-Running, Multi-release Software}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {994--998}, doi = {}, year = {2017}, } Info |
|
Podelski, Andreas |
ESEC/FSE '17: "Craig vs. Newton in Software ..."
Craig vs. Newton in Software Model Checking
Daniel Dietsch, Matthias Heizmann, Betim Musa, Alexander Nutz, and Andreas Podelski (University of Freiburg, Germany) Ever since the seminal work on SLAM and BLAST, software model checking with counterexample-guided abstraction refinement (CEGAR) has been an active topic of research. The crucial procedure here is to analyze a sequence of program statements (the counterexample) to find building blocks for the overall proof of the program. We can distinguish two approaches (which we name Craig and Newton) to implement the procedure. The historically first approach, Newton (named after the tool from the SLAM toolkit), is based on symbolic execution. The second approach, Craig, is based on Craig interpolation. It was widely believed that Craig is substantially more effective than Newton. In fact, 12 out of the 15 CEGAR-based tools in SV-COMP are based on Craig. Advances in software model checkers based on Craig, however, can go only lockstep with advances in SMT solvers with Craig interpolation. It may be time to revisit Newton and ask whether Newton can be as effective as Craig. We have implemented a total of 11 variants of Craig and Newton in two different state-of-the-art software model checking tools and present the outcome of our experimental comparison. @InProceedings{ESEC/FSE17p487, author = {Daniel Dietsch and Matthias Heizmann and Betim Musa and Alexander Nutz and Andreas Podelski}, title = {Craig vs. Newton in Software Model Checking}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {487--497}, doi = {}, year = {2017}, } |
|
Podgurski, Andy |
ESEC/FSE '17-DEMO: "RunDroid: Recovering Execution ..."
RunDroid: Recovering Execution Call Graphs for Android Applications
Yujie Yuan, Lihua Xu, Xusheng Xiao, Andy Podgurski, and Huibiao Zhu (East China Normal University, China; Case Western Reserve University, USA) Fault localization is a well-received technique for helping developers to identify faulty statements of a program. Research has shown that the coverages of faulty statements and its predecessors in program dependence graph are important for effective fault localization. However, app executions in Android split into segments in different components, i.e., methods, threads, and processes, posing challenges for traditional program dependence computation, and in turn rendering fault localization less effective. We present RunDroid, a tool for recovering the dynamic call graphs of app executions in Android, assisting existing tools for more precise program dependence computation. For each exectuion, RunDroid captures and recovers method calls from not only the application layer, but also between applications and the Android framework. Moreover, to deal with the widely adopted multi-threaded communications in Android applications, RunDroid also captures methods calls that are split among threads. Demo : https://github.com/MiJack/RunDroid Video : https://youtu.be/EM7TJbE-Oaw @InProceedings{ESEC/FSE17p949, author = {Yujie Yuan and Lihua Xu and Xusheng Xiao and Andy Podgurski and Huibiao Zhu}, title = {RunDroid: Recovering Execution Call Graphs for Android Applications}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {949--953}, doi = {}, year = {2017}, } Video Info |
|
Poggi, Giovanni |
ESEC/FSE '17: "Automatically Analyzing Groups ..."
Automatically Analyzing Groups of Crashes for Finding Correlations
Marco Castelluccio, Carlo Sansone, Luisa Verdoliva, and Giovanni Poggi (Federico II University of Naples, Italy; Mozilla, UK) We devised an algorithm, inspired by contrast-set mining algorithms such as STUCCO, to automatically find statistically significant properties (correlations) in crash groups. Many earlier works focused on improving the clustering of crashes but, to the best of our knowledge, the problem of automatically describing properties of a cluster of crashes is so far unexplored. This means developers currently spend a fair amount of time analyzing the groups themselves, which in turn means that a) they are not spending their time actually developing a fix for the crash; and b) they might miss something in their exploration of the crash data (there is a large number of attributes in crash reports and it is hard and error-prone to manually analyze everything). Our algorithm helps developers and release managers understand crash reports more easily and in an automated way, helping in pinpointing the root cause of the crash. The tool implementing the algorithm has been deployed on Mozilla's crash reporting service. @InProceedings{ESEC/FSE17p717, author = {Marco Castelluccio and Carlo Sansone and Luisa Verdoliva and Giovanni Poggi}, title = {Automatically Analyzing Groups of Crashes for Finding Correlations}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {717--726}, doi = {}, year = {2017}, } |
|
Politano, Mariano |
ESEC/FSE '17-DEMO: "DynAlloy Analyzer: A Tool ..."
DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour
Germán Regis, César Cornejo, Simón Gutiérrez Brida, Mariano Politano, Fernando Raverta, Pablo Ponzio, Nazareno Aguirre, Juan Pablo Galeotti, and Marcelo Frias (Universidad Nacional de Río Cuarto, Argentina; Universidad Nacional de Córdoba, Argentina; University of Buenos Aires, Argentina; Buenos Aires Institute of Technology, Argentina) We describe DynAlloy Analyzer, a tool that extends Alloy Analyzer with support for dynamic elements in Alloy models. The tool builds upon Alloy Analyzer in a way that makes it fully compatible with Alloy models, and extends their syntax with a particular idiom, inspired in dynamic logic, for the description of dynamic behaviours, understood as sequences of states over standard Alloy models, in terms of programs. The syntax is broad enough to accommodate abstract dynamic behaviours, e.g., using nondeterministic choice and finite unbounded iteration, as well as more concrete ones, using standard sequential programming constructions. The analysis of DynAlloy models resorts to the analysis of Alloy models, through an optimized translation that often makes the analysis more efficient than that of typical ad-hoc constructions to capture dynamism in Alloy. Tool screencast, binaries and further details available in: http://dc.exa.unrc.edu.ar/tools/dynalloy @InProceedings{ESEC/FSE17p969, author = {Germán Regis and César Cornejo and Simón Gutiérrez Brida and Mariano Politano and Fernando Raverta and Pablo Ponzio and Nazareno Aguirre and Juan Pablo Galeotti and Marcelo Frias}, title = {DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {969--973}, doi = {}, year = {2017}, } Video Info |
|
Ponzio, Pablo |
ESEC/FSE '17-DEMO: "DynAlloy Analyzer: A Tool ..."
DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour
Germán Regis, César Cornejo, Simón Gutiérrez Brida, Mariano Politano, Fernando Raverta, Pablo Ponzio, Nazareno Aguirre, Juan Pablo Galeotti, and Marcelo Frias (Universidad Nacional de Río Cuarto, Argentina; Universidad Nacional de Córdoba, Argentina; University of Buenos Aires, Argentina; Buenos Aires Institute of Technology, Argentina) We describe DynAlloy Analyzer, a tool that extends Alloy Analyzer with support for dynamic elements in Alloy models. The tool builds upon Alloy Analyzer in a way that makes it fully compatible with Alloy models, and extends their syntax with a particular idiom, inspired in dynamic logic, for the description of dynamic behaviours, understood as sequences of states over standard Alloy models, in terms of programs. The syntax is broad enough to accommodate abstract dynamic behaviours, e.g., using nondeterministic choice and finite unbounded iteration, as well as more concrete ones, using standard sequential programming constructions. The analysis of DynAlloy models resorts to the analysis of Alloy models, through an optimized translation that often makes the analysis more efficient than that of typical ad-hoc constructions to capture dynamism in Alloy. Tool screencast, binaries and further details available in: http://dc.exa.unrc.edu.ar/tools/dynalloy @InProceedings{ESEC/FSE17p969, author = {Germán Regis and César Cornejo and Simón Gutiérrez Brida and Mariano Politano and Fernando Raverta and Pablo Ponzio and Nazareno Aguirre and Juan Pablo Galeotti and Marcelo Frias}, title = {DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {969--973}, doi = {}, year = {2017}, } Video Info |
|
Poshyvanyk, Denys |
ESEC/FSE '17: "Enabling Mutation Testing ..."
Enabling Mutation Testing for Android Apps
Mario Linares-Vásquez, Gabriele Bavota, Michele Tufano, Kevin Moran, Massimiliano Di Penta, Christopher Vendome, Carlos Bernal-Cárdenas, and Denys Poshyvanyk (Universidad de los Andes, Colombia; University of Lugano, Switzerland; College of William and Mary, USA; University of Sannio, Italy) Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well as to guide test case generation or prioritization. Empirical studies have shown that, while mutants are generally representative of real faults, an effective application of mutation testing requires “traditional” operators designed for programming languages to be augmented with operators specific to an application domain and/or technology. This paper proposes MDroid+, a framework for effective mutation testing of Android apps. First, we systematically devise a taxonomy of 262 types of Android faults grouped in 14 categories by manually analyzing 2,023 so ware artifacts from different sources (e.g., bug reports, commits). Then, we identified a set of 38 mutation operators, and implemented an infrastructure to automatically seed mutations in Android apps with 35 of the identified operators. The taxonomy and the proposed operators have been evaluated in terms of stillborn/trivial mutants generated as compared to well know mutation tools, and their capacity to represent real faults in Android apps @InProceedings{ESEC/FSE17p233, author = {Mario Linares-Vásquez and Gabriele Bavota and Michele Tufano and Kevin Moran and Massimiliano Di Penta and Christopher Vendome and Carlos Bernal-Cárdenas and Denys Poshyvanyk}, title = {Enabling Mutation Testing for Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {233--244}, doi = {}, year = {2017}, } Info |
|
Pu, Geguang |
ESEC/FSE '17: "Guided, Stochastic Model-Based ..."
Guided, Stochastic Model-Based GUI Testing of Android Apps
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, and Zhendong Su (East China Normal University, China; Nanyang Technological University, Singapore; Shanghai Jiao Tong University, China; University of California at Davis, USA) Mobile apps are ubiquitous, operate in complex environments and are developed under the time-to-market pressure. Ensuring their correctness and reliability thus becomes an important challenge. This paper introduces Stoat, a novel guided approach to perform stochastic model-based testing on Android apps. Stoat operates in two phases: (1) Given an app as input, it uses dynamic analysis enhanced by a weighted UI exploration strategy and static analysis to reverse engineer a stochastic model of the app's GUI interactions; and (2) it adapts Gibbs sampling to iteratively mutate/refine the stochastic model and guides test generation from the mutated models toward achieving high code and model coverage and exhibiting diverse sequences. During testing, system-level events are randomly injected to further enhance the testing effectiveness. Stoat was evaluated on 93 open-source apps. The results show (1) the models produced by Stoat cover 17~31% more code than those by existing modeling tools; (2) Stoat detects 3X more unique crashes than two state-of-the-art testing tools, Monkey and Sapienz. Furthermore, Stoat tested 1661 most popular Google Play apps, and detected 2110 previously unknown and unique crashes. So far, 43 developers have responded that they are investigating our reports. 20 of reported crashes have been confirmed, and 8 already fixed. @InProceedings{ESEC/FSE17p245, author = {Ting Su and Guozhu Meng and Yuting Chen and Ke Wu and Weiming Yang and Yao Yao and Geguang Pu and Yang Liu and Zhendong Su}, title = {Guided, Stochastic Model-Based GUI Testing of Android Apps}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {245--256}, doi = {}, year = {2017}, } |
|
Rajamani, Sriram |
ESEC/FSE '17: "A Compiler and Verifier for ..."
A Compiler and Verifier for Page Access Oblivious Computation
Rohit Sinha, Sriram Rajamani, and Sanjit A. Seshia (University of California at Berkeley, USA; Microsoft Research, India) Trusted hardware primitives such as Intel's SGX instructions provide applications with a protected address space, called an enclave, for trusted code and data. However, building enclaves that preserve confidentiality of sensitive data continues to be a challenge. The developer must not only avoid leaking secrets via the enclave's outputs but also prevent leaks via side channels induced by interactions with the untrusted platform. Recent attacks have demonstrated that simply observing the page faults incurred during an enclave's execution can reveal its secrets if the enclave makes data accesses or control flow decisions based on secret values. To address this problem, a developer needs compilers to automatically produce confidential programs, and verification tools to certify the absence of secret-dependent page access patterns (a property that we formalize as page-access obliviousness). To that end, we implement an efficient compiler for a type and memory-safe language, a compiler pass that enforces page-access obliviousness with low runtime overheads, and an automatic, modular verifier that certifies page-access obliviousness at the machine-code level, thus removing the compiler from our trusted computing base. We evaluate this toolchain on several machine learning algorithms and image processing routines that we run within SGX enclaves. @InProceedings{ESEC/FSE17p649, author = {Rohit Sinha and Sriram Rajamani and Sanjit A. Seshia}, title = {A Compiler and Verifier for Page Access Oblivious Computation}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {649--660}, doi = {}, year = {2017}, } |
|
Ramadan, Qusai |
ESEC/FSE '17-DEMO: "Model-Based Privacy and Security ..."
Model-Based Privacy and Security Analysis with CARiSMA
Amir Shayan Ahmadian, Sven Peldszus, Qusai Ramadan, and Jan Jürjens (University of Koblenz-Landau, Germany; Fraunhofer ISST, Germany) We present CARiSMA, a tool that is originally designed to support model-based security analysis of IT systems. In our recent work, we added several new functionalities to CARiSMA to support the privacy of personal data. Moreover, we introduced a mechanism to assist the system designers to perform a CARiSMA analysis by automatically initializing an appropriate CARiSMA analysis concerning security and privacy requirements. The motivation for our work is Article 25 of Regulation (EU) 2016/679, which requires appropriate technical and organizational controls must be implemented for ensuring that, by default, the processing of personal data complies with the principles on processing of personal data. This implies that initially IT systems must be analyzed to verify if such principles are respected. System models allow the system developers to handle the complexity of systems and to focus on key aspects such as privacy and security. CARiSMA is available at http://carisma.umlsec.de and our screen cast at https://youtu.be/b5zeHig3ARw. @InProceedings{ESEC/FSE17p989, author = {Amir Shayan Ahmadian and Sven Peldszus and Qusai Ramadan and Jan Jürjens}, title = {Model-Based Privacy and Security Analysis with CARiSMA}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {989--993}, doi = {}, year = {2017}, } Video Info |
|
Rastogi, Vaibhav |
ESEC/FSE '17: "Cimplifier: Automatically ..."
Cimplifier: Automatically Debloating Containers
Vaibhav Rastogi, Drew Davidson, Lorenzo De Carli, Somesh Jha, and Patrick McDaniel (University of Wisconsin-Madison, USA; Tala Security, USA; Colorado State University, USA; Pennsylvania State University, USA) Application containers, such as those provided by Docker, have recently gained popularity as a solution for agile and seamless software deployment. These light-weight virtualization environments run applications that are packed together with their resources and configuration information, and thus can be deployed across various software platforms. Unfortunately, the ease with which containers can be created is oftentimes a double-edged sword, encouraging the packaging of logically distinct applications, and the inclusion of significant amount of unnecessary components, within a single container. These practices needlessly increase the container size—sometimes by orders of magnitude. They also decrease the overall security, as each included component—necessary or not—may bring in security issues of its own, and there is no isolation between multiple applications packaged within the same container image. We propose algorithms and a tool called Cimplifier, which address these concerns: given a container and simple user-defined constraints, our tool partitions it into simpler containers, which (i) are isolated from each other, only communicating as necessary, and (ii) only include enough resources to perform their functionality. Our evaluation on real-world containers demonstrates that Cimplifier preserves the original functionality, leads to reduction in image size of up to 95%, and processes even large containers in under thirty seconds. @InProceedings{ESEC/FSE17p476, author = {Vaibhav Rastogi and Drew Davidson and Lorenzo De Carli and Somesh Jha and Patrick McDaniel}, title = {Cimplifier: Automatically Debloating Containers}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {476--486}, doi = {}, year = {2017}, } |
|
Ráth, István |
ESEC/FSE '17-DEMO: "The MONDO Collaboration Framework: ..."
The MONDO Collaboration Framework: Secure Collaborative Modeling over Existing Version Control Systems
Csaba Debreceni, Gábor Bergmann, Márton Búr, István Ráth, and Dániel Varró (Budapest University of Technology and Economics, Hungary; MTA-BME Lendület Research Group on Cyber-Physical Systems, Hungary; McGill University, Canada) Model-based systems engineering of critical cyber-physical systems necessitates effective collaboration between different stakeholders while still providing secure protection of intellectual properties of all involved parties. While engineering artifacts are frequently stored in version control repositories, secure access control is limited to file-level strategies in most existing frameworks where models are split into multiple fragments with all-or-nothing permissions, which becomes a scalability and usability bottleneck in case of complex industrial models. In this paper, we introduce the MONDO Collaboration Framework, which provides rule-based fine-grained model-level secure access control, property-based locking and automated model merge integrated over existing version control systems such as Subversion (SVN) for storage and version control. Our framework simultaneously supports offline collaboration (asynchronous checkout-modify-commit) on top of off-the-shelf modeling tools and online scenarios (GoogleDocs-style short transactions) scenarios by offering a web-based modeling frontend. Screencast Demo: https://youtu.be/Ix3CgmsYIU0 @InProceedings{ESEC/FSE17p984, author = {Csaba Debreceni and Gábor Bergmann and Márton Búr and István Ráth and Dániel Varró}, title = {The MONDO Collaboration Framework: Secure Collaborative Modeling over Existing Version Control Systems}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {984--988}, doi = {}, year = {2017}, } |
|
Raverta, Fernando |
ESEC/FSE '17-DEMO: "DynAlloy Analyzer: A Tool ..."
DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour
Germán Regis, César Cornejo, Simón Gutiérrez Brida, Mariano Politano, Fernando Raverta, Pablo Ponzio, Nazareno Aguirre, Juan Pablo Galeotti, and Marcelo Frias (Universidad Nacional de Río Cuarto, Argentina; Universidad Nacional de Córdoba, Argentina; University of Buenos Aires, Argentina; Buenos Aires Institute of Technology, Argentina) We describe DynAlloy Analyzer, a tool that extends Alloy Analyzer with support for dynamic elements in Alloy models. The tool builds upon Alloy Analyzer in a way that makes it fully compatible with Alloy models, and extends their syntax with a particular idiom, inspired in dynamic logic, for the description of dynamic behaviours, understood as sequences of states over standard Alloy models, in terms of programs. The syntax is broad enough to accommodate abstract dynamic behaviours, e.g., using nondeterministic choice and finite unbounded iteration, as well as more concrete ones, using standard sequential programming constructions. The analysis of DynAlloy models resorts to the analysis of Alloy models, through an optimized translation that often makes the analysis more efficient than that of typical ad-hoc constructions to capture dynamism in Alloy. Tool screencast, binaries and further details available in: http://dc.exa.unrc.edu.ar/tools/dynalloy @InProceedings{ESEC/FSE17p969, author = {Germán Regis and César Cornejo and Simón Gutiérrez Brida and Mariano Politano and Fernando Raverta and Pablo Ponzio and Nazareno Aguirre and Juan Pablo Galeotti and Marcelo Frias}, title = {DynAlloy Analyzer: A Tool for the Specification and Analysis of Alloy Models with Dynamic Behaviour}, booktitle = {Proc.\ ESEC/FSE}, publisher = {ACM}, pages = {969--973}, doi = {}, year = {2017}, } Video Info |
|
Ray, Baishakhi | ESEC/FSE '17: "Automatically Diagnosing and ..." |