SANER 2015 – Proceedings

Message from the Chairs
Bienvenue à la 22e Conférence Internationale sur l'Analyse, l'Évolution et la Réingénierie Logicielles à Montréal, Québec, Canada ! Welcome to SANER 2015, the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering in Montreal, Quebec, Canada! SANER is the premier research conference on the theory and practice of recovering information from existing software and systems. It combines the Working Conference on Reverse Engineering (WCRE) series, i.e., the premier conference series on the theory and practice of recovering information from existing software and systems, and the European Conference on Software Maintenance and Reengineering (CSMR) series, i.e., the premier European conference series on the theory and practice of maintenance, reengineering, and evolution of software systems. SANER 2015 follows the highly successful IEEE CSMR-WCRE Software Evolution Week, held in Antwerp, Belgium, in 2014.

Keynotes

On Whose Shoulders? (Keynote)
Jane Cleland-Huang
(DePaul University, USA)
Bernard of Chartres (via Sir Isaac Newton) reminded us that all progress is achieved “on the shoulders of giants” – that our greatest discoveries and innovations build upon the inspirations, triumphs, and foundational truths established by those who have gone before us. However, in our field of Software Engineering, as new ideas are transmitted at the speed of light, rather than the speed of Bernard’s horse, innovations are typically achieved as we, the ordinary people, exchange ideas, deliver incremental improvements, and offer the occasional truly novel idea to advance our field. In this fast-paced environment it is particularly important for us to take the time to build a strong foundation for our knowledge – keeping audit trails of our experiments, sharing our datasets, releasing the code we used to run our experiments, and generally making our work transparent and reproducible, so that we no longer depend on giants to further the field. Instead our successes are a collective effort from our community. Unfortunately, this degree of openness comes with its own challenges. In this talk, Dr. Cleland-Huang will explore some of the success stories in our field and discuss ways to deal with the psychological, philosophical, and practical barriers that impede open collaboration.

Checkpoint Alpha (Keynote)
Boris Debić
(Google, USA)
We’re into the 15th year of the 21st century, 1/7th of our way through it. We have spent one day of the 21st century week. It is time to pause, and look at what has changed since the start of the century, perhaps even make an informed proposal as to - how could we best align our engineering and research efforts for the rest of it. Large scale computing (LSC) and computing everywhere in the form of mobile devices are having an irreversible and permanent impact in both the way we engineer our systems, and more importantly in the way we deploy and use these new capabilities. Let me illustrate LSC with some Google numbers; 3.5 billion searches daily, to search through 60 trillion web documents from 230 million domains. All of this powered by over 2.5 GW of renewable energy sources (466 MW in the US signed in 2014 alone, 300 MW Wind and 80 MW of Solar already signed up in the US for 2015). The way these new capabilities, from all of the tech industry, are transforming society is undeniable and profound - from the ubiquity of the Internet of Things to the Collaborative Commons. The pace of change in this first day of the new century is more significant by many measures than the combined preceding ~65 years of computing engineering history (one smartphone is equivalent to ~100,000 times of all NASA’s computing back in the late 60’s during the Apollo missions). However, even with all these advances we are still running into fundamental software engineering limits. Whereas our hardware capabilities followed the exponentials of Moore’s law, our engineering was perfected linearly at best. And what is truly disconcerting, sometimes we are actively working on fragmenting our knowledge and collaboration infrastructure - the Internet. In this talk I will call attention to some of the engineering pain points ahead of us. Also at the opportunities to further accelerate change. I will argue that our research investment should share some of the common goals identified in the talk. Ultimately I will challenge us in how to rethink the way we engineer interactions with end users and design a socially responsible infrastructure. We have some days to go in this century and at this point we should probably start to form an idea of how we want to spend the weekend.

Main Research

Information Retrieval
Tue, Mar 3, 11:00 - 12:30

Modeling the Evolution of Development Topics using Dynamic Topic Models
Jiajun Hu, Xiaobing Sun, David Lo

, and Bin Li
(Yangzhou University, China; Nanjing University, China; Singapore Management University, Singapore)
As the development of a software project progresses, its complexity grows accordingly, making it difficult to understand and maintain. During software maintenance and evolution, software developers and stakeholders constantly shift their focus between different tasks and topics. They need to investigate into software repositories (e.g., revision control systems) to know what tasks have recently been worked on and how much effort has been devoted to them. For example, if an important new feature request is received, an amount of work that developers perform on ought to be relevant to the addition of the incoming feature. If this does not happen, project managers might wonder what kind of work developers are currently working on. Several topic analysis tools based on Latent Dirichlet Allocation (LDA) have been proposed to analyze information stored in software repositories to model software evolution, thus helping software stakeholders to be aware of the focus of development efforts at various time during software evolution. Previous LDAbased topic analysis tools can capture either changes on the strengths of various development topics over time (i.e., strength evolution) or changes in the content of existing topics over time (i.e., content evolution). Unfortunately, none of the existing techniques can capture both strength and content evolution. In this paper, we use Dynamic Topic Models (DTM) to analyze commit messages within a project’s lifetime to capture both strength and content evolution simultaneously. We evaluate our approach by conducting a case study on commit messages of two well-known open source software systems, jEdit and PostgreSQL. The results show that our approach could capture not only how the strengths of various development topics change over time, but also how the content of each topic (i.e., words that form the topic) changes over time. Compared with existing topic analysis approaches, our approach can provide a more complete and valuable view of software evolution to help developers better understand the evolution of their projects.

Understanding Developers' Natural Language Queries with Interactive Clarification
Shihai Jiang, Liwei Shen, Xin Peng

, Zhaojin Lv, and Wenyun Zhao
(Fudan University, China)
When performing software maintenance tasks, developers often need to understand a series of background knowledge based on information distributed in different software repositories such as source codes, version control systems and bug tracking systems. An effective way to support developers to understand such knowledge is to provide an integrated knowledge base and allow them to ask questions using natural language. Existing approaches cannot well support natural language questions that involve a series of conceptual relationships and are phrased in a flexible way. In this paper, we propose an interactive approach for understanding developers' natural language queries. The approach can understand a developer's natural language questions phrased in different ways by generating a set of ranked and human-readable candidate questions and getting feedback from the developer. Based on the candidate question confirmed by the developer, the approach can then synthesize an answer by constructing and executing a structural query to the knowledge base. We have implemented a tool following the proposed approach and conducted a user study using the tool. The results show that our approach can help developers get the desired answers more easily and accurately.

APIs and Patterns
Tue, Mar 3, 14:00 - 15:30

Mining Multi-level API Usage Patterns
Mohamed Aymen Saied, Omar Benomar, Hani Abdeen, and Houari Sahraoui

(Université de Montréal, Canada)
Software developers need to cope with complexity of Application Programming Interfaces (APIs) of external libraries or frameworks. However, typical APIs provide several thousands of methods to their client programs, and such large APIs are difficult to learn and use. An API method is generally used within client programs along with other methods of the API of interest. Despite this, co-usage relationships between API methods are often not documented. We propose a technique for mining Multi-Level API Usage Patterns (MLUP) to exhibit the co-usage relationships between methods of the API of interest across interfering usage scenarios. We detect multi-level usage patterns as distinct groups of API methods, where each group is uniformly used across variable client programs, independently of usage contexts. We evaluated our technique through the usage of four APIs having up to 22 client programs per API. For all the studied APIs, our technique was able to detect usage patterns that are, almost all, highly consistent and highly cohesive across a considerable variability of client programs.

An Observational Study on API Usage Constraints and Their Documentation
Mohamed Aymen Saied, Houari Sahraoui

, and Bruno Dufour
(Université de Montréal, Canada)
Nowadays, APIs represent the most common reuse form when developing software. However, the reuse benefits depend greatly on the ability of client application developers to use correctly the APIs. In this paper, we present an observational study on the API usage constraints and their documentation. To conduct the study on a large number of APIs, we implemented and validated strategies to automatically detect four types of usage constraints in existing APIs. We observed that some of the constraint types are frequent and that for three types, they are not documented in general. Surprisingly, the absence of documentation is, in general, specific to the constraints and not due to the non documenting habits of developers.

Improving Pattern Tracking with a Language-Aware Tree Differencing Algorithm
Nicolas Palix, Jean-Rémy Falleri, and Julia Lawall
(University of Grenoble, France; LaBRI, France; University of Bordeaux, France; INRIA, France)
Tracking code fragments of interest is important in monitoring a software project over multiple versions. Various approaches, including our previous work on Herodotos, exploit the notion of Longest Common Subsequence, as computed by readily available tools such as GNU Diff, to map corresponding code fragments. Nevertheless, the efficient code differencing algorithms are typically line-based or word-based, and thus do not report changes at the level of language constructs. Furthermore, they identify only additions and removals, but not the moving of a block of code from one part of a file to another. Code fragments of interest that fall within the added and removed regions of code have to be manually correlated across versions, which is tedious and error-prone. When studying a very large code base over a long time, the number of manual correlations can become an obstacle to the success of a study. In this paper, we investigate the effect of replacing the current line-based algorithm used by Herodotos by tree-matching, as provided by the algorithm of the differencing tool GumTree. In contrast to the line-based approach, the tree-based approach does not generate any manual correlations, but it incurs a high execution time. To address the problem, we propose a hybrid strategy that gives the best of both approaches.

Measuring the Quality of Design Pattern Detection Results
Shouzheng Yang, Ayesha Manzer, and Vassilios Tzerpos
(York University, Canada)
Detecting design patterns in large software systems is a common reverse engineering task that can help the comprehension process of the system's design. While several design pattern detection tools presented in the literature are capable of detecting design patterns automatically, evaluating these detection results is usually done in a manual and subjective fashion. Differences in design pattern definitions, as well as pattern instance counting and presenting, exacerbate the difficulty of evaluating design pattern detection results.
In this paper, we present a novel approach to evaluating and comparing design pattern detection results. Our approach, called MoRe, introduces a novel way to present design pattern instances in a uniform fashion. Based on this characterization of design pattern instances, we propose four measures for design pattern detection evaluation that convey a concise assessment of the quality of the results produced by a given detection method. We have implemented these measures, and present case studies that showcase their usefulness.

Analysis of Programming Languages
Tue, Mar 3, 16:00 - 17:30

Are PHP Applications Ready for Hack?
Laleh Eshkevari, Fabien Dos Santos, James R. Cordy, and Giuliano Antoniol
(Polytechnique Montréal, Canada; Polytech Montpellier, France; Queen's University, Canada)
PHP is by far the most popular WEB scripting language, accounting for more than 80% of existing websites. PHP is dynamically typed, which means that variables take on the type of the objects that they are assigned, and may change type as execution proceeds. While some type changes are likely not harmful, others involving function calls and global variables may be more difficult to understand and the source of many bugs. Hack, a new PHP variant endorsed by Facebook, attempts to address this problem by adding static typing to PHP variables, which limits them to a single consistent type throughout execution. This paper defines an empirical taxonomy of PHP type changes along three dimensions: the complexity or burden imposed to understand the type change; whether or not the change is potentially harmful; and the actual types changed. We apply static and dynamic analyses to three widely used WEB applications coded in PHP (WordPress, Drupal and phpBB) to investigate (1) to what extent developers really use dynamic typing, (2) what kinds of type changes are actually encountered; and (3) how difficult it might be to refactor the code to avoid type changes, and thus meet the constraints of Hack's static typing. We report evidence that dynamic typing is actually a relatively uncommon practice in production PHP programs, and that most dynamic type changes are simple representational changes, such as between strings and integers. We observe that most PHP type changes in these programs are relatively simple, and that the largest proportion of them are easy to refactor to consistent static typing using simple local renaming transformations. Overall, the paper casts doubt on the usefulness of dynamic typing in PHP, and indicates that for many production applications, conversion to Hack's static typing may not be very difficult.

Does JavaScript Software Embrace Classes?
Leonardo Humberto Silva, Miguel Ramos, Marco Tulio Valente, Alexandre Bergel, and Nicolas Anquetil
(Federal Institute of Northern Minas Gerais, Brazil; Federal University of Minas Gerais, Brazil; University of Chile, Chile; INRIA, France)
JavaScript is the de facto programming language for the Web. It is used to implement mail clients, office applications, or IDEs, that can weight hundreds of thousands of lines of code. The language itself is prototype based, but to master the complexity of their application, practitioners commonly rely on informal class abstractions. This practice has never been the target of empirical research in JavaScript. Yet, understanding it is key to adequately tuning programming environments and structure libraries such that they are accessible to programmers. In this paper we report on a large and in-depth study to understand how class emulation is employed in JavaScript applications. We propose a strategy to statically detect class-based abstractions in the source code of JavaScript systems. We used this strategy in a dataset of 50 popular JavaScript applications available from GitHub. We found four types of JavaScript software: class-free (systems that do not make any usage of classes), class-aware (systems that use classes, but marginally), class-friendly (systems that make a relevant usage of classes), and class-oriented (systems that have most of their data structures implemented as classes). The systems in these categories represent, respectively, 26%, 36%, 30%, and 8% of the systems we studied.

Info

Evolution Analysis for Accessibility Excessiveness in Java
Kazuo Kobori, Makoto Matsushita, and Katsuro Inoue
(NTT DATA, Japan; Osaka University, Japan)
In Java programs, access modifiers are used to control the accessibility of fields and methods from other objects. Choosing appropriate access modifiers is one of the key factors to improve program quality and to reduce potential vulnerability. In our previous work, we presented a static analysis method named Accessibility Excessiveness (AE) detection for each field and method in Java program. We have also developed an AE analysis tool named ModiChecker that analyzes each field and method of the input Java programs, and reports their excessiveness. In this paper, we have applied ModiChecker to several OSS repositories to investigate the evolution of AE over versions, and identified transition of AE status and the difference in the amount of AE change between major version releases and minor ones. Also we propose when to evaluate source code with AE analysis.

A Software Quality Model for RPG
Gergely Ladányi, Zoltán Tóth, Rudolf Ferenc, and Tibor Keresztesi
(University of Szeged, Hungary; R&R Software, Hungary)
The IBM i mainframe was designed to manage business applications for which the reliability and quality is a matter of national security. The RPG programming language is the most frequently used one on this platform. The maintainability of the source code has big influence on the development costs, probably this is the reason why it is one of the most attractive, observed and evaluated quality characteristic of all. For improving or at least preserving the maintainability level of software it is necessary to evaluate it regularly. In this study we present a quality model based on the ISO/IEC 25010 international standard for evaluating the maintainability of software systems written in RPG. As an evaluation step of the quality model we show a case study in which we explain how we integrated the quality model as a continuous quality monitoring tool into the business processes of a mid-size software company which has more than twenty years of experience in developing RPG applications.

On Crashes and Traces
Wed, Mar 4, 09:00 - 10:30

JCHARMING: A Bug Reproduction Approach using Crash Traces and Directed Model Checking
Mathieu Nayrolles, Abdelwahab Hamou-Lhadj, Sofiène Tahar

, and Alf Larsson
(Concordia University, Canada; Ericsson, Sweden)
Due to their inherent complexity, software systems are pledged to be released with bugs. These bugs manifest themselves on client's computers, causing crashes and undesired behaviors. Field crashes, in particular, are challenging to understand and fix as the information provided by the impacted customers are often scarce and inaccurate. To address this issue, there is a need to find ways for automatically reproducing the crash in a lab environment in order to fully understand its root causes. Crash reproduction is also an important step towards developing adequate patches. In this paper, we propose a novel crash reproduction approach, called JCHARMING (Java CrasH Automatic Reproduction by directed Model checkING). JCHARMING uses crash traces and model checking to identify program statements needed to reproduce a crash. Our approach takes advantage of the completeness provided by model checking while ignoring unneeded system states by means of information found in crash traces combined with static slices. We show the effectiveness of JCHARMING by applying it to seven different open source programs cumulating more than one million lines of code scattered in around 7000 classes. Overall, JCHARMING was able to reproduce 85% of the submitted bugs.

Best-Paper Candidate

Towards a Common Metamodel for Traces of High Performance Computing Systems to Enable Software Analysis Tasks
Luay Alawneh, Abdelwahab Hamou-Lhadj, and Jameleddine Hassine
(Jordan University of Science and Technology, Jordan; Concordia University, Canada; KFUPM, Saudi Arabia)
There exist several tools for analyzing traces generated from HPC (High Performance Computing) applications, used by software engineers for debugging and other maintenance tasks. These tools, however, use different formats to represent HPC traces, which hinders interoperability and data exchange. At the present time, there is no standard metamodel that represents HPC trace concepts and their relations. In this paper, we argue that the lack of a common metamodel is a serious impediment for effective analysis for this class of software systems. We aim to fill this void by presenting MTF2 (MPI Trace Format2)—a metamodel for representing HPC system traces. MTF2 is built with expressiveness and scalability in mind. Scalability, an important requirement when working with large traces, is achieved by adopting graph theory concepts to compact large traces. We show through a case study that a trace represented in MTF2 can be in average 49% smaller than a trace represented in a format that does not consider compaction.

Automated Extraction of Failure Reproduction Steps from User Interaction Traces
Tobias Roehm, Stefan Nosovic, and Bernd Bruegge
(TU München, Germany)
Bug reports submitted by users and crash reports collected by crash reporting tools often lack information about reproduction steps, i.e. the steps necessary to reproduce a failure. Hence, developers have difficulties to reproduce field failures and might not be able to fix all reported bugs. We present an approach to automatically extract failure reproduction steps from user interaction traces. We capture interactions between a user and a WIMP GUI using a capture/replay tool. Then, we extract the minimal, failure-inducing subsequence of captured interaction traces. We use three algorithms to perform this extraction: Delta Debugging, Sequential Pattern Mining, and a combination of both. Delta Debugging automatically replays subsequences of an interaction trace to identify the minimal, failure-inducing subsequence. Sequential Pattern Mining identifies the common subsequence in interaction traces inducing the same failure. We evaluated our approach in a case study. We injected four bugs to the code of a mail client application, collected interaction traces of five participants trying to find these bugs, and applied the extraction algorithms. Delta Debugging extracted the minimal, failure-inducing interaction subsequence in 90% of all cases. Sequential Pattern Mining produced failure-inducing interaction sequences in 75% of all cases and removed on average 93% of unnecessary interactions, potentially enabling manual analysis by developers. Both algorithms complement each other because they are applicable in different contexts and can be combined to improve performance.

Misery Loves Company: CrowdStacking Traces to Aid Problem Detection
Tommaso Dal Sasso, Andrea Mocci, and Michele Lanza

(University of Lugano, Switzerland)
Abstract—During software development, exceptions are by no means exceptional: Programmers repeatedly try and test their code to ensure that it works as expected. While doing so, runtime exceptions are raised, pointing out various issues, such as inappropriate usage of an API, convoluted code, as well as defects. Such failures result in stack traces, lists composed of the sequence of method invocations that led to the interruption of the program. Stack traces are useful to debug source code, and if shared also enhance the quality of bug reports. However, they are handled manually and individually, while we argue that they can be leveraged automatically and collectively to enable what we call crowdstacking, the automated collection of stack traces on the scale of a whole development community. We present our crowdstacking approach, supported by ShoreLine Reporter, a tool which seamlessly collects stack traces during program development and execution and stores them on a central repository. We illustrate how thousands of stack traces stemming from the IDEs of several developers can be leveraged to identify common hot spots in the code that are involved in failures, using this knowledge to retrieve relevant and related bug reports and to provide an effective, instant context of the problem to the developer.

Code Reviews
Wed, Mar 4, 11:00 - 12:30

Who Should Review My Code? A File Location-Based Code-Reviewer Recommendation Approach for Modern Code Review
Patanamon Thongtanunam, Chakkrit Tantithamthavorn, Raula Gaikovina Kula, Norihiro Yoshida

, Hajimu Iida, and Kenichi Matsumoto

(NAIST, Japan; Osaka University, Japan; Nagoya University, Japan)
Software code review is an inspection of a code change by an independent third-party developer in order to identify and fix defects before an integration. Effectively performing code review can improve the overall software quality. In recent years, Modern Code Review (MCR), a lightweight and tool-based code inspection, has been widely adopted in both proprietary and open-source software systems. Finding appropriate code-reviewers in MCR is a necessary step of reviewing a code change. However, little research is known the difficulty of finding code-reviewers in a distributed software development and its impact on reviewing time. In this paper, we investigate the impact of reviews with code-reviewer assignment problem has on reviewing time. We find that reviews with code-reviewer assignment problem take 12 days longer to approve a code change. To help developers find appropriate code-reviewers, we propose RevFinder, a file location-based code-reviewer recommendation approach. We leverage a similarity of previously reviewed file path to recommend an appropriate code-reviewer. The intuition is that files that are located in similar file paths would be managed and reviewed by similar experienced code-reviewers. Through an empirical evaluation on a case study of 42,045 reviews of Android Open Source Project (AOSP), OpenStack, Qt and LibreOffice projects, we find that RevFinder accurately recommended 79% of reviews with a top 10 recommendation. RevFinder also correctly recommended the code-reviewers with a median rank of 4. The overall ranking of RevFinder is 3 times better than that of a baseline approach. We believe that RevFinder could be applied to MCR in order to help developers find appropriate code-reviewers and speed up the overall code review process.

Code Review: Veni, ViDI, Vici
Yuriy Tymchuk, Andrea Mocci, and Michele Lanza

(University of Lugano, Switzerland)
Modern software development sees code review as a crucial part of the process, because not only does it facilitate the sharing of knowledge about the system at hand, but it may also lead to the early detection of defects, ultimately improving the quality of the produced software. Although supported by numerous approaches and tools, code review is still in its infancy, and indeed researchers have pointed out a number of shortcomings in the state of the art. We present a critical analysis of the state of the art of code review tools and techniques, extracting a set of desired features that code review tools should possess. We then present our vision and initial implementation of a novel code review approach named Visual Design Inspection (ViDI), illustrated through a set of usage scenarios. ViDI is based on a combination of visualization techniques, design heuristics, and static code analysis techniques.

Would Static Analysis Tools Help Developers with Code Reviews?
Sebastiano Panichella, Venera Arnaoudova, Massimiliano Di Penta, and Giuliano Antoniol
(University of Zurich, Switzerland; Polytechnique Montréal, Canada; University of Sannio, Italy)
Code reviews have been conducted since decades in software projects, with the aim of improving code quality from many different points of view. During code reviews, developers are supported by checklists, coding standards and, possibly, by various kinds of static analysis tools. This paper investigates whether warnings highlighted by static analysis tools are taken care of during code reviews and, whether there are kinds of warnings that tend to be removed more than others. Results of a study conducted by mining the Gerrit repository of six Java open source projects indicate that the density of warnings only slightly vary after each review. The overall percentage of warnings removed during reviews is slightly higher than what previous studies found for the overall project evolution history. However, when looking (quantitatively and qualitatively) at specific categories of warnings, we found that during code reviews developers focus on certain kinds of problems. For such categories of warnings the removal percentage tend to be very high, often above 50% and sometimes up to 100%. Examples of those are warnings in the imports, regular expressions, and type resolution categories. In conclusion, while a broad warning detection might produce way too many false positives, enforcing the removal of certain warnings prior to the patch submission could reduce the amount of effort provided during the code review process.

Do Code Review Practices Impact Design Quality? A Case Study of the Qt, VTK, and ITK Projects
Rodrigo Morales, Shane McIntosh, and Foutse Khomh

(Polytechnique Montréal, Canada; Queen's University, Canada)
Code review is the process of having other team members examine changes to a software system in order to evaluate its technical content and quality. A lightweight variant of this practice, often referred to as Modern Code Review (MCR), is widely adopted by software organizations today. Previous studies have established a relation between the practice of code review and the occurrence of post-release bugs. While the prior work studies the impact of code review practices on software release quality, it is still unclear what impact code review practices have on software design quality. Therefore, using the occurrence of 7 different types of anti-patterns (i.e., poor solutions to design and implementation problems) as a proxy for software design quality, we set out to investigate the relationship between code review practices and software design quality. Through a case study of the Qt, VTK and ITK open source projects, we find that software components with low review coverage or low review participation are often more prone to the occurrence of anti-patterns than those components with more active code review practices.

Info

Searching and Cloning
Wed, Mar 4, 16:00 - 17:30

Scaling up Evaluation of Code Search Tools through Developer Usage Metrics
Kostadin Damevski, David C. Shepherd, and Lori Pollock
(Virginia State University, USA; ABB, USA; University of Delaware, USA)
Code search is a fundamental part of program understanding and software maintenance and thus researchers have developed many techniques to improve its performance, such as corpora preprocessing and query reformulation. Unfortunately, to date, evaluations of code search techniques have largely been in lab settings, while scaling and transitioning to effective practical use demands more empirical feedback from the field. This paper addresses that need by studying metrics based on automatically-gathered anonymous field data from code searches to infer user satisfaction. We describe techniques for addressing important concerns, such as how privacy is retained and how the overhead on the interactive system is minimized. We perform controlled user and field studies which identify metrics that correlate with user satisfaction, enabling the future evaluation of search tools through anonymous usage data. In comparing our metrics to similar metrics used in Internet search we observe differences in the relationship of some of the metrics to user satisfaction. As we further explore the data, we also present a predictive multi-metric model that achieves accuracy of over 70% in determining query satisfaction.

Optimized Feature Selection towards Functional and Non-functional Requirements in Software Product Lines
Xiaoli Lian and Li Zhang

(Beihang University, China)
As an important research issue in software product line, feature selection is extensively studied. Besides the basic functional requirements (FRs), the non-functional requirements (NFRs) are also critical during feature selection. Some NFRs have numerical constraints, while some have not. Without clear criteria, the latter are always expected to be the best possible. However, most existing selection methods ignore the combination of constrained and unconstrained NFRs and FRs. Meanwhile, the complex constraints and dependencies among features are perpetual challenges for feature selection. To this end, this paper proposes a multi-objective optimization algorithm IVEA to optimize the selection of features with NFRs and FRs by considering the relations among these features. Particularly, we first propose a two-dimensional fitness function. One dimension is to optimize the NFRs without quantitative constraints. The other one is to assure the selected features satisfy the FRs, and conform to the relations among features. Second, we propose a violationdominance principle, which guides the optimization under FRs and the relations among features. We conducted comprehensive experiments on two feature models with different sizes to evaluate IVEA with state-of-the-art multi-objective optimization algorithms, including IBEAHD, IBEAɛ+, NSGA-II and SPEA2. The results showed that the IVEA significantly outperforms the above baselines in the NFRs optimization. Meanwhile, our algorithm needs less time to generate a solution that meets the FRs and the constraints on NFRs and fully conforms to the feature model.

Threshold-Free Code Clone Detection for a Large-Scale Heterogeneous Java Repository
Iman Keivanloo, Feng Zhang, and Ying Zou

(Queen's University, Canada)
Code clones are unavoidable entities in software ecosystems. A variety of clone-detection algorithms are available for finding code clones. For Type-3 clone detection at method granularity (i.e., similar methods with changes in statements), dissimilarity threshold is one of the possible configuration parameters. Existing approaches use a single threshold to detect Type-3 clones across a repository. However, our study shows that to detect Type-3 clones at method granularity on a large-scale heterogeneous repository, multiple thresholds are often required. We find that the performance of clone detection improves if selecting different thresholds for various groups of clones in a heterogeneous repository (i.e., various applications). In this paper, we propose a threshold-free approach to detect Type-3 clones at method granularity across a large number of applications. Our approach uses an unsupervised learning algorithm, i.e., k-means, to determine true and false clones. We use a clone benchmark with 330,840 tagged clones from 24,824 open source Java projects for our study. We observe that our approach improves the performance significantly by 12% in terms of F-measure. Furthermore, our threshold-free approach eliminates the concern of practitioners about possible misconfiguration of Type-3 clone detection tools.

Detecting Duplicate Bug Reports with Software Engineering Domain Knowledge
Karan Aggarwal, Tanner Rutgers, Finbarr Timbers, Abram Hindle, Russ Greiner, and Eleni Stroulia
(University of Alberta, Canada)
In previous work by Alipour et al., a methodology was proposed for detecting duplicate bug reports by comparing the textual content of bug reports to subject-specific contextual material, namely lists of software-engineering terms, such as non-functional requirements and architecture keywords. When a bug report contains a word in these word-list contexts, the bug report is considered to be associated with that context and this information tends to improve bug-deduplication methods. In this paper, we propose a method to partially automate the extraction of contextual word lists from software-engineering literature. Evaluating this software-literature context method on real-world bug reports produces useful results that indicate this semi-automated method has the potential to substantially decrease the manual effort used in contextual bug deduplication while suffering only a minor loss in accuracy.

Best-Paper Candidate

Change Impact Analysis
Thu, Mar 5, 11:00 - 12:30

Impact Analysis Based on a Global Hierarchical Object Graph
Marwan Abi-Antoun, Yibin Wang, Ebrahim Khalaj, Andrew Giang, and Václav Rajlich
(Wayne State University, USA)
During impact analysis on object-oriented code, statically extracting dependencies is often complicated by subclassing, programming to interfaces, aliasing, and collections, among others. When a tool recommends a large number of types or does not rank its recommendations, it may lead developers to explore more irrelevant code. We propose to mine and rank dependencies based on a global, hierarchical points-to graph that is extracted using abstract interpretation. A previous whole-program static analysis interprets a program enriched with annotations that express hierarchy, and over-approximates all the objects that may be created at runtime and how they may communicate. In this paper, an analysis mines the hierarchy and the edges in the graph to extract and rank dependencies such as the most important classes related to a class, or the most important classes behind an interface. An evaluation using two case studies on two systems totaling 10,000 lines of code and five completed code modification tasks shows that following dependencies based on abstract interpretation achieves higher effectiveness compared to following dependencies extracted from the abstract syntax tree. As a result, developers explore less irrelevant code.

Info

A Framework for Cost-Effective Dependence-Based Dynamic Impact Analysis
Haipeng Cai and Raul Santelices
(University of Notre Dame, USA)
Dynamic impact analysis can greatly assist developers with managing software changes by focusing their attention on the effects of potential changes relative to concrete program executions. While dependence-based dynamic impact analysis (DDIA) provides finer-grained results than traceability-based approaches, traditional DDIA techniques often produce imprecise results, incurring excessive costs thus hindering their adoption in many practical situations. In this paper, we present the design and evaluation of a DDIA framework and its three new instances that offer not only much more precise impact sets but also flexible cost-effectiveness options to meet diverse application needs such as different budgets and levels of detail of results. By exploiting both static dependencies and various dynamic information including method-execution traces, statement coverage, and dynamic points-to data, our techniques achieve that goal at reasonable costs according to our experiment results. Our study also suggests that statement coverage has generally stronger effects on the precision and cost-effectiveness of DDIA than dynamic points-to data.

Info

Circular Dependencies and Change-Proneness: An Empirical Study
Tosin Daniel Oyetoyan, Jens Dietrich, Jean-Rémy Falleri, and Kamil Jezek
(NTNU, Norway; Massey University, New Zealand; LaBRI, France; University of Bordeaux, France; University of West Bohemia, Czech Republic)
Advice that circular dependencies between programming artefacts should be avoided goes back to the earliest work on software design, and is well-established and rarely questioned. However, empirical studies have shown that real-world (Java) programs are riddled with circular dependencies between artefacts on different levels of abstraction and aggregation. It has been suggested that additional heuristics could be used to distinguish between bad and harmless cycles, for instances by relating them to the hierarchical structure of the packages within a program, or to violations of additional design principles. In this study, we try to explore this question further by analysing the relationship between different kinds of circular dependencies between Java classes, and their change frequency. We find that (1) the presence of cycles can have a significant impact on the change proneness of the classes near these cycles and (2) neither subtype knowledge nor the location of the cycle within the package containment tree are suitable criteria to distinguish between critical and harmless cycles.

An Empirical Study of Work Fragmentation in Software Evolution Tasks
Heider Sanchez, Romain Robbes, and Victor M. Gonzalez
(University of Chile, Chile; ITAM, Mexico)
Information workers and software developers are exposed to work fragmentation, an interleaving of activities and interruptions during their normal work day. Small-scale observational studies have shown that this can be detrimental to their work. In this paper, we perform a large-scale study of this phenomenon for the particular case of software developers performing software evolution tasks. Our study is based on several thousands interaction traces collected by Mylyn, for dozens of developers. We observe that work fragmentation is correlated to lower observed productivity at both the macro level (for entire sessions), and at the micro level (around markers of work fragmentation); further, longer activity switches seem to strengthen the effect. These observations are basis for subsequent studies investigating the phenomenon of work fragmentation.

Best-Paper Candidate

SCAM at SANER
Thu, Mar 5, 16:00 - 17:30

Library Functions Identification in Binary Code by Using Graph Isomorphism Testings
Jing Qiu, Xiaohong Su, and Peijun Ma
(Harbin Institute of Technology, China)
Library function identification is a key technique in reverse engineering. Discontinuity and polymorphism of inline and optimized library functions in binary code create a difficult challenge for library function identification. To solve this problem, a novel approach is developed to identify library functions. First, we introduce execution dependence graphs (EDGs) to describe the behavior characteristics of binary code. Then, by finding similar EDG subgraphs in target functions, we identify both full and inline library functions. Experimental results from the prototype tool show that the proposed method is not only capable of identifying inline functions but is also more efficient and precise than the current methods for identifying full library functions.

A Non-convex Abstract Domain for the Value Analysis of Binaries
Sven Mattsen, Arne Wichmann, and Sibylle Schupp
(TU Hamburg, Germany)
A challenge in sound reverse engineering of binary executables is to determine sets of possible targets for dynamic jumps. One technique to address this challenge is abstract interpretation, where singleton values in registers and memory locations are overapproximated to collections of possible values. With contemporary abstract interpretation techniques, convexity is usually enforced on these collections, which causes unacceptable loss of precision. We present a non-convex abstract domain, suitable for the analysis of binary executables. The domain is based on binary decision diagrams (BDD) to allow an efficient representation of non-convex sets of integers. Non-convex sets are necessary to represent the results of jump table lookups and bitwise operations, which are more frequent in executables than in high-level code because of optimizing compilers. Our domain computes abstract bitwise and arithmetic operations precisely and looses precision only for division and multiplication. Because the operations are defined on the structure of the BDDs, they remain efficient even if executed on very large sets. In executables, conditional jumps require solving formulas built with negation and conjunction. We implement a constraint solver using the fast intersection and complementation of BDD-based sets. Our domain is implemented as a plug-in, called BDDStab, and integrated with the binary analysis framework Jakstab. We use Jakstab’s k-set and interval domains to discuss the increase in precision for a selection of compiler-generated executables.

Video

Info

Precision vs. Scalability: Context Sensitive Analysis with Prefix Approximation
Raveendra Kumar Medicherla and Raghavan Komondoor
(Tata Consultancy Services, India; Indian Institute of Science, India)
Context sensitive inter-procedural dataflow analysis is a precise approach for static analysis of programs. It is very expensive in its full form. We propose a prefix approximation for context sensitive analysis, wherein a prefix of the full context stack is used to tag dataflow facts. Our technique, which is in contrast with suffix approximation that has been widely used in the literature, is designed to be more scalable when applied to programs with modular structure. We describe an instantiation of our technique in the setting of the classical call-strings approach for inter-procedural analysis. We analyzed several large enterprise programs using an implementation of our technique, and compared it with the fully context sensitive, context insensitive, as well as suffix-approximated variants of the call-strings approach. The precision of our technique was in general less than that of suffix approximation when measured on entire programs. However, the precision that it offered for outer-level procedures, which typically contain key business logic, was better, and its performance was much better.

MG++: Memory Graphs for Analyzing Dynamic Data Structures
Vineet Singh, Rajiv Gupta

, and Iulian Neamtiu
(University of California at Riverside, USA)
Memory graphs are very useful in understanding the behavior of programs that use dynamically allocated data structures. We present a new memory graph representation, MG++, and a memory graph construction algorithm, that greatly enhance the utility of memory graphs. First, in addition to capturing the shapes of dynamically-constructed data structures, MG++ also captures how they evolve as the program executes and records the source code statements that play a role in their evolution to assist in debugging. Second, MG++ captures the history of actions performed by the memory allocator. This is useful in debugging programs that internally manage storage or in cases where understanding program behavior requires examining memory allocator actions. Our binary instrumentation-based algorithm for MG++ construction does not rely on the knowledge of memory allocator functions or on symbol table information. Our algorithm works for custom memory allocators as well as for in-program memory management. Experiments studying the time and space efficiency for real-world programs show that MG++ representation is space-efficient and the time overhead for MG++ construction algorithm is practical. We show that MG++ is effective for fault location and for analyzing binaries to detect heap buffer overflow attacks.

Mining Software Repositories
Fri, Mar 6, 09:00 - 10:30

SQA-Profiles: Rule-Based Activity Profiles for Continuous Integration Environments
Martin Brandtner, Sebastian C. Müller, Philipp Leitner, and Harald C. Gall

(University of Zurich, Switzerland)
Continuous Integration (CI) environments cope with the repeated integration of source code changes and provide rapid feedback about the status of a software project. However, as the integration cycles become shorter, the amount of data increases, and the effort to find information in CI environments becomes substantial. In modern CI environments, the selection of measurements (e.g., build status, quality metrics) listed in a dashboard does only change with the intervention of a stakeholder (e.g., a project manager). In this paper, we want to address the shortcoming of static views with so-called Software Quality Assessment (SQA) profiles. SQA-Profiles are defined as rulesets and enable a dynamic composition of CI dashboards based on stakeholder activities in tools of a CI environment (e.g., version control system). We present a set of SQA-Profiles for project management committee (PMC) members: Bandleader, Integrator, Gatekeeper, and Onlooker. For this, we mined the commit and issue management activities of PMC members from 20 Apache projects. We implemented a framework to evaluate the performance of our rule-based SQA-Profiles in comparison to a machine learning approach. The results showed that project-independent SQA-Profiles can be used to automatically extract the profiles of PMC members with a precision of 0.92 and a recall of 0.78.

Info

Cross-Project Build Co-change Prediction
Xin Xia, David Lo

, Shane McIntosh, Emad Shihab, and Ahmed E. Hassan

(Zhejiang University, China; Singapore Management University, Singapore; Queen's University, Canada; Concordia University, Canada)
Build systems orchestrate how human-readable source code is translated into executable programs. In a software project, source code changes can induce changes in the build system (aka. build co-changes). It is difficult for developers to identify when build co-changes are necessary due to the complexity of build systems. Prediction of build co-changes works well if there is a sufficient amount of training data to build a model. However, in practice, for new projects, there exists a limited number of changes. Using training data from other projects to predict the build co-changes in a new project can help improve the performance of the build co-change prediction. We refer to this problem as cross-project build co-change prediction. In this paper, we propose CroBuild, a novel cross-project build co-change prediction approach that iteratively learns new classifiers. CroBuild constructs an ensemble of classifiers by iteratively building classifiers and assigning them weights according to its prediction error rate. Given that only a small proportion of code changes are build co-changing, we also propose an imbalance-aware approach that learns a threshold boundary between those code changes that are build co-changing and those that are not in order to construct classifiers in each iteration. To examine the benefits of CroBuild, we perform experiments on 4 large datasets including Mozilla, Eclipse-core, Lucene, and Jazz, comprising a total of 50,884 changes. On average, across the 4 datasets, CroBuild achieves a F1-score of up to 0.408. We also compare CroBuild with other approaches such as a basic model, AdaBoost proposed by Freund et al., and TrAdaBoost proposed by Dai et al.. On average, across the 4 datasets, the CroBuild approach yields an improvement in F1-scores of 41.54%, 36.63%, and 36.97% over the basic model, AdaBoost, and TrAdaBoost, respectively.

The Influence of App Churn on App Success and StackOverflow Discussions
Latifa Guerrouj, Shams Azad, and Peter C. Rigby

(Concordia University, Canada)
Gauging the success of software systems has been difficult in the past as there was no uniform measure. With mobile Application (App) Stores, users rate each App according to a common rating scheme. In this paper, we study the impact of App churn on the App success through the analysis of 154 free Android Apps that have a total of 1.2k releases. We provide a novel technique to extract Android API elements used by Apps that developers change between releases. We find that high App churn leads to lower user ratings. For example, we find that on average, per release, poorly rated Apps change 140 methods compared to the 82 methods changed by positively rated Apps. Our findings suggest that developers should not release new features at the expense of churn and user ratings. We also investigate the link between how frequently API classes and methods are changed by App developers relative to the amount of discussion of these code elements on StackOverflow. Our findings indicate that classes and methods that are changed frequently by App developers are in more posts on StackOverflow. We add to the growing consensus that StackOverflow keeps up with the documentation needs of practitioners.

Beyond Support and Confidence: Exploring Interestingness Measures for Rule-Based Specification Mining
Tien-Duy B. Le and David Lo

(Singapore Management University, Singapore)
Numerous rule-based specification mining approaches have been proposed in the literature. Many of these approaches analyze a set of execution traces to discover interesting usage rules, e.g., whenever lock() is invoked, eventually unlock() is invoked. These techniques often generate and enumerate a set of candidate rules and compute some interestingness scores. Rules whose interestingness scores are above a certain threshold would then be output. In past studies, two measures, namely support and confidence, which are well-known measures, are often used to compute these scores. However, aside from these two, many other interestingness measures have been proposed. It is thus unclear if support and confidence are the best interestingness measures for specification mining. In this work, we perform an empirical study that investigates the utility of 38 interestingness measures in recovering correct specifications of classes from Java libraries. We used a ground truth dataset consisting of 683 rules and recorded execution traces that are produced when we run the DaCapo test suite. We apply 38 different interestingness measures to identify correct rules from a pool of candidate rules. Our study highlights that many measures are on par to support and confidence. Some of the measures are even better than support or confidence and at least one of the measures is statistically significantly better than the two measures. We also find that compositions of several measures with support statistically significantly outperform the composition of support and confidence. Our findings highlight the need to look beyond standard support and confidence to find interesting rules.

On Code Changes
Fri, Mar 6, 11:00 - 12:30

Untangling Fine-Grained Code Changes
Martín Dias, Alberto Bacchelli, Georgios Gousios

, Damien Cassou, and Stéphane Ducasse

(INRIA, France; University of Lille, France; Delft University of Technology, Netherlands; Radboud University Nijmegen, Netherlands)
After working for some time, developers commit their code changes to a version control system. When doing so, research shows that they often bundle unrelated changes (e.g., bug fix and refactoring) in a single commit, thus creating a so-called tangled commit. Sharing tangled commits is problematic because it makes review, reversion, and integration of these commits harder and historical analyses of the project less reliable. Researchers have worked at untangling existing commits, i.e., finding which part of a commit relates to which task. In this paper, we contribute to this line of work in two ways: (1) A publicly available dataset of untangled code changes, created with the help of two developers who accurately split their code changes into self contained tasks over a period of four months; (2) based on this dataset we devise and assess EpiceaUntangler, an approach to help developers share untangled commits (aka. atomic commits) by using fine-grained code change information. We further evaluate EpiceaUntangler by deploying it to 7 developers, who used it for 2 weeks. We recorded a median success rate of 91% and average one of 75%, in automatically creating clusters of untangled fine- grained code changes.

Best-Paper Candidate

A Comprehensive and Scalable Method for Analyzing Fine-Grained Source Code Change Patterns
Masatomo Hashimoto, Akira Mori, and Tomonori Izumida
(RIKEN Advanced Institute for Computational Science, Japan; National Institute of Advanced Industrial Science and Technology, Japan)
This paper presents a comprehensive method for identifying fine-grained change patterns in the source code of large-scale software projects. Source code changes are computed by differencing abstract syntax trees of adjacent versions and transferred to a set of logical statements called a factbase. A fact base contains information for tracking and relating source code entities across versions and can be used to integrate analysis results of other tools such as call graphs and control flows. Users can obtain a list of change pattern instances by querying the factbase. Experiments conducted on the Linux-2.6 kernel, which involve more than 4 billions of facts, are reported to demonstrate capability of the method.

Summarizing Evolutionary Trajectory by Grouping and Aggregating Relevant Code Changes
Qingtao Jiang, Xin Peng

, Hai Wang, Zhenchang Xing, and Wenyun Zhao
(Fudan University, China; Nanyang Technological University, Singapore)
The lifecycle of a large-scale software system can undergo many releases. Each release often involves hundreds or thousands of revisions committed by many developers over time. Many code changes are made in a systematic and collaborative way. However, such systematic and collaborative code changes are often undocumented and hidden in the evolution history of a software system. It is desirable to recover commonalities and associations among dispersed code changes in the evolutionary trajectory of a software system. In this paper, we present SETGA (Summarizing Evolutionary Trajectory by Grouping and Aggregation), an approach to summarizing historical commit records as trajectory patterns by grouping and aggregating relevant code changes committed over time. SETGA extracts change operations from a series of commit records from version control systems. It then groups extracted change operations by their common properties from different dimensions such as change operation types, developers and change locations. After that, SETGA aggregates relevant change operation groups by mining various associations among them. The proposed approach has been implemented and applied to three open-source systems. The results show that SETGA can identify various types of trajectory patterns that are useful for software evolution management and quality assurance.

Best-Paper Candidate

Identifying the Exact Fixing Actions of Static Rule Violation
Hayatou Oumarou, Nicolas Anquetil, Anne Etien, Stéphane Ducasse

, and Kolyang Dina Taiwe
(University of Maroua, Cameroon; INRIA, France; University of Lille, France)
Abstract— We study good programming practices expressed in rules and detected by static analysis checkers such as PMD or FindBugs. To understand how violations to these rules are corrected and whether this can be automated, we need to identify in the source code where they appear and how they were fixed. This presents some similarities with research on understanding software bugs, their causes, their fixes, and how they could be avoided. The traditional method to identify how a bug or a rule violation were fixed consists in finding the commit that contains this fix and identifying what was changed in this commit. If the commit is small, all the lines changed are ascribed to the fixing of the rule violation or the bug. However, commits are not always atomic, and several fixes and even enhancements can be mixed in a single one (a large commit). In this case, it is impossible to detect which modifications contribute to which fix. In this paper, we are proposing a method that identifies precisely the modifications that are related to the correction of a rule violation. The same method could be applied to bug fixes, providing there is a test illustrating this bug. We validate our solution on a real world system and actual rules.

The Human Within
Fri, Mar 6, 14:00 - 15:30

CloCom: Mining Existing Source Code for Automatic Comment Generation
Edmund Wong, Taiyue Liu, and Lin Tan
(University of Waterloo, Canada)
Code comments are an integral part of software development. They improve program comprehension and software maintainability. The lack of code comments is a common problem in the software industry. Therefore, it is beneficial to generate code comments automatically. In this paper, we propose a general approach to generate code comments automatically by analyzing existing software repositories. We apply code clone detection techniques to discover similar code segments and use the comments from some code segments to describe the other similar code segments. We leverage natural language processing techniques to select relevant comment sentences.
In our evaluation, we analyze 42 million lines of code from 1,005 open source projects from GitHub, and use them to generate 359 code comments for 21 Java projects. We manually evaluate the generated code comments and find that only 23.7% of the generated code comments are good. We report to the developers the good code comments, whose code segments do not have an existing code comment. Amongst the reported code comments, seven have been confirmed by the developers as good and committable to the software repository while the rest await for developers’ confirmation. Although our approach can generate good and committable comments, we still have to improve the yield and accuracy of the proposed approach before it can be used in practice with full automation.

amAssist: In-IDE Ambient Search of Online Programming Resources
Hongwei Li, Xuejiao Zhao, Zhenchang Xing, Lingfeng Bao, Xin Peng

, Dongjing Gao, and Wenyun Zhao
(Fudan University, China; Jiangxi Normal University, China; Nanyang Technological University, Singapore; Zhejiang University, China)
Developers work in the IDE, but search online resources in the web browser. The separation of the working and search context often cause the ignorance of the working context during online search. Several tools have been proposed to integrate the web browser into the IDE so that developers can search and use online resources directly in the IDE. These tools enable only the shallow integration of the web browser and the IDE. Some tools allow the developer to augment search queries with program entities in the current snapshot of the code. In this paper, we present an in-IDE ambient search agent to bridge the separation of the developer’s working context and search context. Our approach considers the developers’ working context in the IDE as a time-series stream of programming event observed from the developer’s interaction with the IDE over time. It supports the deeper integration of the working context in the entire search process from query formulation, custom search, to search results refinement and representation. We have implemented our ambient search agent and integrate it into the Eclipse IDE. We conducted a user study to evaluate our approach and the tool support. Our evaluation shows that our ambient search agent can better aid developers in searching and using online programming resources while working in the IDE.

Reverse Engineering Time-Series Interaction Data from Screen-Captured Videos
Lingfeng Bao, Jing Li, Zhenchang Xing, Xinyu Wang, and Bo Zhou
(Zhejiang University, China; Nanyang Technological University, Singapore)
In recent years the amount of research on human aspects of software engineering has increased. Many studies use screen-capture software (e.g., Snagit) to record developers' behavior as they work on software development tasks. The recorded task videos capture direct information about which activities the developers carry out with which content and in which applications during the task. Such behavioral data can help researchers and practitioners understand and improve software engineering practices from human perspective. However, extracting time-series interaction data (software usage and application content) from screen-captured videos requires manual transcribing and coding of videos, which is tedious and error-prone. In this paper we present a computer-vision based video scraping technique to automatically reverse-engineer time-series interaction data from screen-captured videos. We report the usefulness, effectiveness and runtime performance of our video scraping technique using a case study of the 29 hours task videos of 20 developers in the two development tasks.

Niche vs. Breadth: Calculating Expertise over Time through a Fine-Grained Analysis
Jose Ricardo da Silva Junior, Esteban Clua, Leonardo Murta, and Anita Sarma
(Federal Fluminense University, Brazil; University of Nebraska-Lincoln, USA)
Identifying expertise in a project is essential for task allocation, knowledge dissemination, and risk management, among other activities. However, keeping a detailed record of such expertise at class and method levels is cumbersome due to project size, evolution, and team turnover. Existing approaches that automate this task have limitations in terms of the number and granularity of elements that can be analyzed and the analysis timeframe. In this paper, we introduce a novel technique to identify expertise for a given project, package, file, class, or method by considering not only the total number of edits that a developer has made, but also the spread of their changes in an artifact over time, and thereby the breadth of their expertise. We use Dominoes – our GPU-based approach for exploratory repository analysis – for expertise identification over any given granularity and time period with a short processing time. We evaluated our approach through Apache Derby and observed that granularity and time can have significant influence on expertise identification.

Info

Search, Touch, Tweet
Fri, Mar 6, 16:00 - 17:30

Protecting Web Applications via Unicode Extension
Boze Zekan, Mark Shtern, and Vassilios Tzerpos
(York University, Canada)
Protecting web applications against security attacks, such as command injection, is an issue that has been attracting increasing attention as such attacks are becoming more prevalent. Taint tracking is an approach that achieves protection while offering significant maintenance benefits when implemented at the language library level. This allows the transparent re-engineering of legacy web applications without the need to modify their source code. Such an approach can be implemented at either the string or the character level. We propose a new phase in the evolution of character-level taint tracking. Our approach provides all the benefits of existing methods, but it is not limited to a specific programming language. It also has a broader range within computing environments as it can easily propagate taint information between networks, servers, clients, applications, documents and operating systems. Finally, it allows a character's taint status to easily be stored in databases, files, and even in filenames. This paper presents our novel character-level taint mechanism. It also describes the web application re-engineering framework we constructed to demonstrate the benefits of our approach. Experiments with a prototype implementation of the framework showcase its usefulness.

A Search-Based Approach to Multi-view Clustering of Software Systems
Amir M. Saeidi, Jurriaan Hage, Ravi Khadka, and Slinger Jansen
(Utrecht University, Netherlands)
Unsupervised software clustering is the problem of automatically decomposing the software system into meaningful units. Some approaches solely rely on the structure of the system, such as the module dependency graph, to decompose the software systems into cohesive groups of modules. Other techniques focus on the informal knowledge hidden within the source code itself to retrieve the modular architecture of the system. However both techniques in the case of large systems fail to produce decompositions that correspond to the actual architecture of the system. To overcome this problem, we propose a novel approach to clustering software systems by incorporating knowledge from different viewpoints of the system, such as the knowledge embedded within the source code as well as the structural dependencies within the system, to produce a clustering. In this setting, we adopt a search-based approach to the encoding of multi-view clustering and investigate two approaches to tackle this problem, one based on a linear combination of objectives into a single objective, the other a multi-objective approach to clustering. We evaluate our approach against a set of substantial software systems. The two approaches are evaluated on a dataset comprising of 10 Java open source projects. Finally, we propose two techniques based on interpolation and hierarchical clustering to combine different results obtained to yield a single result for single-objective and multi-objective encodings, respectively.

CEL: Touching Software Modeling in Essence
Remo Lemma, Michele Lanza

, and Andrea Mocci
(University of Lugano, Switzerland)
Understanding a problem domain is a fundamental prerequisite for good software design. In object-oriented systems design, modeling is the fundamental first phase that focuses on identifying core concepts and their relations. How to properly support modeling is still an open problem, and existing approaches and tools can be very different in nature. On the one hand, lightweight ones, such as pen & paper/whiteboard or CRC cards, are informal and support well the creative aspects of modeling, but produce artifacts that are difficult to store, process and reuse as documentation. On the other hand, more constrained and semi-formal ones, like UML, produce storable and processable structured artifacts with defined semantics, but this comes at the expense of creativity. We believe there exists a middle ground to investigate that maximizes the good of both worlds, that is, by supporting software modeling closer to its essence, with minimal constraints on the developer's creativity and still producing reusable structured artifacts. We also claim that modeling can be best treated by using the emerging technology of touch-based tablets. We present a novel gesture-based modeling approach based on a minimal set of constructs, and Cel, an iPad application, for rapidly creating, manipulating, and storing language agnostic object-oriented software models, which can be exported as skeleton source code in any language of choice. We assess our approach through a controlled qualitative study.

NIRMAL: Automatic Identification of Software Relevant Tweets Leveraging Language Model
Abhishek Sharma, Yuan Tian, and David Lo

(Singapore Management University, Singapore)
Twitter is one of the most widely used social media platforms today. It enables users to share and view short 140- character messages called “tweets”. About 284 million active users generate close to 500 million tweets per day. Such rapid generation of user generated content in large magnitudes results in the problem of information overload. Users who are interested in information related to a particular domain have limited means to filter out irrelevant tweets and tend to get lost in the huge amount of data they encounter. A recent study by Singer et al. found that software developers use Twitter to stay aware of industry trends, to learn from others, and to network with other developers. However, Singer et al. also reported that developers often find Twitter streams to contain too much noise which is a barrier to the adoption of Twitter. In this paper, to help developers cope with noise, we propose a novel approach named NIRMAL, which automatically identifies software relevant tweets from a collection or stream of tweets. Our approach is based on language modeling which learns a statistical model based on a training corpus (i.e., set of documents). We make use of a subset of posts from StackOverflow, a programming question and answer site, as a training corpus to learn a language model. A corpus of tweets was then used to test the effectiveness of the trained language model. The tweets were sorted based on the rank the model assigned to each of the individual tweets. The top 200 tweets were then manually analyzed to verify whether they are software related or not, and then an accuracy score was calculated. The results show that decent accuracy scores can be achieved by various variants of NIRMAL, which indicates that NIRMAL can effectively identify software related tweets from a huge corpus of tweets.

Tool Demonstrations
Tue, Mar 3, 11:00 - 12:30

A Static Code Analysis Tool for Control System Software
Sreeja Nair, Raoul Jetley, Anil Nair, and Stefan Hauck-Stattelmann
(ABB Research, India; ABB Research, Germany)
Latent errors in control system software can be hard to detect through traditional testing techniques. Such errors, if left undetected, could manifest themselves as failures during run-time that could be potentially catastrophic and very expensive to fix. In this paper, we present a static code analysis approach to detect potential sources of such run-time errors during compile time itself, thus ensuring easy identification, safe execution and reducing the effort required during debugging. In order to detect run-time errors, the control system application is first parsed to generate a set of abstract syntax trees, which in turn are used to derive the control flow graph for the application. A hybrid algorithm, based on abstract interpretation and traditional data flow analysis techniques is used to check the control flow graph for type constraints, reachability and liveness properties. Additionally, the abstract syntax trees are used to check for datatype mismatches and compliance violations. A proof of concept prototype is implemented to demonstrate how the algorithm/approach can be used to analyze control applications developed using domain specific languages such as those complying with the IEC 61131-3 standard.

RbG: A Documentation Generator for Scientific and Engineering Software
Michael Moser, Josef Pichler, Günter Fleck, and Michael Witlatschil
(Software Competence Center Hagenberg, Austria; Siemens Transformers Austria, Austria)
This paper demonstrates RbG, a new tool intended for the generation of high-quality documentation from source code of scientific and engineering applications. RbG extracts mathematical formulae and decision tables from program statements by means of static code analysis and generates corresponding documentation in the Open Document Format or LaTeX. Annotations in source code comments are used to define the structure of the generated documents, include additional textual and graphical descriptions, and control extraction of formulae on a fine-grained level. Furthermore, RbG provides an interpreter to generate function plots for extracted formulae. In this tool demonstration we briefly introduce the tool and show its usage for different scenarios such as reverse engineering and re-documentation of legacy code and documentation generation during development and maintenance of software.

Video

Info

Historef: A Tool for Edit History Refactoring
Shinpei Hayashi, Daiki Hoshino, Jumpei Matsuda, Motoshi Saeki, Takayuki Omori, and Katsuhisa Maruyama
(Tokyo Institute of Technology, Japan; Ritsumeikan University, Japan)
This paper presents Historef, a tool for automating edit history refactoring on Eclipse IDE for Java programs. The aim of our history refactorings is to improve the understandability and/or usability of the history without changing its whole effect. Historef enables us to apply history refactorings to the recorded edit history in the middle of the source code editing process by a developer. By using our integrated tool, developers can commit the refactored edits into underlying SCM repository after applying edit history refactorings so that they are easy to manage their changes based on the performed edits.

Info

ClonePacker: A Tool for Clone Set Visualization
Hiroaki Murakami, Yoshiki Higo

, and Shinji Kusumoto
(Osaka University, Japan)
Programmers often copy and paste code fragments when they would like to reuse them. Although copy-and-paste operations enable programmers to realize rapid developments of software systems, it makes code clones. Some clones have negative impacts on software developments. For example, if we modify a code fragment, we have to check whether its clones need the same modification. In this case, programmers often use tools that take a code fragment as input and take its clones as output. However, when programmers use such existing tools, programmers have to open a number of source code and move up/down a scroll bar for browsing the detected clones. In order to reduce the cost of browsing the detected clones, we developed a tool that visualizes clones by using Circle Packing, named ClonePacker. As a result of an experiment with participants, we confirmed that participants using ClonePacker reported the locations of clones faster than an existing tool.

Video

Info

GiLA: GitHub Label Analyzer
Javier Luis Cánovas Izquierdo, Valerio Cosentino, Belén Rolandi, Alexandre Bergel, and Jordi Cabot
(AtlanMod, France; University of Chile, Chile)
Reporting bugs, asking for new features and in general giving any kind of feedback is a common way to contribute to an Open-Source Software (OSS) project. In GitHub, the largest code hosting service for OSS, this feedback is typically expressed as new issues for the project managed by an issue-tracking system available in each new project repository.
Among other features, the issue tracker allows creating and assigning labels to issues with the goal of helping the project community to better classify and manage those issues (e.g., facilitating the identification of issues for top priority components or candidate developers that could solve them). Nevertheless, as the project grows a manual browsing of the project issues is no longer feasible. In this paper we present GiLA, a tool which generates a set of visualizations to facilitate the analysis of issues in a project depending on their label-based categorization. We believe our visualizations are useful to see the most popular labels (and their relationships) in a project, identify the most active community members for those labels and compare the typical issue evolution for each label category.

Video

Info

SPCP-Miner: A Tool for Mining Code Clones That Are Important for Refactoring or Tracking
Manishankar Mondal, Chanchal K. Roy, and Kevin A. Schneider
(University of Saskatchewan, Canada)
Code cloning has both positive and negative impacts on software maintenance and evolution. Focusing on the issues related to code cloning, researchers suggest to manage code clones through refactoring and tracking. However, it is impractical to refactor or track all clones in a software system. Thus, it is essential to identify which clones are important for refactoring and also, which clones are important for tracking. In this paper, we present a tool called SPCP-Miner which is the pioneer one to automatically identify and rank the important refactoring as well as important tracking candidates from the whole set of clones in a software system. SPCP-Miner implements the existing techniques that we used to conduct a large scale empirical study on SPCP clones (i.e., the clones that evolved following a Similarity Preserving Change Pattern called SPCP). We believe that SPCP-Miner can help us in better management of code clones by suggesting important clones for refactoring or tracking.

Video

Info

TracerJD: Generic Trace-Based Dynamic Dependence Analysis with Fine-Grained Logging
Haipeng Cai and Raul Santelices
(University of Notre Dame, USA)
We present the design and implementation of TraceJD, a toolkit devoted to dynamic dependence analysis via fine-grained whole-program dependence tracing. TraceJD features a generic framework for efficient offline analysis of dynamic dependencies, including those due to exception-driven control flows. Underlying the framework is a hierarchical trace indexing scheme by which TraceJD maintains the relationships among execution events at multiple levels of granularity while capturing those events at runtime. Built on this framework, several application tools are provided as well, including a dynamic slicer and a performance profiler. These example applications also demonstrate the flexibility and ease with which a variety of client analyses can be built based on the framework. We tested our toolkit on four Java subjects, for which the results suggest promising efficiency of TraceJD for its practical use in various dependence-based tasks.

Video

Info

Umple: A Framework for Model Driven Development of Object-Oriented Systems
Miguel A. Garzón, Hamoud Aljamaan, and Timothy C. Lethbridge
(University of Ottawa, Canada)
Huge benefits are gained when Model Driven Engineering are adopted to develop software systems. However, it remains a challenge for software modelers to embrace the MDE approach. In this paper, we present Umple, a framework for Model Driven Development in Object-Oriented Systems that can be used to generate entire software systems (Model Driven Forward Engineering) or to recover the models from existing software systems (Model Driven Reverse Engineering). Umple models are written using a friendly human-readable modeling notation seamlessly integrated with algorithmic code. In other words, we present a model-is-the-code approach, where developers are more likely to maintain and evolve the code as the system matures simply by the fact that both model and code are integrated as aspects of the same system. Finally, we demonstrate how the framework can be used to elaborate on solutions supporting different scenarios such as software modernization and program comprehension.

Assessing the Bus Factor of Git Repositories
Valerio Cosentino, Javier Luis Cánovas Izquierdo, and Jordi Cabot
(AtlanMod, France)
Software development projects face a lot of risks (requirements inflation, poor scheduling, technical problems, etc.). Underestimating those risks may put in danger the project success. One of the most critical risks is the employee turnover, that is the risk of key personnel leaving the project. A good indicator to evaluate this risk is to measure the concentration of information in individual developers. This is also popularly known as the bus factor (“number of key developers who would need to be incapacitated, i.e. hit by a bus, to make a project unable to proceed”). Despite the simplicity of the concept, calculating the actual bus factor for specific projects can quickly turn into an errorprone and time-consuming activity as soon as the size of the project and development team increase. In order to help project managers to assess the bus factor of their projects, in this paper we present a tool that, given a Git-based repository, automatically measures the bus factor for any file, directory and branch in the repository and for the project itself. You can also simulate with the tool what would happen to the project (e.g., which files would become orphans) if one or more developers disappeared.

Video

Industrial Research
Wed, Mar 4, 16:00 - 17:15

Old Habits Die Hard: Why Refactoring for Understandability Does Not Give Immediate Benefits
Erik Ammerlaan, Wim Veninga, and Andy Zaidman
(Exact International Development, Netherlands; Delft University of Technology, Netherlands)
Depending on the context, the benefits of clean code with respect to understandability might be less obvious in the short term than is often claimed. In this study we evaluate whether a software system with legacy code in an industrial environment benefits from a “clean code” refactoring in terms of developer productivity. We observed both increases as well as decreases in understandability, showing that immediate increases in understandability are not always obvious. Our study suggests that refactoring code could result in a productivity penalty in the short term if the coding style becomes different from the style developers have grown attached to.

Bash2py: A Bash to Python Translator
Ian J. Davis, Mike Wexler, Cheng Zhang, Richard C. Holt, and Theresa Weber
(University of Waterloo, Canada; Owl Computing Technologies, USA)
Shell scripting is the primary way for programmers to interact at a high level with operating systems. For decades bash shell scripts have thus been used to accomplish various tasks. But Bash has a counter-intuitive syntax that is not well understood by modern programmers and is no longer adequately supported, making it now difficult to maintain. Bash also suffers from poor performance, memory leakage problems, and limited functionality which make continued dependence on it problematic. At the request of our industrial partner, we therefore developed a source-to-source translator, bash2py, which converts bash scripts into Python. Bash2py leverages the open source bash code, and the internal parser employed by Bash to parse any bash script. However, bash2py re-implements the variable expansion that occurs in Bash to better generate correct Python code. Bash2py correctly converts most Bash into Python, but does require human intervention to handle constructs that cannot easily be automatically translated. In our experiments on real-world open source bash scripts bash2py successfully translates 90% of the code. Feedback from our industrial partner confirms the usefulness of bash2py in practice.

On Implementational Variations in Static Analysis Tools
Tukaram Muske and Prasad Bokil
(Tata Consultancy Services, India)
Static analysis tools are widely used in practice due to their ability to detect defects early in the software development life-cycle and that too while proving absence of defects of certain patterns. There exists a large number of such tools, and they are found to be varying depending on several tool characteristics like analysis techniques, programming languages supported, verification checks performed, scalability, and performance. Many studies about these tools and their variations, have been performed to improve the analysis results or figure out a better tool amongst a set of available static analysis tools. It is our observation that, in these studies only the aforementioned tool characteristics are considered and compared, and other implementational variations are usually ignored. In this paper, we study the implementational variations occurring among the static analysis tools, and experimentally demonstrate their impact on the tool characteristics and other analysis related attributes. The aim of this paper is twofold - a) to provide the studied implementational variations as choices, along with their pros and cons, to the designers or developers of static analysis tools, and b) to provide an educating material to the tool users so that the analysis results are better understood.

Tracking Known Security Vulnerabilities in Proprietary Software Systems
Mircea Cadariu, Eric Bouwers, Joost Visser, and Arie van Deursen

(Software Improvement Group, Netherlands; Delft University of Technology, Netherlands; Radboud University Nijmegen, Netherlands)
Known security vulnerabilities can be introduced in software systems as a result of being dependent upon third-party components. These documented software weaknesses are “hiding in plain sight” and represent low hanging fruit for attackers. In this paper we present the Vulnerability Alert Service (VAS), a tool-based process to track known vulnerabilities in software systems throughout their life cycle. We studied its usefulness in the context of external software product quality monitoring provided by the Software Improvement Group, a software ad- visory company based in Amsterdam, the Netherlands. Besides empirically assessing the usefulness of the VAS, we have also leveraged it to gain insight and report on the prevalence of third-party components with known security vulnerabilities in proprietary applications.

Early Research Achievements

Evolution and Reuse
Tue, Mar 3, 11:00 - 12:30

Trusting a Library: A Study of the Latency to Adopt the Latest Maven Release
Raula Gaikovina Kula, Daniel M. German, Takashi Ishio, and Katsuro Inoue
(Osaka University, Japan; University of Victoria, Canada)
With the popularity of open source library (re)use in both industrial and open source settings, `trust' plays vital role in third-party library adoption. Trust involves the assumption of both functional and non-functional correctness. Even with the aid of dependency management build tools such as Maven and Gradle, research have still found a latency to trust the latest release of a library. In this paper, we investigate the trust of OSS libraries. Our study of 6,374 systems in Maven Super Repository suggests that 82% of systems are more trusting of adopting the latest library release to existing systems. We uncover the impact of maven on latent and trusted library adoptions.

Evolution of Dynamic Feature Usage in PHP
Mark Hills
(East Carolina University, USA)
PHP includes a number of dynamic features that, if used, make it challenging for both programmers and tools to reason about programs. In this paper we examine how usage of these features has changed over time, looking at usage trends for three categories of dynamic features across the release histories of two popular open-source PHP systems, WordPress and MediaWiki. Our initial results suggest that, while features such as eval are being removed over time, more constrained dynamic features such as variable properties are becoming more common. We believe the results of this analysis provide useful insights for researchers and tool developers into the evolving use of dynamic features in real PHP programs.

Towards Incremental Model Slicing for Delta-Oriented Software Product Lines
Sascha Lity, Hauke Baller, and Ina Schaefer
(TU Braunschweig, Germany)
The analysis of nowadays software systems for supporting, e.g., testing, verification or debugging is becoming more challenging due to their increasing complexity. Model slicing is a promising analysis technique to tackle this issue by abstracting from those parts not influencing the current point of interest. In the context of software product lines, applying model slicing separately for each variant is in general infeasible. Delta modeling allows exploiting the explicit specification of commonality and variability within deltas and enables the reuse of artifacts and already obtained results to reduce the modeling and analysis efforts. In this paper, we propose a novel approach for incremental model slicing for delta-oriented software product lines. Based on the specification of model changes between variants by means of model regression deltas, an incremental adaptation of variant-specific dependency graphs as well as an incremental slice computation is achieved. The slice computation further allows for the derivation of differences between slices for the same point of interest enhancing, e.g., change impact analysis. We provide details of our incremental approach, discuss benefits and present future work.

Understanding Software Performance Regressions using Differential Flame Graphs
Cor-Paul Bezemer, Johan Pouwelse, and Brendan Gregg
(Delft University of Technology, Netherlands; Netflix, USA)
Flame graphs are gaining rapidly in popularity in industry to visualize performance profiles collected by stack-trace based profilers. In some cases, for example, during performance regression detection, profiles of different software versions have to be compared. Doing this manually using two or more flame graphs or textual profiles is tedious and error-prone. In this `Early Research Achievements'-track paper, we present our preliminary results on using differential flame graphs instead. Differential flame graphs visualize the differences between two performance profiles. In addition, we discuss which research fields we expect to benefit from using differential flame graphs. We have implemented our approach in an open source prototype called FlameGraphDiff, which is available on GitHub. FlameGraphDiff makes it easy to generate interactive differential flame graphs from two existing performance profiles. These graphs facilitate easy tracing of elements in the different graphs to ease the understanding of the (d)evolution of the performance of an application.

Text and Labeling
Tue, Mar 3, 14:00 - 15:30

TextRank Based Search Term Identification for Software Change Tasks
Mohammad Masudur Rahman and Chanchal K. Roy
(University of Saskatchewan, Canada)
During maintenance, software developers deal with a number of software change requests. Each of those requests is generally written using natural language texts, and it involves one or more domain related concepts. A developer needs to map those concepts to exact source code locations within the project in order to implement the requested change. This mapping generally starts with a search within the project that requires one or more suitable search terms. Studies suggest that the developers often perform poorly in coming up with good search terms for a change task. In this paper, we propose and evaluate a novel TextRank-based technique that automatically identifies and suggests search terms for a software change task by analyzing its task description. Experiments with 349 change tasks from two subject systems and comparison with one of the latest and closely related state-of-the-art approaches show that our technique is highly promising in terms of suggestion accuracy, mean average precision and recall.

Query Expansion via Wordnet for Effective Code Search
Meili Lu, Xiaobing Sun, Shaowei Wang, David Lo

, and Yucong Duan
(Yangzhou University, China; Nanjing University, China; Singapore Management University, Singapore; Hainan University, China)
Source code search plays an important role in software maintenance. The effectiveness of source code search not only relies on the search technique, but also on the quality of the query. In practice, software systems are large, thus it is difficult for a developer to format an accurate query to express what really in her/his mind, especially when the maintainer and the original developer are not the same person. When a query performs poorly, it has to be reformulated. But the words used in a query may be different from those that have similar semantics in the source code, i.e., the synonyms, which will affect the accuracy of code search results. To address this issue, we propose an approach that extends a query with synonyms generated from WordNet. Our approach extracts natural language phrases from source code identifiers, matches expanded queries with these phrases, and sorts the search results. It allows developers to explore word usage in a piece of software, helps them quickly identify relevant program elements for investigation or quickly recognize alternative words for query reformulation. Our initial empirical study on search tasks performed on the JavaScript/ECMAScript interpreter and compiler, Rhino, shows that the synonyms used to expand the queries help recommend good alternative queries. Our approach also improves the precision and recall of Conquer, a state-of-the-art query expansion/reformulation technique, by 5% and 8% respectively.

Exploring the Use of Labels to Categorize Issues in Open-Source Software Projects
Jordi Cabot, Javier Luis Cánovas Izquierdo, Valerio Cosentino, and Belén Rolandi
(AtlanMod, France)
Reporting bugs, asking for new features and in general giving any kind of feedback is a common way to contribute to an Open-Source Software (OSS) project. This feedback is generally reported in the form of new issues for the project, managed by the so-called issue-trackers. One of the features provided by most issue-trackers is the possibility to define a set of labels/tags to classify the issues and, at least in theory, facilitate their management. Nevertheless, there is little empirical evidence to confirm that taking the time to categorize new issues has indeed a beneficial impact on the project evolution. In this paper we analyze a population of more than three million of GitHub projects and give some insights on how labels are used in them. Our preliminary results reveal that, even if the label mechanism is scarcely used, using labels favors the resolution of issues. Our analysis also suggests that not all projects use labels in the same way (e.g., for some labels are only a way to prioritize the project while others use them to signal their temporal evolution as they move along in the development workflow). Further research is needed to precisely characterize these label "families" and learn more the ideal application scenarios for each of them.

Explore the Evolution of Development Topics via On-Line LDA
Jiajun Hu, Xiaobing Sun, and Bin Li
(Yangzhou University, China; Nanjing University, China)
Software repositories such as revision control systems and bug tracking systems are usually used to manage the changes of software projects. During software maintenance and evolution, software developers and stakeholders need to investigate these repositories to identify what tasks were worked on in a particular time interval and how much effort was devoted to them. A typical way of mining software repositories is to use topic analysis models, e.g., Latent Dirichlet Allocation (LDA), to identify and organize the underlying structure in software documents to understand the evolution of development topics. These previously LDA-based topic analysis models can capture either changes on the strength (popularity) of various development topics over time (i.e., strength evolution) or changes in the content (the words that form the topic) of existing topics over time (i.e., content evolution). Unfortunately, few techniques can capture both strength and content evolution simultaneously. However, both pieces of information are necessary for developers to fully understand how software evolves. In this paper, we propose a novel approach to analyze commit messages within a project’s lifetime to capture both strength and content evolution simultaneously via Online Latent Dirichlet Allocation (On-Line LDA). Moreover, the proposed approach also provides an efficient way to detect emerging topics in real development iteration when a new feature request arrives at a particular time, thus helping project stakeholds progress their projects smoothly.

Bugs and Violations
Wed, Mar 4, 11:00 - 12:30

Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in Large Systems
Pavneet Singh Kochhar, Ferdian Thung, and David Lo

(Singapore Management University, Singapore)
During software maintenance, testing is a crucial activity to ensure the quality of program code as it evolves over time. With the increasing size and complexity of software, adequate software testing has become increasingly important. Code coverage is often used as a yardstick to gauge the comprehensiveness of test cases and the adequacy of testing. A test suite quality is often measured by the number of bugs it can find (aka. kill). Previous studies have analysed the quality of a test suite by its ability to kill mutants, i.e., artificially seeded faults. However, mutants do not necessarily represent real bugs. Moreover, many studies use small programs which increases the threat of the applicability of the results on large real-world systems.
In this paper, we analyse two large software systems to measure the relationship of code coverage and its effectiveness in killing real bugs from the software systems. We use Randoop, a random test generation tool to generate test suites with varying levels of coverage and run them to analyse if the test suites can kill each of the real bugs or not. In this preliminary study, we have performed an experiment on 67 and 92 real bugs from Apache HTTPClient and Mozilla Rhino, respectively. Our experiment finds that there is indeed statistically significant correlation between code coverage and bug kill effectiveness. The strengths of the correlation, however, differ for the two software systems. For HTTPClient, the correlation is moderate for both statement and branch coverage. For Rhino, the correlation is strong for both statement and branch coverage.

Detection of Violation Causes in Reflexion Models
Sebastian Herold, Michael English, Jim Buckley, Steve Counsell

, and Mel Ó Cinnéide
(Lero, Ireland; University of Limerick, Ireland; Brunel University, UK; University College Dublin, Ireland)
Reflexion Modelling is a well-understood technique to detect architectural violations that occur during software architecture erosion. Resolving these violations can be difficult when erosion has reached a critical level and the causes of the violations are interwoven and difficult to understand. This article outlines a novel technique to automatically detect typical causes of violations in reflexion models, based on the definition and detection of typical symptoms for these causes. Preliminary results show that the proposed technique can support software architects’ navigation through reflexion models of eroded systems to understand causes of violations and to systematically take actions against them.

A Comparative Study on the Effectiveness of Part-of-Speech Tagging Techniques on Bug Reports
Yuan Tian and David Lo

(Singapore Management University, Singapore)
Many software artifacts are written in natural language or contain substantial amount of natural language contents. Thus these artifacts could be analyzed using text analysis techniques from the natural language processing (NLP) community, e.g., the part-of-speech (POS) tagging technique that assigns POS tags (e.g., verb, noun, etc.) to words in a sentence. In the literature, several studies have already applied POS tagging technique on software artifacts to recover important words in them, which are then used for automating various tasks, e.g., locating buggy files for a given bug report, etc. There are many POS tagging techniques proposed and they are trained and evaluated on non software engineering corpus (documents). Thus it is unknown whether they can correctly identify the POS of a word in a software artifact and which of them performs the best. To fill this gap, in this work, we investigate the effectiveness of seven POS taggers on bug reports. We randomly sample 100 bug reports from Eclipse and Mozilla project and create a text corpus that contains 21,713 words. We manually assign POS tags to these words and use them to evaluate the studied POS taggers. Our comparative study shows that the state-of-the-art POS taggers achieve an accuracy of 83.6%-90.5% on bug reports and the Stanford POS tagger and the TreeTagger achieve the highest accuracy on the sampled bug reports. Our findings show that researchers could use these POS taggers to analyze software artifacts, if an accuracy of 80-90% is acceptable for their specific needs, and we recommend using the Stanford POS tagger or the TreeTagger.

Static and Dynamic Analysis
Thu, Mar 5, 11:00 - 12:30

Efficiently Identifying Object Production Sites
Alejandro Infante and Alexandre Bergel
(University of Chile, Chile)
Most programming environments are shipped with accurate memory profilers. Although efficient in their analyses, memory profilers traditionally output textual listing reports, thus reducing the memory profile exploration as a set of textual pattern-matching operations. Memory blueprint visually reports the memory consumption of a program execution. A number of simple visual cues are provided to identify direct and indirect object production sites, key ingredients to efficiently address memory issues. Scalability is addressed by restricting the scope of interest both in the call graph and the considered classes. Memory blueprint has been implemented in the Pharo programming language, and is available under the MIT license.

Where Was This SQL Query Executed? A Static Concept Location Approach
Csaba Nagy, Loup Meurice, and Anthony Cleve
(University of Namur, Belgium)
Concept location in software engineering is the process of identifying where a specific concept is implemented in the source code of a software system. It is a very common task performed by developers during development or maintenance, and many techniques have been studied by researchers to make it more efficient. However, most of the current techniques ignore the role of a database in the architecture of a system, which is also an important source of concepts or dependencies among them. In this paper, we present a concept location technique for data-intensive systems, as systems with at least one database server in their architecture which is intensively used by its clients. Specifically, we present a static technique for identifying the exact source code location from where a given SQL query was sent to the database. We evaluate our technique by collecting and locating SQL queries from testing scenarios of two open source Java systems under active development. With our technique, we are able to successfully identify the source of most of these queries.

Taint Analysis of Manual Service Compositions using Cross-Application Call Graphs
Marc-André Laverdière, Bernhard J. Berger, and Ettore Merlo
(Tata Consultancy Services, Canada; Polytechnique Montréal, Canada; University of Bremen, Germany)
We propose an extension over the traditional call graph to incorporate edges representing control flow between web services, named the Cross-Application Call Graph (CACG). We introduce a construction algorithm for applications built on the Jax-WS standard and validate its effectiveness on sample applications from Apache CXF and JBossWS. Then, we demonstrate its applicability for taint analysis over a sample application of our making. Our CACG construction algorithm accurately identifies service call targets 81.07% of the time on average. Our taint analysis obtains a F-Measure of 95.60% over a benchmark. The use of a CACG, compared to a naive approach, improves the F-Measure of a taint analysis from 66.67% to 100.00% for our sample application.

Tutorials and Briefings

TXL Source Transformation in Practice
James R. Cordy
(Queen's University, Canada)
The TXL source transformation system is widely used in industry and academia for both research and production tasks involving source transformation and software analysis. While it is designed to be accessible to software practitioners, understanding how to use TXL effectively takes time and has a steep learning curve. This tutorial is designed to get you over the initial hump and rapidly move you from a TXL novice to the skills necessary to use it effectively in real applications. Consisting of a combination of one hour lecture presentations followed by one hour practice sessions, this is a hands-on tutorial in which participants quickly learn the basics of how to use TXL effectively in their research or industrial practice.

Software Risk Management in Practice: Shed Light on Your Software Product
Jens Knodel, Matthias Naab, Eric Bouwers, and Joost Visser
(Fraunhofer IESE, Germany; Software Improvement Group, Netherlands; Radboud University Nijmegen, Netherlands)
You can’t control what you can’t measure. And you can’t decide if you are wandering around in the dark. Risk management in practice requires shedding light on the internals of the software product in order to make informed decisions. Thus, in practice, risk management has to be based on information about artifacts (documentation, code, and executables) in order to detect (potentially) critical issues. This tutorial presents experiences from industrial cases world-wide on qualitative and quantitative measurement of software products. We present our lessons learned as well as consolidated experiences from practice and provide a classification scheme of applicable measurement techniques. Participants of the tutorial receive an introduction to the techniques in theory and then apply them in practice in interactive exercises. This enables participants to learn how to shed light on the internals of their software and how to make risk management decisions efficiently and effectively.

Info

Software Architecture Reconstruction: Why? What? How?
Mehdi Mirakhorli

(Rochester Institute of Technology, USA)
Software architecture reconstruction plays an increasingly essential role in software engineering tasks such as architecture renovation, program comprehension, and change impact analysis. Various methods have been developed which use a software system's implementation-level artifacts to recover the architecture of the software. This tutorial will answer three fundamental questions about software architecture recovery: Why? What? and How? Through several examples it articulates and synthesizes technical forces and financial motivations that make software companies to invest in software architecture recovery. It discusses “what” are the pieces of design knowledge that can be recovered and lastly demonstrates a methodology as well as required tools for answering “how” to reconstruct architecture from implementation artifacts

Doctoral Symposium
Tue, Mar 3, 16:00 - 17:30

SKilLed Communication for Toolchains
Timm Felden
(University of Stuttgart, Germany)
The creation of a program analysis toolchain involves design choices regarding intermediate representations (IRs). Good choices for an IR depend on the analyses performed by a toolchain. In academia, new analyses are developed frequently. Therefore, the best single IR of a research-oriented toolchain does not exist. Thus, we will describe our design of an intermediate representation that can be easily adapted to new requirements.

The Impact of Column-Orientation on the Quality of Class Inheritance Mapping Specifications
Martin Lorenz
(HPI, Germany)
oriented modeling. Persisting objects from an inheritance hi- erarchy into a relational database is not straight forward, because the concept of inheritance is not supported by relational data stores. An accepted solution are object-relational mapping strategies. The problem is that each strategy varies in terms of its non-functional characteristics e.g., usability, maintainability, efficiency. Software developers base the decision, what mapping strategy to chose, on experience and best practices. Most of these best practices can be found in programming guides for object- relational mapping frameworks or books and publications of experienced software architects. However, these best practices are based on experiences with row-oriented database systems. With the advent of new database technologies, such as column-stores, these best practices become obsolete. In my Ph.D. thesis I am investigating the influence of a database’s data layout (row- vs. column) on the non-functional characteristics of object-relational mapping strategies.

Improving the Integration Process of Large Software Systems
Yujuan Jiang
(Polytechnique Montréal, Canada)
Software integration is the software engineering activity where code changes of different developers are combined into a consistent whole. While the advent of distributed version control systems has allowed distributed development to scale up substantially, at the same time the risk of integration conflicts (changes that do not go well together) and the time required to fix them has increased as well. In order to help practitioners deal with this paradox, this thesis aims to understand and improve the integration process of modern software organizations. We took the Linux kernel as our pilot case study, which is supported by a distributed version control system (Git) and low-tech reviewing system (mailing list). So far, we have (1) analyzed how to reconstruct the data of the integration process in a low- tech environment where reviews are stored in emails without explicit link to version control commits, and (2) studied the characteristics of the Linux integration process. We found that the commits developed by more mature developers and impacting less subsystems are more likely to be accepted. As a next step, we plan to build a model quantifying the integration effort.

Handling the Differential Evolution of Software Artefacts: A Framework for Consistency Management
Ildiko Pete and Dharini Balasubramaniam
(University of St. Andrews, UK)
Modern software systems are subject to frequent changes. Different artefacts of a system, such as requirements specifications, design documents and source code, often evolve at different times and become inconsistent with one another. This differential evolution poses problems to effective software maintenance and erodes trust in artefacts as accurate representation of the system. In this paper, we propose a holistic framework for managing the consistent co-evolution of software artefacts, incorporating: traceability creation and maintenance, change detection, impact analysis, consistency checking and change propagation.

Towards a Framework for Analysis, Transformation, and Manipulation of Makefiles
Doug Martin
(Queen's University, Canada)
Build systems are an integral part of the software development process, being responsible for turning source code into a deliverable product. They are, however, difficult to comprehend and maintain at times. Make, the most popular build language, is often cited as being difficult to debug. In this work, we propose a framework to analyze and manipulate Makefiles, and discover how the language is used in open source systems using existing software analysis techniques like source transformation and clone detection.

Towards a Framework for Automatic Correction of Anti-patterns
Rodrigo Morales
(Polytechnique Montréal, Canada)
One of the biggest concerns in software maintenance is design quality; poor design hinders software maintenance and evolution. One way to improve design quality is to detect and correct anti-patterns (i.e., poor solutions to design and implementation problems), for example through refactorings. There are several approaches to detect anti-patterns, that rely on metrics and structural properties. However, finding a specific solution to remove anti-patterns is a challenging task as candidate refactorings can be conflicting and their number very large, making it costly. Hence, development teams often have to prioritize the refactorings to be applied on a system. In addition to this, refactoring is risky, since non-experienced developers can change the behaviour of a system, without a comprehensive test suite. Therefore, there is a need for tools that can automatically remove anti-patterns. We will apply meta-heuristics to propose a technique for automated refactoring that improves design quality.

Towards an Ontology-Based Context-Aware Meta-Model for the Software Domain
Mostafa Erfani
(Concordia University, Canada)
Over the last decade, contextual modeling has gained on importance due to the wide spread introduction of ubiquitous computing. Common to these systems is that they integrate contextual information to improve situated cognition and awareness as well as stakeholders’ usage experience with these systems. While domains such as Web 3.0, which shares many commonalities with the software domain, have made context-awareness as part of their solution space, the software domain still lacks the same rate of adoption. In our research, we introduce an ontology based context-aware meta-model to capture and formalize different context abstraction levels.

Investigating Modern Release Engineering Practices
Md Tajmilur Rahman
(Concordia University, Canada)
In my PhD research I will focus on modern release engineering practices. First, I have quantified the time and effort that is involved in stabilizing a release. I found that despite using rapid elease, the Chrome and Linux projects still have a period where they rush changes into a release. Second, developers typically isolate unrelated changes on branches. However, developers at major companies, such as Google and Facebook, commit all changes to a single branch. They isolate unrelated changes using feature-flags, which allows them to disable works in progress. My goal is to empirically determine the best practices when using flags and identify dead code. Finally, I will develop tool support to manage feature flags.

SANER 2015 – Proceedings

Frontmatter

Keynotes

Main Research

Information Retrieval Tue, Mar 3, 11:00 - 12:30

APIs and Patterns Tue, Mar 3, 14:00 - 15:30

Analysis of Programming Languages Tue, Mar 3, 16:00 - 17:30

On Crashes and Traces Wed, Mar 4, 09:00 - 10:30

Code Reviews Wed, Mar 4, 11:00 - 12:30

Searching and Cloning Wed, Mar 4, 16:00 - 17:30

Change Impact Analysis Thu, Mar 5, 11:00 - 12:30

SCAM at SANER Thu, Mar 5, 16:00 - 17:30

Mining Software Repositories Fri, Mar 6, 09:00 - 10:30

On Code Changes Fri, Mar 6, 11:00 - 12:30

The Human Within Fri, Mar 6, 14:00 - 15:30

Search, Touch, Tweet Fri, Mar 6, 16:00 - 17:30

Tool Demonstrations Tue, Mar 3, 11:00 - 12:30

Industrial Research Wed, Mar 4, 16:00 - 17:15

Early Research Achievements

Evolution and Reuse Tue, Mar 3, 11:00 - 12:30

Text and Labeling Tue, Mar 3, 14:00 - 15:30

Bugs and Violations Wed, Mar 4, 11:00 - 12:30

Static and Dynamic Analysis Thu, Mar 5, 11:00 - 12:30

Tutorials and Briefings

Doctoral Symposium Tue, Mar 3, 16:00 - 17:30

Information Retrieval
Tue, Mar 3, 11:00 - 12:30

APIs and Patterns
Tue, Mar 3, 14:00 - 15:30

Analysis of Programming Languages
Tue, Mar 3, 16:00 - 17:30

On Crashes and Traces
Wed, Mar 4, 09:00 - 10:30

Code Reviews
Wed, Mar 4, 11:00 - 12:30

Searching and Cloning
Wed, Mar 4, 16:00 - 17:30

Change Impact Analysis
Thu, Mar 5, 11:00 - 12:30

SCAM at SANER
Thu, Mar 5, 16:00 - 17:30

Mining Software Repositories
Fri, Mar 6, 09:00 - 10:30

On Code Changes
Fri, Mar 6, 11:00 - 12:30

The Human Within
Fri, Mar 6, 14:00 - 15:30

Search, Touch, Tweet
Fri, Mar 6, 16:00 - 17:30

Tool Demonstrations
Tue, Mar 3, 11:00 - 12:30

Industrial Research
Wed, Mar 4, 16:00 - 17:15

Evolution and Reuse
Tue, Mar 3, 11:00 - 12:30

Text and Labeling
Tue, Mar 3, 14:00 - 15:30

Bugs and Violations
Wed, Mar 4, 11:00 - 12:30

Static and Dynamic Analysis
Thu, Mar 5, 11:00 - 12:30

Doctoral Symposium
Tue, Mar 3, 16:00 - 17:30