WCRE 2013 – Proceedings

Message from the Chairs
Welcome to the 20th Working Conference on Reverse Engineering (WCRE) that will be held in the University of Koblenz-Landau, Germany, from October 14th to October 17th, 2013. This year, WCRE received a very good number of submissions. The 117 submissions we received – 97 re-search papers, 10 ERA papers, and 10 practice papers – made the task of selecting papers very chal-lenging. All submissions went through a rigorous review process, in which each paper was re-viewed in detail by at least three Program Committee members from academia and industry. All the decisions were open for discussion by the entire Program Committee before being finalized. At the end of this process, we selected 38 research papers, 4 ERA papers, and 7 practice papers, origi-nating from 31 countries. The authors of the best papers from WCRE’13 will be invited to submit extended versions of their work to a special issue of the Empirical Software Engineering Journal, published by Springer.

Invited Papers

Genetic Programming for Reverse Engineering (Invited Paper)
Mark Harman

, William B. Langdon, and Westley Weimer
(University College London, UK; University of Virginia, USA)
This paper overviews the application of Search Based Software Engineering (SBSE) to reverse engineering with a particular emphasis on the growing importance of recent developments in genetic programming and genetic improvement for reverse engineering. This includes work on SBSE for re-modularisation, refactoring, regression testing, syntax-preserving slicing and dependence analysis, concept assignment and feature location, bug fixing, and code migration. We also explore the possibilities for new directions in research using GP and GI for partial evaluation, amorphous slicing, and product lines, with a particular focus on code transplantation. This paper accompanies the keynote given by Mark Harman at the 20th Working Conference on Reverse Engineering (WCRE 2013).

The First Decade of GUI Ripping: Extensions, Applications, and Broader Impacts (Invited Paper)
Atif Memon, Ishan Banerjee, Bao N. Nguyen, and Bryan Robbins
(University of Maryland at College Park, USA)
This paper provides a retrospective examination of GUI Ripping---reverse engineering a workflow model of the graphical user interface of a software application---born a decade ago out of recognition of the severe need for improving the then largely manual state-of-the-practice of functional GUI testing. In these last 10 years, GUI ripping has turned out to be an enabler for much research, both within our group at Maryland and other groups. Researchers have found new and unique applications of GUI ripping, ranging from measuring human performance to re-engineering legacy user interfaces. GUI ripping has also enabled large-scale experimentation involving millions of test cases, thereby helping to understand the nature of GUI faults and characteristics of test cases to detect them. It has resulted in large multi-institutional Government-sponsored research projects on test automation and benchmarking. GUI ripping tools have been ported to many platforms, including Java AWT and Swing, iOS, Android, UNO, Microsoft Windows, and web. In essence, the technology has transformed the way researchers and practitioners think about the nature of GUI testing, no longer considered a manual activity; rather, thanks largely to GUI Ripping, automation has become the primary focus of current GUI testing techniques.

Main Research Track

Binary Reverse Engineering

Who Allocated My Memory? Detecting Custom Memory Allocators in C Binaries
Xi Chen, Asia Slowinska, and Herbert Bos
(VU University Amsterdam, Netherlands)
Many reversing techniques for data structures rely on the knowledge of memory allocation routines. Typically, they interpose on the system’s malloc and free functions, and track each chunk of memory thus allocated as a data structure. However, many performance-critical applications implement their own custom memory allocators. Examples include webservers, database management systems, and compilers like gcc and clang. As a result, current binary analysis techniques for tracking data structures fail on such binaries.
We present MemBrush, a new tool to detect memory alloca- tion and deallocation functions in stripped binaries with high accuracy. We evaluated the technique on a large number of real world applications that use custom memory allocators. As we show, we can furnish existing reversing tools with detailed information about the memory management API, and as a result perform an analysis of the actual application specific data structures designed by the programmer. Our system uses dynamic analysis and detects memory allocation and deallocation routines by searching for functions that comply with a set of generic characteristics of allocators and deallocators.

MemPick: High-Level Data Structure Detection in C/C++ Binaries
Istvan Haller, Asia Slowinska, and Herbert Bos
(VU University Amsterdam, Netherlands)
Many existing techniques for reversing data structures in C/C++ binaries are limited to low-level programming constructs, such as individual variables or structs. Unfortunately, without detailed information about a program's pointer structures, forensics and reverse engineering are exceedingly hard. To fill this gap, we propose MemPick, a tool that detects and classifies high-level data structures used in stripped binaries. By analyzing how links between memory objects evolve throughout the program execution, it distinguishes between many commonly used data structures, such as singly- or doubly-linked lists, many types of trees (e.g., AVL, red-black trees, B-trees), and graphs. We evaluate the technique on 10 real world applications and 16 popular libraries. The results show that MemPick can identify the data structures with high accuracy.

Reconstructing Program Memory State from Multi-gigabyte Instruction Traces to Support Interactive Analysis
Brendan Cleary, Patrick Gorman, Eric Verbeek, Margaret-Anne Storey, Martin Salois, and Frederic Painchaud
(University of Victoria, Canada; Defence R&D Canada, Canada)
Exploitability analysis is the process of attempting to determine if a vulnerability in a program is exploitable. Fuzzing is a popular method of finding such vulnerabilities, in which a program is subjected to millions of generated program inputs until it crashes. Each program crash indicates a potential vulnerability that needs to be prioritized according to its potential for exploitation. The highest priority vulnerabilities need to be investigated by a security analyst by re-executing the program with the input that caused the crash while recording a trace of all executed assembly instructions and then performing analysis on the resulting trace. Recreating the entire memory state of the program at the time of the crash, or at any other point in the trace, is very important for helping the analyst build an understanding of the conditions that led to the crash. Unfortunately, tracing even a small program can create multi-million line trace files from which reconstructing memory state is a computationally intensive process and virtually impossible to do manually. In this paper we present an analysis of the problem of memory state reconstruction from very large execution traces. We report on a novel approach for reconstructing the entire memory state of a program from an execution trace that allows near real-time queries on the state of memory at any point in a program's execution trace. Finally we benchmark our approach showing storage and performance results in line with our theoretical calculations and demonstrate memory state query response times of less than 100ms for trace files up to 60 million lines.

Static Binary Rewriting without Supplemental Information: Overcoming the Tradeoff between Coverage and Correctness
Matthew Smithson, Khaled ElWazeer, Kapil Anand, Aparna Kotha, and Rajeev Barua
(University of Maryland at College Park, USA)
Binary rewriting is the process of transforming executables by maintaining the original binary’s functionality, while improving it in one or more metrics, such as energy use, memory use, security, or reliability. Although several technologies for rewriting binaries exist, static rewriting allows for arbitrarily complex transformations to be performed. Other technologies, such as dynamic or minimally-invasive rewriting, are limited in their transformation ability.
We have designed the first static binary rewriter that guarantees 100% code coverage without the need for relocation or symbolic information. A key challenge in static rewriting is content classification (i.e. deciding what portion of the code segment is code versus data). Our contributions are (i) handling portions of the code segment with uncertain classification by using speculative disassembly in case it was code, and retaining the original binary in case it was data; (ii) drastically limiting the number of possible speculative sequences using a new technique called binary characterization; and (iii) avoiding the need for relocation or symbolic information by using call translation at usage points of code pointers (i.e. indirect control transfers), rather than changing addresses at address creation points. Extensive evaluation using stripped binaries for the entire SPEC 2006 benchmark suite (with over 1.9 million lines of code) demonstrates the robustness of the scheme.

Bug Management

An Incremental Update Framework for Efficient Retrieval from Software Libraries for Bug Localization
Shivani Rao, Henry Medeiros, and Avinash Kak
(Purdue University, USA)
Information Retrieval (IR) based bug localization techniques use a bug reports to query a software repository to retrieve relevant source files. These techniques index the source files in the software repository and train a model which is then queried for retrieval purposes. Much of the current research is focused on improving the retrieval effectiveness of these methods. However, little consideration has been given to the efficiency of such approaches for software repositories that are constantly evolving. As the software repository evolves, the index creation and model learning have to be repeated to ensure accuracy of retrieval for each new bug. In doing so, the query latency may be unreasonably high, and also, re-computing the index and the model for files that did not change is computationally redundant. We propose an incremental update framework to continuously update the index and the model using the changes made at each commit. We demonstrate that the same retrieval accuracy can be achieved but with a fraction of the time needed by current approaches. Our results are based on two basic IR modeling techniques -– Vector Space Model (VSM) and Smoothed Unigram Model (SUM). The dataset we used in our validation experiments was created by tracking commit history of AspectJ and JodaTime software libraries over a span of 10 years.

Accurate Developer Recommendation for Bug Resolution
Xin Xia, David Lo

, Xinyu Wang, and Bo Zhou
(Zhejiang University, China; Singapore Management University, Singapore)
Bug resolution refers to the activity that developers perform to diagnose, fix, test, and document bugs during software development and maintenance. It is a collaborative activity among developers who contribute their knowledge, ideas, and expertise to resolve bugs. Given a bug report, we would like to recommend the set of bug resolvers that could potentially contribute their knowledge to fix it. We refer to this problem as developer recommendation for bug resolution.
In this paper, we propose a new and accurate method named DevRec for the developer recommendation problem. DevRec is a composite method which performs two kinds of analysis: bug reports based analysis (BR-Based analysis), and developer based analysis (D-Based analysis). In the BR-Based analysis, we characterize a new bug report based on past bug reports that are similar to it. Appropriate developers of the new bug report are found by investigating the developers of similar bug reports appearing in the past. In the D-Based analysis, we compute the affinity of each developer to a bug report based on the characteristics of bug reports that have been fixed by the developer before. This affinity is then used to find a set of developers that are ``close'' to a new bug report.
We evaluate our solution on 5 large bug report datasets including GCC, OpenOffice, Mozilla, Netbeans, and Eclipse containing a total of 107,875 bug reports. We show that DevRec could achieve recall@5 and recall@10 scores of 0.4826-0.7989, and 0.6063-0.8924, respectively. We also compare DevRec with other state-of-art methods, such as Bugzie and DREX. The results show that DevRec on average improves recall@5 and recall@10 scores of Bugzie by 57.55% and 39.39% respectively. DevRec also outperforms DREX by improving the average recall@5 and recall@10 scores by 165.38% and 89.36%, respectively.

Has This Bug Been Reported?
Kaiping Liu, Hee Beng Kuan Tan, and Hongyu Zhang

(Nanyang Technological University, Singapore; Tsinghua University, China)
Bug reporting is essentially an uncoordinated process. The same bugs could be repeatedly reported because users or testers are unaware of previously reported bugs. As a result, extra time could be spent on bug triaging and fixing. In order to reduce redundant effort, it is important to provide bug reporters with the ability to search for previously reported bugs. The search functions provided by the existing bug tracking systems are using relatively simple ranking functions, which often produce unsatisfactory results. In this paper, we adopt Ranking SVM, a Learning to Rank technique to construct a ranking model for effective bug report search. We also propose to use the knowledge of Wikipedia to discover the semantic relations among words and documents. Given a user query, the constructed ranking model can search for relevant bug reports in a bug tracking system. Unlike related works on duplicate bug report detection, our approach retrieves existing bug reports based on short user queries, before the complete bug report is submitted. We perform evaluations on more than 16,340 Eclipse and Mozilla bug reports. The evaluation results show that the proposed approach can achieve better search results than the existing search functions provided by Bugzilla and Lucene. We believe our work can help users and testers locate potential relevant bug reports more precisely.

Automatic Recovery of Root Causes from Bug-Fixing Changes
Ferdian Thung

, David Lo

, and Lingxiao Jiang

(Singapore Management University, Singapore)
What is the root cause of this failure? This question is often among the first few asked by software debuggers when they try to address issues raised by a bug report. Root cause is the erroneous lines of code that cause a chain of erroneous program states eventually leading to the failure. Bug tracking and source control systems only record the symptoms (e.g., bug reports) and treatments of a bug (e.g., committed changes that fix the bug), but not its root cause. Many treatments contain non-essential changes, which are intermingled with root causes. Reverse engineering the root cause of a bug can help to understand why the bug is introduced and help to detect and prevent other bugs of similar causes. The recovered root causes are also better ground truth for bug detection and localization studies.
In this work, we propose a combination of machine learning and code analysis techniques to identify root causes from the changes made to fix bugs. We evaluate the effectiveness of our approach based on a golden set (i.e., ground truth data) of manually recovered root causes of 200 bug reports from three open source projects. Our approach is able to achieve a precision, recall, and F-measure (i.e., the harmonic mean of precision and recall) of 76.42%, 71.88%, and 74.08% respectively. Compared with the work by Kawrykow and Robillard, our approach achieves a 60.83% improvement in F-measure.

Clones

Distilling Useful Clones by Contextual Differencing
Zhenchang Xing, Yinxing Xue, and Stanislaw Jarzabek
(Nanyang Technological University, Singapore; National University of Singapore, Singapore)
Clone detectors find similar code fragments and report large numbers of them for large systems. Textually similar clones may perform different computations, depending on the program context in which clones occur. Understanding these contextual differences is essential to distill useful clones for a specific maintenance task, such as refactoring. Manual analysis of contextual differences is time consuming and error-prone. To mitigate this problem, we present an automated approach to helping developers find and analyze contextual differences of clones. Our approach represents context of clones as program dependence graphs, and applies a graph differencing technique to identify required contextual differences of clones. We implemented a tool called CloneDifferentiator that identifies contextual differences of clones and allows developers to formulate queries to distill candidate clones that are useful for a given refactoring task. Two empirical studies show that CloneDifferentiator can reduce the efforts of post-detection analysis of clones for refactorings.

Effects of Cloned Code on Software Maintainability: A Replicated Developer Study
Debarshi Chatterji, Jeffrey C. Carver, Nicholas A. Kraft, and Jan Harder
(University of Alabama, USA; University of Bremen, Germany)
Code clones are a common occurrence in most software systems. Their presence is believed to have an effect on the maintenance process. Although these effects have been previously studied, there is not yet a conclusive result. This paper describes an extended replication of a controlled experiment (i.e. a strict replication with an additional task) that analyzes the effects of cloned bugs (i.e. bugs in cloned code) on the program comprehension of programmers. In the strict replication portion, the study participants attempted to isolate and fix two types of bugs, cloned and non-cloned, in one of two small systems. In the extension of the original study, we provided the participants with a clone report describing the location of all cloned code in the other system and asked them to again isolate and fix cloned and non-cloned bugs. The results of the original study showed that cloned bugs were not significantly %JEFF Added 'significantly' to address Jan's comment P1N1. more difficult to maintain than non-cloned bugs. Conversely, the results of the replication showed that it was significantly more difficult to correctly fix a cloned bug than a non-cloned bug. But, there was no significant difference in the amount of time required to fix a cloned bug vs. a non-cloned bug. Finally, the results of the study extension showed that programmers performed significantly better when given clone information than without clone information.

Human Studies

The Influence of Non-technical Factors on Code Review
Olga Baysal, Oleksii Kononenko, Reid Holmes, and Michael W. Godfrey

(University of Waterloo, Canada)
When submitting a patch, the primary concerns of individual developers are "How can I maximize the chances of my patch being approved, and minimize the time it takes for this to happen?" In principle, code review is a transparent process that aims to assess qualities of the patch by their technical merits and in a timely manner; however, in practice the execution of this process can be affected by a variety of factors, some of which are external to the technical content of the patch itself. In this paper, we describe an empirical study of the code review process for WebKit, a large, open source project; we replicate the impact of previously studied factors — such as patch size, priority, and component and extend these studies by investigating organizational (the company) and personal dimensions (reviewer load and activity, patch writer experience) on code review response time and outcome. Our approach uses a reverse engineered model of the patch submission process and extracts key information from the issue tracking and code review systems. Our findings suggest that these non- technical factors can significantly impact code review outcomes.

Understanding Project Dissemination on a Social Coding Site
Jing Jiang, Li Zhang

, and Lei Li
(Beihang University, China)
Popular social coding sites like GitHub and BitBucket are changing software development. Users follow some interesting developers, listen to their activities and find new projects. Social relationships between users are utilized to disseminate projects, attract contributors and increase the popularity. A deep understanding of project dissemination on social coding sites can provide important insights into questions of project diffusion characteristics and into the improvement of the popularity.
In this paper, we seek a deeper understanding of project dissemination in GitHub. We collect 2,665 projects and 272,874 events. Moreover, we crawl 747,107 developers and 2,234,845 social links to construct social graphs. We analyze topological characteristics and reciprocity of social graphs. We then study the speed and the range of project dissemination, and the role of social links. Our main observations are: (1) Social relationships are not reciprocal. (2) The popularity increases gradually for a long time. (3) Projects spread to users far away from their creators. (4) Social links play a notable role of project dissemination. These results can be leveraged to increase the popularity. Specifically, we suggest that project owners should (1) encourage experienced developers to choose some promising new developers, follow them in return and provide guidance. (2) promote projects for a long time. (3) advertise projects to a wide range of developers. (4) fully utilize social relationships to advertise projects and attract contributors.

What Help Do Developers Seek, When and How?
Hongwei Li, Zhenchang Xing, Xin Peng

, and Wenyun Zhao
(Fudan University, China; Nanyang Technological University, Singapore)
Software development often requires knowledge beyond what developers already possess. In such cases, developers have to seek help from different sources of information. As a metacognitive skill, help seeking influences software developers' efficiency and success in many situations. However, there has been little research to provide a systematic investigation of the general process of help seeking activities in software engineering and human and system factors affecting help seeking. This paper reports our empirical study aiming to fill this gap. Our study includes two human experiments, involving 24 developers and two typical software development tasks. Our study gathers empirical data that allows us to provide an in-depth analysis of help-seeking task structures, task strategies, information sources, process model, and developers' information needs and behaviors in seeking and using help information and in managing information during help seeking. Our study provides a detailed understanding of help seeking activities in software engineering, the challenges that software developers face, and the limitations of existing tool support. This can lead to the design and development of more efficient and usable help seeking support that helps developers become better help seekers.

Towards Understanding How Developers Spend Their Effort during Maintenance Activities
Zéphyrin Soh, Foutse Khomh

, Yann-Gaël Guéhéneuc, and Giuliano Antoniol
(Polytechnique Montréal, Canada)
For many years, researchers and practitioners have strived to assess and improve the productivity of software development teams. One key step toward achieving this goal is the understanding of factors affecting the efficiency of developers performing development and maintenance activities. In this paper, we aim to understand how developers' spend their effort during maintenance activities and study the factors affecting developers' effort. By knowing how developers' spend their effort and which factors affect their effort, software organisations will be able to take the necessary steps to improve the efficiency of their developers, for example, by providing them with adequate program comprehension tools. For this preliminary study, we mine 2,408 developers' interaction histories and 3,395 patches from four open-source software projects (ECF, Mylyn, PDE, Eclipse Platform). We observe that usually, the complexity of the implementation required for a task does not reflect the effort spent by developers on the task. Most of the effort appears to be spent during the exploration of the program. In average, 62% of files explored during the implementation of a task are not significantly relevant to the final implementation of the task. Developers who explore a large number of files that are not significantly relevant to the solution to their task take a longer time to perform the task. We expect that the results of this study will pave the way for better program comprehension tools to guide developers during their explorations of software systems.

Re-documenting Legacy Code

Leveraging Specifications of Subcomponents to Mine Precise Specifications of Composite Components
Ziying Dai, Xiaoguang Mao

, Yan Lei, and Liqian Chen

(National University of Defense Technology, China)
Specifications play an important role in many software engineering activities. Despite their usefulness, formal specifications are often unavailable in practice. Specification mining techniques try to automatically recover specifications from existing programs. Unfortunately, mined specifications are often overly general, which hampers their applications in the downstream analysis and testing. Nowadays, programmers develop software systems by utilizing existing components that usually have some available specifications. However, benefits of these available specifications are not explored by current specification miners. In this paper, we propose an approach to leverage available specifications of subcomponents to improve the precision of specifications of the composite component mined by state-based mining techniques. We monitor subcomponents against their specifications during the mining process and use states that are reached to construct abstract states of the composite component. Our approach makes subcomponents' states encoded within their specifications visible to their composite component, and improves the precision of mined specifications by effectively increasing the number of their states. The empirical evaluation shows that our approach can significantly improve the precision of mined specifications by removing erroneous behavior without noticeable loss of recall.

A Model-Driven Graph-Matching Approach for Design Pattern Detection
Mario Luca Bernardi, Marta Cimitile, and Giuseppe Antonio Di Lucca
(University of Sannio, Italy; Unitelma Sapienza University, Italy)
In this paper an approach to automatically detect Design Patterns (DPs) in Object Oriented systems is presented. It allows to link system’s source code components to the roles they play in each pattern. DPs are modelled by high level structural properties (e.g. inheritance, dependency, invocation, delegation, type nesting and membership relationships) that are checked against the system structure and components. The proposed metamodel also allows to define DP variants, overriding the structural properties of existing DP models, to improve detection quality. The approach was validated on an open benchmark containing several open-source systems of increasing sizes. Moreover, for other two systems, the results have been compared with the ones from a similar approach existing in literature. The results obtained on the analyzed systems, the identified variants and the efficency and effectiveness of the approach are thoroughly presented and discussed.

Recommendation Systems

Automated Library Recommendation
Ferdian Thung

, David Lo

, and Julia Lawall
(Singapore Management University, Singapore; INRIA, France)
Many third party libraries are available to be downloaded and used. Using such libraries can reduce development time and make the developed software more reliable. However, developers are often unaware of suitable libraries to be used for their projects and thus they miss out on these benefits. To help developers better take advantage of the available libraries, we propose a new technique that automatically recommends libraries to developers. Our technique takes as input the set of libraries that an application currently uses, and recommends other libraries that are likely to be relevant.
We follow a hybrid approach that combines association rule mining and collaborative filtering. The association rule mining component recommends libraries based on a set of library usage patterns. The collaborative filtering component recommends libraries based on those that are used by other similar projects. We investigate the effectiveness of our hybrid approach on 500 software projects that use many third-party libraries. Our experiments show that our approach can recommend libraries with recall rate@5 of 0.852 and recall rate@10 of 0.894.

Automatic Discovery of Function Mappings between Similar Libraries
Cédric Teyton, Jean-Rémy Falleri, and Xavier Blanc
(University of Bordeaux, France)
Library migration is the process of replacing a third-party library in favor of a competing one during software maintenance. The process of transforming a software source code to become compliant with a new library is cumbersome and error-prone. Indeed, developers have to understand a new Application Programming Interface (API) and search for the right replacements for the functions they use from the old library. As the two libraries are independent, the functions may have totally different structures and names, making the search of mappings very difficult. To assist the developers in this difficult task, we introduce an approach that analyzes source code changes from software projects that already underwent a given library migration to extract mappings between functions. We demonstrate the applicability of our approach on several library migrations performed on the Java open source software projects.

Find Your Library Experts
Cédric Teyton, Jean-Rémy Falleri, Floréal Morandat, and Xavier Blanc
(University of Bordeaux, France)
Heavy usage of third-party libraries is almost mandatory in modern software systems. The knowledge of these libraries is generally scattered across the development team. When a development or a maintenance task involving specific libraries arises, finding the relevant experts would simplify its completion. However there is no automatic approach to identify these experts. In this article we propose LIBTIC, a search engine of library experts automatically populated by mining software repositories. We show that LIBTIC finds relevant experts of common Java libraries among the GitHub developers. We also illustrate its usefulness through a case study on the Apache HBase project where several maintenance and development use-cases are carried out.

Refactoring and Re-modularization

Towards Automatically Improving Package Structure while Respecting Original Design Decisions
Hani Abdeen, Houari Sahraoui

, Osama Shata, Nicolas Anquetil, and Stéphane Ducasse

(Qatar University, Qatar; Université de Montréal, Canada; INRIA, France)
Recently, there has been an important progress in applying search-based optimization techniques to the problem of software re-modularization. Yet, a major part of the existing body of work addresses the problem of modularizing software systems from scratch, regardless of the existing packages structure. This paper presents a novel multi-objective optimization approach for improving existing packages structure. The optimization approach aims at increasing the cohesion and reducing the coupling and cyclic connectivity of packages, by modifying as less as possible the existing packages organization. Moreover, maintainers can specify several constraints to guide the optimization process with regard to extra design factors. To this contribution, we use the Non-Dominated Sorting Genetic Algorithm (NSGA-II). We evaluate the optimization approach through an experiment covering four real-world software systems. The results promise the effectiveness of our optimization approach for improving existing packages structure by doing very small modifications.

Heuristics for Discovering Architectural Violations
Cristiano Maffort, Marco Tulio Valente, Mariza Bigonha, Nicolas Anquetil, and André Hora

(UFMG, Brazil; INRIA, France)
Software architecture conformance is a key software quality control activity that aims to reveal the progressive gap normally observed between concrete and planned software architectures. In this paper, we present ArchLint, a lightweight approach for architecture conformance based on a combination of static and historical source code analysis. For this purpose, ArchLint relies on four heuristics for detecting both absences and divergences in source code based architectures. We applied ArchLint in an industrial-strength system and as a result we detected 119 architectural violations, with an overall precision of 46.7% and a recall of 96.2%, for divergences. We also evaluated ArchLint with four open-source systems, used in an independent study on reflexion models. In this second study, ArchLint achieved precision results ranging from 57.1% to 89.4%.

Info

Recommending Move Method Refactorings using Dependency Sets
Vitor Sales, Ricardo Terra, Luis Fernando Miranda, and Marco Tulio Valente
(UFMG, Brazil; UFSJ, Brazil)
Methods implemented in incorrect classes are common bad smells in object-oriented systems, especially in the case of systems maintained and evolved for years. To tackle this design flaw, we propose a novel approach that recommends Move Method refactorings based on the set of static dependencies established by a method. More specifically, our approach compares the similarity of the dependencies established by a source method with the dependencies established by the methods in possible target classes. We evaluated our approach using systems from a compiled version of the Qualitas Corpus. We report that our approach provides an average precision of 60.63% and an average recall of 81.07%. Such results are, respectively, 129% and 49% better than the results achieved by JDeodorant, a well-known move method recommendation system.

Info

Do Developers Care about Code Smells? An Exploratory Survey
Aiko Yamashita and Leon Moonen

(Mesan, Norway; Simula Research Laboratory, Norway)
Code smells are a well-known metaphor to describe symptoms of code decay or other issues with code quality which can lead to a variety of maintenance problems. Even though code smell detection and removal has been well-researched over the last decade, it remains open to debate whether or not code smells should be considered meaningful conceptualizations of code quality issues from the developer's perspective. To some extent, this question applies as well to the results provided by current code smell detection tools. Are code smells really important for developers? If they are not, is this due to the lack of relevance of the underlying concepts, due to the lack of awareness about code smells on the developers' side, or due to the lack of appropriate tools for code smell analysis or removal? In order to align and direct research efforts to address actual needs and problems of professional developers, we need to better understand the knowledge about, and interest in code smells, together with their perceived criticality. This paper reports on the results obtained from an exploratory survey involving 85 professional software developers.

Info

Security and Testing

LigRE: Reverse-Engineering of Control and Data Flow Models for Black-Box XSS Detection
Fabien Duchène, Sanjay Rawat, Jean-Luc Richier, and Roland Groz
(LIG, France; Ensimag, France)
Fuzz testing consists of automatically generating and sending malicious inputs to an application in order to hopefully trigger a vulnerability. In order to be efficient, the fuzzing should answer questions such as: Where to send a malicious value? Where to observe its effects? How to position the system in such states? Answering such questions is a matter of understanding precisely enough the application. Reverse- engineering is a possible way to gain this knowledge, especially in a black-box harness. In fact, given the complexity of modern web applications, automated black-box scanners alternatively reverse- engineer and fuzz web applications to detect vulnerabilities. We present an approach, named as LigRE, which improves the reverse engineering to guide the fuzzing. We adapt a method to automatically learn a control flow model of web applications, and annotate this model with inferred data flows. Afterwards, we generate slices of the model for guiding the scope of a fuzzer. Empirical experiments show that LigRE increases detection capabilities of Cross Site Scripting (XSS), a particular case of web command injection vulnerabilities.

Info

Circe: A Grammar-Based Oracle for Testing Cross-Site Scripting in Web Applications
Andrea Avancini and Mariano Ceccato
(Fondazione Bruno Kessler, Italy)
Security is a crucial concern, especially for those applications, like web-based programs, that are constantly exposed to potentially malicious environments. Security testing aims at verifying the presence of security related defects. Security tests consist of two major parts, input values to run the application and the decision if the actual output matches the expected output, the latter is known as the “oracle”. In this paper, we present a process to build a security oracle for testing Cross-site scripting vulnerabilities in web applications. In the learning phase, we analyze web pages generated in safe conditions to learn a model of their syntactic structure. Then, in the testing phase, the model is used to classify new test cases either as “safe tests” or as “successful attacks”. This approach has been implemented in a tool, called Circe, and empirically assessed in classifying security test cases for two real world open source web applications.

Capture-Replay vs. Programmable Web Testing: An Empirical Assessment during Test Case Evolution
Maurizio Leotta

, Diego Clerissi, Filippo Ricca

, and Paolo Tonella
(University of Genova, Italy; Fondazione Bruno Kessler, Italy)
There are several approaches for automated functional web testing and the choice among them depends on a number of factors, including the tools used for web testing and the costs associated with their adoption. In this paper, we present an empirical cost/benefit analysis of two different categories of automated functional web testing approaches: (1) capture-replay web testing (in particular, using Selenium IDE); and, (2) programmable web testing (using Selenium WebDriver). On a set of six web applications, we evaluated the costs of applying these testing approaches both when developing the initial test suites from scratch and when the test suites are maintained, upon the release of a new software version.
Results indicate that, on the one hand, the development of the test suites is more expensive in terms of time required (between 32% and 112%) when the programmable web testing approach is adopted, but on the other hand, test suite maintenance is less expensive when this approach is used (with a saving between 16% and 51%). We found that, in the majority of the cases, after a small number of releases (from one to three), the cumulative cost of programmable web testing becomes lower than the cost involved with capture-replay web testing and the cost saving gets amplified over the successive releases.

Software Maintenance

Clustering Static Analysis Defect Reports to Reduce Maintenance Costs
Zachary P. Fry and Westley Weimer
(University of Virginia, USA)
Static analysis tools facilitate software maintenance by automatically identifying bugs in source code. However, for large systems, these tools often produce an overwhelming number of defect reports. Many of these defect reports are conceptually similar, but addressing each report separately costs developer effort and increases the maintenance burden. We propose to automatically cluster machine-generated defect reports so that similar bugs can be triaged and potentially fixed in aggregate. Our approach leverages both syntactic and structural information available in static bug reports to accurately cluster related reports, thus expediting the maintenance process.
We evaluate our technique using 8,948 defect reports produced by the Coverity Static Analysis and FindBugs tools in both C and Java programs totaling over 14 million lines of code. We find that humans overwhelmingly agree that clusters of defect reports produced by our tool could be handled aggregately, thus reducing developer maintenance effort. Additionally, we show that our tool is not only capable of perfectly accurate clusters, but can also significantly reduce the number of defect reports that have to be manually examined by developers. For instance, at a level of 90% accuracy, our technique can reduce the number of individually inspected defect reports by 21.33% while other multi-language tools fail to obtain more than a 2% reduction.

Lehman's Laws in Agile and Non-agile Projects
Kelley Duran, Gabbie Burns, and Paul Snell
(Rochester Institute of Technology, USA)
Software team leaders and managers must make decisions on what type of process model they will use for their projects. Recent work suggests the use of agile processes since they promote shorter development cycles, better collaboration, and process flexibility. Due to the many benefits of agile processes, many software organizations have shifted to using more agile process methodologies. However, there is limited research that studies how agile processes affects the evolution of a software system over time.
In this paper, we perform an empirical study to better understand the effects of using agile processes. We compare two open source projects, one of which uses a tailored agile process (i.e., Xtreme Programming) and another that has no formal process methodology. In particular, we compare the two projects within the context of Lehman’s Laws for continuing growth, continuing change, increasing complexity, and conservation of familiarity. Our findings show that all four of the laws held true for the project that uses an agile process and that there are noticeable differences in the evolution of the two projects, many of which can be traced back to specific practices used by the agile team.

Inferring Extended Finite State Machine Models from Software Executions
Neil Walkinshaw, Ramsay Taylor, and John Derrick
(University of Leicester, UK; University of Sheffield, UK)
The ability to reverse-engineer models of software behaviour is valuable for a wide range of software maintenance, validation and verification tasks. Current reverse-engineering techniques focus either on control-specific behaviour (e.g. in the form of Finite State Machines), or data-specific behaviour (e.g. as pre/post-conditions or invariants). However, typical software behaviour is usually a product of the two; models must combine both aspects to fully represent the software's operation. Extended Finite State Machines (EFSMs) provide such a model. Although attempts have been made to infer EFSMs, these have been problematic. The models inferred by these techniques can be non deterministic, the inference algorithms can be inflexible, and only applicable to traces with specific characteristics. This paper presents a novel EFSM inference technique that addresses the problems of inflexibility and non determinism. It also adapts an experimental technique from the field of Machine Learning to evaluate EFSM inference techniques, and applies it to two open-source software projects.

Comparing and Combining Evolutionary Couplings from Interactions and Commits
Fasil Bantelay, Motahareh Bahrami Zanjani, and Huzefa Kagdi
(Wichita State University, USA)
The paper presents an approach to mine evolutionary couplings from a combination of interaction (e.g., Mylyn) and commit (e.g., CVS) histories. These evolutionary couplings are expressed at the file and method levels of granularity, and are applied to support the tasks of commit and interaction predictions. Although the topic of mining evolutionary couplings has been investigated previously, the empirical comparison and combination of the two types from interaction and commit histories have not been attempted. An empirical study on 3272 interactions and 5093 commits from Mylyn, an open source task management tool, was conducted. These interactions and commits were divided into training and testing sets to evaluate the combined, and individual, models. Precision and recall metrics were used to measure the performance of these models. The results show that combined models offer statistically significant increases in recall over the individual models for change predictions. At the file level, the combined models achieved a maximum recall improvement of 13% for commit prediction with a 2% maximum precision drop.

Software Quality

Improving SOA Antipatterns Detection in Service Based Systems by Mining Execution Traces
Mathieu Nayrolles, Naouel Moha, and Petko Valtchev
(Université du Québec à Montréal, Canada)
Service Based Systems (SBSs), like other software systems, evolve due to changes in both user requirements and execution contexts. Continuous evolution could easily deteriorate the design and reduce the Quality of Service (QoS) of SBSs and may result in poor design solutions, commonly known as SOA antipatterns. SOA antipatterns lead to a reduced maintainability and reusability of SBSs. It is therefore important to first detect and then remove them. However, techniques for SOA antipattern detection are still in their infancy, and there are hardly any tools for their automatic detection. In this paper, we propose a new and innovative approach for SOA antipattern detection called SOMAD (Service Oriented Mining for Antipattern Detection) which is an evolution of the previously published SODA (Service Oriented Detection For Antpatterns) tool. SOMAD improves SOA antipattern detection by mining execution traces: It detects strong associations between sequences of service/method calls and further filters them using a suite of dedicated metrics. We first present the underlying association mining model and introduce the SBS-oriented rule metrics. We then describe a validating application of SOMAD to two independently developed SBSs. A comparison of our new tool with SODA reveals superiority of the former: Its precision is better by a margin ranging from 2.6% to 16.67% while the recall remains optimal at 100% and the speed is significantly reduces (2.5+ times on the same test subjects).

Video

Info

Mining System Specific Rules from Change Patterns
André Hora

, Nicolas Anquetil, Stéphane Ducasse

, and Marco Tulio Valente
(INRIA, France; University of Lille, France; UFMG, Brazil)
A significant percentage of warnings reported by tools to detect coding standard violations are false positives. Thus, there are some works dedicated to provide better rules by mining them from source code history, analyzing bug-fixes or changes between system releases. However, software evolves over time, and during development not only bugs are fixed, but also features are added, and code is refactored. In such cases, changes must be consistently applied in source code to avoid maintenance problems. In this paper, we propose to extract system specific rules by mining systematic changes over source code history, i.e., not just from bug-fixes or system releases, to ensure that changes are consistently applied over source code. We focus on structural changes done to support API modification or evolution with the goal of providing better rules to developers. Also, rules are mined from predefined rule patterns that ensure their quality. In order to assess the precision of such specific rules to detect real violations, we compare them with generic rules provided by tools to detect coding standard violations on four real world systems covering two programming languages. The results show that specific rules are more precise in identifying real violations in source code than generic ones, and thus can complement them.

Empirical Evidence of Code Decay: A Systematic Mapping Study
Ajay Bandi, Byron J. Williams, and Edward B. Allen
(Mississippi State University, USA)
Abstract—Code decay is a gradual process that negatively impacts the quality of a software system. Developers need trusted measurement techniques to evaluate whether their systems have decayed. The research aims to find what is currently known about code decay detection techniques and metrics used to evaluate decay. We performed a systematic mapping study to determine which techniques and metrics have been empirically evaluated. A review protocol was developed and followed to identify 30 primary studies with empirical evidence of code decay. We categorized detection techniques into two broad groups: human based and metric-based approaches. We describe the attributes of each approach and distinguish features of several subcategories of both high-level groups. A tabular overview of code decay metrics is also presented. We exclude studies that do not use time (i.e., do not use evaluation of multiple software versions) as a factor when evaluating code decay. This limitation serves to focus the review. We found that coupling metrics are the most widely used at identifying code decay. Researchers use various terms to define code decay, and we recommend additional research to operationalize the terms to provide more consistent analysis.

Mining the Relationship between Anti-patterns Dependencies and Fault-Proneness
Fehmi Jaafar, Yann-Gaël Guéhéneuc, Sylvie Hamel, and Foutse Khomh

(Polytechnique Montréal, Canada; Université de Montréal, Canada)
Anti-patterns describe poor solutions to design and implementation problems which are claimed to make object oriented systems hard to maintain. Anti-patterns indicate weaknesses in design that may slow down development or increase the risk of faults or failures in the future. Classes in anti-patterns have some dependencies, such as static relationships, that may propagate potential problems to other classes. To the best of our knowledge, the relationship between anti-patterns dependencies (with non anti-patterns classes) and faults has yet to be studied in details.
This paper presents the results of an empirical study aimed at analysing anti-patterns dependencies in three open source software systems, namely ArgoUML, JFreeChart, and XerecesJ. We show that, in almost all releases of the three systems, classes having dependencies with anti-patterns are more fault-prone than others. We also report other observations about these dependencies such as their impact on fault prediction. Software organizations could make use of these knowledge about anti-patterns dependencies to better focus their testing and reviews activities toward the most risky classes, e.g. classes with fault-prone dependencies with anti-patterns.

Traceability and Feature Location

Leveraging Historical Co-change Information for Requirements Traceability
Nasir Ali, Fehmi Jaafar, and Ahmed E. Hassan

(Queen's University, Canada; Université de Montréal, Canada)
Requirements traceability (RT) links requirements to the corresponding source code entities, which implement them. Information Retrieval (IR) based RT links recovery approaches are often used to automatically recover RT links. However, such approaches exhibit low accuracy, in terms of precision, recall, and ranking. This paper presents an approach (CoChaIR), complementary to existing IR-based RT links recovery approaches. CoChaIR leverages historical co-change information of files to improve the accuracy of IR-based RT links recovery approaches. We evaluated the effectiveness of CoChaIR on three datasets, i.e., iTrust, Pooka, and SIP Communicator. We compared CoChaIR with two different IR-based RT links recovery approaches, i.e., vector space model and Jensen--Shannon divergence model. Our study results show that CoChaIR significantly improves precision and recall by up to 12.38% and 5.67% respectively; while decreasing the rank of true positive links by up to 48% and reducing false positive links by up to 44%.

Using Relationships for Matching Textual Domain Models with Existing Code
Raghavan Komondoor, Indrajit Bhattacharya, Deepak D'Souza, and Sachin Kale
(Indian Institute of Science, India; IBM Research, India)
We address the task of mapping a given textual domain model (e.g., an industry-standard reference model) for a given domain (e.g., ERP), with the source code of an independently developed application in the same domain. This has applications in improving the understandability of an existing application, migrating it to a more flexible architecture, or integrating it with other related applications. We use the vector-space model to abstractly represent domain model elements as well as source-code artifacts. The key novelty in our approach is to leverage the relationships between source-code artifacts in a principled way to improve the mapping process. We describe experiments wherein we apply our approach to the task of matching two real, open-source applications to corresponding industry-standard domain models. We demonstrate the overall usefulness of our approach, as well as the role of our propagation techniques in improving the precision and recall of the mapping task.

On the Effectiveness of Accuracy of Automated Feature Location Technique
Takashi Ishio, Shinpei Hayashi, Hiroshi Kazato, and Tsuyoshi Oshima
(Osaka University, Japan; Tokyo Institute of Technology, Japan; NTT Data Intellilink, Japan; NTT, Japan)
Automated feature location techniques have been proposed to extract program elements that are likely to be relevant to a given feature. A more accurate result is expected to enable developers to perform more accurate feature location. However, several experiments assessing traceability recovery have shown that analysts cannot utilize an accurate traceability matrix for their tasks. Because feature location deals with a certain type of traceability links, it is an important question whether the same phenomena are visible in feature location or not. To answer that question, we have conducted a controlled experiment. We have asked 20 subjects to locate features using lists of methods of which the accuracy is controlled artificially. The result differs from the traceability recovery experiments. Subjects given an accurate list would be able to locate a feature more accurately. However, subjects could not locate the complete implementation of features in 83% of tasks. Results show that the accuracy of automated feature location techniques is effective, but it might be insufficient for perfect feature location.

Info

On the Effect of Program Exploration on Maintenance Tasks
Zéphyrin Soh, Foutse Khomh

, Yann-Gaël Guéhéneuc, Giuliano Antoniol, and Bram Adams
(Polytechnique Montréal, Canada)
When developers perform a maintenance task, they follow an exploration strategy (ES) that is characterised by how they navigate through the program entities. Studying ES can help to assess how developers understand a program and perform a change task. Various factors could influence how developers explore a program and the way in which they explore a program may affect their performance for a certain task. In this paper, we investigate the ES followed by developers during maintenance tasks and assess the impact of these ES on the duration and effort spent by developers on the tasks. We want to know if developers frequently revisit one (or a set) of program entities (referenced exploration), or if they visit program entities with almost the same frequency (unreferenced exploration) when performing a maintenance task. We mine 1,705 Mylyn interaction histories (IH) from four open-source projects (ECF, Mylyn, PDE, and Eclipse Platform) and perform a user study to verify if both referenced exploration (RE) and unreferenced exploration (UE) were followed by some developers. Using the Gini inequality index on the number of revisits of program entities, we automatically classify interaction histories as RE and UE and perform an empirical study to measure the effect of program exploration on the task duration and effort. We report that, although a UE may require more exploration effort than a RE, a UE is on average 12.30% less time consuming than a RE.

Practice Track

Practice Papers I

Documenting APIs with Examples: Lessons Learned with the APIMiner Platform
João Eduardo Montandon, Hudson Borges, Daniel Felix, and Marco Tulio Valente
(UFMG, Brazil)
Software development increasingly relies on Application Programming Interfaces (APIs) to increase productivity. However, learning how to use new APIs in many cases is a non-trivial task given their ever-increasing complexity. To help developers during the API learning process, we describe in this paper a platform---called APIMiner---that instruments the standard Java-based API documentation format with concrete examples of usage. The examples are extracted from a private source code repository---composed by real systems---and summarized using a static slicing algorithm. We also describe a particular instantiation of our platform for the Android API. To evaluate the proposed solution, we performed a field study, when professional Android developers used the platform by four months.

Info

Extracting Business Rules from COBOL: A Model-Based Framework
Valerio Cosentino, Jordi Cabot, Patrick Albert, Philippe Bauquel, and Jacques Perronnet
(AtlanMod, France; IBM, France)
Organizations rely on the logic embedded in their Information Systems for their daily operations. This logic implements the business rules in place in the organization, which must be continuously adapted in response to market changes. Unfortunately, this evolution implies understanding and evolving also the underlying software components enforcing those rules. This is challenging because, first, the code implementing the rules is scattered throughout the whole system and, second, most of the time documentation is poor and out-of-date. This is specially true for older systems that have been maintained and evolved for several years (even decades). In those systems, it is not even clear which business rules are enforced nor whether rules are still consistent with the current organizational policies.
In this sense, the goal of this paper is to facilitate the comprehension of legacy systems (in particular COBOL-based ones) by providing a model driven reverse engineering framework able to extract and visualize the business logic embedded in them.

Evaluating Architecture Stability of Software Projects
Lerina Aversano, Marco Molfetta, and Maria Tortorella
(University of Sannio, Italy)
Reuse of software components depends from different aspects of high level software artifacts. In particular, software architecture and its stability should be taken into account before selecting software components for reuse. In this direction, this paper presents an empirical study aimed at assessing software architecture stability and its evolution along the software project history. The study entails the gathering and analysis of relevant information from several open source projects. The analysis of the software architectures stability of the core components of the analysed projects and related trends are presented as results.

Migrating a Large Scale Legacy Application to SOA: Challenges and Lessons Learned
Ravi Khadka, Amir Saeidi, Slinger Jansen, Jurriaan Hage

, and Geer P. Haas
(Utrecht University, Netherlands; IBM, Netherlands)
This paper presents the findings of a case study of a large scale legacy to service-oriented architecture migration process in the payments domain of a Dutch bank. The paper presents the business drivers that initiated the migration, and describes a 4-phase migration process. For each phase, the paper details benefits of using the techniques, best practices that contribute to the success, and possible challenges that are faced during migration. Based on these observations, the findings are discussed as lessons learned, including the implications of using reverse engineering techniques to facilitate the migration process, adopting a pragmatic migration realization approach, emphasizing the organizational and business perspectives, and harvesting knowledge of the system throughout the system's life cycle.

Info

Practice Papers II

Assessing the Complexity of Upgrading Software Modules
Bram Schoenmakers, Niels van Den Broek, Istvan Nagy, Bogdan Vasilescu, and Alexander Serebrenik
(ASML, Netherlands; Eindhoven University of Technology, Netherlands)
Modern software development frequently involves developing multiple codelines simultaneously. Improvements to one codeline should often be applied to other codelines as well, which is typically a time consuming and error-prone process. In order to reduce this (manual) effort, changes are applied to the system's modules and those affected modules are upgraded on the target system. This is a more coarse-grained approach than upgrading the affected files only. However, when a module is upgraded, one must make sure that all its dependencies are still satisfied.
This paper proposes an approach to assess the ease of upgrading a software system. An algorithm was developed to compute the smallest set of upgrade dependencies, given the current version of a module and the version it has to be upgraded to. Furthermore, a visualization has been designed to explain why upgrading one module requires upgrading many additional modules.
A case study has been performed at ASML to study the ease of upgrading the TwinScan software. The analysis shows that removing elements from interfaces leads to many additional upgrade dependencies. Moreover, based on our analysis we have formulated a number improvement suggestions such as a clear separation between the test code and the production code as well as an introduction of a structured process of symbols deprecation and removal.

Analyzing PL/1 Legacy Ecosystems: An Experience Report
Erik Aeschlimann, Mircea Lungu, Oscar Nierstrasz, and Carl Worms
(University of Bern, Switzerland; Credit Suisse, Switzerland)
This paper presents a case study of analyzing a legacy PL/1 ecosystem that has grown for 40 years to support the busi- ness needs of a large banking company. In order to support the stakeholders in analyzing it we developed St1-PL/1— a tool that parses the code for association data and computes structural metrics which it then visualizes using top-down interactive exploration. Before building the tool and after demonstrating it to stakeholders we conducted several inter- views to learn about legacy ecosystem analysis requirements. We briefly introduce the tool and then present results of analysing the case study. We show that although the vision for the future is to have an ecosystem architecture in which systems are as decoupled as possible the current state of the ecosystem is still removed from this. We also present some of the lessons learned during our experience discussions with stakeholders which include their interests in automatically assessing the quality of the legacy code.

Psyb0t Malware: A Step-by-Step Decompilation Case Study
Lukáš Ďurfina, Jakub Křoustek, and Petr Zemek
(Brno University of Technology, Czech Republic)
Decompilation (i.e. reverse compilation) represents one of the most toughest and challenging tasks in reverse engineering. Even more difficult task is the decompilation of malware because it typically does not follow standard application binary interface conventions, has stripped symbols, is obfuscated, and can contain polymorphic code. Moreover, in the recent years, there is a rapid expansion of various smart devices, running different types of operating systems on many types of processors, and malware targeting these platforms. These facts, combined with the boundedness of standard decompilation tools to a particular platform, imply that a considerable amount of effort is needed when decompiling malware for such a diversity of platforms.
This is an experience paper reporting the decompilation of a real-world malware. We give a step-by-step case study of decompiling a MIPS worm called psyb0t by using a retargetable decompiler that is being developed within the Lissom project. First, we describe the decompiler in detail. Then, we present the case study. After that, we analyse the results obtained during the decompilation and present our personal experience. The paper is concluded by discussing future research possibilities.

Info

ERA Track

Reusing Reused Code
Tomoya Ishihara, Keisuke Hotta, Yoshiki Higo

, and Shinji Kusumoto
(Osaka University, Japan)
Although source code search systems are well known as being helpful to reuse source code, they have an issue that they often suggest larger code than what users actually need. This is because they suggest code based on the structure of programming languages such as files or classes. In this paper, we propose a new code search technique that considers past reuse. In the proposed technique, code are suggested at the unit of past reuse. The proposed technique detects reused code by using a fine-grained code clone detection technique. We conducted an experiment to compare the proposed technique with an existing technique. The result shows that the proposed technique helps more effectively to reuse code than the existing technique.

Specification Extraction by Symbolic Execution
Josef Pichler
(Software Competence Center Hagenberg, Austria)
Technical software systems contain extensive and complex computations that are frequently implemented in an optimized and unstructured way. Computations are, therefore, hard to comprehend from source code. If no other documentation exists, it is a tedious endeavor to understand which input data impact on a particular computation and how a program does achieves a particular result. We apply symbolic execution to automatically extract computations from source code. Symbolic execution makes it possible to identify input and output data, the actual computation as well as constraints of a particular computation, independently of encountered optimizations and unstructured program elements. The proposed technique may be used to improve maintenance and reengineering activities concerning legacy code in scientific and engineering domains.

An IDE-Based Context-Aware Meta Search Engine
Mohammad Masudur Rahman, Shamima Yeasmin, and Chanchal K. Roy
(University of Saskatchewan, Canada)
Traditional web search forces the developers to leave their working environments and look for solutions in the web browsers. It often does not consider the context of their programming problems. The context-switching between the web browser and the working environment is time-consuming and distracting, and the keyword-based traditional search often does not help much in problem solving. In this paper, we propose an Eclipse IDE-based web search solution that collects the data from three web search APIs- Google, Yahoo, Bing and a programming Q & A site- StackOverflow. It then provides search results within IDE taking not only the content of the selected error into account but also the problem context, popularity and search engine recommendation of the result links. Experiments with 25 runtime errors and exceptions show that the proposed approach outperforms the keyword-based search approaches with a recommendation accuracy of 96%. We also validate the results with a user study involving five prospective participants where we get a result agreement of 64.28%. While the preliminary results are promising, the approach needs to be further validated with more errors and exceptions followed by a user study with more participants to establish itself as a complete IDE-based web search solution.

An Approach to Clone Detection in Behavioural Models
Elizabeth P. Antony, Manar H. Alalfi, and James R. Cordy
(Queen's University, Canada)
In this paper we present an approach for identifying near-miss interaction clones in reverse-engineered UML behavioural models. Our goal is to identify patterns of interaction (“conversations”) that can be used to characterize and abstract the run-time behaviour of web applications and other interactive systems. In order to leverage robust near-miss code clone technology, our approach is text-based, working on the level of XMI,the standard interchange serialization for UML. Behavioural model clone detection presents several challenges - first, it is not clear how to break a continuous stream of interaction between lifelines into meaningful conversational units. Second, unlike programming languages, the XMI text representation for UML is highly non-local, using attributes to reference information in the model file remotely. In this work we use a set of contextualizing source transformations on the XMI text representation to reveal the hidden hierarchical structure of the model and granularize behavioural interactions into conversational units. Then we adapt NiCad, a near-miss code clone detection tool, to help us identify conversational clones in reverse-engineered behavioural models.

Tool Demonstrations

MemBrush: A Practical Tool to Detect Custom Memory Allocators in C Binaries
Xi Chen, Asia Slowinska, and Herbert Bos
(VU University Amsterdam, Netherlands)
Many reversing techniques for data structures rely on the knowledge of memory allocation routines. Typically, they interpose on the system’s malloc and free functions, and track each chunk of memory thus allocated as a data structure. However, many performance-critical applications implement their own custom memory allocators. As a result, current binary analysis techniques for tracking data structures fail on such binaries. We present MemBrush, a new tool to detect memory allocation and deallocation functions in stripped binaries with high accuracy. We evaluated the technique on a large number of real world applications that use custom memory allocators. Our system uses dynamic analysis and detects memory allocation and deallocation routines by searching for functions that comply with a set of generic characteristics of allocators and deallocators.

Video

MemPick: A Tool for Data Structure Detection
Istvan Haller, Asia Slowinska, and Herbert Bos

(VU University Amsterdam, Netherlands)
Most current techniques for data structure reverse engineering are limited to low-level programing constructs, such as individual variables or structs. In practice, pointer networks connect some of these constructs, to form higher level entities like lists and trees. The lack of information about the pointer network limits our ability to efficiently perform forensics and reverse engineering. To fill this gap, we propose MemPick, a tool that detects and classifies high-level data structures used in stripped C/C++ binaries. By analyzing the evolution of the heap during program execution, it identifies and classifies the most commonly used data structures, such as singly- or doubly-linked lists, many types of trees (e.g., AVL, red-black trees, B-trees), and graphs. We evaluated MemPick on a wide variety of popular libraries and real world applications with great success. I

Gelato: GEneric LAnguage TOols for Model-Driven Analysis of Legacy Software Systems
Amir Saeidi, Jurriaan Hage, Ravi Khadka, and Slinger Jansen
(Utrecht University, Netherlands)
We present an integrated set of language-independent (generic) tools for analyzing legacy software systems: Gelato. Like any analysis tool, Gelato consists of a set of parsers, tree walkers, transformers, visualizers and pretty printers for different programming languages. Gelato is divided into a set of components, comprising of a set of language-specific bundles and a generic core. By providing a generic core, Gelato enables building tools for analyzing legacy systems independent of the languages they are implemented in. To achieve this, Gelato consists of a generic extensible imperative language called Kernel which provides a separation between syntactic and semantic analysis. We have adopted model-driven techniques to develop the Gelato tool set which is integrated into the Eclipse environment.

Extracting Business Rules from COBOL: A Model-Based Tool
Valerio Cosentino, Jordi Cabot, Patrick Albert, Philippe Bauquel, and Jacques Perronnet
(AtlanMod, France; IBM, France)
This paper presents a Business Rule Extraction tool for COBOL systems. Starting from a COBOL program, we derive a model-based representation of the source code and we provide a set of model transformations to identify and visualize the embedded business rules. In particular, the tool facilitates the definition of an application vocabulary and the identification of relevant business variables. In addition, such variables are used as starting point to slice the code in order to identify business rules, that are finally represented by means of textual and graphical artifacts. The tool has been developed as an Eclipse plug-in in collaboration with IBM France.

Detecting Dependencies in Enterprise JavaBeans with SQuAVisiT
Alexandru Sutii, Serguei Roubtsov, and Alexander Serebrenik
(Eindhoven University of Technology, Netherlands)
We present recent extensions to SQuAVisiT, Software Quality Assessment and Visualization Toolset. While SQuAVisiT has been designed with traditional software and traditional caller-callee dependencies in mind, recent popularity of Enterprise JavaBeans (EJB) required extensions that enable analysis of additional forms of dependencies: EJB dependency injections, object-relational (persistence) mappings and Web service mappings. In this paper we discuss the implementation of these extensions in SQuAVisiT and the application of SQuAVisiT to an open-source software system.

REdiffs: Refactoring-Aware Difference Viewer for Java
Shinpei Hayashi, Sirinut Thangthumachit, and Motoshi Saeki
(Tokyo Institute of Technology, Japan)
Comparing and understanding differences between old and new versions of source code are necessary in various software development situations. However, if changes are tangled with refactorings in a single revision, then the resulting source code differences are more complicated. We propose an interactive difference viewer which enables us to separate refactoring effects from source code differences for improving the understandability of the differences.

Info

CCCD: Concolic Code Clone Detection
Daniel E. Krutz and Emad Shihab
(Rochester Institute of Technology, USA)
Code clones are multiple code fragments that produce similar results when provided the same input. Prior research has shown that clones can be harmful since they elevate maintenance costs, increase the number of bugs caused by inconsistent changes to cloned code and may decrease programmer comprehensibility due to the increased size of the code base.
To assist in the detection of code clones, we propose a new tool known as Concolic Code Clone Discovery (CCCD). CCCD is the first known clone detection tool that uses concolic analysis as its primary component and is one of only three known techniques which are able to reliably detect the most complicated kind of clones, type-4 clones.

Workshop Summaries

3rd Workshop on Mining Unstructured Data
Alberto Bacchelli, Nicolas Bettenburg, Latifa Guerrouj, and Sonia Haiduc
(Delft University of Technology, Netherlands; Queen's University, Canada; Polytechnique Montréal, Canada; Florida State University, USA)
Software development knowledge resides in the source code and in a number of other artefacts produced during the development process. To extract such a knowledge, past software engineering research has extensively focused on mining the source code, i.e., the final product of the development effort. Currently, we witness an emerging trend where researchers strive to exploit the information captured in artifacts such as emails and bug reports, free-form text requirements and specifications, comments and identifiers. Being often expressed in natural language, and not having a well-defined structure, the information stored in these artifacts is defined as unstructured data. Although research communities in Information Retrieval, Data Mining and Natural Language Processing have devised techniques to deal with unstructured data, these techniques are usually limited in scope (i.e., designed for English language text found in newspaper articles) and intended for use in specific scenarios, thus failing to achieve their full potential in a software development context. The workshop on Mining Unstructured Data (MUD) aims to provide a common venue for researchers and practitioners across software engineering, information retrieval and data mining research domains, to share new approaches and emerging results in mining unstructured software engineering data. Through this workshop, we aim to encourage cross-fertilization across different research domains, and to document and evolve the state of the art in mining unstructured data.

Workshop on Open and Original Problems in Software Language Engineering
Anya Helene Bagge and Vadim Zaytsev
(University of Bergen, Norway; CWI, Netherlands)
The OOPSLE workshop is a discussion-oriented and collaborative forum for formulating and addressing with open, unsolved and unsolvable problems in software language engineering (SLE), which is a research domain of systematic, disciplined and measurable approaches of development, evolution and maintenance of artificial languages used in software development. OOPSLE aims to serve as a think tank in selecting candidates for the open problem list, as well as other kinds of unconventional questions and definitions that do not necessarily have clear answers or solutions, thus facilitating the exposure of dark data. We also plan to formulate promising language-related challenges to organise in the future.

Info

WCRE 2013 – Proceedings

Preface

Invited Papers

Main Research Track

Binary Reverse Engineering

Bug Management

Clones

Human Studies

Re-documenting Legacy Code

Recommendation Systems

Refactoring and Re-modularization

Security and Testing

Software Maintenance

Software Quality

Traceability and Feature Location

Practice Track

Practice Papers I

Practice Papers II

ERA Track

Tool Demonstrations

Workshop Summaries