IWSC 2015 – Proceedings

Foreword
Welcome to the 9th International Workshop on Software Clones (IWSC 2015) held on March 6th 2015 in Montreal, Canada and co-located with the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2015). Software clones are similar or identical fragments of software artifacts. Clones are used as a proxy for various concerns in software engineering, such as software quality, complexity, architecture, refactoring, evolution, licensing, plagiarism, and so on. Various characteristics of software systems can be uncovered through clone analysis, while system restructuring can be driven by merging clones. The purpose of this workshop is to continue to solidify and give shape to this research area and community. More specifically, the goals are to bring together researchers and practitioners from around the world to evaluate the current state of research and applications, discuss common problems, discover new opportunities for collaboration, exchange ideas, envision new areas of research and applications, and explore synergies with similarity analysis in other areas and disciplines.

Clone Detection and Clone Analysis

An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and Analysis
Toshihiro Kamiya
(Future University Hakodate, Japan)
This paper presents a code-clone detection and its analysis method, based on an execution-semantic and arbitrary-granularity modelKamiya2013 of code fragments The principal goal of introducing the proposed detection method is to provide a code-clone detection method suitable for programming languages, where software developers can define their own ``control sentences'' with such as lambda or lazy evaluation. Code clones detected with the proposed method are a kind of type-3 clone, where code fragments exist across boundaries of procedures or modules. The model also seems useful as clone metrics (for a clone triage) based on the contents and contexts of code fragments in a clone class and extensible to a unified method of code-clone detection and code search. This paper introduces an execution-semantic and content-and-context based code clone, describes its definition, a detection method, an analysis method, and a prototype implementation of a tool chain, which was applied to two open-source products as an preliminary empirical evaluation.

Code Clone Detection using Wavelets
Siim Karus and Karl Kilgi
(University of Tartu, Estonia)
Code clones have an influence on the difficulty of maintaining code, which affects the cost in time and money. In order to effectively manage code clones, it is important to know where the clones are and how they relate to each-other. Wavelet analysis has been found to be extremely useful for clone detection in image processing and financial market analysis. Wavelets have the benefit of allowing comparisons than span different scales and strength. It also benefits a lot from parallelisation, which has become more affordable thanks to GPU computing and cloud computing advances. Thus, it makes sense to evaluate wavelet analysis for finding code clones as well. We hereby evaluate a set of wavelets-based language independent code clone detection approaches. The experimental evaluation shows that our approach is able to effectively identify more clones than alternative algorithms.

Performance Impact of Lazy Deletion in Metric Trees for Incremental Clone Analysis
Thierry Lavoie and Ettore Merlo
(Polytechnique Montréal, Canada)
Doing clone detection in multiple versions of a software can be expensive. Incremental clone detection is acknowledged to be a good method to make this process better. We extend existing ideas in incremental clone detection to metric trees using lazy deletion. We measured the execution time of the non-incremental and the incremental version of the clone detector and discovered that incremental clone detection can save a sizable amount of time even for versions separated by large variations. We discuss the results and propose some future research.

Suggesting Reuse with Clones

An Empirical Study of Identical Function Clones in CRAN
Maëlick Claes, Tom Mens, Narjisse Tabout, and Philippe Grosjean
(University of Mons, Belgium)
Code clone analysis is a very active subject of study, and research on inter-project code clones is starting to emerge. In the context of software package repositories specifically, developers are confronted with the choice between depending on code implemented in other packages, or cloning this code in their own package. This article presents an empirical study of identical function clones in the CRAN package archive network, in order to understand the extent of this practice in the R community. Depending on too many packages may hamper maintainability as unexpected conflicts may arise during package updates. Duplicating functions from other packages may reduce maintainability since bug fixes or code changes are not propagated automatically to its clones. We study how the characteristics of cloned functions in CRAN snapshots evolve over time, and classify these clones depending on what has prevented package developers to rely on dependencies instead.

On the Level of Code Suggestion for Reuse
Akio Ohtani, Yoshiki Higo

, Tomoya Ishihara, and Shinji Kusumoto
(Osaka University, Japan)
ode search techniques are well-known as one of the techniques that helps code reuse. If developers input queries that represent functionality that they want, the techniques suggest code fragments that are related to the query. Generally, code search techniques suggest code at the component level of programming language such as class or file. Due to this, developers occasionally need to search necessary code in the suggested area. As a countermeasure, there is a code search technique where code is suggested based on the past reuse. The technique ignores structural code blocks, so that developers need to add some code to the pasted code or remove some code from it. That is, the advantages and disadvantages of the former technique are disadvantages and advantages of the latter one, respectively. In this research, we have conducted a comparative study to reveal which level of code suggestion is more useful for code reuse. In the study, we also compared a hybrid technique of the two techniques with them. As a result, we revealed that component-level suggestions were able to provide reusable code more precisely. On the other hand, reuse-level suggestions were more helpful to reuse larger code.

Source Code Reuse Evaluation by Using Real/Potential Copy and Paste
Takafumi Ohta, Hiroaki Murakami, Hiroshi Igaki, Yoshiki Higo

, and Shinji Kusumoto
(Osaka University, Japan)
Developers often reuse existing software by copy and paste. Source code reuse improves productivity and software quality. On the other hand, source code reuse requires several professional skills to developers. In source code reuse, developers must locate reusable code fragments, and judge whether such reusable code is adequate to copy and paste into the source file under development. This paper presents extraction and analysis methods for developers' source code reuse behavior (copy and paste). Our method extracts developers' actual source code reuse (real copy and paste). Then, by using a code clone detection tool, the method extracts code fragments for (potential reuse). Our study of real and potential copy and paste provides a quantitative assessment for source code reuse by developers.

Supporting Users

Tool Support for Managing Method Clones
Hamid Abdul Basit, Hassan Shahid Khan, Fahad Hamid, and Irtza Suhail
(Lahore University of Management Sciences, Pakistan)
It is not always feasible to refactor or remove all clones in a system, either due to language limitations or other practical considerations. Meta-programming based reuse technique of VCL can effectively unify and manage clones at the meta-level for better maintenance, even in the presence of clones with differences that render them hard to unify using other language-based conventional techniques, or clones that are kept in the system for other purposes. In this paper, we present an automated tool that can unify method clones with VCL. We also discuss various patterns of cloned fragments contained in those methods, and how each pattern can be framed with VCL.

Analysis and Visualization for Clone Refactoring
Minhaz F. Zibran
(University of New Orleans, USA)
Clone analysis and visualization help in understanding characteristics of clones and indicate potential clones as cost-effective candidates for refactoring. Many studies have analyzed clones and their evolution while a number of techniques have also been proposed for visualizing clones for aiding clone analysis. However, clone analyses and visualizations with respect to inheritance hierarchy and call graphs have remained ignored so far. In this position paper, we argue that such analyses and visualizations are necessary to help in dealing with clones for refactoring.

What Do Practitioners Ask about Code Clone? A Preliminary Investigation of Stack Overflow
Eunjong Choi, Norihiro Yoshida

, Raula Gaikovina Kula, and Katsuro Inoue
(Osaka University, Japan; Nagoya University, Japan)
We present a preliminary investigation of Stack Overflow to reveal practitioner’s interests about code clones. We then discuss possible future directions of research on code clones.

What Do We Need to Know about Clones? Deriving Information Needs from User Goals
Hamid Abdul Basit, Muhammad Hammad, Stan Jarzabek, and Rainer Koschke
(Lahore University of Management Sciences, Pakistan; Punjab Information Technology Board, Pakistan; National University of Singapore, Singapore; University of Bremen, Germany)
Clone detection can be used to achieve diverse objectives such as refactoring, program understanding, bug localization, and plagiarism detection, etc. Each goal takes a different perspective on clone information needs. Different clone detection tools report different information about clones. To gauge the suitability of a given clone detector for a particular user objective, we need to determine which information needs implied by the objective a clone detector addresses. In this paper, we make a first step toward gathering clone information needs from the description of user goals. The results of our analysis are useful for various stakeholders such as programmers, managers, tool developers, and researchers.

IWSC 2015 – Proceedings

2015 IEEE 9th International Workshop on Software Clones (IWSC)

Frontmatter

Clone Detection and Clone Analysis

Suggesting Reuse with Clones

Supporting Users