ICSME 2015 – Proceedings

Constrained Feature Selection for Localizing Faults
Tien-Duy B. Le, David Lo

, and Ming Li
(Singapore Management University, Singapore; Nanjing University, China)
Developers often take much time and effort to find buggy program elements. To help developers debug, many past studies have proposed spectrum-based fault localization techniques. These techniques compare and contrast correct and faulty execution traces and highlight suspicious program elements. In this work, we propose constrained feature selection algorithms that we use to localize faults. Feature selection algorithms are commonly used to identify important features that are helpful for a classification task. By mapping an execution trace to a classification instance and a program element to a feature, we can transform fault localization to the feature selection problem. Unfortunately, existing feature selection algorithms do not perform too well, and we extend its performance by adding a constraint to the feature selection formulation based on a specific characteristic of the fault localization problem. We have performed experiments on a popular benchmark containing 154 faulty versions from 8 programs and demonstrate that several variants of our approach can outperform many fault localization techniques proposed in the literature. Using Wilcoxon rank-sum test and Cliff's d effect size, we also show that the improvements are both statistically significant and substantial.

Crowdsourced Bug Triaging
Ali Sajedi Badashian, Abram Hindle, and Eleni Stroulia
(University of Alberta, Canada)
Bug triaging and assignment is a time-consuming task in big projects. Most research in this area examines the developers’ prior development and bug-fixing activities in order to recognize their areas of expertise and assign to them relevant bug fixes. We propose a novel method that exploits a new source of evidence for the developers’ expertise, namely their contributions to Q&A platforms such as Stack Overflow. We evaluated this method in the context of the 20 largest GitHub projects, considering 7144 bug reports. Our results demonstrate that our method exhibits superior accuracy to other state-of-theart methods, and that future bug-assignment algorithms should consider exploring other sources of expertise, beyond the project’s version-control system and bug tracker.

Toward Improving Graftability on Automated Program Repair
Soichi Sumi, Yoshiki Higo

, Keisuke Hotta, and Shinji Kusumoto
(Osaka University, Japan)
In software evolution, many bugs occur and developers spend a long time to fix them. Program debugging is a costly and difficult task. Automated program repair is a promising way to reduce costs on program debugging dramatically. Several repair techniques reusing existing code lines have been proposed in the past. They reuse code lines already existing in the source code to generate variant source code of a given source code (if an inserted code line to fix a given bug is identical to any of the code lines in existing source code, we call the code line graftable). However, there are many bugs that such techniques cannot automatically repair. One of the reasons is that many bugs require code lines not existing in the source code of the software. In order to mitigate this issue, we are conducting our research with two ideas. The first idea is using a large dataset of source code to reuse code lines. The second idea is reusing only structures of code lines. Vocabularies are obtained from faulty code regions. In this paper, we report the feasibilities of the two ideas. More concretely, we found that the first and second ideas improved graftability of code lines to 43--59% and 56--64% from 34--54%, respectively. If we combine both the ideas, graftability was improved to 64--69%. In cases where we used the second idea, 24--49% variables used in reused code lines were able to be retrieved from the surrounding code of given faulty code regions.

Mining Stack Overflow for Discovering Error Patterns in SQL Queries
Csaba Nagy and Anthony Cleve
(University of Namur, Belgium)
Constructing complex queries in SQL sometimes necessitates the use of language constructs and the invocation of internal functions which inexperienced developers find hard to comprehend or which are unknown to them. In the worst case, bad usage of these constructs might lead to errors, to ineffective queries, or hamper developers in their tasks.
This paper presents a mining technique for Stack Overflow to identify error-prone patterns in SQL queries. Identifying such patterns can help developers to avoid the use of error-prone constructs, or if they have to use such constructs, the Stack Overflow posts can help them to properly utilize the language. Hence, our purpose is to provide the initial steps towards a recommendation system that supports developers in constructing SQL queries.
Our current implementation supports the MySQL dialect, and Stack Overflow has over 300,000 questions tagged with the MySQL flag in its database. It provides a huge knowledge base where developers can ask questions about real problems. Our initial results indicate that our technique is indeed able to identify patterns among them.

Towards Purity-Guided Refactoring in Java
Jiachen Yang, Keisuke Hotta, Yoshiki Higo

, and Shinji Kusumoto
(Osaka University, Japan)
Refactoring source code requires preserving a certain level of semantic behaviors, which are difficult to be checked by IDEs. Therefore, IDEs generally check syntactic pre-conditions instead before applying refactoring, which are often too restrictive than checking semantic behaviors. On the other hand, there are pure functions in the source code that do not have observable side-effects, of which semantic behaviors are more easily to be checked. In this research, we propose purity-guided refactoring, which applies high-level refactoring such as memoization on pure functions that can be detected statically. By combining our purity analyzing tool purano with refactoring, we can ensure the preservation of semantic behaviors on these detected pure functions, which is impossible through previous refactoring operations provided by IDEs. As a case study of our approach, we applied memorization refactoring on several open-source software in Java. We observed improvements of the performance and preservation of semantics by profiling their bundled test cases.

Fitness Workout for Fat Interfaces: Be Slim, Clean, and Flexible
Spyros Kranas, Apostolos V. Zarras, and Panos Vassiliadis
(University of Ioannina, Greece)
A class that provides a fat interface violates the interface segregation principle, which states that the clients of the class should not be coupled with methods that they do not need. Coping with this problem involves extracting interfaces that satisfy the needs of the clients. In this paper, we envision an interface extraction method that serves a combination of four principles: (1) fitness, as the extracted interfaces have to fit the needs of the clients, (2) clarity, as the interfaces should not be cluttered with duplicated methods declarations due to clients' similar needs, (3) flexibility, as it should be easy to maintain the extracted interfaces to cope with client changes, without affecting parts of the software that are not concerned by the changes, and (4) practicality, as the interface extraction should account for practical issues like the number of extracted interfaces, domain/developer specific constraints on what to include in the interfaces, etc. Our preliminary results show that it is feasible to extract interfaces by respecting the aforementioned principles. Moreover, our results reveal a number of open issues around the trading between fitness, clarity, flexibility and practicality.

Social and Developers
Thu, Oct 1, 10:40 - 12:20, GW2 B2890 (Chair: Fabian Beck; Latifa Guerrouj)

Choosing Your Weapons: On Sentiment Analysis Tools for Software Engineering Research
Robbert Jongeling, Subhajit Datta, and Alexander Serebrenik
(Eindhoven University of Technology, Netherlands; Singapore University of Technology and Design, Singapore)
Recent years have seen an increasing attention to social aspects of software engineering, including studies of emotions and sentiments experienced and expressed by the software developers. Most of these studies reuse existing sentiment analysis tools such as SentiStrength and NLTK. However, these tools have been trained on product reviews and movie reviews and, therefore, their results might not be applicable in the software engineering domain. In this paper we study whether the sentiment analysis tools agree with the sentiment recognized by human evaluators (as reported in an earlier study) as well as with each other. Furthermore, we evaluate the impact of the choice of a sentiment analysis tool on software engineering studies by conducting a simple study of differences in issue resolution times for positive, negative and neutral texts. We repeat the study for seven datasets (issue trackers and Stack Overflow questions) and different sentiment analysis tools and observe that the disagreement between the tools can lead to contradictory conclusions.

Assessing Developer Contribution with Repository Mining-Based Metrics
Jalerson Lima, Christoph Treude, Fernando Figueira Filho, and Uirá Kulesza
(Federal University of Rio Grande do Norte, Brazil; IFRN, Brazil)
Productivity as a result of individual developers' contributions is an important aspect for software companies to maintain their competitiveness in the market. However, there is no consensus in the literature on how to measure productivity or developer contribution. While some repository mining-based metrics have been proposed, they lack validation in terms of their applicability and usefulness from the individuals who will use them to assess developer contribution: team and project leaders. In this paper, we propose the design of a suite of metrics for the assessment of developer contribution, based on empirical evidence obtained from project and team leaders. In a preliminary evaluation with four software development teams, we found that code contribution and code complexity metrics received the most positive feedback, while participants pointed out several threats of using bug-related metrics for contribution assessment. None of the metrics can be used in isolation, and project leaders and developers need to be aware of the benefits, limitations, and threats of each one. These findings present a first step towards the design of a larger suite of metrics as well as an investigation into the impact of using metrics to assess contribution.

What's Hot in Software Engineering Twitter Space?
Abhishek Sharma, Yuan Tian, and David Lo

(Singapore Management University, Singapore)
Abstract—Twitter is a popular means to disseminate information and currently more than 300 million people are using it actively. Software engineers are no exception; Singer et al. have shown that many developers use Twitter to stay current with recent technological trends. At various time points, many users are posting microblogs (i.e., tweets) about the same topic in Twitter. We refer to this reasonably large set of topically-coherent microblogs in the Twitter space made at a particular point in time as an event. In this work, we perform an exploratory study on software engineering related events in Twitter. We collect a large set of Twitter messages over a period of 8 months that are made by 79,768 Twitter users and filter them by five programming language keywords. We then run a state-of-the-art Twitter event detection algorithm borrowed from the Natural Language Processing (NLP) domain. Next, using the open coding procedure, we manually analyze 1,000 events that are identified by the NLP tool, and create eleven categories of events (10 main categories + “others”). We find that external resource sharing, technical discussion, and software product updates are the “hottest” categories. These findings shed light on hot topics in Twitter that are interesting to many people and they provide guidance to future Twitter analytics studies that develop automated solutions to help users find fresh, relevant, and interesting pieces of information from Twitter stream to keep developers up-to-date with recent trends.

Validating Metric Thresholds with Developers: An Early Result
Paloma Oliveira, Marco Tulio Valente, Alexandre Bergel, and Alexander Serebrenik
(Federal University of Minas Gerais, Brazil; IFMG, Brazil; University of Chile, Chile; Eindhoven University of Technology, Netherlands)
Thresholds are essential for promoting source code metrics as an effective instrument to control the internal quality of software applications. However, little is known about the relation between software quality as identified by metric thresholds and as perceived by real developers. In this paper, we report the first results of a study designed to validate a technique that extracts relative metric thresholds from benchmark data. We use this technique to extract thresholds from a benchmark of 79 Pharo/Smalltalk applications, which are validated with five experts and 25 developers. Our preliminary results indicate that good quality applications—as cited by experts—respect metric thresholds. In contrast, we observed that noncompliant applications are not largely viewed as requiring more effort to maintain than other applications.

Info

Towards a Survival Analysis of Database Framework Usage in Java Projects
Mathieu Goeminne and Tom Mens
(University of Mons, Belgium)
Many software projects rely on a relational database in order to realize part of their functionality. Various database frameworks and object-relational mappings have been developed and used to facilitate data manipulation. Little is known about whether and how such frameworks co-occur, how they complement or compete with each other, and how this changes over time. We empirically studied these aspects for 5 Java database frameworks, based on a corpus of 3,707 GitHub Java projects. In particular, we analysed whether certain database frameworks co- occur frequently, and whether some database frameworks get replaced over time by others. Using the statistical technique of survival analysis, we explored the survival of the database frameworks in the considered projects. This provides useful evidence to software developers about which frameworks can be used successfully in combination and which combinations should be avoided.

Maintenance and Analysis
Thu, Oct 1, 13:50 - 15:30, GW2 B2890 (Chair: Ferenc Rudolf; Giuseppe Scanniello)

Exploring the Use of Deep Learning for Feature Location
Christopher S. Corley, Kostadin Damevski, and Nicholas A. Kraft
(University of Alabama, USA; Virginia Commonwealth University, USA; ABB Corporate Research, USA)
Deep learning models can infer complex patterns present in natural language text. Relative to n-gram models, deep learning models can capture more complex statistical patterns based on smaller training corpora. In this paper we explore the use of a particular deep learning model, document vectors (DVs), for feature location. DVs seem well suited to use with source code, because they both capture the influence of context on each term in a corpus and map terms into a continuous semantic space that encodes semantic relationships such as synonymy. We present preliminary results that show that a feature location technique (FLT) based on DVs can outperform an analogous FLT based on latent Dirichlet allocation (LDA) and then suggest several directions for future work on the use of deep learning models to improve developer effectiveness in feature location.

Info

Using Stereotypes in the Automatic Generation of Natural Language Summaries for C++ Methods
Nahla J. Abid, Natalia Dragan, Michael L. Collard, and Jonathan I. Maletic
(Kent State University, USA; University of Akron, USA)
An approach to automatically generate natural language documentation summaries for C++ methods is presented. The approach uses prior work by the authors on stereotyping methods along with the source code analysis framework srcML. First, each method is automatically assigned a stereotype(s) based on static analysis and a set of heuristics. Then, the approach uses the stereotype information, static analysis, and predefined templates to generate a natural-language summary for each method. This summary is automatically added to the code base as a comment for each method. The predefined templates are designed to produce a generic summary for specific method stereotypes. Static analysis is used to extract internal details about the method (e.g., parameters, local variables, calls, etc.). This information is used to specialize the generated summaries.

Keecle: Mining Key Architecturally Relevant Classes using Dynamic Analysis
Liliane do Nascimento Vale and Marcelo de A. Maia
(Federal University of Uberlândia, Brazil; Federal University of Goiás, Brazil)
Reconstructing architectural components from existing software applications is an important task during the software maintenance cycle because either those elements do not exist or are outdated. Reverse engineering techniques are used to reduce the effort demanded during the reconstruction. Unfortunately, there is no widely accepted technique to retrieve software components from source code. Moreover, in several architectural descriptions of systems, a set of architecturally relevant classes are used to represent the set of architectural components. Based on this fact, we propose Keecle, a novel dynamic analysis approach for the detection of such classes from execution traces in a semi-automatic manner. Several mechanisms are applied to reduce the size of traces, and finally the reduced set of key classes is identified using Naïve Bayes classification. We evaluated the approach with two open source systems, in order to assess if the encountered classes map to the actual architectural classes defined in the documentation of those respective systems. The results were analyzed in terms of precision and recall, and suggest that the proposed approach is effective for revealing key classes that conceptualize architectural components, outperforming a state-of-the-art approach.

Combining Software Interrelationship Data across Heterogeneous Software Repositories
Nikola Ilo, Johann Grabner, Thomas Artner, Mario Bernhart, and Thomas Grechenig
(Vienna University of Technology, Austria)
Software interrelationships have an impact on the quality and evolution of software projects and are therefore important to development and maintenance. Package management and build systems result in software ecosystems that usually are syntactically and semantically incompatible with each other, although the described software can overlap. There is currently no general way for querying software interrelationships across these different ecosystems. In this paper, we present our approach to combine and consequently query information about software interrelationships across different ecosystems. We propose an ontology for the semantic modeling of the relationships as linked data. Furthermore, we introduce a temporal storage and query model to handle inconsistencies between different data sources. By providing a scalable and extensible architecture to retrieve and process data from multiple repositories, we establish a foundation for ongoing research activities. We evaluated our approach by integrating the data of several ecosystems and demonstrated its usefulness by creating tools for vulnerability notification and license violation detection.

Recovering Transitive Traceability Links among Software Artifacts
Kazuki Nishikawa, Hironori Washizaki, Yoshiaki Fukazawa, Keishi Oshima, and Ryota Mibe
(Waseda University, Japan; Hitachi, Japan; Yokohama Research Laboratory, Japan)
Although many methods have been suggested to automatically recover traceability links in software development, they do not cover all link combinations (e.g., links between the source code and test cases) because specific documents or artifact features (e.g., log documents and structures of source code) are used. In this paper, we propose a method called the Connecting Links Method (CLM) to recover transitive traceability links between two artifacts using a third artifact. Because CLM uses a different artifact as a document, it can be applied to kinds of various data. Basically, CLM recovers traceability links using the Vector Space Model (VSM) in Information Retrieval (IR) methods. For example, by connecting links between A and B and between B and C, CLM retrieves the link between A and C transitively. In this way, CLM can recover transitive traceability links when a suggested method cannot. Here we demonstrate that CLM can effectively recover links that VSM is hard using Open Source Software.

Live Object Exploration: Observing and Manipulating Behavior and State of Java Objects
Benjamin Biegel, Benedikt Lesch, and Stephan Diehl
(University of Trier, Germany)
In this paper we introduce a visual representation of Java objects that can be used for observing and manipulating behavior and state of currently developed classes. It runs separately, e.g., on a tablet, beside an integrated development environment. Within the visualization, developers are able to arbitrarily change the object state, then invoke any method with custom parameters and observe how the object state changes. When changing the source code of the related class, the visualization holds the previous object state and adapts the new behavior defined by the underlying source code. This instantly enables developers to observe functionalities objects of a certain class have and how they manipulate their state, and especially, how source code changes influence their behavior. We implemented a first prototype as a touch-enabled web application that is connected to a conventional integrated development environment. In order to gain first practical insights, we evaluated our approach in a pilot user study.

ICSME 2015 – Proceedings

Early Research Achievements Track

Defects and Refactoring
Wed, Sep 30, 08:30 - 10:10, GW2 B2890 (Chair: Coen De Roover; Foutse Khomh; Lin Tan; Serge Demeyer)

Social and Developers
Thu, Oct 1, 10:40 - 12:20, GW2 B2890 (Chair: Fabian Beck; Latifa Guerrouj)

Maintenance and Analysis
Thu, Oct 1, 13:50 - 15:30, GW2 B2890 (Chair: Ferenc Rudolf; Giuseppe Scanniello)

ICSME 2015 – Proceedings

Early Research Achievements Track

Defects and Refactoring Wed, Sep 30, 08:30 - 10:10, GW2 B2890 (Chair: Coen De Roover; Foutse Khomh; Lin Tan; Serge Demeyer)

Social and Developers Thu, Oct 1, 10:40 - 12:20, GW2 B2890 (Chair: Fabian Beck; Latifa Guerrouj)

Maintenance and Analysis Thu, Oct 1, 13:50 - 15:30, GW2 B2890 (Chair: Ferenc Rudolf; Giuseppe Scanniello)

Defects and Refactoring
Wed, Sep 30, 08:30 - 10:10, GW2 B2890 (Chair: Coen De Roover; Foutse Khomh; Lin Tan; Serge Demeyer)

Social and Developers
Thu, Oct 1, 10:40 - 12:20, GW2 B2890 (Chair: Fabian Beck; Latifa Guerrouj)

Maintenance and Analysis
Thu, Oct 1, 13:50 - 15:30, GW2 B2890 (Chair: Ferenc Rudolf; Giuseppe Scanniello)