SANER 2017
2017 IEEE 24th International Conference on Software Analysis, Evolution, and Reengineering (SANER)
Powered by
Conference Publishing Consulting

2017 IEEE International Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), February 21, 2017, Klagenfurt, Austria

MaLTeSQuE 2017 – Proceedings

Contents - Abstracts - Authors

2017 IEEE International Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE)

Title Page

Message from the Chairs
Welcome to the MaLTeSQuE 2017 workshop on the Machine Learning Techniques for Software Quality Evaluation, held in Klagenfurt on February 21st, 2017, as an event collocated with SANER 2017 conference.
Using Source Code Metrics to Predict Change-Prone Web Services: A Case-Study on eBay Services
Lov Kumar, Santanu Kumar Rath, and Ashish Sureka
(NIT Rourkela, India; ABB Corporate Research, India)
Predicting change-prone object-oriented software using source code metrics is an area that has attracted several researchers’ attention. However, predicting change-prone web services in terms of changes in the WSDL (Web Service Description Language) Interface using source code metrics implementing the services is a relatively unexplored area. We conduct a case-study on change proneness prediction on an experimental dataset consisting of several versions of eBay web services wherein we compute the churn between different versions of the WSDL interfaces using the WSDLDiff Tool. We compute 21 source code metrics using Chidamber and Kemerer Java Metrics (CKJM) extended tool serving as predictors and apply Least Squares Support Vector Machines (LSSVM) based technique to develop a change proneness estimator. Our experimental results demonstrates that a predictive model developed using all 21 metrics and linear kernel yields the best results.
Article Search
Investigating Code Smell Co-occurrences using Association Rule Learning: A Replicated Study
Fabio Palomba, Rocco Oliveto, and Andrea De Lucia
(Delft University of Technology, Netherlands; University of Salerno, Italy; University of Molise, Italy)
Previous research demonstrated how code smells (i.e., symptoms of the presence of poor design or implementation choices) threat software maintainability. Moreover, some studies showed that their interaction has a stronger negative impact on the ability of developers to comprehend and enhance the source code when compared to cases when a single code smell instance affects a code element (i.e., a class or a method). While such studies analyzed the effect of the co-presence of more smells from the developers’ perspective, a little knowledge regarding which code smell types tend to co-occur in the source code is currently available. Indeed, previous papers on smell co-occurrence have been conducted on a small number of code smell types or on small datasets, thus possibly missing important relationships. To corroborate and possibly enlarge the knowledge on the phenomenon, in this paper we provide a large-scale replication of previous studies, taking into account 13 code smell types on a dataset composed of 395 releases of 30 software systems. Code smell co-occurrences have been captured by using association rule mining, an unsupervised learning technique able to discover frequent relationships in a dataset. The results highlighted some expected relationships, but also shed light on co-occurrences missed by previous research in the field.
Article Search
Using Machine Learning to Design a Flexible LOC Counter
Miroslaw Ochodek, Miroslaw Staron, Dominik Bargowski, Wilhelm Meding, and Regina Hebig
(Poznan University of Technology, Poland; Chalmers University of Technology, Sweden; University of Gothenburg, Sweden; Ericsson, Sweden)

The results of counting the size of programs in terms of Lines-of-Code (LOC) depends on the rules used for counting (i.e. definition of which lines should be counted). In the majority of the measurement tools, the rules are statically coded in the tool and the users of the measurement tools do not know which lines were counted and which were not.

The goal of our research is to investigate how to use machine learning to teach a measurement tool which lines should be counted and which should not. Our interest is to identify which parameters of the learning algorithm can be used to classify lines to be counted.

Our research is based on the design science research methodology where we construct a measurement tool based on machine learning and evaluate it based on open source programs. As a training set, we use industry professionals to classify which lines should be counted.

The results show that classifying the lines as to be counted or not has an average accuracy varying between 0.90 and 0.99 measured as Matthew’s Correlation Coefficient and between 95% and nearly 100% measured as the percentage of correctly classified lines.

Based on the results we conclude that using machine learning algorithms as the core of modern measurement instruments has a large potential and should be explored further.


Article Search
Machine Learning for Finding Bugs: An Initial Report
Timothy Chappell, Cristina Cifuentes, Padmanabhan Krishnan, and Shlomo Geva
(Queensland University of Technology, Australia; Oracle Labs, Australia)
Static program analysis is a technique to analyse code without executing it, and can be used to find bugs in source code. Many open source and commercial tools have been developed in this space over the past 20 years. Scalability and precision are of importance for the deployment of static code analysis tools -- numerous false positives and slow runtime both make the tool hard to be used by development, where integration into a nightly build is the standard goal. This requires one to identify a suitable abstraction for the static analysis which is typically a manual process and can be expensive. In this paper we report our findings on using machine learning techniques to detect defects in C programs. We use three off-the-shelf machine learning techniques and use a large corpus of programs available for use in both the training and evaluation of the results. We compare the results produced by the machine learning technique against the Parfait static program analysis tool used internally at Oracle by thousands of developers. While on the surface the initial results were encouraging, further investigation suggests that the machine learning techniques we used are not suitable replacements for static program analysis tools due to low precision of the results. This could be due to a variety of reasons including not using domain knowledge such as the semantics of the programming language and lack of suitable data used in the training process.
Article Search
Automatic Feature Selection by Regularization to Improve Bug Prediction Accuracy
Haidar Osman, Mohammad Ghafari, and Oscar Nierstrasz
(University of Bern, Switzerland)
Bug prediction has been a hot research topic for the past two decades, during which different machine learning models based on a variety of software metrics have been proposed. Feature selection is a technique that removes noisy and redundant features to improve the accuracy and generalizability of a prediction model. Although feature selection is important, it adds yet another step to the process of building a bug prediction model and increases its complexity. Recent advances in machine learning introduce embedded feature selection methods that allow a prediction model to carry out feature selection automatically as part of the training process. The effect of these methods on bug prediction is unknown. In this paper we study regularization as an embedded feature selection method in bug prediction models. Specifically, we study the impact of three regularization methods (Ridge, Lasso, and ElasticNet) on linear and Poisson Regression as bug predictors for five open source Java systems. Our results show that the three regularization methods reduce the prediction error of the regressors and improve their stability.
Article Search
Hyperparameter Optimization to Improve Bug Prediction Accuracy
Haidar Osman, Mohammad Ghafari, and Oscar Nierstrasz
(University of Bern, Switzerland)
Bug prediction is a technique that strives to identify where defects will appear in a software system. Bug prediction employs machine learning to predict defects in software entities based on software metrics. These machine learning models usually have adjustable parameters, called hyperparameters, that need to be tuned for the prediction problem at hand. However, most studies in the literature keep the model hyperparameters set to the default values provided by the used machine learning frameworks. In this paper we investigate whether optimizing the hyperparameters of a machine learning model improves its prediction power. We study two machine learning algorithms: k-nearest neighbours (IBK) and support vector machines (SVM). We carry out experiments on five open source Java systems. Our results show that (i) models differ in their sensitivity to their hyperparameters, (ii) tuning hyperparameters gives at least as accurate models for SVM and significantly more accurate models for IBK, and (iii) most of the default values are changed during the tuning phase. Based on these findings we recommend tuning hyperparameters as a necessary step before using a machine learning model in bug prediction.
Article Search

proc time: 0.12