PROMISE 2020 – Proceedings

Welcome from the Chairs
It is our pleasure to welcome you to the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE 2020), to be held virtually on November 8-9th, 2020, co-located with the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020).
PROMISE is an annual forum for researchers and practitioners to present, discuss and exchange ideas, results, expertise and experiences in the construction and/or application of predictive models and data analytics in software engineering. Such models and analyses could be targeted at planning, design, implementation, testing, maintenance, quality assurance, evaluation, process improvement, management, decision making, and risk assessment in software and systems development.

Papers

Software Defect Prediction using Tree-Based Ensembles
Hamoud Aljamaan

and Amal Alazba

(King Fahd University of Petroleum and Minerals, Saudi Arabia; King Saud University, Saudi Arabia)
Software defect prediction is an active research area in software engineering. Accurate prediction of software defects assists software engineers in guiding software quality assurance activities. In machine learning, ensemble learning has been proven to improve the prediction performance over individual machine learning models. Recently, many Tree-based ensembles have been proposed in the literature, and their prediction capabilities were not investigated in defect prediction. In this paper, we will empirically investigate the prediction performance of seven Tree-based ensembles in defect prediction. Two ensembles are classified as bagging ensembles: Random Forest and Extra Trees, while the other five ensembles are boosting ensembles: Ada boost, Gradient Boosting, Hist Gradient Boosting, XGBoost and CatBoost. The study utilized 11 publicly available MDP NASA software defect datasets. Empirical results indicate the superiority of Tree-based bagging ensembles: Random Forest and Extra Trees ensembles over other Tree-based boosting ensembles. However, none of the investigated Tree-based ensembles was significantly lower than individual decision trees in prediction performance. Finally, Adaboost ensemble was the worst performing ensemble among all Tree-based ensembles.

Publisher's Version

Info

Improving Real-World Vulnerability Characterization with Vulnerable Slices
Solmaz Salimi, Maryam Ebrahimzadeh, and Mehdi Kharrazi

(Sharif University of Technology, Iran)
Vulnerability detection is an important challenge in the security community. Many different techniques have been proposed, ranging from symbolic execution to fuzzing in order to help in identifying vulnerabilities. Even though there has been considerable improvement in these approaches, they perform poorly on a large scale code basis. There has also been an alternate approach, where software metrics are calculated on the overall code structure with the hope of predicting code segments more likely to be vulnerable. The logic has been that more complex code with respect to the software metrics, will be more likely to contain vulnerabilities.
In this paper, we conduct an empirical study with a large dataset of vulnerable codes to discuss if we can change the way we measure metrics to improve vulnerability characterization. More specifically, we introduce vulnerable slices as vulnerable code units to measure the software metrics and then use these new measured metrics to characterize vulnerable codes. The result shows that vulnerable slices significantly increase the accuracy of vulnerability characterization. Further, we utilize vulnerable slices to analyze the dataset of known vulnerabilities, particularly to observe how by using vulnerable slices the size and complexity changes in real-world vulnerabilities.

Publisher's Version

Workload-Aware Reviewer Recommendation using a Multi-objective Search-Based Approach
Wisam Haitham Abbood Al-Zubaidi, Patanamon Thongtanunam, Hoa Khanh Dam, Chakkrit Tantithamthavorn, and Aditya Ghose
(University of Wollongong, Australia; University of Melbourne, Australia; Monash University, Australia)
Reviewer recommendation approaches have been proposed to provide automated support in finding suitable reviewers to review a given patch. However, they mainly focused on reviewer experience, and did not take into account the review workload, which is another important factor for a reviewer to decide if they will accept a review invitation. We set out to empirically investigate the feasibility of automatically recommending reviewers while considering the review workload amongst other factors. We develop a novel approach that leverages a multi-objective meta-heuristic algorithm to search for reviewers guided by two objectives, i.e., (1) maximizing the chance of participating in a review, and (2) minimizing the skewness of the review workload distribution among reviewers. Through an empirical study of 230,090 patches with 7,431 reviewers spread across four open source projects, we find that our approach can recommend reviewers who are potentially suitable for a newly-submitted patch with 19% - 260% higher F-measure than the five benchmarks. Our empirical results demonstrate that the review workload and other important information should be taken into consideration in find-ing reviewers who are potentially suitable for a newly-submitted patch. In addition, the results show the effectiveness of realizing this approach using a multi-objective search-based approach.

Publisher's Version

Evaluating Hyper-parameter Tuning using Random Search in Support Vector Machines for Software Effort Estimation
Leonardo Villalobos-Arias, Christian Quesada-López, Jose Guevara-Coto, Alexandra Martínez, and Marcelo Jenkins
(University of Costa Rica, Costa Rica)
Studies in software effort estimation ‍(SEE) have explored the use of hyper-parameter tuning for machine learning algorithms ‍(MLA) to improve the accuracy of effort estimates. In other contexts random search ‍(RS) has shown similar results to grid search, while being less computationally-expensive. In this paper, we investigate to what extent the random search hyper-parameter tuning approach affects the accuracy and stability of support vector regression ‍(SVR) in SEE. Results were compared to those obtained from ridge regression models and grid search-tuned models. A case study with four data sets extracted from the ISBSG 2018 repository shows that random search exhibits similar performance to grid search, rendering it an attractive alternative technique for hyper-parameter tuning. RS-tuned SVR achieved an increase of 0.227 standardized accuracy ‍(SA) with respect to default hyper-parameters. In addition, random search improved prediction stability of SVR models to a minimum ratio of 0.840. The analysis showed that RS-tuned SVR attained performance equivalent to GS-tuned SVR. Future work includes extending this research to cover other hyper-parameter tuning approaches and machine learning algorithms, as well as using additional data sets.

Publisher's Version

Fault-Insertion and Fault-Fixing: Analysing Developer Activity over Time
David Bowes

, Giuseppe Destefanis, Tracy Hall

, Jean Petric, and Marco Ortu
(Lancaster University, UK; Brunel University, UK; University of Cagliari, Italy)
Developers inevitably make human errors while coding. These errors can lead to faults in code, some of which may result in system failures. It is important to reduce the faults inserted by developers as well as fix any that slip through. To investigate the fault insertion and fault fixing activities of developers. We identify developers who insert and fix faults, ask whether code topic `experts' insert fewer faults, and experts fix more faults and whether patterns of insertion and fixing change over time. We perform a time-based analysis of developer activity on six Apache projects using Latent Dirichlet Allocation (LDA), Network Analysis and Topic Modelling. We show that: the majority of the projects we analysed have developers who dominate in the insertion and fixing of faults; Faults are less likely to be inserted by developers with code topic expertise; Different projects have different patterns of fault inserting and fixing over time. We recommend that projects identify the code topic expertise of developers and use expertise information to inform the assignment of project work. We propose a preliminary analytics dashboard of data to enable projects to track fault insertion and fixing over time. This dashboard should help projects to identify any anomalous insertion and fixing activity.

Publisher's Version

Identifying Key Developers using Artifact Traceability Graphs
H. Alperen Çetin and Eray Tüzün

(Bilkent University, Turkey)
Developers are the most important resource to build and maintain software projects. Due to various reasons, some developers take more responsibility, and this type of developers are more valuable and indispensable for the project. Without them, the success of the project would be at risk. We use the term key developers for these essential and valuable developers, and identifying them is a crucial task for managerial decisions such as risk assessment for potential developer resignations. We study key developers under three categories: jacks, mavens and connectors. A typical jack (of all trades) has a broad knowledge of the project, they are familiar with different parts of the source code, whereas mavens represent the developers who are the sole experts in specific parts of the projects. Connectors are the developers who involve different groups of developers or teams. They are like bridges between teams.
To identify key developers in a software project, we propose to use traceable links among software artifacts such as the links between change sets and files. First, we build an artifact traceability graph, then we define various metrics to find key developers. We conduct experiments on three open source projects: Hadoop, Hive and Pig. To validate our approach, we use developer comments in issue tracking systems and demonstrate that the identified key developers by our approach match the top commenters up to 92%.

Publisher's Version

Info

SEERA: A Software Cost Estimation Dataset for Constrained Environments
Emtinan I. Mustafa and Rasha Osman
(University of Khartoum, Sudan)
The accuracy of software cost estimation depends on the relevancy of the cost estimation dataset, the quality of its data and its suitability for the targeted software development environment. Software development cost is impacted by technical, socio-economic and country-specific organizational and cultural environments. Current publicly available software cost estimation datasets represent environments of North America and Europe, thus limiting their application in technically and economically constrained software industries. In this paper we introduce the SEERA (Software enginEERing in SudAn) cost estimation dataset, a dataset of 120 software development projects representing 42 organizations in Sudan. The SEERA dataset contains 76 attributes and, unlike current cost estimation datasets, is augmented with metadata and the original raw data. This paper describes the data collection process, submitting organizations and project characteristics. In addition, we give a general analysis of the dataset projects to illustrate the impact of local factors on software project cost and compare the data quality of the SEERA dataset to public datasets from the PROMISE repository. The SEERA dataset fills a gap in the diversity of current cost estimation datasets and provides researchers with an opportunity to evaluate the generalization of previous and future cost estimation methods to constrained environments and to develop new techniques that are more suitable for these environments.

Publisher's Version

An Exploratory Study on Applicability of Cross Project Defect Prediction Approaches to Cross-Company Effort Estimation
Sousuke Amasaki

, Hirohisa Aman

, and Tomoyuki Yokogawa

(Okayama Prefectural University, Japan; Ehime University, Japan)
BACKGROUND: Research on software effort estimation has been active for decades, especially in developing effort estimation models. Effort estimation models need a dataset collected from completed projects similar to a project to be estimated. The similarity suffers from dataset shift, and cross-company software effort estimation (CCSEE) gets an attractive research topic. A recent study on the dataset shift problem examined the applicability and the effectiveness of cross-project defect prediction (CPDP) approaches. It was insufficient to bring a conclusion due to a limited number of examined approaches. AIMS: To investigate the characteristics of CPDP approaches that are applicable and effective for dataset shift problem in effort estimation. METHOD: We first reviewed the characteristics of 24 CPDP approaches to find applicable approaches. Next, we investigated their effectiveness in effort estimation performance with ten dataset configurations. RESULTS: 16 out of 24 CPDP approaches implemented in CrossPare framework were found to be applicable to CCSEE. However, only one approach could improve the effort estimation performance. Most of the others degraded it and were harmful. CONCLUSIONS: Most of the CPDP approaches we examined were helpless for CCSEE.

Publisher's Version

PROMISE 2020 – Proceedings

16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE 2020)

Frontmatter

Papers