ASE 2017 – Proceedings

Learning Effective Changes for Software Projects
Rahul Krishna
(North Carolina State University, USA)
The primary motivation of much of software analytics is decision making. How to make these decisions? Should one make decisions based on lessons that arise from within a particular project? Or should one generate these decisions from across multiple projects? This work is an attempt to answer these questions. Our work was motivated by a realization that much of the current generation software analytics tools focus primarily on prediction. Indeed prediction is a useful task, but it is usually followed by ``planning'' about what actions need to be taken. This research seeks to address the planning task by seeking methods that support actionable analytics by offering clear guidance on what to do. Specifically, we propose XTREE and BELLTREE algorithms for generating a set of actionable plans within and across projects. Each of these plans, if followed will improve the quality of the software project.

Characterizing and Taming Non-deterministic Bugs in JavaScript Applications
Jie Wang
(Institute of Software at Chinese Academy of Sciences, China; University at Chinese Academy of Sciences, China)
JavaScript has become one of the most popular programming languages for both client-side and server-side applications. In JavaScript applications, events may be generated, triggered and consumed non-deterministically. Thus, JavaScript applications may suffer from non-deterministic bugs, when events are triggered and consumed in an unexpected order. In this proposal, we aim to characterize and combat non-deterministic bugs in JavaScript applications. Specifically, we first perform a comprehensive study about real-world non-deterministic bugs in server-side JavaScript applications. In order to facilitate bug diagnosis, we further propose approaches to isolate the necessary events that are responsible for the occurrence of a failure. We also plan to design new techniques in detecting non-deterministic bugs in JavaScript applications.

Towards API-Specific Automatic Program Repair
Sebastian Nielebock

(Otto von Guericke University Magdeburg, Germany)
The domain of Automatic Program Repair (APR) had many research contributions in recent years. So far, most approaches target fixing generic bugs in programs (e.g., off-by-one errors). Nevertheless, recent studies reveal that about 50% of real bugs require API-specific fixes (e.g., adding missing API method calls or correcting method ordering), for which existing APR approaches are not designed. In this paper, we address this problem and introduce the notion of an API-specific program repair mechanism. This mechanism detects erroneous code in a similar way to existing APR approaches. However, to fix such bugs, it uses API-specific information from the erroneous code to search for API usage patterns in other software, with which we could fix the bug. We provide first insights on the applicability of this mechanism and discuss upcoming research challenges.

Managing Software Evolution through Semantic History Slicing
Yi Li
(University of Toronto, Canada)
Software change histories are results of incremental updates made by developers. As a side-effect of the software development process, version history is a surprisingly useful source of information for understanding, maintaining and reusing software. However, traditional commit-based sequential organization of version histories lacks semantic structure and thus are insufficient for many development tasks that require high-level, semantic understanding of program functionality, such as locating feature implementations and porting hot fixes. In this work, we propose to use well-organized unit tests as identifiers for corresponding software functionalities. We then present a family of automated techniques which analyze the semantics of historical changes and assist developers in many everyday practical settings. For validation, we evaluate our approaches on a benchmark of developer-annotated version history instances obtained from real-world open source software projects on GitHub.

Towards the Automatic Classification of Traceability Links
Chris Mills
(Florida State University, USA)
A wide range of text-based artifacts contribute to software projects (e.g., source code, test cases, use cases, project requirements, interaction diagrams, etc.). Traceability Link Recovery (TLR) is the software task in which relevant documents in these various sets are linked to one another, uncovering information about the project that is not available when considering only the documents themselves. This information is helpful for enabling other tasks such as improving test coverage, impact analysis, and ensuring that system or regulatory requirements are met. However, while traceability links are useful, performing TLR manually is time consuming and fraught with error. Previous work has applied Information Retrieval (IR) and other techniques to reduce the human effort involved; however, that effort remains significant. In this research we seek to take the next step in reducing it by using machine learning (ML) classification models to predict whether a candidate link is valid or invalid without human oversight. Preliminary results show that this approach has promise for accurately recommending valid links; however, there are several challenges that still must be addressed in order to achieve a technique with high enough performance to consider it a viable, completely automated solution.

Towards a Software Vulnerability Prediction Model using Traceable Code Patterns and Software Metrics
Kazi Zakia Sultana
(Mississippi State University, USA)
Software security is an important aspect of ensuring software quality. The goal of this study is to help developers evaluate software security using traceable patterns and software metrics during development. The concept of traceable patterns is similar to design patterns but they can be automatically recognized and extracted from source code. If these patterns can better predict vulnerable code compared to traditional software metrics, they can be used in developing a vulnerability prediction model to classify code as vulnerable or not. By analyzing and comparing the performance of traceable patterns with metrics, we propose a vulnerability prediction model. This study explores the performance of some code patterns in vulnerability prediction and compares them with traditional software metrics. We use the findings to build an effective vulnerability prediction model. We evaluate security vulnerabilities reported for Apache Tomcat, Apache CXF and three stand-alone Java web applications. We use machine learning and statistical techniques for predicting vulnerabilities using traceable patterns and metrics as features. We found that patterns have a lower false negative rate and higher recall in detecting vulnerable code than the traditional software metrics.

Towards Search-Based Modelling and Analysis of Requirements and Architecture Decisions
Saheed A. Busari
(University College London, UK)
Many requirements engineering and software architecture decisions are complicated by uncertainty and multiple conflicting stakeholders' objectives. Using quantitative decision models helps clarify these decisions and allows the use of multi-objective simulation optimisation techniques in analysing the impact of decisions on objectives. Existing requirements and architecture decision support methods that use quantitative decision models are limited by the difficulty in elaborating problem-specific decision models and/or lack integrated tool support for automated decision analysis under uncertainty. To address these problems and facilitate requirements and architecture decision analysis, this research proposes a novel modelling language and automated decision analysis technique, implemented in a tool called RADAR. The modelling language is a simplified version of quantitative AND/OR goal models used in requirements engineering and similar to feature models used in software product lines. This research involves developing the RADAR tool and evaluating the tool's applicability, usefulness and scalability on a set of real-world examples.

Privacy-Aware Data-Intensive Applications
Michele Guerriero
(Politecnico di Milano, Italy)
The rise of Big Data is leading to an increasing demand for data-intensive applications (DIAs), which, in many cases, are expected to process massive amounts of sensitive data. In this context, ensuring data privacy becomes paramount. While the way we design and develop DIAs has radically changed over the last few years in order to deal with Big Data, there has been relatively little effort to make such design privacy-aware. As a result, enforcing privacy policies in large-scale data processing is currently an open research problem. This thesis proposal makes one step towards this investigation: after identifying the dataflow model as the reference computational model for large-scale DIAs, (1) we propose a novel language for specifying privacy policies on dataflow applications along with (2) a dataflow re-writing mechanism to enforce such policies during DIA execution. Although a systematic evaluation still needs to be carried out, preliminary results are promising. We plan to implement our approach within a model-driven solution to ultimately simplify the design and development of privacy-aware DIAs, i.e. DIAs that ensure privacy policies at runtime.

ASE 2017 – Proceedings

Doctoral Symposium Mon, Oct 30, 09:00 - 17:30, Room 4405

Doctoral Symposium
Mon, Oct 30, 09:00 - 17:30, Room 4405