ICSE 2012 – Proceedings

Message from the Chairs
We cordially welcome you to ICSE 2012, the 34th International Conference on Software Engineering, June 2-9, 2012 in Zurich, Switzerland. The conference theme, “Sustainable Software for a Sustainable World” carries a double meaning: on the one hand, a sustainable world is only possible with massive software support, and, on the other hand, our daily life is embedded in and dependent on so many software systems that this software can only be created and maintained properly if we do this in a sustainable fashion. The conference offers a strong technical program with broad appeal to researchers, industrial practitioners, students, and educators in the field of software engineering. The ICSE 2012 main conference consists of three keynotes, several tracks and some special invited presentations.

Technical Research

Fault Handling
Wed, Jun 6, 10:45 - 12:45

A Systematic Study of Automated Program Repair: Fixing 55 out of 105 Bugs for $8 Each
Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer
(University of Virginia, USA; University of New Mexico, USA)
There are more bugs in real-world programs than human programmers can realistically address. This paper evaluates two research questions: “What fraction of bugs can be repaired automatically?” and “How much does it cost to repair a bug automatically?” In previous work, we presented GenProg, which uses genetic programming to repair defects in off-the-shelf C programs. To answer these questions, we: (1) propose novel algorithmic improvements to GenProg that allow it to scale to large programs and find repairs 68GenProg’s inherent parallelism using cloud computing resources to provide grounded, human-competitive cost measurements, and (3) generate a large, indicative benchmark set to use for systematic evaluations. We evaluate GenProg on 105 defects from 8 open-source programs totaling 5.1 million lines of code and involving 10,193 test cases. GenProg automatically repairs 55 of those 105 defects. To our knowledge, this evaluation is the largest available of its kind, and is often two orders of magnitude larger than previous work in terms of code or test suite size or defect count. Public cloud computing prices allow our 105 runs to be reproduced for 403; a successful repair completes in 96 minutes and costs 7.32, on average.

Where Should the Bugs Be Fixed? - More Accurate Information Retrieval-Based Bug Localization Based on Bug Reports
Jian Zhou, Hongyu Zhang, and David Lo
(Tsinghua University, China; Singapore Management University, Singapore)
For a large and evolving software system, the project team could receive a large number of bug reports. Locating the source code files that need to be changed in order to fix the bugs is a challenging task. Once a bug report is received, it is desirable to automatically point out to the files that developers should change in order to fix the bug. In this paper, we propose BugLocator, an information retrieval based method for locating the relevant files for fixing a bug. BugLocator ranks all files based on the textual similarity between the initial bug report and the source code using a revised Vector Space Model (rVSM), taking into consideration information about similar bugs that have been fixed before. We perform large-scale experiments on four open source projects to localize more than 3,000 bugs. The results show that BugLocator can effectively locate the files where the bugs should be fixed. For example, relevant buggy files for 62.60% Eclipse 3.1 bugs are ranked in the top ten among 12,863 files. Our experiments also show that BugLocator outperforms existing state-of-the-art bug localization methods.

Developer Prioritization in Bug Repositories
Jifeng Xuan, He Jiang, Zhilei Ren, and Weiqin Zou
(Dalian University of Technology, China)
Developers build all the software artifacts in development. Existing work has studied the social behavior in software repositories. In one of the most important software repositories, a bug repository, developers create and update bug reports to support software development and maintenance. However, no prior work has considered the priorities of developers in bug repositories. In this paper, we address the problem of the developer prioritization, which aims to rank the contributions of developers. We mainly explore two aspects, namely modeling the developer prioritization in a bug repository and assisting predictive tasks with our model. First, we model how to assign the priorities of developers based on a social network technique. Three problems are investigated, including the developer rankings in products, the evolution over time, and the tolerance of noisy comments. Second, we consider leveraging the developer prioritization to improve three predicted tasks in bug repositories, i.e., bug triage, severity identification, and reopened bug prediction. We empirically investigate the performance of our model and its applications in bug repositories of Eclipse and Mozilla. The results indicate that the developer prioritization can provide the knowledge of developer priorities to assist software tasks, especially the task of bug triage.

WhoseFault: Automatic Developer-to-Fault Assignment through Fault Localization
Francisco Servant and James A. Jones
(UC Irvine, USA)
This paper describes a new technique, which automatically selects the most appropriate developers for fixing the fault represented by a failing test case, and provides a diagnosis of where to look for the fault. This technique works by incorporating three key components: (1) fault localization to inform locations whose execution correlate with failure, (2) history mining to inform which developers edited each line of code and when, and (3) expertise assignment to map locations to developers. To our knowledge, the technique is the first to assign developers to execution failures, without the need for textual bug reports. We implement this technique in our tool, WhoseFault, and describe an experiment where we utilize a large, open-source project to determine the frequency in which our tool suggests an assignment to the actual developer who fixed the fault. Our results show that 81% of the time, WhoseFault produced the same developer that actually fixed the fault within the top three suggestions. We also show that our technique improved by a difference between 4% and 40% the results of a baseline technique. Finally, we explore the influence of each of the three components of our technique over its results, and compare our expertise algorithm against an existing expertise assessment technique and find that our algorithm provides greater accuracy, by up to 37%.

Code Generation and Recovery
Wed, Jun 6, 10:45 - 12:45

Recovering Traceability Links between an API and Its Learning Resources
Barthélémy Dagenais and Martin P. Robillard
(McGill University, Canada)
Large frameworks and libraries require extensive developer learning resources, such as documentation and mailing lists, to be useful. Maintaining these learning resources is challenging partly because they are not explicitly linked to the frameworks' API, and changes in the API are not reflected in the learning resources. Automatically recovering traceability links between an API and learning resources is notoriously difficult due to the inherent ambiguity of unstructured natural language. Code elements mentioned in documents are rarely fully qualified, so readers need to understand the context in which a code element is mentioned. We propose a technique that identifies code-like terms in documents and links these terms to specific code elements in an API, such as methods. In an evaluation study with four open source systems, we found that our technique had an average recall and precision of 96%.

Generating Range Fixes for Software Configuration
Yingfei Xiong, Arnaud Hubaux, Steven She, and Krzysztof Czarnecki
(University of Waterloo, Canada; University of Namur, Belgium)
To prevent ill-formed configurations, highly configurable software often allows defining constraints over the available options. As these constraints can be complex, fixing a configuration that violates one or more constraints can be challenging. Although several fix-generation approaches exist, their applicability is limited because (1) they typically generate only one fix, failing to cover the solution that the user wants; and (2) they do not fully support non-Boolean constraints, which contain arithmetic, inequality, and string operators.
This paper proposes a novel concept, range fix, for software configuration. A range fix specifies the options to change and the ranges of values for these options. We also design an algorithm that automatically generates range fixes for a violated constraint. We have evaluated our approach with three different strategies for handling constraint interactions, on data from five open source projects. Our evaluation shows that, even with the most complex strategy, our approach generates complete fix lists that are mostly short and concise, in a fraction of a second.

Graph-Based Pattern-Oriented, Context-Sensitive Source Code Completion
Anh Tuan Nguyen, Tung Thanh Nguyen, Hoan Anh Nguyen, Ahmed Tamrawi, Hung Viet Nguyen, Jafar Al-Kofahi, and Tien N. Nguyen
(Iowa State University, USA)
Code completion helps improve programming productivity. However, current support for code completion is limited to context-free code templates or a single method call of the variable on focus. Using libraries for development, developers often repeat API usages for certain tasks. Therefore, in this paper, we introduce GraPacc, a graph-based pattern-oriented, context-sensitive code completion approach that is based on a database of API usage patterns. GraPacc manages and represents the API usage patterns of multiple variables, methods, and control structures via graph-based models. It extracts the context-sensitive features from the code, e.g. the API elements on focus or under modification, and their relations to other elements. The features are used to search and rank the patterns that are most fitted with the current code. When a pattern is selected, the current code will be completed via our novel graph-based code completion algorithm. Empirical evaluation on several real-world systems and human subjects shows that GraPacc has a high level of accuracy and a better level of usefulness than existing tools.

Automatic Input Rectification
Fan Long, Vijay Ganesh, Michael Carbin, Stelios Sidiroglou, and Martin Rinard
(MIT, USA)
Abstract—We present a novel technique, automatic input rectification, and a prototype implementation, SOAP. SOAP learns a set of constraints characterizing typical inputs that an application is highly likely to process correctly. When given an atypical input that does not satisfy these constraints, SOAP automatically rectifies the input (i.e., changes the input so that is satisfies the learned constraints). The goal is to automatically convert potentially dangerous inputs into typical inputs that the program is highly likely to process correctly. Our experimental results show that, for a set of benchmark applications (namely, Google Picasa, ImageMagick, VLC, Swfdec, and Dillo), this approach effectively converts malicious inputs (which successfully exploit vulnerabilities in the application) into benign inputs that the application processes correctly. Moreover, a manual code analysis shows that, if an input does satisfy the learned constraints, it is incapable of exploiting these vulnerabilities. We also present the results of a user study designed to evaluate the subjective perceptual quality of outputs from benign but atypical inputs that have been automatically rectified by SOAP to conform to the learned constraints. Specifically, we obtained benign inputs that violate learned constraints, used our input rectifier to obtain rectified inputs, then paid Amazon Mechanical Turk users to provide their subjective qualitative perception of the difference between the outputs from the original and rectified inputs. The results indicate that rectification can often preserve much, and in many cases all, of the desirable data in the original input.

Empirical Studies of Development
Wed, Jun 6, 10:45 - 12:45

Overcoming the Challenges in Cost Estimation for Distributed Software Projects
Narayan Ramasubbu and Rajesh Krishna Balan
(Singapore Management University, Singapore)
In this paper, we describe how we studied, in-situ, the operational processes of three large high process maturity distributed software development companies and discovered three common problems they faced with respect to early stage project cost estimation. We found that project managers faced significant challenges to accurately estimate project costs because the standard metrics-based estimation tools they used (a) did not effectively incorporate diverse distributed project configurations and characteristics, (b) required comprehensive data that was not fully available for all starting projects, and (c) required significant experience to derive accurate estimates. To address these problems, we collaborated with practitioners at all three firms and developed a new learning-oriented semi-automated early-stage cost estimation solution that was specifically designed for globally distributed software projects. The key idea of our solution was to augment the existing metrics-driven estimation methods with a case repository that stratified past incidents related to project effort estimation issues from the historical project databases at the firms into several generalizable categories. This repository allowed project managers to quickly and effectively “benchmark” their new projects to all past projects across the firms, and thereby learn from them. We deployed our solution at each of our three research sites for real-world field-testing over a period of six months. Project managers of 219 new large globally distributed projects used both our method to estimate the cost of their projects as well as the established metrics-based estimation approaches they were used to. Our approach achieved significantly reduced estimation errors (of up to 60%). This resulted in more than 20% net cost savings, on average, per project – a massive total cost savings across all projects at the three firms!

Characterizing Logging Practices in Open-Source Software
Ding Yuan, Soyeon Park, and Yuanyuan Zhou
(University of Illinois at Urbana-Champaign, USA; UC San Diego, USA)
Software logging is a conventional programming practice. As its efficacy is often important for users and developers to understand what have happened in production run, yet software logging is often done in an arbitrarily manner. So far, there have been little study for understanding logging practices in real world software. This paper makes the first attempt (to the best of our knowledge) to provide quantitative characteristic study of the current log messages within four pieces of large open-source software. First, we quantitatively show that software logging is pervasive. By examining developers’ own modifications to logging code in revision history, we find that they often do not make the log messages right in their first attempts, and thus need to spend significant amount of efforts to modify the log messages as after-thoughts. Our study further provides several interesting findings on where developers spend most of their efforts in modifying the log messages, which can give insights for programmers, tool developers, and language and compiler designers to improve the current logging practice. To demonstrate the benefit of our study, we built a simple checker based on one of our findings and effectively detected 138 new problematic logging code from studied software (24 of them are already confirmed and fixed by developers).

The Impacts of Software Process Improvement on Developers: A Systematic Review
Mathieu Lavallée and Pierre N. Robillard
(École Polytechnique de Montréal, Canada)
This paper presents the results of a systematic review on the impacts of Software Process Improvement (SPI) on developers. This review selected 26 studies from the highest quality journals, conferences, and workshop in the field. The results were compiled and organized following the grounded theory approach. Results from the grounded theory were further categorized using the Ishikawa (or fishbone) diagram. The Ishikawa Diagram models all the factors potentially impacting software developers, and shows both the positive and negative impacts. Positive impacts include a reduction in the number of crises, and an increase in team communications and morale, as well as better requirements and documentation. Negative impacts include increased overhead on developers through the need to collect data and compile documentation, an undue focus on technical approaches, and the fact that SPI is oriented toward management and process quality, and not towards developers and product quality. This systematic review should support future practice through the identification of important obstacles and opportunities for achieving SPI success. Future research should also benefit from the problems and advantages of SPI identified by developers.

Combining Functional and Imperative Programming for Multicore Software: An Empirical Study Evaluating Scala and Java
Victor Pankratius, Felix Schmidt, and Gilda Garretón
(KIT, Germany; Oracle Labs, USA)
Recent multi-paradigm programming languages combine functional and imperative programming styles to make software development easier. Given today's proliferation of multicore processors, parallel programmers are supposed to benefit from this combination, as many difficult problems can be expressed more easily in a functional style while others match an imperative style. Due to a lack of empirical evidence from controlled studies, however, important software engineering questions are largely unanswered. Our paper is the first to provide thorough empirical results by using Scala and Java as a vehicle in a controlled comparative study on multicore software development. Scala combines functional and imperative programming while Java focuses on imperative shared-memory programming. We study thirteen programmers who worked on three projects, including an industrial application, in both Scala and Java. In addition to the resulting 39 Scala programs and 39 Java programs, we obtain data from an industry software engineer who worked on the same project in Scala. We analyze key issues such as effort, code, language usage, performance, and programmer satisfaction. Contrary to popular belief, the functional style does not lead to bad performance. Average Scala run-times are comparable to Java, lowest run-times are sometimes better, but Java scales better on parallel hardware. We confirm with statistical significance Scala's claim that Scala code is more compact than Java code, but clearly refute other claims of Scala on lower programming effort and lower debugging effort. Our study also provides explanations for these observations and shows directions on how to improve multi-paradigm languages in the future.

Performance Analysis
Wed, Jun 6, 10:45 - 12:45

Uncovering Performance Problems in Java Applications with Reference Propagation Profiling
Dacong Yan, Guoqing Xu, and Atanas Rountev
(Ohio State University, USA; UC Irvine, USA)
Many applications suffer from run-time bloat: excessive memory usage and work to accomplish simple tasks. Bloat significantly affects scalability and performance, and exposing it requires good diagnostic tools. We present a novel analysis that profiles the run-time execution to help programmers uncover potential performance problems. The key idea of the proposed approach is to track object references, starting from object creation statements, through assignment statements, and eventually statements that perform useful operations. This propagation is abstracted by a representation we refer to as a reference propagation graph. This graph provides path information specific to reference producers and their run-time contexts. Several client analyses demonstrate the use of reference propagation profiling to uncover run-time inefficiencies. We also present a study of the properties of reference propagation graphs produced by profiling 36 Java programs. Several cases studies discuss the inefficiencies identified in some of the analyzed programs, as well as the significant improvements obtained after code optimizations.

Performance Debugging in the Large via Mining Millions of Stack Traces
Shi Han, Yingnong Dang, Song Ge, Dongmei Zhang, and Tao Xie
(Microsoft Research, China; North Carolina State University, USA)
Given limited resource and time before software release, development-site testing and debugging become more and more insufficient to ensure satisfactory software performance. As a counterpart for debugging in the large pioneered by the Microsoft Windows Error Reporting (WER) system focusing on crashing/hanging bugs, performance debugging in the large has emerged thanks to available infrastructure support to collect execution traces with performance issues from a huge number of users at the deployment sites. However, performance debugging against these numerous and complex traces remains a significant challenge for performance analysts. In this paper, to enable performance debugging in the large in practice, we propose a novel approach, called StackMine, that mines callstack traces to help performance analysts effectively discover highly impactful performance bugs (e.g., bugs impacting many users with long response delay). As a successful technology-transfer effort, since December 2010, StackMine has been applied in performance-debugging activities at a Microsoft team for performance analysis, especially for a large number of execution traces. Based on real-adoption experiences of StackMine in practice, we conducted an evaluation of StackMine on performance debugging in the large for Microsoft Windows 7. We also conducted another evaluation on a third-party application. The results highlight substantial benefits offered by StackMine in performance debugging in the large for large-scale software systems.

Automatically Finding Performance Problems with Feedback-Directed Learning Software Testing
Mark Grechanik, Chen Fu, and Qing Xie
(Accenture Technology Labs, USA; University of Illinois at Chicago, USA)
A goal of performance testing is to find situations when applications unexpectedly exhibit worsened characteristics for certain combinations of input values. A fundamental question of performance testing is how to select a manageable subset of the input data faster to find performance problems in applications automatically.
We offer a novel solution for finding performance problems in applications automatically using black-box software testing. Our solution is an adaptive, feedback-directed learning testing system that learns rules from execution traces of applications and then uses these rules to select test input data automatically for these applications to find more performance problems when compared with exploratory random testing. We have implemented our solution and applied it to a medium-size application at a major insurance company and to an open-source application. Performance problems were found automatically and confirmed by experienced testers and developers.

Predicting Performance via Automated Feature-Interaction Detection
Norbert Siegmund, Sergiy S. Kolesnikov, Christian Kästner, Sven Apel, Don Batory, Marko Rosenmüller, and Gunter Saake
(University of Magdeburg, Germany; University of Passau, Germany; Philipps University of Marburg, Germany; University of Texas at Austin, USA)
Customizable programs and program families provide user-selectable features to allow users to tailor a program to an application scenario. Knowing in advance which feature selection yields the best performance is difficult because a direct measurement of all possible feature combinations is infeasible. Our work aims at predicting program performance based on selected features. However, when features interact, accurate predictions are challenging. An interaction occurs when a particular feature combination has an unexpected influence on performance. We present a method that automatically detects performance-relevant feature interactions to improve prediction accuracy. To this end, we propose three heuristics to reduce the number of measurements required to detect interactions. Our evaluation consists of six real-world case studies from varying domains (e.g., databases, encoding libraries, and web servers) using different configuration techniques (e.g., configuration files and preprocessor flags). Results show an average prediction accuracy of 95%.

Defect Prediction
Wed, Jun 6, 14:00 - 15:30

Sound Empirical Evidence in Software Testing
Gordon Fraser and Andrea Arcuri
(Saarland University, Germany; Simula Research Laboratory, Norway)
Several promising techniques have been proposed to automate different tasks in software testing, such as test data generation for object-oriented software. However, reported studies in the literature only show the feasibility of the proposed techniques, because the choice of the employed artifacts in the case studies (e.g., software applications) is usually done in a non-systematic way. The chosen case study might be biased, and so it might not be a valid representative of the addressed type of software (e.g., internet applications and embedded systems). The common trend seems to be to accept this fact and get over it by simply discussing it in a threats to validity section. In this paper, we evaluate search-based software testing (in particular the EvoSuite tool) when applied to test data generation for open source projects. To achieve sound empirical results, we randomly selected 100 Java projects from SourceForge, which is the most popular open source repository (more than 300,000 projects with more than two million registered users). The resulting case study not only is very large (8,784 public classes for a total of 291,639 bytecode level branches), but more importantly it is statistically sound and representative for open source projects. Results show that while high coverage on commonly used types of classes is achievable, in practice environmental dependencies prohibit such high coverage, which clearly points out essential future research directions. To support this future research, our SF100 case study can serve as a much needed corpus of classes for test generation.

Privacy and Utility for Defect Prediction: Experiments with MORPH
Fayola Peters and Tim Menzies
(West Virginia University, USA)
Ideally, we can learn lessons from software projects across multiple organizations. However, a major impediment to such knowledge sharing are the privacy concerns of software development organizations. This paper aims to provide defect data-set owners with an effective means of privatizing their data prior to release. We explore MORPH which understands how to maintain class boundaries in a data-set. MORPH is a data mutator that moves the data a random distance, taking care not to cross class boundaries. The value of training on this MORPHed data is tested via a 10-way within learning study and a cross learning study using Random Forests, Naive Bayes, and Logistic Regression for ten object-oriented defect data-sets from the PROMISE data repository. Measured in terms of exposure of sensitive attributes, the MORPHed data was four times more private than the unMORPHed data. Also, in terms of the f-measures, there was little difference between the MORPHed and unMORPHed data (original data and data privatized by data-swapping) for both the cross and within study. We conclude that at least for the kinds of OO defect data studied in this project, data can be privatized without concerns for inference efficacy.

Bug Prediction Based on Fine-Grained Module Histories
Hideaki Hata, Osamu Mizuno, and Tohru Kikuno
(Osaka University, Japan; Kyoto Institute of Technology, Japan)
There have been many bug prediction models built with historical metrics, which are mined from version histories of software modules. Many studies have reported the effectiveness of these historical metrics. For prediction levels, most studies have targeted package and file levels. Prediction on a fine-grained level, which represents the method level, is required because there may be interesting results compared to coarse-grained (package and file levels) prediction. These results include good performance when considering quality assurance efforts, and new findings about the correlations between bugs and histories. However, fine-grained prediction has been a challenge because obtaining method histories from existing version control systems is a difficult problem. To tackle this problem, we have developed a fine-grained version control system for Java, Historage. With this system, we target Java software and conduct fine-grained prediction with well-known historical metrics. The results indicate that fine-grained (method-level) prediction outperforms coarse-grained (package and file levels) prediction when taking the efforts necessary to find bugs into account. Using a correlation analysis, we show that past bug information does not contribute to method-level bug prediction.

Refactoring
Wed, Jun 6, 14:00 - 15:30

Reconciling Manual and Automatic Refactoring
Xi Ge, Quinton L. DuBose, and Emerson Murphy-Hill
(North Carolina State University, USA)
Although useful and widely available, refactoring tools are underused. One cause of this underuse is that a developer sometimes fails to recognize that she is going to refactor before she begins manually refactoring. To address this issue, we conducted a formative study of developers’ manual refactoring process, suggesting that developers’ reliance on “chasing error messages” when manually refactoring is an error-prone manual refactoring strategy. Additionally, our study distilled a set of manual refactoring workflow patterns. Using these patterns, we designed a novel refactoring tool called BeneFactor. BeneFactor detects a developer’s manual refactoring, reminds her that automatic refactoring is available, and can complete her refactoring automatically. By alleviating the burden of recognizing manual refactoring, BeneFactor is designed to help solve the refactoring tool underuse problem.

WitchDoctor: IDE Support for Real-Time Auto-Completion of Refactorings
Stephen R. Foster, William G. Griswold, and Sorin Lerner
(UC San Diego, USA)
Integrated Development Environments (IDEs) have come to perform a wide variety of tasks on behalf of the programmer, refactoring being a classic example. These operations have undeniable benefits, yet their large (and growing) number poses a cognitive scalability problem. Our main contribution is WitchDoctor -- a system that can detect, on the fly, when a programmer is hand-coding a refactoring. The system can then complete the refactoring in the background and propose it to the user long before the user can complete it. This implies a number of technical challenges. The algorithm must be 1) highly efficient, 2) handle unparseable programs, 3) tolerate the variety of ways programmers may perform a given refactoring, 4) use the IDE's proven and familiar refactoring engine to perform the refactoring, even though the the refactoring has already begun, and 5) support the wide range of refactorings present in modern IDEs. Our techniques for overcoming these challenges are the technical contributions of this paper. We evaluate WitchDoctor's design and implementation by simulating over 5,000 refactoring operations across three open-source projects. The simulated user is faster and more efficient than an average human user, yet WitchDoctor can detect more than 90% of refactoring operations as they are being performed -- and can complete over a third of refactorings before the simulated user does. All the while, WitchDoctor remains robust in the face of non-parseable programs and unpredictable refactoring scenarios. We also show that WitchDoctor is efficient enough to perform computation on a keystroke-by-keystroke basis, adding an average overhead of only 15 milliseconds per keystroke.

Use, Disuse, and Misuse of Automated Refactorings
Mohsen Vakilian, Nicholas Chen, Stas Negara, Balaji Ambresh Rajkumar, Brian P. Bailey, and Ralph E. Johnson
(University of Illinois at Urbana-Champaign, USA)
Though refactoring tools have been available for more than a decade, research has shown that programmers underutilize such tools. However, little is known about why programmers do not take advantage of these tools. We have conducted a field study on programmers in their natural settings working on their code. As a result, we collected a set of interaction data from about 1268 hours of programming using our minimally intrusive data collectors. Our quantitative data show that programmers prefer lightweight methods of invoking refactorings, usually perform small changes using the refactoring tool, proceed with an automated refactoring even when it may change the behavior of the program, and rarely preview the automated refactorings. We also interviewed nine of our participants to provide deeper insight about the patterns that we observed in the behavioral data. We found that programmers use predictable automated refactorings even if they have rare bugs or change the behavior of the program. This paper reports some of the factors that affect the use of automated refactorings such as invocation method, awareness, naming, trust, and predictability and the major mismatches between programmers' expectations and automated refactorings. The results of this work contribute to producing more effective tools for refactoring complex software.

Human Aspects of Development
Wed, Jun 6, 14:00 - 15:30

Test Confessions: A Study of Testing Practices for Plug-In Systems
Michaela Greiler, Arie van Deursen, and Margaret-Anne Storey
(TU Delft, Netherlands; University of Victoria, Canada)
Testing plug-in based systems is challenging due to complex interactions among many different plug-ins, and variations in version and configuration. The objective of this paper is to find out how developers address this test challenge. To that end, we conduct a qualitative (grounded theory) study, in which we interview 25 senior practitioners about how they test plug-ins and applications built on top of the Eclipse plug-in framework. The outcome is an overview of the testing practices currently used, a set of identified barriers limiting the adoption of test practices, and an explanation of how limited testing is compensated by self-hosting of projects and by involving the community. These results are supported by a structured survey of more than 150 professionals. The study reveals that unit testing plays a key role, whereas plug-in specific integration problems are identified and resolved by the community. Based on our findings, we propose a series of recommendations and areas for future research.

How Do Professional Developers Comprehend Software?
Tobias Roehm, Rebecca Tiarks, Rainer Koschke, and Walid Maalej
(TU Munich, Germany; University of Bremen, Germany)
Research in program comprehension has considerably evolved over the past two decades. However, only little is known about how developers practice program comprehension under time and project pressure, and which methods and tools proposed by researchers are used in industry. This paper reports on an observational study of 28 professional developers from seven companies, investigating how developers comprehend software. In particular we focus on the strategies followed, information needed, and tools used. We found that developers put themselves in the role of end users by inspecting user interfaces. They try to avoid program comprehension, and employ recurring, structured comprehension strategies depending on work context. Further, we found that standards and experience facilitate comprehension. Program comprehension was considered a subtask of other maintenance tasks rather than a task by itself. We also found that face-to-face communication is preferred to documentation. Overall, our results show a gap between program comprehension research and practice as we did not observe any use of state of the art comprehension tools and developers seem to be unaware of them. Our findings call for further careful analysis and for reconsidering research agendas.

Asking and Answering Questions about Unfamiliar APIs: An Exploratory Study
Ekwa Duala-Ekoko and Martin P. Robillard
(McGill University, Canada)
The increasing size of APIs and the increase in the number of APIs available imply developers must frequently learn how to use unfamiliar APIs. To identify the types of questions developers want answered when working with unfamiliar APIs and to understand the difficulty they may encounter answering those questions, we conducted a study involving twenty programmers working on different programming tasks, using unfamiliar APIs. Based on the screen captured videos and the verbalization of the participants, we identified twenty different types of questions programmers ask when working with unfamiliar APIs, and provide new insights to the cause of the difficulties programmers encounter when answering questions about the use of APIs. The questions we have identified and the difficulties we observed can be used for evaluating tools aimed at improving API learning, and in identifying areas of the API learning process where tool support is missing, or could be improved.

Bug Detection
Wed, Jun 6, 16:00 - 18:00

Automated Repair of HTML Generation Errors in PHP Applications Using String Constraint Solving
Hesam Samimi, Max Schäfer, Shay Artzi, Todd Millstein, Frank Tip, and Laurie Hendren
(UC Los Angeles, USA; IBM Research, USA; McGill University, Canada)
PHP web applications routinely generate invalid HTML. Modern browsers silently correct HTML errors, but sometimes malformed pages render inconsistently, cause browser crashes, or expose security vulnerabilities. Fixing errors in generated pages is usually straightforward, but repairing the generating PHP program can be much harder. We observe that malformed HTML is often produced by incorrect "constant prints", i.e., statements that print string literals, and present two tools for automatically repairing such HTML generation errors. PHPQuickFix repairs simple bugs by statically analyzing individual prints. PHPRepair handles more general repairs using a dynamic approach. Based on a test suite, the property that all tests should produce their expected output is encoded as a string constraint over variables representing constant prints. Solving this constraint describes how constant prints must be modified to make all tests pass. Both tools were implemented as an Eclipse plugin and evaluated on PHP programs containing hundreds of HTML generation errors, most of which our tools were able to repair automatically.

Leveraging Test Generation and Specification Mining for Automated Bug Detection without False Positives
Michael Pradel and Thomas R. Gross
(ETH Zurich, Switzerland)
Mining specifications and using them for bug detection is a promising way to reveal bugs in programs. Existing approaches suffer from two problems. First, dynamic specification miners require input that drives a program to generate common usage patterns. Second, existing approaches report false positives, that is, spurious warnings that mislead developers and reduce the practicability of the approach. We present a novel technique for dynamically mining and checking specifications without relying on existing input to drive a program and without reporting false positives. Our technique leverages automatically generated tests in two ways: Passing tests drive the program during specification mining, and failing test executions are checked against the mined specifications. The output are warnings that show with concrete test cases how the program violates commonly accepted specifications. Our implementation reports no false positives and 54 true positives in ten well-tested Java programs.

Axis: Automatically Fixing Atomicity Violations through Solving Control Constraints
Peng Liu and Charles Zhang
(Hong Kong University of Science and Technology, China)
Atomicity, a general correctness criteria in concurrency programs, is often violated in real-world applications. What is worse is that the violations are difficult to be fixed by the developers, which makes the automatic bug fixing techniques attractive solutions. The state of the art aims at automating the manual fixing process and can not provide any theoretical reasoning and guarantees. We provide an automatic approach that applies the well-studied control theory to guarantee the maximal preservation of the concurrency degree of the original code and no introduction of deadlocks. Under the hood, we reduce the problem of violation fixing to a constraint solving problem using the petri net model. Our evaluation on 13 subjects shows that the slowdown incurred by our patches is only 40% of that of the state of the art. With the deadlock-free guarantee, our patches incur moderate overhead (around 10%), which is a worthwhile cost for safety.

CBCD: Cloned Buggy Code Detector
Jingyue Li and Michael D. Ernst
(DNV Research and Innovation, Norway; University of Washington, USA)
Developers often copy, or clone, code in order to reuse or modify functionality. When they do so, they also clone any bugs in the original code. Or, different developers may independently make the same mistake. As one example of a bug, multiple products in a product line may use a component in a similar wrong way. This paper makes two contributions. First, it presents an empirical study of cloned buggy code. In a large industrial product line, about 4% of the bugs are duplicated across more than one product or file. In three open source projects (the Linux kernel, the Git version control system, and the PostgreSQL database) we found 282, 33, and 33 duplicated bugs, respectively. Second, this paper presents a tool, CBCD, that searches for code that is semantically identical to given buggy code. CBCD tests graph isomorphism over the Program Dependency Graph (PDG) representation and uses four optimizations. We evaluated CBCD by searching for known clones of buggy code segments in the three projects and compared the results with text-based, token-based, and AST-based code clone detectors, namely Simian, CCFinder, Deckard, and CloneDR. The evaluation shows that CBCD is fast when searching for possible clones of the buggy code in a large system, and it is more precise for this purpose than the other code clone detectors.

Multiversion Software
Wed, Jun 6, 16:00 - 18:00

Crosscutting Revision Control System
Sagi Ifrah and David H. Lorenz
(Open University, Israel)
Large and medium scale software projects often require a source code revision control (RC) system. Unfortunately, RC systems do not perform well with obliviousness and quantification found in aspect-oriented code. When classes are oblivious to aspects, so is the RC system, and the crosscutting effect of aspects is not tracked. In this work, we study this problem in the context of using AspectJ (a standard AOP language) with Subversion (a standard RC system). We describe scenarios where the crosscutting effect of aspects combined with the concurrent changes that RC supports can lead to inconsistent states of the code. The work contributes a mechanism that checks-in with the source code versions of crosscutting metadata for tracking the effect of aspects. Another contribution of this work is the implementation of a supporting Eclipse plug-in (named XRC) that extends the JDT, AJDT, and SVN plug-ins for Eclipse to provide crosscutting revision control (XRC) for aspect-oriented programming.

Where Does This Code Come from and Where Does It Go? - Integrated Code History Tracker for Open Source Systems -
Katsuro Inoue, Yusuke Sasaki, Pei Xia, and Yuki Manabe
(Osaka University, Japan)
When we reuse a code fragment in an open source system, it is very important to know the history of the code, such as the code origin and evolution. In this paper, we propose an integrated approach to code history tracking for open source repositories. This approach takes a query code fragment as its input, and returns the code fragments containing the code clones with the query code. It utilizes publicly available code search engines as external resources. Based on this model, we have designed and implemented a prototype system named Ichi Tracker. Using Ichi Tracker, we have conducted three case studies. These case studies show the ancestors and descendents of the code, and we can recognize their evolution history.

Improving Early Detection of Software Merge Conflicts
Mário Luís Guimarães and António Rito Silva
(Technical University of Lisbon, Portugal)
Merge conflicts cause software defects which if detected late may require expensive resolution. This is especially true when developers work too long without integrating concurrent changes, which in practice is common as integration generally occurs at check-in. Awareness of others' activities was proposed to help developers detect conflicts earlier. However, it requires developers to detect conflicts by themselves and may overload them with notifications, thus making detection harder. This paper presents a novel solution that continuously merges uncommitted and committed changes to create a background system that is analyzed, compiled, and tested to precisely and accurately detect conflicts on behalf of developers, before check-in. An empirical study confirms that our solution avoids overloading developers and improves early detection of conflicts over existing approaches. Similarly to what happened with continuous compilation, this introduces the case for continuous merging inside the IDE.

A History-Based Matching Approach to Identification of Framework Evolution
Sichen Meng, Xiaoyin Wang, Lu Zhang, and Hong Mei
(Key Laboratory of High Confidence Software Technologies, China; Peking University, China)
In practice, it is common that a framework and its client programs evolve simultaneously. Thus, developers of client programs may need to migrate their programs to the new release of the framework when the framework evolves. As framework developers can hardly always guarantee backward compatibility during the evolution of a framework, migration of its client program is often time-consuming and error-prone. To facilitate this migration, researchers have proposed two categories of approaches to identification of framework evolution: operation-based approaches and matching-based approaches. To overcome the main limitations of the two categories of approaches, we propose a novel approach named HiMa, which is based on matching each pair of consecutive revisions recorded in the evolution history of the framework and aggregating revision-level rules to obtain framework-evolution rules. We implemented our HiMa approach as an Eclipse plug-in targeting at frameworks written in Java using SVN as the versioncontrol system. We further performed an experimental study on HiMa together with a state-of-art approach named AURA using six tasks based on three subject Java frameworks. Our experimental results demonstrate that HiMa achieves higher precision and higher recall than AURA in most circumstances and is never inferior to AURA in terms of precision and recall in any circumstances, although HiMa is computationally more costly than AURA.

Similarity and Classification
Wed, Jun 6, 16:00 - 18:00

Detecting Similar Software Applications
Collin McMillan, Mark Grechanik, and Denys Poshyvanyk
(College of William and Mary, USA; Accenture Technology Labs, USA; University of Illinois at Chicago, USA)
Although popular text search engines allow users to retrieve similar web pages, source code search engines do not have this feature. Detecting similar applications is a notoriously difficult problem, since it implies that similar high-level requirements and their low-level implementations can be detected and matched automatically for different applications. We created a novel approach for automatically detecting Closely reLated ApplicatioNs (CLAN) that helps users detect similar applications for a given Java application. Our main contributions are an extension to a framework of relevance and a novel algorithm that computes a similarity index between Java applications using the notion of semantic layers that correspond to packages and class hierarchies. We have built CLAN and we conducted an experiment with 33 participants to evaluate CLAN and compare it with the closest competitive approach, MUDABlue. The results show with strong statistical significance that CLAN automatically detects similar applications from a large repository of 8,310 Java applications with a higher precision than MUDABlue.

Content Classification of Development Emails
Alberto Bacchelli, Tommaso Dal Sasso, Marco D'Ambros, and Michele Lanza
(University of Lugano, Switzerland)
Emails related to the development of a software system contain information about design choices and issues encountered during the development process. Exploiting the knowledge embedded in emails with automatic tools is challenging, due to the unstructured, noisy, and mixed language nature of this communication medium. Natural language text is often not well-formed and is interleaved with languages with other syntaxes, such as code or stack traces.
We present an approach to classify email content at line level. Our technique classifies email lines in five categories (i.e., text, junk, code, patch, and stack trace) to allow one to subsequently apply ad hoc analysis techniques for each category. We evaluated our approach on a statistically significant set of emails gathered from mailing lists of four unrelated open source systems.

Identifying Linux Bug Fixing Patches
Yuan Tian, Julia Lawall, and David Lo
(Singapore Management University, Singapore; INRIA/LIP6, France)
In the evolution of an operating system there is a continuing tension between the need to develop and test new features, and the need to provide a stable and secure execution environment to users. A compromise, adopted by the developers of the Linux kernel, is to release new versions, including bug fixes and new features, frequently, while maintaining some older "longterm" versions. This strategy raises the problem of how to identify bug fixing patches that are submitted to the current version but should be applied to the longterm versions as well. The current approach is to rely on the individual subsystem maintainers to forward patches that seem relevant to the maintainers of the longterm kernels. The reactivity and diligence of the maintainers, however, varies, and thus many important patches could be missed by this approach.
In this paper, we propose an approach that automatically identifies bug fixing patches based on the changes and commit messages recorded in code repositories. We compare our approach with the keyword-based approach for identifying bug-fixing patches used in the literature, in the context of the Linux kernel. The results show that our approach can achieve a 53.19% improvement in recall as compared to keyword-based approaches, with similar precision.

Active Refinement of Clone Anomaly Reports
Lucia, David Lo, Lingxiao Jiang, and Aditya Budi
(Singapore Management University, Singapore)
Software clones have been widely studied in the recent literature and shown useful for finding bugs because inconsistent changes among clones in a clone group may indicate potential bugs. However, many inconsistent clone groups are not real bugs. The excessive number of false positives could easily impede broad adoption of clone-based bug detection approaches.
In this work, we aim to improve the usability of clonebased bug detection tools by increasing the rate of true positives found when a developer analyzes anomaly reports. Our idea is to control the number of anomaly reports a user can see at a time and actively incorporate incremental user feedback to continually refine the anomaly reports. Our system first presents top few anomaly reports from the list of reports generated by a tool in its default ordering. Users then either accept or reject each of the reports. Based on the feedback, our system automatically and iteratively refines a classification model for anomalies and re-sorts the rest of the reports. Our goal is to present the true positives to the users earlier than the default ordering. The rationale of the idea is based on our observation that false positives among the inconsistent clone groups could share common features (in terms of code structure, programming patterns, etc.), and these features can be learned from the incremental user feedback.
We evaluate our refinement process on three sets of clonebased anomaly reports from three large real programs: the Linux Kernel (C), Eclipse, and ArgoUML (Java), extracted by a clone-based anomaly detection tool. The results show that compared to the original ordering of bug reports, we can improve the rate of true positives found (i.e., true positives are found faster) by 11%, 87%, and 86% for Linux kernel, Eclipse, and ArgoUML, respectively.

Analysis for Evolution
Thu, Jun 7, 10:45 - 12:45

Automated Analysis of CSS Rules to Support Style Maintenance
Ali Mesbah and Shabnam Mirshokraie
(University of British Columbia, Canada)
CSS is a widely used language for describing the presentation semantics of HTML elements on the web. The language has a number of characteristics, such as inheritance and cascading order, which makes maintaining CSS code a challenging task for web developers. As a result, it is common for unused rules to be accumulated over time. Despite these challenges, CSS analysis has not received much attention from the research community. We propose an automated technique to support styling code maintenance, which (1) analyzes the runtime relationship between the CSS rules and DOM elements of a given web application (2) detects unmatched and ineffective selectors, overridden declaration properties, and undefined class values. Our technique, implemented in an open source tool called CILLA, has a high precision and recall rate. The results of our case study, conducted on fifteen open source and industrial web-based systems, show an average of 60% unused CSS selectors in deployed applications, which points to the ubiquity of the problem.

Graph-Based Analysis and Prediction for Software Evolution
Pamela Bhattacharya, Marios Iliofotou, Iulian Neamtiu, and Michalis Faloutsos
(UC Riverside, USA)
We exploit recent advances in analysis of graph topology to better understand software evolution, and to construct predictors that facilitate software development and maintenance. Managing an evolving, collaborative software system is a complex and expensive process, which still cannot ensure software reliability. Emerging techniques in graph mining have revolutionized the modeling of many complex systems and processes. We show how we can use a graph-based characterization of a software system to capture its evolution and facilitate development, by helping us estimate bug severity, prioritize refactoring efforts, and predict defect-prone releases. Our work consists of three main thrusts. First, we construct graphs that capture software structure at two different levels: (a) the product, i.e., source code and module level, and (b) the process, i.e., developer collaboration level. We identify a set of graph metrics that capture interesting properties of these graphs. Second, we study the evolution of eleven open source programs, including Firefox, Eclipse, MySQL, over the lifespan of the programs, typically a decade or more. Third, we show how our graph metrics can be used to construct predictors for bug severity, high-maintenance software parts, and failure-prone releases. Our work strongly suggests that using graph topology analysis concepts can open many actionable avenues in software engineering research and practice.

Integrated Impact Analysis for Managing Software Changes
Malcom Gethers, Bogdan Dit, Huzefa Kagdi, and Denys Poshyvanyk
(College of William and Mary, USA; Wichita State University, USA)
The paper presents an adaptive approach to perform impact analysis from a given change request to source code. Given a textual change request (e.g., a bug report), a single snapshot (release) of source code, indexed using Latent Semantic Indexing, is used to estimate the impact set. Should additional contextual information be available, the approach configures the best-fit combination to produce an improved impact set. Contextual information includes the execution trace and an initial source code entity verified for change. Combinations of information retrieval, dynamic analysis, and data mining of past source code commits are considered. The research hypothesis is that these combinations help counter the precision or recall deficit of individual techniques and improve the overall accuracy. The tandem operation of the three techniques sets it apart from other related solutions. Automation along with the effective utilization of two key sources of developer knowledge, which are often overlooked in impact analysis at the change request level, is achieved.
To validate our approach, we conducted an empirical evaluation on four open source software systems. A benchmark consisting of a number of maintenance issues, such as feature requests and bug fixes, and their associated source code changes was established by manual examination of these systems and their change history. Our results indicate that there are combinations formed from the augmented developer contextual information that show statistically significant improvement over stand-alone approaches.

Detecting and Visualizing Inter-worksheet Smells in Spreadsheets
Felienne Hermans, Martin Pinzger, and Arie van Deursen
(TU Delft, Netherlands)
Spreadsheets are often used in business, for simple tasks, as well as for mission critical tasks such as finance or forecasting. Similar to software, some spreadsheets are of better quality than others, for instance with respect to usability, maintainability or reliability. In contrast with software however, spreadsheets are rarely checked, tested or certified. In this paper, we aim at developing an approach for detecting smells that indicate weak points in a spreadsheet's design. To that end we first study code smells and transform these code smells to their spreadsheet counterparts. We then present an approach to detect the smells, and communicate located smells to spreadsheet users with data flow diagrams. We analyzed occurrences of these smells in the Euses corpus. Furthermore we conducted ten case studies in an industrial setting. The results of the evaluation indicate that smells can indeed reveal weaknesses in a spreadsheet's design, and that data flow diagrams are an appropriate way to show those weaknesses.

Debugging
Thu, Jun 7, 10:45 - 12:45

An Empirical Study about the Effectiveness of Debugging When Random Test Cases Are Used
Mariano Ceccato, Alessandro Marchetto, Leonardo Mariani, Cu D. Nguyen, and Paolo Tonella
(Fondazione Bruno Kessler, Italy; University of Milano-Bicocca, Italy)
Automatically generated test cases are usually evaluated in terms of their fault revealing or coverage capability. Beside these two aspects, test cases are also the major source of information for fault localization and fixing. The impact of automatically generated test cases on the debugging activity, compared to the use of manually written test cases, has never been studied before.
In this paper we report the results obtained from two controlled experiments with human subjects performing debugging tasks using automatically generated or manually written test cases. We investigate whether the features of the former type of test cases, which make them less readable and understandable (e.g., unclear test scenarios, meaningless identifiers), have an impact on accuracy and efficiency of debugging. The empirical study is aimed at investigating whether, despite the lack of readability in automatically generated test cases, subjects can still take advantage of them during debugging.

Reducing Confounding Bias in Predicate-Level Statistical Debugging Metrics
Ross Gore and Paul F. Reynolds, Jr.
(University of Virginia, USA)
Statistical debuggers use data collected during test case execution to automatically identify the location of faults within software. Recent work has applied causal inference to eliminate or reduce control and data flow dependence confounding bias in statement-level statistical debuggers. The result is improved effectiveness. This is encouraging but motivates two novel questions: (1) how can causal inference be applied in predicate-level statistical debuggers and (2) what other biases can be eliminated or reduced. Here we address both questions by providing a model that eliminates or reduces control flow dependence and failure flow confounding bias within predicate-level statistical debuggers. We present empirical results demonstrating that our model significantly improves the effectiveness of a variety of predicate-level statistical debuggers, including those that eliminate or reduce only a single source of confounding bias.

BugRedux: Reproducing Field Failures for In-House Debugging
Wei Jin and Alessandro Orso
(Georgia Tech, USA)
A recent survey conducted among developers of the Apache, Eclipse, and Mozilla projects showed that the ability to recreate field failures is considered of fundamental importance when investigating bug reports. Unfortunately, the information typically contained in a bug report, such as memory dumps or call stacks, is usually insufficient for recreating the problem. Even more advanced approaches for gathering field data and help in-house debugging tend to collect either too little information, and be ineffective, or too much information, and be inefficient. To address these issues, we present BugRedux, a novel general approach for in-house debugging of field failures. BugRedux aims to synthesize, using execution data collected in the field, executions that mimic the observed field failures. We define several instances of BugRedux that collect different types of execution data and perform, through an empirical study, a cost-benefit analysis of the approach and its variations. In the study, we apply BugRedux to 16 failures of 14 real-world programs. Our results are promising in that they show that it is possible to synthesize in-house executions that reproduce failures observed in the field using a suitable set of execution data.

Object-Centric Debugging
Jorge Ressia, Alexandre Bergel, and Oscar Nierstrasz
(University of Bern, Switzerland; University of Chile, Chile)
During the process of developing and maintaining a complex software system, developers pose detailed questions about the runtime behavior of the system. Source code views offer strictly limited insights, so developers often turn to tools like debuggers to inspect and interact with the running system. Unfortunately, traditional debuggers focus on the runtime stack as the key abstraction to support debugging operations, though the questions developers pose often have more to do with objects and their interactions. We propose object-centric debugging as an alternative approach to interacting with a running software system. We show how, by focusing on objects as the key abstraction, natural debugging operations can be defined to answer developer questions related to runtime behavior. We present a running prototype of an object-centric debugger, and demonstrate, with the help of a series of examples, how object-centric debugging offers more effective support for many typical developer tasks than a traditional stack-oriented debugger.

Human Aspects of Process
Thu, Jun 7, 10:45 - 12:45

Disengagement in Pair Programming: Does It Matter?
Laura Plonka, Helen Sharp, and Janet van der Linden
(Open University, UK)
Pair Programming (PP) requires close collaboration and mutual engagement. Most existing empirical studies of PP do not focus on developers’ behaviour during PP sessions, and focus instead on the effects of PP such as productivity. However, disengagement, where a developer is not focusing on solving the task or understanding the problem and allows their partner to work by themselves, can hinder collaboration between developers and have a negative effect on their performance. This paper reports on an empirical study that investigates disengagement. Twenty-one industrial pair programming sessions were video and audio recorded and qualitatively analysed to investigate circumstances that lead to disengagement. We identified five reasons for disengagement: interruptions during the collaboration, the way the work is divided, the simplicity of the task involved, social pressure on inexperienced pair programmers, and time pressure. Our findings suggest that disengagement is sometimes acceptable and agreed upon between the developers in order to speed up problem solving. However, we also found episodes of disengagement where developers “drop out” of their PP sessions and are not able to follow their partner’s work nor contribute to the task at hand, thus losing the expected benefits of pairing. Analysis of sessions conducted under similar circumstances but where mutual engagement was sustained identified three behaviours that help to maintain engagement: encouraging the novice to drive, verbalisation and feedback, and asking for clarification.

Ambient Awareness of Build Status in Collocated Software Teams
John Downs, Beryl Plimmer, and John G. Hosking
(University of Melbourne, Australia; University of Auckland, New Zealand; Australian National University, Australia)
We describe the evaluation of a build awareness system that assists agile software development teams to understand current build status and who is responsible for any build breakages. The system uses ambient awareness technologies, providing a separate, easily perceived communication channel distinct from standard team workflow. Multiple system configurations and behaviours were evaluated. An evaluation of the system showed that, while there was no significant change in the proportion of build breakages, the overall number of builds increased substantially, and the duration of broken builds decreased. Team members also reported an increased sense of awareness of, and responsibility for, broken builds and some noted the system dramatically changed their perception of the build process making them more cognisant of broken builds.

What Make Long Term Contributors: Willingness and Opportunity in OSS Community
Minghui Zhou and Audris Mockus
(Peking University, China; Key Laboratory of High Confidence Software Technologies, China; Avaya Labs Research, USA)
To survive and succeed, software projects need to attract and retain contributors. We model the individual's chances to become a valuable contributor through their capacity, willingness, and the opportunity to contribute at the time of joining. Using issue tracking data of Mozilla and Gnome, we find that the probability for a new joiner to become a Long Term Contributor (LTC) is associated with her willingness and environment. Specifically, during their first month, future LTCs tend to be more active and show more community-oriented attitude than other joiners. Joiners who start by commenting on instead of reporting an issue or ones who succeed to get at least one reported issue to be fixed, more than double their odds of becoming an LTC. The macro-climate with high project relative sociality and the micro-climate with a large, productive, and clustered peer group increase the odds. On the contrary, the macro-climate with high project popularity and the micro-climate with low attention from peers reduce the odds. This implies that the interaction between individual's attitude and project's climate are associated with the odds that an individual would become a valuable contributor or disengage from the project. Our findings may provide a basis for empirical approaches to design a better community architecture and to improve the experience of contributors.

Development of Auxiliary Functions: Should You Be Agile? An Empirical Assessment of Pair Programming and Test-First Programming
Otávio Augusto Lazzarini Lemos, Fabiano Cutigi Ferrari, Fábio Fagundes Silveira, and Alessandro Garcia
(UNIFESP, Brazil; UFSCar, Brazil; PUC-Rio, Brazil)
A considerable part of software systems is comprised of functions that support the main modules, such as array or string manipulation and basic math computation. These auxiliary functions are usually considered less complex, and thus tend to receive less attention from developers. However, failures in these functions might propagate to more critical modules, thereby affecting the system's overall reliability. Given the complementary role of auxiliary functions, a question that arises is whether agile practices, such as pair programming and test-first programming, can improve their correctness without affecting time-to-market. This paper presents an empirical assessment comparing the application of these agile practices with more traditional approaches. Our study comprises independent experiments of pair versus solo programming, and test-first versus test-last programming. The first study involved 85 novice programmers who applied both traditional and agile approaches in the development of six auxiliary functions within three different domains. Our results suggest that the agile practices might bring benefits in this context. In particular, pair programmers delivered correct implementations much more often, and test-first programming encouraged the production of larger and higher coverage test sets. On the downside, the main experiment showed that both practices significantly increase total development time. A replication of the test-first experiment with professional developers shows similar results.

Models
Thu, Jun 7, 10:45 - 12:45

Maintaining Invariant Traceability through Bidirectional Transformations
Yijun Yu, Yu Lin, Zhenjiang Hu, Soichiro Hidaka, Hiroyuki Kato, and Lionel Montrieux
(Open University, UK; University of Illinois at Urbana-Champaign, USA; National Institute of Informatics, Japan)
Following the ``convention over configuration" paradigm, model-driven development (MDD) generates code to implement the ``default'' behaviour that has been specified by a template separate from the input model, reducing the decision effort of developers. For flexibility, users of MDD are allowed to customise the model and the generated code in parallel. A synchronisation of changed model or code is maintained by reflecting them on the other end of the code generation, as long as the traceability is unchanged. However, such invariant traceability between corresponding model and code elements can be violated either when (a) users of MDD protect custom changes from the generated code, or when (b) developers of MDD change the template for generating the default behaviour. A mismatch between user and template code is inevitable as they evolve for their own purposes. In this paper, we propose a two-layered invariant traceability framework that reduces the number of mismatches through bidirectional transformations. On top of existing vertical (model<->code) synchronisations between a model and the template code, a horizontal (code<->code) synchronisation between user and template code is supported, aligning the changes in both directions. Our blinkit tool is evaluated using the data set available from the CVS repositories of a MDD project: Eclipse MDT/GMF.

Slicing MATLAB Simulink Models
Robert Reicherdt and Sabine Glesner
(TU Berlin, Germany)
MATLAB Simulink is the most widely used industrial tool for developing complex embedded systems in the automotive sector. The resulting Simulink models often consist of more than ten thousand blocks and a large number of hierarchy levels. To ensure the quality of such models, automated static analyses and slicing are necessary to cope with this complexity. In particular, static analyses are required that operate directly on the models. In this article, we present an approach for slicing Simulink Models using dependence graphs and demonstrate its efficiency using case studies from the automotive and avionics domain. With slicing, the complexity of a model can be reduced for a given point of interest by removing unrelated model elements, thus paving the way for subsequent static quality assurance methods.

Partial Evaluation of Model Transformations
Ali Razavi and Kostas Kontogiannis
(University of Waterloo, Canada; National Technical University of Athens, Greece)
Model Transformation is considered an important enabling factor for Model Driven Development. Transformations can be applied not only for the generation of new models from existing ones, but also for the consistent co-evolution of software artifacts that pertain to various phases of software lifecycle such as requirement models, design documents and source code. Furthermore, it is often common in practical scenarios to apply such transformations repeatedly and frequently; an activity that can take a significant amount of time and resources, especially when the affected models are complex and highly interdependent. In this paper, we discuss a novel approach for deriving incremental model transformations by the partial evaluation of original model transformation programs. Partial evaluation involves pre-computing parts of the transformation program based on known model dependencies and the type of the applied model change. Such pre-evaluation allows for significant reduction of transformation time in large and complex model repositories. To evaluate the approach, we have implemented QvtMix, a prototype partial evaluator for the Query, View and Transformation Operational Mappings (QVT-OM) language. The experiments indicate that the proposed technique can be used for significantly improving the performance of repetitive applications of model transformations.

Partial Models: Towards Modeling and Reasoning with Uncertainty
Michalis Famelis, Rick Salay, and Marsha Chechik
(University of Toronto, Canada)
Models are good at expressing information about software but not as good at expressing modelers' uncertainty about it. The highly incremental and iterative nature of software development nonetheless requires the ability to express uncertainty and reason with models containing it. In this paper, we build on our earlier work on expressing uncertainty using partial models, by elaborating an approach to reasoning with such models. We evaluate our approach by experimentally comparing it to traditional strategies for dealing with uncertainty as well as by conducting a case study using open source software. We conclude that we are able to reap the benefits of well-managed uncertainty while incurring minimal additional cost.

Concurrency and Exceptions
Thu, Jun 7, 16:00 - 17:30

Static Detection of Resource Contention Problems in Server-Side Scripts
Yunhui Zheng and Xiangyu Zhang
(Purdue University, USA)
With modern multi-core architectures, web applications are usually configured to serve multiple requests simultaneously by spawning multiple instances. These instances may access the same external resources such as database tables and files. Such contentions may become severe during peak time, leading to violations of atomicity business logic. In this paper, we propose a novel static analysis that detects atomicity violations of external operations for server side scripts. The analysis differs from traditional atomicity violation detection techniques by focusing on external resources instead of shared memory. It consists of three components. The first one is an interprocedural and path-sensitive resource identity analysis that determines whether multiple operations access the same external resource, which is critical to identifying contentions. The second component infers pairs of external operations that should be executed atomically. Finally, violations are detected by reasoning about serializability of interleaved atomic pairs. Experimental results show that the analysis is highly effective in detecting atomicity violations in real-world web apps.

Amplifying Tests to Validate Exception Handling Code
Pingyu Zhang and Sebastian Elbaum
(University of Nebraska-Lincoln, USA)
Validating code handling exceptional behavior is difficult, particularly when dealing with external resources that may be noisy and unreliable, as it requires: 1) the systematic exploration of the space of exceptions that may be thrown by the external resources, and 2) the setup of the context to trigger specific patterns of exceptions. In this work we present an approach that addresses those difficulties by performing an exhaustive amplification of the space of exceptional behavior associated with an external resource that is exercised by a test suite. Each amplification attempts to expose a program exception handling construct to new behavior by mocking an external resource so that it returns normally or throws an exception following a predefined pattern. Our assessment of the approach indicates that it can be fully automated, is powerful enough to detect 65% of the faults reported in the bug reports of this kind, and is precise enough that 77% of the detected anomalies correspond to faults fixed by the developers.

MagicFuzzer: Scalable Deadlock Detection for Large-Scale Applications
Yan Cai and W. K. Chan
(City University of Hong Kong, China)
We present MagicFuzzer, a novel dynamic deadlock detection technique. Unlike existing techniques to locate potential deadlock cycles from an execution, it iteratively prunes lock dependencies that each has no incoming or outgoing edge. Combining with a novel thread-specific strategy, it dramatically shrinks the size of lock dependency set for cycle detection, improving the efficiency and scalability of such a detection significantly. In the real deadlock confirmation phase, it uses a new strategy to actively schedule threads of an execution against the whole set of potential deadlock cycles. We have implemented a prototype and evaluated it on large-scale C/C++ programs. The experimental results confirm that our technique is significantly more effective and efficient than existing techniques.

Software Architecture
Thu, Jun 7, 16:00 - 17:30

Does Organizing Security Patterns Focus Architectural Choices?
Koen Yskout, Riccardo Scandariato, and Wouter Joosen
(KU Leuven, Belgium)
Security patterns can be a valuable vehicle to design secure software. Several proposals have been advanced to improve the usability of security patterns. They often describe extra annotations to be included in the pattern documentation. This paper presents an empirical study that validates whether those proposals provide any real benefit for software architects. A controlled experiment has been executed with 90 master students, who have performed several design tasks involving the hardening of a software architecture via security patterns. The results show that annotations produce benefits in terms of a reduced number of alternatives that need to be considered during the selection of a suitable pattern. However, they do not reduce the time spent in the selection process.

Enhancing Architecture-Implementation Conformance with Change Management and Support for Behavioral Mapping
Yongjie Zheng and Richard N. Taylor
(UC Irvine, USA)
It is essential for software architecture to be consistent with implementation during software development. Existing architecture-implementation mapping approaches are not sufficient due to a variety of reasons, including lack of support for change management and mapping of behavioral architecture specification. A new approach called 1.x-way architecture-implementation mapping is presented in this paper to address these issues. Its contribution includes deep separation of generated and non-generated code, an architecture change model, architecture-based code regeneration, and architecture change notification. The approach is implemented in ArchStudio 4, an Eclipse-based architecture development environment. To evaluate its utility, we refactored the code of ArchStudio, and replayed changes that had been made to ArchStudio in two research projects by redoing them with the developed tool.

A Tactic-Centric Approach for Automating Traceability of Quality Concerns
Mehdi Mirakhorli, Yonghee Shin, Jane Cleland-Huang, and Murat Cinar
(DePaul University, USA)
The software architectures of business, mission, or safety critical systems must be carefully designed to balance an exacting set of quality concerns describing characteristics such as security, reliability, and performance. Unfortunately, software architectures tend to degrade over time as maintainers modify the system without understanding the underlying architectural decisions. Although this problem can be mitigated by manually tracing architectural decisions into the code, the cost and effort required to do this can be prohibitively expensive. In this paper we therefore present a novel approach for automating the construction of traceability links for architectural tactics. Our approach utilizes machine learning methods and lightweight structural analysis to detect tactic-related classes. The detected tactic-related classes are then mapped to a Tactic Traceability Information Model. We train our trace algorithm using code extracted from fifteen performance-centric and safety-critical open source software systems and then evaluate it against the Apache Hadoop framework. Our results show that automatically generated traceability links can support software maintenance activities while preserving architectural qualities.

Formal Verification
Thu, Jun 7, 16:00 - 17:30

Build Code Analysis with Symbolic Evaluation
Ahmed Tamrawi, Hoan Anh Nguyen, Hung Viet Nguyen, and Tien N. Nguyen
(Iowa State University, USA)
Build process is crucial in software development. However, the analysis support for build code is still limited. In this paper, we present SYMake, an infrastructure and tool for the analysis of build code in make. Due to the dynamic nature of make language, it is challenging to understand and maintain complex Makefiles. SYMake provides a symbolic evaluation algorithm that processes Makefiles and produces a symbolic dependency graph (SDG), which represents the build dependencies (i.e. rules) among files via commands. During the symbolic evaluation, for each resulting string value in an SDG that represents a part of a file name or a command in a rule, SYMake provides also an acyclic graph (called T-model) to represent its symbolic evaluation trace. We have used SYMake to develop algorithms and a tool 1) to detect several types of code smells and errors in Makefiles, and 2) to support build code refactoring, e.g. renaming a variable/target even if its name is fragmented and built from multiple substrings. Our empirical evaluation for SYMake's renaming on several real-world systems showed its high accuracy in entity renaming. Our controlled experiment showed that with SYMake, developers were able to understand Makefiles better and to detect more code smells as well as to perform refactoring more accurately.

An Automated Approach to Generating Efficient Constraint Solvers
Dharini Balasubramaniam, Christopher Jefferson, Lars Kotthoff, Ian Miguel, and Peter Nightingale
(University of St. Andrews, UK)
Combinatorial problems appear in numerous settings, from timetabling to industrial design. Constraint solving aims to find solutions to such problems efficiently and automatically. Current constraint solvers are monolithic in design, accepting a broad range of problems. The cost of this convenience is a complex architecture, inhibiting efficiency, extensibility and scalability. Solver components are also tightly coupled with complex restrictions on their configuration, making automated generation of solvers difficult.
We describe a novel, automated, model-driven approach to generating efficient solvers tailored to individual problems and present some results from applying the approach. The main contribution of this work is a solver generation framework called Dominion, which analyses a problem and, based on its characteristics, generates a solver using components chosen from a library. The key benefit of this approach is the ability to solve larger and more difficult problems as a result of applying finer-grained optimisations and using specialised techniques as required.

Simulation-Based Abstractions for Software Product-Line Model Checking
Maxime Cordy, Andreas Classen, Gilles Perrouin, Pierre-Yves Schobbens, Patrick Heymans, and Axel Legay
(University of Namur, Belgium; INRIA, France; LIFL–CNRS, France; IRISA, France; Aalborg University, Denmark; University of Liège, Belgium)
Software Product Line (SPL) engineering is a software engineering paradigm that exploits the commonality between similar software products to reduce life cycle costs and time-to-market. Many SPLs are critical and would benefit from efficient verification through model checking. Model checking SPLs is more difficult than for single systems, since the number of different products is potentially huge. In previous work, we introduced Featured Transition Systems (FTS), a formal, compact representation of SPL behaviour, and provided efficient algorithms to verify FTS. Yet, we still face the state explosion problem, like any model checking-based verification. Model abstraction is the most relevant answer to state explosion. In this paper, we define a novel simulation relation for FTS and provide an algorithm to compute it. We extend well-known simulation preservation properties to FTS and thus lay the theoretical foundations for abstraction-based model checking of SPLs. We evaluate our approach by comparing the cost of FTS-based simulation and abstraction with respect to product-by-product methods. Our results show that FTS are a solid foundation for simulation-based model checking of SPL.

Invariant Generation
Fri, Jun 8, 08:45 - 10:15

Using Dynamic Analysis to Discover Polynomial and Array Invariants
ThanhVu Nguyen, Deepak Kapur, Westley Weimer, and Stephanie Forrest
(University of New Mexico, USA; University of Virginia, USA)
Dynamic invariant analysis identifies likely properties over variables from observed program traces. These properties can aid programmers in refactoring, documenting, and debugging tasks by making dynamic patterns visible statically. Two useful forms of invariants involve relations among polynomials over program variables and relations among array variables. Current dynamic analysis methods support such invariants in only very limited forms. We combine mathematical techniques that have not previously been applied to this problem, namely equation solving, polyhedra construction, and SMT solving, to bring new capabilities to dynamic invariant detection. Using these methods, we show how to find equalities and inequalities among nonlinear polynomials over program variables, and linear relations among array variables of multiple dimensions. Preliminary experiments on 24 mathematical algorithms and an implementation of AES encryption provide evidence that the approach is effective at finding these invariants.

Metadata Invariants: Checking and Inferring Metadata Coding Conventions
Myoungkyu Song and Eli Tilevich
(Virginia Tech, USA)
As the prevailing programming model of enterprise applications is becoming more declarative, programmers are spending an increasing amount of their time and efforts writing and maintaining metadata, such as XML or annotations. Although metadata is a cornerstone of modern software, automatic bug finding tools cannot ensure that metadata maintains its correctness during refactoring and enhancement. To address this shortcoming, this paper presents metadata invariants, a new abstraction that codifies various naming and typing relationships between metadata and the main source code of a program. We reify this abstraction as a domain-specific language. We also introduce algorithms to infer likely metadata invariants and to apply them to check metadata correctness in the presence of program evolution. We demonstrate how metadata invariant checking can help ensure that metadata remains consistent and correct during program evolution; it finds metadata-related inconsistencies and recommends how they should be corrected. Similar to static bug finding tools, a metadata invariant checker identifies metadata-related bugs as a program is being refactored and enhanced. Because metadata is omnipresent in modern software applications, our approach can help ensure the overall consistency and correctness of software as it evolves.

Generating Obstacle Conditions for Requirements Completeness
Dalal Alrajeh, Jeff Kramer, Axel van Lamsweerde, Alessandra Russo, and Sebastián Uchitel
(Imperial College London, UK; Université Catholique de Louvain, Belgium)
Missing requirements are known to be among the major causes of software failure. They often result from a natural inclination to conceive over-ideal systems where the software-to-be and its environment always behave as expected. Obstacle analysis is a goal-anchored form of risk analysis whereby exceptional conditions that may obstruct system goals are identified, assessed and resolved to produce complete requirements. Various techniques have been proposed for identifying obstacle conditions systematically. Among these, the formal ones have limited applicability or are costly to automate. This paper describes a tool-supported technique for generating a set of obstacle conditions guaranteed to be complete and consistent with respect to the known domain properties. The approach relies on a novel combination of model checking and learning technologies. Obstacles are iteratively learned from counterexample and witness traces produced by model checking against a goal and converted into positive and negative examples, respectively. A comparative evaluation is provided with respect to published results on the manual derivation of obstacles in a real safety-critical system for which failures have been reported.

Regression Testing
Fri, Jun 8, 08:45 - 10:15

make test-zesti: A Symbolic Execution Solution for Improving Regression Testing
Paul Dan Marinescu and Cristian Cadar
(Imperial College London, UK)
Software testing is an expensive and time consuming process, often involving the manual creation of comprehensive regression test suites. However, current testing methodologies do not take full advantage of these tests. In this paper, we present a technique for amplifying the effect of existing test suites using a lightweight symbolic execution mechanism, which thoroughly checks all sensitive operations (e.g., pointer dereferences) executed by the test suite for errors, and explores additional paths around sensitive operations. We implemented this technique in a prototype system called ZESTI (Zero-Effort Symbolic Test Improvement), and applied it to three open-source code bases—GNU Coreutils, libdwarf and readelf—where it found 52 previously unknown bugs, many of which are out of reach of standard symbolic execution. Our technique works transparently to the tester, requiring no additional human effort or changes to source code or tests.

BALLERINA: Automatic Generation and Clustering of Efficient Random Unit Tests for Multithreaded Code
Adrian Nistor, Qingzhou Luo, Michael Pradel, Thomas R. Gross, and Darko Marinov
(University of Illinois at Urbana-Champaign, USA; ETH Zurich, Switzerland)
Testing multithreaded code is hard and expensive. Each multithreaded unit test creates two or more threads, each executing one or more methods on shared objects of the class under test. Such unit tests can be generated at random, but basic generation produces tests that are either slow or do not trigger concurrency bugs. Worse, such tests have many false alarms, which require human effort to filter out. We present BALLERINA, a novel technique for automatic generation of efficient multithreaded random tests that effectively trigger concurrency bugs. BALLERINA makes tests efficient by having only two threads, each executing a single, randomly selected method. BALLERINA increases chances that such a simple parallel code finds bugs by appending it to more complex, randomly generated sequential code. We also propose a clustering technique to reduce the manual effort in inspecting failures of automatically generated multithreaded tests. We evaluate BALLERINA on 14 real-world bugs from 6 popular codebases: Groovy, Java JDK, jFreeChart, Log4j, Lucene, and Pool. The experiments show that tests generated by BALLERINA can find bugs on average 2X-10X faster than various configurations of basic random generation, and our clustering technique reduces the number of inspected failures on average 4X-8X. Using BALLERINA, we found three previously unknown bugs in Apache Pool and Log4j, one of which was already confirmed and fixed.

On-Demand Test Suite Reduction
Dan Hao, Lu Zhang, Xingxia Wu, Hong Mei, and Gregg Rothermel
(Peking University, China; Key Laboratory of High Confidence Software Technologies, China; University of Nebraska, USA)
Most test suite reduction techniques aim to select, from a given test suite, a minimal representative subset of test cases that retains the same code coverage as the suite. Empirical studies have shown, however, that test suites reduced in this manner may lose fault detection capability. Techniques have been proposed to retain certain redundant test cases in the reduced test suite so as to reduce the loss in fault-detection capability, but these still do concede some degree of loss. Thus, these techniques may be applicable only in cases where loose demands are placed on the upper limit of loss in fault-detection capability. In this work we present an on-demand test suite reduction approach, which attempts to select a representative subset satisfying the same test requirements as an initial test suite conceding at most l% loss in fault-detection capability for at least c% of the instances in which it is applied. Our technique collects statistics about loss in fault-detection capability at the level of individual statements and models the problem of test suite reduction as an integer linear programming problem. We have evaluated our approach in the contexts of three scenarios in which it might be used. Our results show that most test suites reduced by our approach satisfy given fault detection capability demands, and that the approach compares favorably with an existing test suite reduction approach.

Software Vulnerability
Fri, Jun 8, 08:45 - 10:15

Automated Detection of Client-State Manipulation Vulnerabilities
Anders Møller and Mathias Schwarz
(Aarhus University, Denmark)
Web application programmers must be aware of a wide range of potential security risks. Although the most common pitfalls are well described and categorized in the literature, it remains a challenging task to ensure that all guidelines are followed. For this reason, it is desirable to construct automated tools that can assist the programmers in the application development process by detecting weaknesses. Many vulnerabilities are related to web application code that stores references to application state in the generated HTML documents to work around the statelessness of the HTTP protocol. In this paper, we show that such client-state manipulation vulnerabilities are amenable to tool supported detection. We present a static analysis for the widely used frameworks Java Servlets, JSP, and Struts. Given a web application archive as input, the analysis identifies occurrences of client state and infers the information flow between the client state and the shared application state on the server. This makes it possible to check how client-state manipulation performed by malicious users may affect the shared application state and cause leakage or modifications of sensitive information. The warnings produced by the tool help the application programmer identify vulnerabilities. Moreover, the inferred information can be applied to configure a security filter that automatically guards against attacks. Experiments on a collection of open source web applications indicate that the static analysis is able to effectively help the programmer prevent client-state manipulation vulnerabilities.

Understanding Integer Overflow in C/C++
Will Dietz, Peng Li, John Regehr, and Vikram Adve
(University of Illinois at Urbana-Champaign, USA; University of Utah, USA)
Integer overflow bugs in C and C++ programs are difficult to track down and may lead to fatal errors or exploitable vulnerabilities. Although a number of tools for finding these bugs exist, the situation is complicated because not all overflows are bugs. Better tools need to be constructed---but a thorough understanding of the issues behind these errors does not yet exist. We developed IOC, a dynamic checking tool for integer overflows, and used it to conduct the first detailed empirical study of the prevalence and patterns of occurrence of integer overflows in C and C++ code. Our results show that intentional uses of wraparound behaviors are more common than is widely believed; for example, there are over 200 distinct locations in the SPEC CINT2000 benchmarks where overflow occurs. Although many overflows are intentional, a large number of accidental overflows also occur. Orthogonal to programmers' intent, overflows are found in both well-defined and undefined flavors. Applications executing undefined operations can be, and have been, broken by improvements in compiler optimizations. Looking beyond SPEC, we found and reported undefined integer overflows in SQLite, PostgreSQL, SafeInt, GNU MPC and GMP, Firefox, GCC, LLVM, Python, BIND, and OpenSSL; many of these have since been fixed. Our results show that integer overflow issues in C and C++ are subtle and complex, that they are common even in mature, widely used programs, and that they are widely misunderstood by developers.

A Large Scale Exploratory Analysis of Software Vulnerability Life Cycles
Muhammad Shahzad, Muhammad Zubair Shafiq, and Alex X. Liu
(Michigan State University, USA)
Software systems inherently contain vulnerabilities that have been exploited in the past resulting in significant revenue losses. The study of vulnerability life cycles can help in the development, deployment, and maintenance of software systems. It can also help in designing future security policies and conducting audits of past incidents. Furthermore, such an analysis can help customers to assess the security risks associated with software products of different vendors. In this paper, we conduct an exploratory measurement study of a large software vulnerability data set containing 46310 vulnerabilities disclosed since 1988 till 2011. We investigate vulnerabilities along following seven dimensions: (1) phases in the life cycle of vulnerabilities, (2) evolution of vulnerabilities over the years, (3) functionality of vulnerabilities, (4) access requirement for exploitation of vulnerabilities, (5) risk level of vulnerabilities, (6) software vendors, and (7) software products. Our exploratory analysis uncovers several statistically significant findings that have important implications for software development and deployment.

API Learning
Fri, Jun 8, 10:45 - 12:45

Synthesizing API Usage Examples
Raymond P. L. Buse and Westley Weimer
(University of Virginia, USA)
Key program interfaces are sometimes documented with usage examples: concrete code snippets that characterize common use cases for a particular data type. While such documentation is known to be of great utility, it is burdensome to create and can be incomplete, out of date, or not representative of actual practice. We present an automatic technique for mining and synthesizing succinct and representative human-readable documentation of program interfaces. Our algorithm is based on a combination of path sensitive dataflow analysis, clustering, and pattern abstraction. It produces output in the form of well-typed program snippets which document initialization, method calls, assignments, looping constructs, and exception handling. In a human study involving over 150 participants, 82% of our generated examples were found to be at least as good at human-written instances and 94% were strictly preferred to state of the art code search.

Temporal Analysis of API Usage Concepts
Gias Uddin, Barthélémy Dagenais, and Martin P. Robillard
(McGill University, Canada)
Software reuse through Application Programming Interfaces (APIs) is an integral part of software development. The functionality offered by an API is not always accessed uniformly throughout the lifetime of a client program. We propose Temporal API Usage Pattern Mining to detect API usage patterns in terms of their time of introduction into client programs. We detect concepts as distinct groups of API functionality from the change history of a client program. We locate those concepts in the client change history and detect temporal usage patterns, where a pattern contains a set of concepts that were added into the client program in a specific temporal order. We investigated the properties of temporal API usage patterns through a multiple-case study of three APIs and their use in up to 19 client software projects. Our technique was able to detect a number of valuable patterns in two out of three of the APIs investigated. Further investigation showed some patterns to be relatively consistent between clients, produced by multiple developers, and not trivially derivable from program structure or API documentation.

Inferring Method Specifications from Natural Language API Descriptions
Rahul Pandita, Xusheng Xiao, Hao Zhong, Tao Xie, Stephen Oney, and Amit Paradkar
(North Carolina State University, USA; Chinese Academy of Sciences, China; CMU, USA; IBM Research, USA)
Application Programming Interface (API) documents are a typical way of describing legal usage of reusable software libraries, thus facilitating software reuse. However, even with such documents, developers often overlook some documents and build software systems that are inconsistent with the legal usage of those libraries. Existing software verification tools require formal specifications (such as code contracts), and therefore cannot directly verify the legal usage described in natural language text of API documents against the code using that library. However, in practice, most libraries do not come with formal specifications, thus hindering tool-based verification. To address this issue, we propose a novel approach to infer formal specifications from natural language text of API documents. Our evaluation results show that our approach achieves an average of 92% precision and 93% recall in identifying sentences that describe code contracts from more than 2500 sentences of API documents. Furthermore, our results show that our approach has an average 83% accuracy in inferring specifications from over 1600 sentences describing code contracts.

Code Recommenders
Fri, Jun 8, 10:45 - 12:45

Automatic Parameter Recommendation for Practical API Usage
Cheng Zhang, Juyuan Yang, Yi Zhang, Jing Fan, Xin Zhang, Jianjun Zhao, and Peizhao Ou
(Shanghai Jiao Tong University, China)
Programmers extensively use application programming interfaces (APIs) to leverage existing libraries and frameworks. However, correctly and efficiently choosing and using APIs from unfamiliar libraries and frameworks is still a non-trivial task. Programmers often need to ruminate on API documentations (that are often incomplete) or inspect code examples (that are often absent) to learn API usage patterns. Recently, various techniques have been proposed to alleviate this problem by creating API summarizations, mining code examples, or showing common API call sequences. However, few techniques focus on recommending API parameters.
In this paper, we propose an automated technique, called Precise, to address this problem. Differing from common code completion systems, Precise mines existing code bases, uses an abstract usage instance representation for each API usage example, and then builds a parameter usage database. Upon a request, Precise queries the database for abstract usage instances in similar contexts and generates parameter candidates by concretizing the instances adaptively.
The experimental results show that our technique is more general and applicable than existing code completion systems, specially, 64% of the parameter recommendations are useful and 53% of the recommendations are exactly the same as the actual parameters needed. We have also performed a user study to show our technique is useful in practice.

On the Naturalness of Software
Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu
(UC Davis, USA; University of Texas at Dallas, USA)
Natural languages like English are rich, complex, and powerful. The highly creative and graceful use of languages like English and Tamil, by masters like Shakespeare and Avvaiyar, can certainly delight and inspire. But in practice, given cognitive constraints and the exigencies of daily life, most human utterances are far simpler and much more repetitive and predictable. In fact, these utterances can be very usefully modeled using modern statistical methods. This fact has led to the phenomenal success of statistical approaches to speech recognition, natural language translation, question-answering, and text mining and comprehension. We begin with the conjecture that most software is also natural, in the sense that it is created by humans at work, with all the attendant constraints and limitations---and thus, like natural language, it is also likely to be repetitive and predictable. We then proceed to ask whether a) code can be usefully modeled by statistical language models and b) such models can be leveraged to support software engineers. Using the widely adopted n-gram model, we provide empirical evidence supportive of a positive answer to both these questions. We show that code is also very repetitive, and in fact even more so than natural languages. As an example use of the model, we have developed a simple code completion engine for Java that, despite its simplicity, already improves Eclipse's completion capability. We conclude the paper by laying out a vision for future research in this area.

Recommending Source Code for Use in Rapid Software Prototypes
Collin McMillan, Negar Hariri, Denys Poshyvanyk, Jane Cleland-Huang, and Bamshad Mobasher
(College of William and Mary, USA; DePaul University, USA)
Rapid prototypes are often developed early in the software development process in order to help project stakeholders explore ideas for possible features, and to discover, analyze, and specify requirements for the project. As prototypes are typically thrown-away following the initial analysis phase, it is imperative for them to be created quickly with little cost and effort. Tool support for finding and reusing components from open-source repositories offers a major opportunity to reduce this manual effort. In this paper, we present a system for rapid prototyping that facilitates software reuse by mining feature descriptions and source code from open-source repositories. Our system identifies and recommends features and associated source code modules that are relevant to the software product under development. The modules are selected such that they implement as many of the desired features as possible while exhibiting the lowest possible levels of external coupling. We conducted a user study to evaluate our approach and results indicated that it returned packages that implemented more features and were considered more relevant than the state-of-the-art approach.

Active Code Completion
Cyrus Omar, YoungSeok Yoon, Thomas D. LaToza, and Brad A. Myers
(CMU, USA)
Code completion menus have replaced standalone API browsers for most developers because they are more tightly integrated into the development workflow. Refinements to the code completion menu that incorporate additional sources of information have similarly been shown to be valuable, even relative to standalone counterparts offering similar functionality. In this paper, we describe active code completion, an architecture that allows library developers to introduce interactive and highly-specialized code generation interfaces, called palettes, directly into the editor. Using several empirical methods, we examine the contexts in which such a system could be useful, describe the design constraints governing the system architecture as well as particular code completion interfaces, and design one such system, named Graphite, for the Eclipse Java development environment. Using Graphite, we implement a palette for writing regular expressions as our primary example and conduct a small pilot study. In addition to showing the feasibility of this approach, it provides further evidence in support of the claim that integrating specialized code completion interfaces directly into the editor is valuable to professional developers.

Test Automation
Fri, Jun 8, 10:45 - 12:45

Automated Oracle Creation Support, or: How I Learned to Stop Worrying about Fault Propagation and Love Mutation Testing
Matt Staats, Gregory Gay, and Mats P. E. Heimdahl
(KAIST, South Korea; University of Minnesota, USA)
In testing, the test oracle is the artifact that determines whether an application under test executes correctly. The choice of test oracle can significantly impact the effectiveness of the testing process. However, despite the prevalence of tools that support the selection of test inputs, little work exists for supporting oracle creation. In this work, we propose a method of supporting test oracle creation. This method automatically selects the oracle data — the set of variables monitored during testing—for expected value test oracles. This approach is based on the use of mutation analysis to rank variables in terms of fault-finding effectiveness, thus automating the selection of the oracle data. Experiments over four industrial examples demonstrate that our method may be a cost-effective approach for producing small, effective oracle data, with fault finding improvements over current industrial best practice of up to 145.8% observed.

Automating Test Automation
Suresh Thummalapenta, Saurabh Sinha, Nimit Singhania, and Satish Chandra
(IBM Research, India; IBM Research, USA)
Mention test case, and it conjures up image of a script or a program that exercises a system under test. In industrial practice, however, test cases often start out as steps described in natural language. These are essentially directions a human tester needs to follow to interact with an application, exercising a given scenario. Since tests need to be executed repeatedly, such manual tests then have to go through test automation to create scripts or programs out of them. Test automation can be expensive in programmer time. We describe a technique to automate test automation. The input to our technique is a sequence of steps written in natural language, and the output is a sequence of procedure calls with accompanying parameters that can drive the application without human intervention. The technique is based on looking at the natural language test steps as consisting of segments that describe actions on targets, except that there can be ambiguity in the action itself, in the order in which segments occur, and in the specification of the target of the action. The technique resolves this ambiguity by backtracking, until it can synthesize a successful sequence of calls. We present an evaluation of our technique on professionally created manual test cases for two open-source web applications as well as a proprietary enterprise application. Our technique could automate over 82% of the steps contained in these test cases with no human intervention, indicating that the technique can reduce the cost of test automation quite effectively.

Stride: Search-Based Deterministic Replay in Polynomial Time via Bounded Linkage
Jinguo Zhou, Xiao Xiao, and Charles Zhang
(Hong Kong University of Science and Technology, China)
Abstract—Deterministic replay remains as one of the most effective ways to comprehend concurrent bugs. Existing approaches either maintain the exact shared read-write linkages with a large runtime overhead or use exponential off-line algorithms to search for a feasible interleaved execution. In this paper, we propose Stride, a hybrid solution that records the bounded shared memory access linkages at runtime and infers an equivalent interleaving in polynomial time, under the sequential consistency assumption. The recording scheme eliminates the need for synchronizing the shared read operations, which results in a significant overhead reduction. Comparing to the previous state-of-the-art approach of deterministic replay, Stride reduces, on average, 2.5 times of runtime overhead and produces, on average, 3.88 times smaller logs.

iTree: Efficiently Discovering High-Coverage Configurations Using Interaction Trees
Charles Song, Adam Porter, and Jeffrey S. Foster
(University of Maryland, USA)
Software configurability has many benefits, but it also makes programs much harder to test, as in the worst case the program must be tested under every possible configuration. One potential remedy to this problem is combinatorial interaction testing (CIT), in which typically the developer selects a strength t and then computes a covering array containing all t-way configuration option combinations. However, in a prior study we showed that several programs have important high-strength interactions (combinations of a subset of configuration options) that CIT is highly unlikely to generate in practice. In this paper, we propose a new algorithm called interaction tree discovery (iTree) that aims to identify sets of configurations to test that are smaller than those generated by CIT, while also including important high-strength interactions missed by practical applications of CIT. On each iteration of iTree, we first use low-strength CIT to test the program under a set of configurations, and then apply machine learning techniques to discover new interactions that are potentially responsible for any new coverage seen. By repeating this process, iTree builds up a set of configurations likely to contain key high-strength interactions. We evaluated iTree by comparing the coverage it achieves versus covering arrays and randomly generated configuration sets. Our results strongly suggest that iTree can identify high-coverage sets of configurations more effectively than traditional CIT or random sampling.

Validation of Specification
Fri, Jun 8, 10:45 - 12:45

Inferring Class Level Specifications for Distributed Systems
Sandeep Kumar, Siau-Cheng Khoo, Abhik Roychoudhury, and David Lo
(National University of Singapore, Singapore; Singapore Management University, Singapore)
Distributed systems often contain many behaviorally similar processes, which are conveniently grouped into classes. In system modeling, it is common to specify such systems by describing the class level behavior, instead of object level behavior. While there have been techniques that mine specifications of such distributed systems from their execution traces, these methods only mine object-level specifications involving concrete process objects. This leads to specifications which are large, hard to comprehend, and sensitive to simple changes in the system (such as the number of objects).
In this paper, we develop a class level specification mining framework for distributed systems. A specification that describes interaction snippets between various processes in a distributed system forms a natural and intuitive way to document their behavior. Our mining method groups together such interactions between behaviorally similar processes, and presents a mined specification involving "symbolic" Message Sequence Charts. Our experiments indicate that our mined symbolic specifications are significantly smaller than mined concrete specifications, while at the same time achieving better precision and recall.

Statically Checking API Protocol Conformance with Mined Multi-Object Specifications
Michael Pradel, Ciera Jaspan, Jonathan Aldrich, and Thomas R. Gross
(ETH Zurich, Switzerland; CMU, USA)
Programmers using an API often must follow protocols that specify when it is legal to call particular methods. Several techniques have been proposed to find violations of such protocols based on mined specifications. However, existing techniques either focus on single-object protocols or on particular kinds of bugs, such as missing method calls. There is no practical technique to find multi-object protocol bugs without a priori known specifications. In this paper, we combine a dynamic analysis that infers multi-object protocols and a static checker of API usage constraints into a fully automatic protocol conformance checker. The combined system statically detects illegal uses of an API without human-written specifications. Our approach finds 41 bugs and code smells in mature, real-world Java programs with a true positive rate of 51%. Furthermore, we show that the analysis reveals bugs not found by state of the art approaches.

Behavioral Validation of JFSL Specifications through Model Synthesis
Carlo Ghezzi and Andrea Mocci
(Politecnico di Milano, Italy)
Contracts are a popular declarative specification technique to describe the behavior of stateful components in terms of pre/post conditions and invariants. Since each operation is specified separately in terms of an abstract implementation, it may be hard to understand and validate the resulting component behavior from contracts in terms of method interactions. In particular, properties expressed through algebraic axioms, which specify the effect of sequences of operations, require complex theorem proving techniques to be validated. In this paper, we propose an automatic small-scope based approach to synthesize incomplete behavioral abstractions for contracts expressed in the JFSL notation. The proposed abstraction technique enables the possibility to check that the contract behavior is coherent with behavioral properties expressed as axioms of an algebraic specifications. We assess the applicability of our approach by showing how the synthesis methodology can be applied to some classes of contract-based artifacts like specifications of data abstractions and requirement engineering models.

Verifying Client-Side Input Validation Functions Using String Analysis
Muath Alkhalaf, Tevfik Bultan, and Jose L. Gallegos
(UC Santa Barbara, USA)
Client-side computation in web applications is becoming increasingly common due to the popularity of powerful client-side programming languages such as JavaScript. Client-side computation is commonly used to improve an application’s responsiveness by validating user inputs before they are sent to the server. In this paper, we present an analysis technique for checking if a client-side input validation function conforms to a given policy. In our approach, input validation policies are expressed using two regular expressions, one specifying the maximum policy (the upper bound for the set of inputs that should be allowed) and the other specifying the minimum policy (the lower bound for the set of inputs that should be allowed). Using our analysis we can identify two types of errors 1) the input validation function accepts an input that is not permitted by the maximum policy, or 2) the input validation function rejects an input that is permitted by the minimum policy. We implemented our analysis using dynamic slicing to automatically extract the input validation functions from web applications and using automata-based string analysis to analyze the extracted functions. Our experiments demonstrate that our approach is effective in finding errors in input validation functions that we collected from real-world applications and from tutorials and books for teaching JavaScript.

Keynotes

Digital Formations of the Powerful and the Powerless (Keynote)
Saskia Sassen
(Columbia University, USA)
This talk compares two kinds of socio-technical formations: electronic financial networks and local social activist movements that are globally networked. Both cut across the duality global/national and each has altered the economic and political landscapes for respectively financial elites and social activists. Using these two cases helps illuminate the very diverse ways in which the growth of electronic networks partially transforms existing politico-economic orderings. They are extreme cases, one marked by hypermobility and the other by physical immobility. But they show us that each is only partly so: financial electronic networks are subject to particular types of embeddedness and local activist organizations can benefit from novel electronic potentials for global operation. I show how financial electronic networks and electronic activism reveal two parallel developments associated with particular technical properties of the new ICTs, but also reveal a third, radically divergent outcome, one I interpret as signaling the weight of the specific social logics of users in each case.

Supporting Sustainability with Software - An Industrial Perspective (Keynote)
Frank-Dieter Clesle
(SAP, Germany)
Supporting sustainability with software is often summed up in the expression ‘Green IT’ and directly relates to the reduction of CO2 emissions and energy used by IT. The amount of CO2 used in the IT industry covers 2% of the overall CO2 emissions. “Green by IT” describes the influence of appropriate software to the remaining 98% of the industry. We estimate that the effect of our sustainability related software on our customers’ CO2 footprint could be 10.000 times higher than our own.
The so called triple bottom line defines sustainability as covering economic, ecological, and social aspects and the dependencies between. Based on this definition of sustainability, software could not only focus on green house gas reduction. Other topics like: consumers’ protection, sustainable supply, reduction of emission (air, water, waste), recycling, human recourse management and intelligent energy usage must be as well focus areas supported by software. At last software industry should not only focus on delivering tools for life cycle assessment (LCA), we should use it and provide a LCA for our software self.
The industrial question is how to increase short and long term profitability by holistically managing economic, social and environmental risks and opportunities supported by software.

Whither Software Architecture? (Keynote)
Jeff Kramer
(Imperial College London, UK)
Since its early beginnings in the 1980s, much has been achieved in the research field of software architecture. Among other aspects, this research has produced foundational work on the specification, analysis and component configuration of software architectures, including the development of associated software tools. However, adoption of the research by industry has been largely methodological rather than based on precise specifications in architecture description languages (ADLs) or rigorously underpinned by formal models of behaviour and non-functional attributes. Why is this? Why were the actual formalisms and tools not more widely adopted? Can we draw any lessons from this? In this talk, I hope to explore this further, drawing on my personal experience as a researcher in distributed software architectures.
I particularly hope to tickle the fancy of the younger members of our community, indicating the excitement of research, the benefits of belonging to a vibrant research community such as ours, and the benefits of being an active contributor. For the more mature researchers, there will be some nostalgic memories combined with some inevitable stepping on toes. For both young and old, there will be some thoughts for research opportunities as the need for self-managing adaptive software systems becomes more urgent.

Software Engineering in Practice

Services and Analytics
Wed, Jun 6, 10:45 - 12:45

Towards a Federated Cloud Ecosystem (Invited Industrial Talk)
Clovis Chapman
(Dell, Ireland)
Cloud computing has today become a widespread practice for the provisioning of IT services. Cloud infrastructures provide the means to lease computational resources on demand, typically on a pay per use or subscription model and without the need for significant capital investment into hardware. With enterprises seeking to migrate their services to the cloud to save on deployment costs, cater for rapid growth or generally relieve themselves from the responsibility of maintaining their own computing infrastructures, a diverse range of services is required to help fulfil business processes. In this talk, we discuss some of the challenges involved in deploying and managing an ecosystem of loosely coupled cloud services that may be accessed through and integrate with a wide range of devices and third party applications. In particular, we focus on how projects such as OpenStack are accelerating the evolution towards a federated cloud service ecosystem. We also examine how the portfolio of existing and emerging standards such as OAuth and the Simple Cloud Identity Management framework can be exploited to seamlessly incorporate cloud services into business processes and solve the problem of identity and access management when dealing with applications exploiting services across organisational boundaries.

Specification Patterns from Research to Industry: A Case Study in Service-Based Applications
Domenico Bianculli, Carlo Ghezzi, Cesare Pautasso, and Patrick Senti
(University of Lugano, Switzerland; Politecnico di Milano, Italy; Credit Suisse, Switzerland)
Specification patterns have proven to help developers to state precise system requirements, as well as formalize them by means of dedicated specification languages. Most of the past work has focused its applicability area to the specification of concurrent and real-time systems, and has been limited to a research setting. In this paper we present the results of our study on specification patterns for service-based applications (SBAs). The study focuses on industrial SBAs in the banking domain. We started by performing an extensive analysis of the usage of specification patterns in published research case studies --- representing almost ten years of research in the area of specification, verification, and validation of SBAs. We then compared these patterns with a large body of specifications written by our industrial partner over a similar time period. The paper discusses the outcome of this comparison, indicating that some needs of the industry, especially in the area of requirements specification languages, are not fully met by current software engineering research.

Methodology for Migration of Long Running Process Instances in a Global Large Scale BPM Environment in Credit Suisse's SOA Landscape
Tarmo Ploom, Stefan Scheit, and Axel Glaser
(Credit Suisse, Switzerland)
Research about process instance migration covers mainly changes in process models during the process evolution and their effects on the same runtime environment. But what if the runtime environment - a legacy Business Process Execution (BPE) platform - had to be replaced with a new solution? Several migration aspects must be taken into account. (1) Process models from the old BPE platform have to be converted to the target process definition language on the target BPE platform. (2) Existing Business Process Management (BPM) applications must be integrated via new BPE platform interfaces. (3) Process instances and process instance data state must be migrated. For each of these points an appropriate migration strategy must be chosen. This paper describes the migration methodology which was applied for the BPE platform renewal in Credit Suisse.

Information Needs for Software Development Analytics
Raymond P. L. Buse and Thomas Zimmermann
(University of Virginia, USA; Microsoft Research, USA)
Software development is a data rich activity with many sophisticated metrics. Yet engineers often lack the tools and techniques necessary to leverage these potentially powerful information resources toward decision making. In this paper, we present the data and analysis needs of professional software engineers, which we identified among 110 developers and managers in a survey. We asked about their decision making process, their needs for artifacts and indicators, and scenarios in which they would use analytics.
The survey responses lead us to propose several guidelines for analytics tools in software development including: Engineers do not necessarily have much expertise in data analysis; thus tools should be easy to use, fast, and produce concise output. Engineers have diverse analysis needs and consider most indicators to be important; thus tools should at the same time support many different types of artifacts and many indicators. In addition, engineers want to drill down into data based on time, organizational structure, and system architecture.

Mini-Tutorial: Software Analytics
Wed, Jun 6, 14:00 - 15:30

Software Analytics in Practice: Mini Tutorial
Dongmei Zhang and Tao Xie
(Microsoft Research, China; North Carolina State University, USA)
A huge wealth of various data exist in the practice of software development. Further rich data are produced by modern software and services in operation, many of which tend to be data-driven and/or data-producing in nature. Hidden in the data is information about the quality of software and services and the dynamics of software development. Software analytics is to develop and apply data exploration and analysis technologies, such as pattern recognition, machine learning, and information visualization, on software data to obtain insightful and actionable information for modern software and services. This tutorial presents latest research and practice on principles, techniques, and applications of software analytics in practice, highlighting success stories in industry, research achievements that are transferred to industrial practice, and future research and practice directions in software analytics. The attendees can acquire the skills and knowledge needed to perform industrial research or conduct industrial practice in the field of software analytics and to integrate analytics in their own industrial research, practice, and training.

Invited Industrial Experts
Wed, Jun 6, 16:00 - 18:00

Software as an Engineering Material: How the Affordances of Programming Have Changed and What to Do about It (Invited Industrial Talk)
Keith Braithwaite
(Zühlke Engineering, UK)
A contemporary programmer has astonishingly abundant processing power under their fingers. That power increases much faster than research into and published results about programming techniques can change. Meanwhile, practitioners still have to make a living by adding value in capital-constrained environments. How have practitioners taken advantage of the relative cheapness of processing power to add value more quickly, to reduce cost, manage risk and please customers and themsleves? And are there any signposts for where they might go next?

Software Architecture - What Does It Mean in Industry? (Invited Industrial Talk)
Eberhard Wolff
(adesso, Germany)
Architecture is crucial for the success of software projects. However, the metaphor is taken from Civil Engineering - does it really fit to building Software? And ultimately software is code - how does architecture help to create the code? This talk give some practical advices based on several years of experience in the industry as an architect as well as coach and trainer for Software Architecture. It shows core elements of Software Architectures, defines its relation to coding and how architects should are educated in the industry.

How Software Engineering Can Benefit from Traditional Industries - A Practical Experience Report (Invited Industrial Talk)
Tom Sprenger
(AdNovum Informatik, Switzerland)
To be competitive in today's market, the IT industry faces many challenges in the development and maintenance of enterprise information systems. Engineering these largescaled systems efficiently requires making decisions about a number of issues. In addition, customer’s expectations imply continuous software delivery in predictable quality. The operation such systems demands for transparency of the software in regard to lifecycle, change and incident management as well as cost efficiency. Addressing these challenges, we learned how to benefit from traditional industries. Contrary to the fact that the IT business calls itself gladly an industry, the industrialization of software engineering in most cases moves on a rather modest level. Industrialization means not only to build a solution or product on top of managed and well-defined processes, but also to have access to structured information about the current conditions of manufacturing at any time. Comparably with test series and assembly lines of the automobile industry, each individual component and each step from the beginning of manufacturing up to the final product should be equipped with measuring points for quality and progress. Even one step further the product itself, after it has left the factory, should be able to continuously provide analytic data for diagnostic reasons. Information is automatically collected and builds the basic essentials for process control, optimization and continuous improvement of the software engineering process. This presentation shows by means of a practical experience report how AdNovum managed to build its software engineering based on a well-balanced system of processes, continuous measurement and control - as well as a healthy portion of pragmatism. We implemented an efficient and predictable software delivery pipeline based on five cornerstones that enables us to ship more than 1500 customer deliveries per year.

Formal Methods
Thu, Jun 7, 10:45 - 12:45

Ten Years of Automated Code Analysis at Microsoft (Invited Industrial Talk)
Wolfram Schulte
(Microsoft Research, USA)
Automated code analysis is technology aimed at locating, describing and repairing areas of weakness in code. Code weaknesses range from security vulnerabilities, logic errors, concurrency violations, to improper resource usage, violations of architectures or coding guidelines. Common to all code analysis techniques is that they build abstractions of code and then check those abstractions for properties of interest. For instance a type checker computes how types are used, abstract interpreters and symbolic evaluators check how values flow, model checkers analyze how state evolves. Building modern program analysis tools thus requires a multi-pronged approach to find a variety of weaknesses. In this talk I will discuss and compare several program analysis tools, which MSR build during the last ten years. They include theorem provers, program verifiers, bug finders, malware scanners, and test case generators. I will describe the need for their development, their innovation, and application. Many of these tools had considerable impact on Microsoft's development practices, as well as on the research community. Some of them are being shipped in products such as the Static Driver Verifier or as part of Visual Studio. Performing program analysis as part of quality assurance is meanwhile standard practice in many software development companies. However several challenges have not yet been resolved. Thus, I will conclude with a set of open challenges in program analysis which hopefully triggers new aspiring directions in our joint quest of delivering predictable software that is free from defect and vulnerabilities.

Large-Scale Formal Verification in Practice: A Process Perspective
June Andronick, Ross Jeffery, Gerwin Klein, Rafal Kolanski, Mark Staples, He Zhang, and Liming Zhu
(NICTA, Australia; UNSW, Australia)
The L4.verified project was a rare success in large-scale, formal verification: it provided a formal, machine-checked, code-level proof of the full functional correctness of the seL4 microkernel. In this paper we report on the development process and management issues of this project, highlighting key success factors. We formulate a detailed descriptive model of its middle-out development process, and analyze the evolution and dependencies of code and proof artifacts. We compare our key findings on verification and re-verification with insights from other verification efforts in the literature. Our analysis of the project is based on complete access to project logs, meeting notes, and version control data over its entire history, including its long-term, ongoing maintenance phase. The aim of this work is to aid understanding of how to successfully run large-scale formal software verification projects.

Constructing Parser for Industrial Software Specifications Containing Formal and Natural Language Description
Futoshi Iwama, Taiga Nakamura, and Hironori Takeuchi
(IBM Research, Japan)
This paper describes a novel framework for creating a parser to process and analyze texts written in a ``partially structured'' natural language. In many projects, the contents of document artifacts tend to be described as a mixture of formal parts (i.e. the text constructs follow specific conventions) and parts written in arbitrary free text. Formal parsers, typically defined and used to process a description with rigidly defined syntax such as program source code are very precise and efficient in processing the formal part, while parsers developed for natural language processing (NLP) are good at robustly interpreting the free-text part. Therefore, combining these parsers with different characteristics can allow for more flexible and practical processing of various project documents. Unfortunately, conventional approaches to constructing a parser from multiple small parsers were studied extensively only for formal language parsers and are not directly applicable to NLP parsers due to the differences in the way the input text is extracted and evaluated. We propose a method to configure and generate a combined parser by extending an approach based on parser combinator, the operators for composing multiple formal parsers, to support both NLP and formal parsers. The resulting text parser is based on Parsing Expression Grammars, and it benefits from the strength of both parser types. We demonstrate an application of such combined parser in practical situations and show that the proposed approach can efficiently construct a parser for analyzing project-specific industrial specification documents.

Formal Correctness, Safety, Dependability, and Performance Analysis of a Satellite
Marie-Aude Esteve, Joost-Pieter Katoen, Viet Yen Nguyen, Bart Postma, and Yuri Yushtein
(European Space Agency, Netherlands; RWTH Aachen University, Germany; University of Twente, Netherlands)
This paper reports on the usage of a broad palette of formal modeling and analysis techniques on a regular industrial-size design of an ultra-modern satellite platform. These efforts were carried out in parallel with the conventional software development of the satellite platform. The model itself is expressed in a formalized dialect of AADL. Its formal nature enables rigorous and automated analysis, for which the recently developed COMPASS toolset was used. The whole effort revealed numerous inconsistencies in the early design documents, and the use of formal analyses provided additional insight on discrete system behavior (comprising nearly 50 million states), on hybrid system behavior involving discrete and continuous variables, and enabled the automated generation of large fault trees (66 nodes) for safety analysis that typically are constructed by hand. The model's size pushed the computational tractability of the algorithms underlying the formal analyses, and revealed bottlenecks for future theoretical research. Additionally, the effort led to newly learned practices from which subsequent formal modeling and analysis efforts shall benefit, especially when they are injected in the conventional software development lifecycle. The case demonstrates the feasibility of fully capturing a system-level design as a single comprehensive formal model and analyze it automatically using a toolset based on (probabilistic) model checkers.

Goldfish Bowl Panel: Software Development Analytics
Thu, Jun 7, 16:00 - 17:30

Goldfish Bowl Panel: Software Development Analytics
Tim Menzies and Thomas Zimmermann
(West Virginia University, USA; Microsoft Research, USA)
Gaming companies now routinely apply data mining to their user data in order to plan the next release of their software. We predict that such software development analytics will become commonplace, in the near future. For example, as large software systems migrate to the cloud, they are divided and sold as dozens of smaller apps; when shopping inside the cloud, users are free to mix and match their apps from multiple vendors (e.g. Google Docs’ word processor with Zoho’s slide manager); to extend, or even retain, market share cloud vendors must mine their user data in order to understand what features best attract their clients. This panel will address the open issues with analytics. Issues addressed will include the following. What is the potential for software development analytics? What are the strengths and weaknesses of the current generation of analytics tools? How best can we mature those tools?

Re-engineering
Thu, Jun 7, 16:00 - 17:30

Making Sense of Healthcare Benefits
Jonathan Bnayahu, Maayan Goldstein, Mordechai Nisenson, and Yahalomit Simionovici
(IBM Research, Israel)
A key piece of information in healthcare is a patient's benefit plan. It details which treatments and procedures are covered by the health insurer (or payer), and at which conditions. While the most accurate and complete implementation of the plan resides in the payer’s claims adjudication systems, the inherent complexity of these systems forces payers to maintain multiple repositories of benefit information for other service and regulatory needs. In this paper we present a technology that deals with this complexity. We show how a large US health payer benefited from using the visualization, search, summarization and other capabilities of the technology. We argue that this technology can be used to improve productivity and reduce error rate in the benefits administration workflow, leading to lower administrative overhead and cost for health payers, which benefits both payers and patients.

On the Proactive and Interactive Visualization for Feature Evolution Comprehension: An Industrial Investigation
Renato Novais, Camila Nunes, Caio Lima, Elder Cirilo, Francisco Dantas, Alessandro Garcia, and Manoel Mendonça
(Federal University of Bahia, Brazil; Federal Institute of Bahia, Brazil; PUC-Rio, Brazil)
Program comprehension is a key activity through maintenance and evolution of large-scale software systems. The understanding of a program often requires the evolution analysis of individual functionalities, so-called features. The comprehension of evolving features is not trivial as their implementations are often tangled and scattered through many modules. Even worse, existing techniques are limited in providing developers with direct means for visualizing the evolution of features’ code. This work presents a proactive and interactive visualization strategy to enable feature evolution analysis. It proactively identifies code elements of evolving features and provides multiple views to present their structure under different perspectives. The novel visualization strategy was compared to a lightweight visualization strategy based on a tree-structure. We ran a controlled experiment with industry developers, who performed feature evolution comprehension tasks on an industrial-strength software. The results showed that the use of the proposed strategy presented significant gains in terms of correctness and execution time for feature evolution comprehension tasks.

Extending Static Analysis by Mining Project-Specific Rules
Boya Sun, Gang Shu, Andy Podgurski, and Brian Robinson
(Case Western Reserve University, USA; ABB Research, USA)
Commercial static program analysis tools can be used to detect many defects that are common across applications. However, such tools currently have limited ability to reveal defects that are specific to individual projects, unless specialized checkers are devised and implemented by tool users. Developers do not typically exploit this capability. By contrast, defect mining tools developed by researchers can discover project-specific defects, but they require specialized expertise to employ and they may not be robust enough for general use. We present a hybrid approach in which a sophisticated dependence-based rule mining tool is used to discover project-specific programming rules, which are then transformed automatically into checkers that a commercial static analysis tool can run against a code base to reveal defects. We also present the results of an empirical study in which this approach was applied successfully to two large industrial code bases. Finally, we analyze the potential implications of this approach for software development practice.

Debugging
Fri, Jun 8, 08:45 - 10:15

Debugger Canvas: Industrial Experience with the Code Bubbles Paradigm
Robert DeLine, Andrew Bragdon, Kael Rowan, Jens Jacobsen, and Steven P. Reiss
(Microsoft Research, USA; Brown University, USA)
At ICSE 2010, the Code Bubbles team from Brown University and the Code Canvas team from Microsoft Research presented similar ideas for new user experiences for an integrated development environment. Since then, the two teams formed a collaboration, along with the Microsoft Visual Studio team, to release Debugger Canvas, an industrial version of the Code Bubbles paradigm. With Debugger Canvas, a programmer debugs her code as a collection of code bubbles, annotated with call paths and variable values, on a two-dimensional pan-and-zoom surface. In this experience report, we describe new user interface ideas, describe the rationale behind our design choices, evaluate the performance overhead of the new design, and provide user feedback based on lab participants, post-release usage data, and a user survey and interviews. We conclude that the code bubbles paradigm does scale to existing customer code bases, is best implemented as a mode in the existing user experience rather than a replacement, and is most useful when the user has a long or complex call paths, a large or unfamiliar code base, or complex control patterns, like factories or dynamic linking.

Characterizing and Predicting Which Bugs Get Reopened
Thomas Zimmermann, Nachiappan Nagappan, Philip J. Guo, and Brendan Murphy
(Microsoft Research, USA; Stanford University, USA; Microsoft Research, UK)
Fixing bugs is an important part of the software development process. An underlying aspect is the effectiveness of fixes: if a fair number of fixed bugs are reopened, it could indicate instability in the software system. To the best of our knowledge there has been on little prior work on understanding the dynamics of bug reopens. Towards that end, in this paper, we characterize when bug reports are reopened by using the Microsoft Windows operating system project as an empirical case study. Our analysis is based on a mixed-methods approach. First, we categorize the primary reasons for reopens based on a survey of 358 Microsoft employees. We then reinforce these results with a large-scale quantitative study of Windows bug reports, focusing on factors related to bug report edits and relationships between people involved in handling the bug. Finally, we build statistical models to describe the impact of various metrics on reopening bugs ranging from the reputation of the opener to how the bug was found.

ReBucket: A Method for Clustering Duplicate Crash Reports Based on Call Stack Similarity
Yingnong Dang, Rongxin Wu, Hongyu Zhang, Dongmei Zhang, and Peter Nobel
(Microsoft Research, China; Tsinghua University, China; Microsoft, USA)
Software often crashes. Once a crash happens, a crash report could be sent to software developers for investigation upon user permission. To facilitate efficient handling of crashes, crash reports received by Microsoft's Windows Error Reporting (WER) system are organized into a set of "buckets". Each bucket contains duplicate crash reports that are deemed as manifestations of the same bug. The bucket information is important for prioritizing efforts to resolve crashing bugs. To improve the accuracy of bucketing, we propose ReBucket, a method for clustering crash reports based on call stack matching. ReBucket measures the similarities of call stacks in crash reports and then assigns the reports to appropriate buckets based on the similarity values. We evaluate ReBucket using crash data collected from five widely-used Microsoft products. The results show that ReBucket achieves better overall performance than the existing methods. In average, the F-measure obtained by ReBucket is about 0.88.

Case Studies
Fri, Jun 8, 08:45 - 10:15

Understanding the Impact of Pair Programming on Developers Attention: A Case Study on a Large Industrial Experimentation
Alberto Sillitti, Giancarlo Succi, and Jelena Vlasenko
(Free University of Bolzano, Italy)
Pair Programming is one of the most studied and debated development techniques. However, at present, we do not have a clear, objective, and quantitative understanding of the claimed benefits of such development approach. All the available studies focus on the analysis of the effects of Pair Programming (e.g., code quality, development speed, etc.) with different findings and limited replicability of the experiments. This paper adopts a different approach that could be replicated in an easier way: it investigates how Pair Programming affects the way developers write code and interact with their development machine. In particular, the paper focuses on the effects that Pair Programming has on developers’ attention and productivity. The study was performed on a professional development team observed for ten months and it finds out that Pair Programming helps developers to eliminate distracting activities and to focus on productive activities.

How Much Does Unused Code Matter for Maintenance?
Sebastian Eder, Maximilian Junker, Elmar Jürgens, Benedikt Hauptmann, Rudolf Vaas, and Karl-Heinz Prommer
(TU Munich, Germany; Munich Re, Germany)
Software systems contain unnecessary code. Its maintenance causes unnecessary costs. We present tool-support that employs dynamic analysis of deployed software to detect unused code as an approximation of unnecessary code, and static analysis to reveal its changes during maintenance. We present a case study on maintenance of unused code in an industrial software system over the course of two years. It quantifies the amount of code that is unused, the amount of maintenance activity that went into it and makes the potential benefit of tool support explicit, which informs maintainers that are about to modify unused code.

Using Knowledge Elicitation to Improve Web Effort Estimation: Lessons from Six Industrial Case Studies
Emilia Mendes
(Zayed University, United Arab Emirates)
This paper details our experience building and validating six different expert-based Web effort estimation models for ICT companies in New Zealand and Brazil. All models were created using Bayesian networks, via eliciting knowledge from domain experts, and validated using data from past finished projects. Post-mortem interviews with the participating companies showed that they found the entire process extremely beneficial and worthwhile, and that all the models created remained in use by those companies.

Testing
Fri, Jun 8, 10:45 - 12:45

Large-Scale Test Automation in the Cloud (Invited Industrial Talk)
John Penix
(Google, USA)
Software development at Google is big and fast. The code base receives 20+ code changes per minute and 50% of the files change every month! Each product is developed and released from ‘head’ relying on automated tests verifying the product behavior. Release frequency varies from multiple times per day to once every few weeks, depending on the product team. With such a huge, fast-moving codebase, it is possible for teams to get stuck spending a lot of time just keeping their build ‘green’. A continuous integration system should help by providing the exact change at which a test started failing, instead of a range of suspect changes or doing a lengthy binary-search for the offending change. We have built a system that uses dependency analysis to determine all the tests a change transitively affects and then runs only those tests for every change. The system is built on top of Google’s cloud computing infrastructure enabling many builds to be executed concurrently, allowing the system to run affected tests as soon as a change is submitted. The use of smart tools and cloud computing infrastructure in the continuous integration system enables quick, effective feedback to development teams.

Efficient Reuse of Domain-Specific Test Knowledge: An Industrial Case in the Smart Card Domain
Nicolas Devos, Christophe Ponsard, Jean-Christophe Deprez, Renaud Bauvin, Benedicte Moriau, and Guy Anckaerts
(CETIC, Belgium; STMicroelectronics, Belgium)
While testing is heavily used and largely automated in software development projects, the reuse of test practices across similar projects in a given domain is seldom systematized and supported by adequate methods and tools. This paper presents a practical approach that emerged from a concrete industrial case in the smart card domain at STMicroelectronics Belgium in order to better address this kind of challenge. The central concept is a test knowledge repository organized as a collection of specific patterns named QPatterns. A systematic process was followed, first to gather, structure and abstract the test practices, then to produce and validate an initial repository, and finally to make it evolve later on Testers can then rely on this repository to produce high quality test plans identifying all the functional and non-functional aspects that have to be addressed, as well as the concrete tests that have to be developed within the context of a new project. A tool support was also developed and integrated in a traceable way into the existing industrial test environment. The approach was validated and is currently under deployment at STMicroelectronics Belgium.

The Quamoco Product Quality Modelling and Assessment Approach
Stefan Wagner, Klaus Lochmann, Lars Heinemann, Michael Kläs, Adam Trendowicz, Reinhold Plösch, Andreas Seidl, Andreas Goeb, and Jonathan Streit
(University of Stuttgart, Germany; TU Munich, Germany; Fraunhofer IESE, Germany; JKU Linz, Austria; Capgemini, Germany; SAP, Germany; itestra, Germany)
Published software quality models either provide abstract quality attributes or concrete quality assessments. There are no models that seamlessly integrate both aspects. In the project Quamoco, we built a comprehensive approach with the aim to close this gap.
For this, we developed in several iterations a meta quality model specifying general concepts, a quality base model covering the most important quality factors and a quality assessment approach. The meta model introduces the new concept of a product factor, which bridges the gap between concrete measurements and abstract quality aspects. Product factors have measures and instruments to operationalise quality by measurements from manual inspection and tool analysis. The base model uses the ISO 25010 quality attributes, which we refine by 200 factors and 600 measures for Java and C# systems.
We found in several empirical validations that the assessment results fit to the expectations of experts for the corresponding systems. The empirical analyses also showed that several of the correlations are statistically significant and that the maintainability part of the base model has the highest correlation, which fits to the fact that this part is the most comprehensive. Although we still see room for extending and improving the base model, it shows a high correspondence with expert opinions and hence is able to form the basis for repeatable and understandable quality assessments in practice.

Industrial Application of Concolic Testing Approach: A Case Study on libexif by Using CREST-BV and KLEE
Yunho Kim, Moonzoo Kim, YoungJoo Kim, and Yoonkyu Jang
(KAIST, South Korea; Samsung Electronics, South Korea)
As smartphones become popular, manufacturers such as Samsung Electronics are developing smartphones with rich functionality such as a camera and photo editing quickly, which accelerates the adoption of open source applications in the smartphone platforms. However, developers often do not know the detail of open source applications, because they did not develop the applications themselves. Thus, it is a challenging problem to test open source applications effectively in short time. This paper reports our experience of applying concolic testing technique to test libexif, an open source library to manipulate EXIF information in image files. We have demonstrated that concolic testing technique is effective and efficient at detecting bugs with modest effort in industrial setting. We also compare two concolic testing tools, CREST-BV and KLEE, in this testing project. Furthermore, we compare the concolic testing results with the analysis result of the Coverity Prevent static analyzer. We detected a memory access bug, a null pointer dereference bug, and four divide-by-zero bugs in libexif through concolic testing, none of which were detected by Coverity Prevent.

Software Engineering Education

The Role of Software Projects in Software Engineering Education
Wed, Jun 6, 11:00 - 12:45 (Chair: Kurt Schneider)

Teaching Software Engineering and Software Project Management: An Integrated and Practical Approach
Gabriele Bavota, Andrea De Lucia, Fausto Fasano, Rocco Oliveto, and Carlo Zottoli
(University of Salerno, Italy; University of Molise, Italy)
We present a practical approach for teaching two different courses of Software Engineering (SE) and Software Project Management (SPM) in an integrated way. The two courses are taught in the same semester, thus allowing to build mixed project teams composed of five-eight Bachelor's students (with development roles) and one or two Master's students (with management roles). The main goal of our approach is to simulate a real-life development scenario giving to the students the possibility to deal with issues arising from typical project situations, such as working in a team, organising the division of work, and coping with time pressure and strict deadlines.

Teaching Collaborative Software Development: A Case Study
Terhi Kilamo, Imed Hammouda, and Mohamed Amine Chatti
(Tampere University of Technology, Finland; RWTH Aachen University, Germany)
Software development is today done in teams of software developers who may be distributed all over the world. Software development has also become to contain more social aspects and the need for collaboration has become more evident. The importance of teaching development methods used in collaborative development is of importance, as skills beyond traditional software development are needed in this modern setting. A novel, student centric approach was tried out at Tampere University of Technology where a new environment called KommGame was introduced. This environment includes a reputation system to support the social aspect of the environment and thus supporting the learner’s collaboration with each other. In this paper, we present the KommGame environment and how it was applied on a course for practical results.

Aspects of Teaching Software Engineering
Wed, Jun 6, 14:00 - 15:30 (Chair: Martin Naedele)

Stages in Teaching Software Testing
Tony Cowling
(University of Sheffield, UK)
This paper describes how a staged approach to the development of students’ abilities to engineer software systems applies to the specific issue of teaching software testing. It evaluates the courses relating to software testing in the Software Engineering volume of Computing Curriculum 2001 against a theoretical model that has been developed from a well-established programme in software engineering, from the perspectives of how well the courses support the progressive development of both students’ knowledge of software testing and their ability to test software systems. It is shown that this progressive development is not well supported, and that to improve this some software testing material should be taught earlier than recommended.

Integrating Tools and Frameworks in Undergraduate Software Engineering Curriculum
Christopher Fuhrman, Roger Champagne, and Alain April
(University of Québec, Canada)
We share our experience over the last 10 years for finding, deploying and evaluating software engineering (SE) technologies in an undergraduate program at the ETS in Montreal, Canada. We identify challenges and propose strategies to integrate technologies into an SE curriculum. We demonstrate how technologies are integrated throughout our program, and provide details of the integration in two specific courses.

What Scope Is There for Adopting Evidence-Informed Teaching in SE?
David Budgen, Sarah Drummond, Pearl Brereton, and Nikki Holland
(Durham University, UK; Keele University, UK)
Context: In teaching about software engineering we currently make little use of any empirical knowledge. Aim: To examine the outcomes available from the use of Evidence-Based Software Engineering (EBSE) practices, so as to identify where these can provide support for, and inform, teaching activities. Method: We have examined all known secondary studies published up to the end of 2009, together with those published in major journals to mid-2011, and identified where these provide practical results that are relevant to student needs. Results: Starting with 145 candidate systematic literature reviews (SLRs), we were able to identify and classify potentially useful teaching material from 43 of them. Conclusions: EBSE can potentially lend authority to our teaching, although the coverage of key topics is uneven. Additionally, mapping studies can provide support for research-led teaching.

Software Engineering Education in Industry
Wed, Jun 6, 16:00 - 18:00 (Chair: Grace Lewis)

FOCUS: An Adaptation of a SWEBOK-Based Curriculum for Industry Requirements
Ganesh Samarthyam, Girish Suryanarayana, Arbind Kumar Gupta, and Raghu Nambiar
(Siemens, India)
Siemens Corporate Development Center India (CT DC IN) develops software applications for the industry, energy, health-care, and infrastructure & cities sectors of Siemens. These applications are typically critical in nature and require software practitioners who have considerable competency in the area of software engineering. To enhance the competency of engineers, CT DC IN has introduced an internal curriculum titled "FOundation CUrriculum for Software engineers" (FOCUS) which is an adapted version of IEEE's SWEBOK curriculum. The FOCUS program has been used to train more than 500 engineers in the last three years. In this experience report, we describe the motivation for FOCUS, how it was structured to address the specific needs of CT DC IN, and how the FOCUS program was rolled out within the organization. We also provide results obtained from a survey of the FOCUS participants, their managers, and FOCUS trainers that was conducted to throw light on the effectiveness of the program. We believe the insights from the survey results provide useful pointers to other organizations and academic institutions that are planning to adopt a SWEBOK-based curriculum.

Teaching Distributed Software Engineering
Thu, Jun 7, 11:15 - 12:30 (Chair: Richard LeBlanc)

Ten Tips to Succeed in Global Software Engineering Education
Ivica Crnković, Ivana Bosnić, and Mario Žagar
(Mälardalen University, Sweden; University of Zagreb, Croatia)
The most effective setting for training in Global Software Engineering is to provide a distributed environment for students. In such an environment, students will meet challenges in recognizing problems first-hand. Teaching in a distributed environment is, however, very demanding, challenging and unpredictable compared to teaching in a local environment. Based on nine years of experience, in this paper we present the most important issues that should be taken into consideration to increase the probability of success in teaching a Global Software Engineering course.

Collaboration Patterns in Distributed Software Development Projects
Igor Čavrak, Marin Orlić, and Ivica Crnković
(University of Zagreb, Croatia; Mälardalen University, Sweden)
The need for educating future software engineers in the field of global software engineering is recognized by many educational institutions. In this paper we outline the characteristics of an existing global software development course run over a period of nine years, and present a flexible project framework for conducting student projects in a distributed environment. Based on data collected from fourteen distributed student projects, a set of collaboration patterns is identified and their causes and implications described. Collaboration patterns are a result of the analysis of collaboration links within distributed student teams, and can assist teachers in better understanding of the dynamics found in distributed projects.

Improving PSP Education by Pairing: An Empirical Study
Guoping Rong, He Zhang, Mingjuan Xie, and Dong Shao
(Nanjing University, China; NICTA, Australia; UNSW, Australia)
Handling large-sized classes and maintaining students’ involvement are two of the major challenges in Personal Software Process (PSP) education in universities. In order to tackle these two challenges, we adapted and incorporated some typical practices of Pair Programming (PP) into the PSP class at summer school in Software Institute of Nanjing University in 2010, and received positive results, such as higher students’ involvement, and conformity of process discipline, as well as (half) workload reduction in evaluating assignments. However, the experiment did not confirm the improved performance of the paired students as expected. Based on the experience and feedbacks, we improved this approach in our PSP course in 2011. Accordingly, by analyzing the previous experiment results, we redesigned the experiment with a number of improvements, such as lab environment, evaluation methods and student selection, to further investigate the effects of this approach in PSP education, in particular students’ performance. We also introduced several new metrics to enable the comparison analysis of the data collected from both paired students and solo students. The new experiment confirms the value of pairing practices in PSP education. The results show that in PSP class, compared to solo students, paired students can achieve better performance in terms of program quality and exam scores.

Five Days of Empirical Software Engineering: The PASED Experience
Massimiliano Di Penta, Giuliano Antoniol, Daniel M. Germán, Yann-Gaël Guéhéneuc, and Bram Adams
(University of Sannio, Italy; École Polytechnique de Montréal, Canada; University of Victoria, Canada)
Acquiring the skills to plan and conduct different kinds of empirical studies is a mandatory requirement for graduate students working in the field of software engineering. These skills typically can only be developed based on the teaching and experience of the students' supervisor, because of the lack of specific, practical courses providing these skills.
To fill this gap, we organized the first Canadian Summer School on Practical Analyses of Software Engineering Data (PASED). The aim of PASED is to provide-using a "learning by doing'' model of teaching-a solid foundation to software engineering graduate students on conducting empirical studies. This paper describes our experience in organizing the PASED school, i.e., what challenges we encountered, how we designed the lectures and laboratories, and what could be improved in the future based on the participants' feedback.

New Ideas and Emerging Results

NIER in Support of Software Engineers
Wed, Jun 6, 10:45 - 12:45

Automatically Detecting Developer Activities and Problems in Software Development Work
Tobias Roehm and Walid Maalej
(TU Munich, Germany)
Detecting the current activity of developers and problems they are facing is a prerequisite for a context-aware assistance and for capturing developers’ experiences during their work. We present an approach to detect the current activity of software developers and if they are facing a problem. By observing developer actions like changing code or searching the web, we detect whether developers are locating the cause of a problem, searching for a solution, or applying a solution. We model development work as recurring problem solution cycle, detect developer’s actions by instrumenting the IDE, translate developer actions to observations using ontologies, and infer developer activities by using Hidden Markov Models. In a preliminary evaluation, our approach was able to correctly detect 72% of all activities. However, a broader more reliable evaluation is still needed.

Software Process Improvement through the Identification and Removal of Project-Level Knowledge Flow Obstacles
Susan M. Mitchell and Carolyn B. Seaman
(University of Maryland in Baltimore County, USA)
Uncontrollable costs, schedule overruns, and poor end product quality continue to plague the software engineering field. This research investigates software process improvement (SPI) through the application of knowledge management (KM) at the software project level. A pilot study was conducted to investigate what types of obstacles to knowledge flow exist within a software development project, as well as the potential influence on SPI of their mitigation or removal. The KM technique of “knowledge mapping” was used as a research technique to characterize knowledge flow. Results show that such mitigation or removal was acknowledged by project team members as having the potential for lowering project labor cost, improving schedule adherence, and enhancing final product quality.

Symbiotic General-Purpose and Domain-Specific Languages
Colin Atkinson, Ralph Gerbig, and Bastian Kennel
(University of Mannheim, Germany)
Domain-Specific Modeling Languages (DSMLs) have received great attention in recent years and are expected to play a big role in the future of software engineering as processes become more view-centric. However, they are a "two-edged sword". While they provide strong support for communication within communities, allowing experts to express themselves using concepts tailored to their exact needs, they are a poor vehicle for communication across communities because of their lack of common, transcending concepts. In contrast, General-Purpose Modeling Languages (GPMLs) have the opposite problem - they are poor at the former but good at the latter. The value of models in software engineering would therefore be significantly boosted if the advantages of DSMLs and GPMLs could be combined and models could be viewed in a domain-specific or general-purpose way depending on the needs of the user. In this paper we present an approach for achieving such a synergy based on the orthogonal classification architecture. In this architecture model elements have two classifiers: a linguistic one representing their "general-purpose" and an ontological one representing their "domain-specific" type. By associating visualization symbols with both classifiers it is possible to support two concrete syntaxes at the same time and allow the domain-specific and general-purpose notation to support each other - that is, to form a symbiotic relationship.

Evaluating the Specificity of Text Retrieval Queries to Support Software Engineering Tasks
Sonia Haiduc, Gabriele Bavota, Rocco Oliveto, Andrian Marcus, and Andrea De Lucia
(Wayne State University, USA; University of Salerno, Italy; University of Molise, Italy)
Text retrieval approaches have been used to address many software engineering tasks. In most cases, their use involves issuing a textual query to retrieve a set of relevant software artifacts from the system. The performance of all these approaches depends on the quality of the given query (i.e., its ability to describe the information need in such a way that the relevant software artifacts are retrieved during the search). Currently, the only way to tell that a query failed to lead to the expected software artifacts is by investing time and effort in analyzing the search results. In addition, it is often very difficult to ascertain what part of the query leads to poor results. We propose a novel pre-retrieval metric, which reflects the quality of a query by measuring the specificity of its terms. We exemplify the use of the new specificity metric on the task of concept location in source code. A preliminary empirical study shows that our metric is a good effort predictor for text retrieval-based concept location, outperforming existing techniques from the field of natural language document retrieval.

Co-adapting Human Collaborations and Software Architectures
Christoph Dorn and Richard N. Taylor
(UC Irvine, USA)
Human collaboration has become an integral part of large-scale systems for massive online knowledge sharing, content distribution, and social networking. Maintenance of these complex systems, however, still relies on adaptation mechanisms that remain unaware of the prevailing user collaboration patterns. Consequently, a system cannot react to changes in the interaction behavior thereby impeding the collaboration's evolution. In this paper, we make the case for a human architecture model and its mapping onto software architecture elements as fundamental building blocks for system adaptation.

Release Engineering Practices and Pitfalls
Hyrum K. Wright and Dewayne E. Perry
(University of Texas at Austin, USA)
The release and deployment phase of the software development process is often overlooked as part of broader software engineering research. In this paper, we discuss early results from a set of multiple semi-structured interviews with practicing release engineers. Subjects for the interviews are drawn from a number of different commercial software development organizations, and our interviews focus on why release process faults and failures occur, how organizations recover from them, and how they can be predicted, avoided or prevented in the future. Along the way, the interviews provide insight into the state of release engineering today, and interesting relationships between software architecture and release processes.

Augmented Intelligence - The New AI - Unleashing Human Capabilities in Knowledge Work
James M. Corrigan
(Stony Brook University, USA)
In this paper I describe a novel application of contemplative techniques to software engineering with the goal of augmenting the intellectual capabilities of knowledge workers within the field in four areas: flexibility, attention, creativity, and trust. The augmentation of software engineers’ intellectual capabilities is proposed as a third complement to the traditional focus of methodologies on the process and environmental factors of the software development endeavor. I argue that these capabilities have been shown to be open to improvement through the practices traditionally used in spiritual traditions, but now used increasingly in other fields of knowledge work, such as in the medical profession and the education field. Historically, the intellectual capabilities of software engineers have been treated as a given within any particular software development effort. This is argued to be an aspect ripe for inclusion within software development methodologies.

NIER for Mining Product and Process Data
Thu, Jun 7, 10:45 - 12:45

On How Often Code Is Cloned across Repositories
Niko Schwarz, Mircea Lungu, and Romain Robbes
(University of Bern, Switzerland; University of Chile, Chile)
Detecting code duplication in large code bases, or even across project boundaries, is problematic due to the massive amount of data involved. Large-scale clone detection also opens new challenges beyond asking for the provenance of a single clone fragment, such as assessing the prevalence of code clones on the entire code base, and their evolution.
We propose a set of lightweight techniques that may scale up to very large amounts of source code in the presence of multiple versions. The common idea behind these techniques is to use bad hashing to get a quick answer. We report on a case study, the Squeaksource ecosystem, which features thousands of software projects, with more than 40 million versions of methods, across more than seven years of evolution. We provide estimates for the prevalence of type-1, type-2, and type-3 clones in Squeaksource.

Mining Input Sanitization Patterns for Predicting SQL Injection and Cross Site Scripting Vulnerabilities
Lwin Khin Shar and Hee Beng Kuan Tan
(Nanyang Technological University, Singapore)
Static code attributes such as lines of code and cyclomatic complexity have been shown to be useful indicators of defects in software modules. As web applications adopt input sanitization routines to prevent web security risks, static code attributes that represent the characteristics of these routines may be useful for predicting web application vulnerabilities. In this paper, we classify various input sanitization methods into different types and propose a set of static code attributes that represent these types. Then we use data mining methods to predict SQL injection and cross site scripting vulnerabilities in web applications. Preliminary experiments show that our proposed attributes are important indicators of such vulnerabilities.

Inferring Developer Expertise through Defect Analysis
Tung Thanh Nguyen, Tien N. Nguyen, Evelyn Duesterwald, Tim Klinger, and Peter Santhanam
(Iowa State University, USA; IBM Research, USA)
Fixing defects is an essential software development activity. For commercial software vendors, the time to repair defects in deployed business-critical software products or applications is a key quality metric for sustained customer satisfaction. In this paper, we report on the analysis of about 1,500 defect records from an IBM middle-ware product collected over a five-year period. The analysis includes a characterization of each repaired defect by topic and a ranking of developers by inferred expertise on each topic. We find clear evidence that defect resolution time is strongly influenced by the specific developer and his/her expertise in the defect's topic. To validate our approach, we conducted interviews with the product’s manager who provided us with his own ranking of developer expertise for comparison. We argue that our automated developer expertise ranking can be beneficial in the planning of a software project and is applicable beyond software support in the other phases of the software lifecycle.

Green Mining: Investigating Power Consumption across Versions
Abram Hindle
(University of Alberta, Canada)
Power consumption is increasingly becoming a concern for not only electrical engineers, but for software engineers as well, due to the increasing popularity of new power-limited contexts such as mobile-computing, smart-phones and cloud-computing. Software changes can alter software power consumption behaviour and can cause power performance regressions. By tracking software power consumption we can build models to provide suggestions to avoid power regressions. There is much research on software power consumption, but little focus on the relationship between software changes and power consumption. Most work measures the power consumption of a single software task; instead we seek to extend this work across the history (revisions) of a project. We develop a set of tests for a well established product and then run those tests across all versions of the product while recording the power usage of these tests. We provide and demonstrate a methodology that enables the analysis of power consumption performance for over 500 nightly builds of Firefox 3.6; we show that software change does induce changes in power consumption. This methodology and case study are a first step towards combining power measurement and mining software repositories research, thus enabling developers to avoid power regressions via power consumption awareness.

Multi-label Software Behavior Learning
Yang Feng and Zhenyu Chen
(Nanjing University, China)
Software behavior learning is an important task in software engineering. Software behavior is usually represented as a program execution. It is expected that similar executions have similar behavior, i.e. revealing the same faults. Single-label learning has been used to assign a single label (fault) to a failing execution in the existing efforts. However, a failing execution may be caused by several faults simultaneously. Hence, it needs to assign multiple labels to support software engineering tasks in practice. In this paper, we present multi-label software behavior learning. A well-known multi-label learning algorithm ML-KNN is introduced to achieve comprehensive learning of software behavior. We conducted a preliminary experiment on two industrial programs: flex and grep. The experimental results show that multi-label learning can produce more precise and complete results than single-label learning.

Trends in Object-Oriented Software Evolution: Investigating Network Properties
Alexander Chatzigeorgiou and George Melas
(University of Macedonia, Greece)
The rise of social networks and the accompanying interest to study their evolution has stimulated a number of research efforts to analyze their growth patterns by means of network analysis. The inherent graph-like structure of object-oriented systems calls for the application of the corresponding methods and tools to analyze software evolution. In this paper we investigate network properties of two open-source systems and observe interesting phenomena regarding their growth. Relating the observed evolutionary trends to principles and laws of software design enables a high-level assessment of tendencies in the underlying design quality.

Exploring Techniques for Rationale Extraction from Existing Documents
Benjamin Rogers, James Gung, Yechen Qiao, and Janet E. Burge
(Miami University, USA)
The rationale for a software system captures the designers’ and developers’ intent behind the decisions made during its development. This information has many potential uses but is typically not captured explicitly. This paper describes an initial investigation into the use of text mining and parsing techniques for identifying rationale from existing documents. Initial results indicate that the use of linguistic features results in better precision but significantly lower recall than using text mining.

NIER to Leverage Social Aspects
Fri, Jun 8, 08:45 - 10:15

Continuous Social Screencasting to Facilitate Software Tool Discovery
Emerson Murphy-Hill
(North Carolina State University, USA)
The wide variety of software development tools available today have a great potential to improve the way developers make software, but that potential goes unfulfilled when developers are not aware of useful tools. In this paper, I introduce the idea of continuous social screencasting, a novel mechanism to help developers gain awareness of relevant tools by enabling them to learn remotely and asychronously from their peers. The idea builds on the strength of several existing techniques that developers already use for discovering new tools, including screencasts and online social networks.

UDesignIt: Towards Social Media for Community-Driven Design
Phil Greenwood, Awais Rashid, and James Walkerdine
(Lancaster University, UK)
Online social networks are now common place in day-to-day lives. They are also increasingly used to drive social action initiatives, either led by government or communities themselves (e.g., SeeClickFix, LoveLewisham.org, mumsnet). However, such initiatives are mainly used for crowd sourcing community views or coordinating activities. With the changing global economic and political landscape, there is an ever pressing need to engage citizens on a large-scale, not only in consultations about systems that affect them, but also involve them directly in the design of these very systems. In this paper we present the UDesignIt platform that combines social media technologies with software engineering concepts to empower communities to discuss and extract high-level design features. It combines natural language processing, feature modelling and visual overlays in the form of ``image clouds'' to enable communities and software engineers alike to unlock the knowledge contained in the unstructured and unfiltered content of social media where people discuss social problems and their solutions. By automatically extracting key themes and presenting them in a structured and organised manner in near real-time, the approach drives a shift towards large-scale engagement of community stakeholders for system design.

Influencing the Adoption of Software Engineering Methods Using Social Software
Leif Singer and Kurt Schneider
(Leibniz Universität Hannover, Germany)
Software engineering research and practice provide a wealth of methods that improve the quality of software and lower the costs of producing it. Even though processes mandate their use, methods are not employed consequently. Software developers and development organizations thus cannot fully benefit from these methods. We propose a method that, for a given software engineering method, provides instructions on how to improve its adoption using social software. This employs the intrinsic motivation of software developers rather than prescribing behavior. As a result, we believe that software engineering methods will be applied better and more frequently.

Toward Actionable, Broadly Accessible Contests in Software Engineering
Jane Cleland-Huang, Yonghee Shin, Ed Keenan, Adam Czauderna, Greg Leach, Evan Moritz, Malcom Gethers, Denys Poshyvanyk, Jane Huffman Hayes, and Wenbin Li
(DePaul University, USA; College of William and Mary, USA; University of Kentucky, USA)
Software Engineering challenges and contests are becoming increasingly popular for focusing researchers' efforts on particular problems. Such contests tend to follow either an exploratory model, in which the contest holders provide data and ask the contestants to discover ``interesting things'' they can do with it, or task-oriented contests in which contestants must perform a specific task on a provided dataset. Only occasionally do contests provide more rigorous evaluation mechanisms that precisely specify the task to be performed and the metrics that will be used to evaluate the results. In this paper, we propose actionable and crowd-sourced contests: actionable because the contest describes a precise task, datasets, and evaluation metrics, and also provides a downloadable operating environment for the contest; and crowd-sourced because providing these features creates accessibility to Information Technology hobbyists and students who are attracted by the challenge. Our proposed approach is illustrated using research challenges from the software traceability area as well as an experimental workbench named TraceLab.

CodeTimeline: Storytelling with Versioning Data
Adrian Kuhn and Mirko Stocker
(University of British Columbia, Canada; University of Applied Sciences Rapperswil, Switzerland)
Working with a software system typically requires knowledge of the system's history, however this knowledge is often only tribal memory of the development team. In past user studies we have observed that when being presented with collaboration views and word clouds from the system's history engineers start sharing memories linked to those visualizations. In this paper we propose an approach based on a story-telling visualization, which is designed to entice engineers to share and document their tribal memory. Sticky notes can be used to share memories of a system's lifetime events, such as past design rationales but also more casual memories like pictures from after-work beer or a hackathon. We present an early-stage prototype implementation and include two design studies created using that prototype.

NIER for Verification and Evolution
Fri, Jun 8, 10:45 - 12:45

Analyzing Multi-agent Systems with Probabilistic Model Checking Approach
Songzheng Song, Jianye Hao, Yang Liu, Jun Sun, Ho-Fung Leung, and Jin Song Dong
(National University of Singapore, Singapore; Chinese University of Hong Kong, China; University of Technology and Design, Singapore)
Multi-agent systems, which are composed of autonomous agents, have been successfully employed as a modeling paradigm in many scenarios. However, it is challenging to guarantee the correctness of their behaviors due to the complex nature of the autonomous agents, especially when they have stochastic characteristics. In this work, we propose to apply probabilistic model checking to analyze multi-agent systems. A modeling language called PMA is defined to specify such kind of systems, and LTL property and logic of knowledge combined with probabilistic requirements are supported to analyze system behaviors. Initial evaluation indicates the effectiveness of our current progress; meanwhile some challenges and possible solutions are discussed as our ongoing work.

Brace: An Assertion Framework for Debugging Cyber-Physical Systems
Kevin Boos, Chien-Liang Fok, Christine Julien, and Miryung Kim
(University of Texas at Austin, USA)
Developing cyber-physical systems (CPS) is challenging because correctness depends on both logical and physical states, which are collectively difficult to observe. The developer often need to repeatedly rerun the system while observing its behavior and tweak the hardware and software until it meets minimum requirements. This process is tedious, error-prone, and lacks rigor. To address this, we propose BRACE, a framework that simplifies the process by enabling developers to correlate cyber (i.e., logical) and physical properties of the system via assertions. This paper presents our initial investigation into the requirements and semantics of such assertions, which we call CPS assertions. We discusses our experience implementing and using the framework with a mobile robot, and highlight key future research challenges.

Augmenting Test Suites Effectiveness by Increasing Output Diversity
Nadia Alshahwan and Mark Harman
(University College London, UK)
The uniqueness (or otherwise) of test outputs ought to have a bearing on test effectiveness, yet it has not previously been studied. In this paper we introduce a novel test suite adequacy criterion based on output uniqueness. We propose 4 definitions of output uniqueness with varying degrees of strictness. We present a preliminary evaluation for web application testing that confirms that output uniqueness enhances fault-finding effectiveness. The approach outperforms random augmentation in fault finding ability by an overall average of 280% in 5 medium sized, real world web applications.

Improving IDE Recommendations by Considering Global Implications of Existing Recommendations
Kıvanç Muşlu, Yuriy Brun, Reid Holmes, Michael D. Ernst, and David Notkin
(University of Washington, USA; University of Waterloo, Canada)
Modern integrated development environments (IDEs) offer recommendations to aid development, such as auto-completions, refactorings, and fixes for compilation errors. Recommendations for each code location are typically computed independently of the other locations. We propose that an IDE should consider the whole codebase, not just the local context, before offering recommendations for a particular location. We demonstrate the potential benefits of our technique by presenting four concrete scenarios in which the Eclipse IDE fails to provide proper Quick Fixes at relevant locations, even though it offers those fixes at other locations. We describe a technique that can augment an existing IDE’s recommendations to account for non-local information. For example, when some compilation errors depend on others, our technique helps the developer decide which errors to resolve first.

Towards Flexible Evolution of Dynamically Adaptive Systems
Gilles Perrouin, Brice Morin, Franck Chauvel, Franck Fleurey, Jacques Klein, Yves Le Traon, Olivier Barais, and Jean-Marc Jézéquel
(University of Namur, Belgium; SINTEF, Norway; University of Luxembourg, Luxembourg; IRISA, France)
Modern software systems need to be continuously available under varying conditions. Their ability to dynamically adapt to their execution context is thus increasingly seen as a key to their success. Recently, many approaches were proposed to design and support the execution of Dynamically Adaptive Systems (DAS). However, the ability of a DAS to evolve is limited to the addition, update or removal of adaptation rules or reconfiguration scripts. These artifacts are very specific to the control loop managing such a DAS and runtime evolution of the DAS requirements may affect other parts of the DAS. In this paper, we argue to evolve all parts of the loop. We suggest leveraging recent advances in model-driven techniques to offer an approach that supports the evolution of both systems and their adaptation capabilities. The basic idea is to consider the control loop itself as an adaptive system.

Towards Business Processes Orchestrating the Physical Enterprise with Wireless Sensor Networks
Fabio Casati, Florian Daniel, Guenadi Dantchev, Joakim Eriksson, Niclas Finne, Stamatis Karnouskos, Patricio Moreno Montero, Luca Mottola, Felix Jonathan Oppermann, Gian Pietro Picco, Antonio Quartulli, Kay Römer, Patrik Spiess, Stefano Tranquillini, and Thiemo Voigt
(University of Trento, Italy; SAP, Germany; Swedish Institute of Computer Science, Sweden; Acciona Infraestructuras, Spain; University of Lübeck, Germany)
The industrial adoption of wireless sensor networks (WSNs) is hampered by two main factors. First, there is a lack of integration of WSNs with business process modeling languages and back-ends. Second, programming WSNs is still challenging as it is mainly performed at the operating system level. To this end, we provide makeSense: a unified programming framework and a compilation chain that, from high-level business process specifications, generates code ready for deployment on WSN nodes.

Engineering and Verifying Requirements for Programmable Self-Assembling Nanomachines
Robyn Lutz, Jack Lutz, James Lathrop, Titus Klinge, Eric Henderson, Divita Mathur, and Dalia Abo Sheasha
(Iowa State University, USA; California Institute of Technology, USA)
We propose an extension of van Lamsweerde’s goal-oriented requirements engineering to the domain of programmable DNA nanotechnology. This is a domain in which individual devices (agents) are at most a few dozen nanometers in diameter. These devices are programmed to assemble themselves from molecular components and perform their assigned tasks. The devices carry out their tasks in the probabilistic world of chemical kinetics, so they are individually error-prone. However, the number of devices deployed is roughly on the order of a nanomole, and some goals are achieved when enough of these agents achieve their assigned subgoals. We show that it is useful in this setting to augment the AND/OR goal diagrams to allow goal refinements that are mediated by threshold functions, rather than ANDs or ORs. We illustrate this method by engineering requirements for a system of molecular detectors (DNA origami “pliers” that capture target molecules) invented by Kuzuya, Sakai, Yamazaki, Xu, and Komiyama (2011). We model this system in the Prism probabilistic symbolic model checker, and we use Prism to verify that requirements are satisfied. This gives prima facie evidence that software engineering methods can be used to make DNA nanotechnology more productive, predictable and safe.

Formal Research Demonstrations

Formal Demos 1
Wed, Jun 6, 16:00 - 18:00

Facilitating Communication between Engineers with CARES
Anja Guzzi and Andrew Begel
(TU Delft, Netherlands; Microsoft Research, USA)
When software developers need to exchange information or coordinate work with colleagues on other teams, they are often faced with the challenge of finding the right person to communicate with. In this paper, we present our tool, called CARES (Colleagues and Relevant Engineers’ Support), which is an integrated development environment-based (IDE) tool that enables engineers to easily discover and communicate with the people who have contributed to the source code. CARES has been deployed to 30 professional developers, and we interviewed 8 of them after 3 weeks of evaluation. They reported that CARES helped them to more quickly find, choose, and initiate contact with the most relevant and expedient person who could address their needs.

Interactive Refinement of Combinatorial Test Plans
Itai Segall and Rachel Tzoref-Brill
(IBM Research, Israel)
Combinatorial test design (CTD) is an effective test planning technique that reveals faulty feature interactions in a given system. The test space is modeled by a set of parameters, their respective values, and restrictions on the value combinations. A subset of the test space is then automatically constructed so that it covers all valid value combinations of every t parameters, where t is a user input.
When applying CTD to real-life testing problems, it can often occur that the result of CTD cannot be used as is, and manual modifications to the tests are performed. One example is very limited resources that significantly reduce the number of tests that can be used. Another example is complex restrictions that are not captured in the model of the test space. The main concern is that manually modifying the result of CTD might potentially introduce coverage gaps that the user is unaware of. In this paper we present a tool that supports interactive modification of a combinatorial test plan, both manually and with tool assistance. For each modification, the tool displays the new coverage gaps that will be introduced, and enables the user to take educated decisions on what to include in the final set of tests.

TraceLab: An Experimental Workbench for Equipping Researchers to Innovate, Synthesize, and Comparatively Evaluate Traceability Solutions
Ed Keenan, Adam Czauderna, Greg Leach, Jane Cleland-Huang, Yonghee Shin, Evan Moritz, Malcom Gethers, Denys Poshyvanyk, Jonathan Maletic, Jane Huffman Hayes, Alex Dekhtyar, Daria Manukian, Shervin Hossein, and Derek Hearn
(DePaul University, USA; College of William and Mary, USA; Kent State University, USA; University of Kentucky, USA; CalPoly, USA)
TraceLab is designed to empower future traceability research, through facilitating innovation and creativity, increasing collaboration between researchers, decreasing the startup costs and effort of new traceability research projects, and fostering technology transfer. To this end, it provides an experimental environment in which researchers can design and execute experiments in TraceLab's visual modeling environment using a library of reusable and user-defined components. TraceLab fosters research competitions by allowing researchers or industrial sponsors to launch research contests intended to focus attention on compelling traceability challenges. Contests are centered around specific traceability tasks, performed on publicly available datasets, and are evaluated using standard metrics incorporated into reusable TraceLab components. TraceLab has been released in beta-test mode to researchers at seven universities, and will be publicly released via CoEST.org in the summer of 2012. Furthermore, by late 2012 TraceLab's source code will be released as open source software, licensed under GPL. TraceLab currently runs on Windows but is designed with cross platforming issues in mind to allow easy ports to Unix and Mac environments.

Specification Engineering and Modular Verification Using a Web-Integrated Verifying Compiler
Charles T. Cook, Heather Harton, Hampton Smith, and Murali Sitaraman
(Clemson University, USA)
This demonstration will present the RESOLVE web-integrated environment, which has been especially built to capture component relationships and allow construction and composition of verified generic components. The environment facilitates team-based software development and has been used in undergraduate CS education at multiple institutions. The environment makes it easy to simulate “what if” scenarios, including the impact of alternative specification styles on verification, and has spawned much research and experimentation. The demonstration will illustrate the issues in generic software verification and the role of higher-order assertions. It will show how logical errors are pinpointed when verification fails. Introductory video URL: http://www.youtube.com/watch?v=9vg3WuxeOkA

MASH: A Tool for End-User Plug-In Composition
Leonardo Mariani and Fabrizio Pastore
(University of Milano-Bicocca, Italy)
Most of the modern Integrated Development Environments are developed with plug-in based architectures that can be extended with additional functionalities and plug-ins, according to user needs. However, extending an IDE is still a possibility restricted to developers with deep knowledge about the specific development environment and its architecture. In this paper we present MASH, a tool that eases the programming of Integrated Development Environments. The tool supports the definition of workflows that can be quickly designed to integrate functionalities offered by multiple plug-ins, without the need of knowing anything about the internal architecture of the IDE. Workflows can be easily reshaped every time an analysis must be modified, without the need of producing Java code and deploying components in the IDE. Early results suggest that this approach can effectively facilitate programming of IDEs.

BabelRef: Detection and Renaming Tool for Cross-Language Program Entities in Dynamic Web Applications
Hung Viet Nguyen, Hoan Anh Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen
(Iowa State University, USA)
In a dynamic web application, client-side code is often dynamically generated from server-side code. Client-side program entities such as HTML presentation elements and Javascript functions/variables are embedded within server-side string literals or variables' values. However, existing tools for code maintenance such as automatic renaming support only work for program entities in a single language on either the server side or the client side. In this paper, we introduce BabelRef, a novel tool that is able to automatically identify and rename client-side program entities and their references that are embedded within server-side code.

Formal Demos 2
Fri, Jun 8, 10:45 - 12:45

WorkItemExplorer: Visualizing Software Development Tasks Using an Interactive Exploration Environment
Christoph Treude, Patrick Gorman, Lars Grammel, and Margaret-Anne Storey
(University of Victoria, Canada)
This demo introduces WorkItemExplorer, an interactive environment to visually explore data from software development tasks. WorkItemExplorer enables developers and managers to investigate activity and correlations in their task management system by making data exploration flexible and interactive, and by utilizing multiple coordinated views. Our preliminary evaluation shows that WorkItemExplorer is able to answer questions that developers ask, while also enabling them to gain new insights through the free exploration of data.

Runtime Monitoring of Component Changes with Spy@Runtime
Carlo Ghezzi, Andrea Mocci, and Mario Sangiorgio
(Politecnico di Milano, Italy; MIT, USA)
We present SPY@RUNTIME, a tool to infer and work with behavior models. SPY@RUNTIME generates models through a dynamic black box approach and is able to keep them updated with observations coming from actual system execution. We also show how to use models describing the protocol of interaction of a software component to detect and report functional changes as soon as they are discovered. Monitoring functional properties is particularly useful in an open environment in which there is a distributed ownership of a software system. Parts of the system may be changed independently and therefore it becomes necessary to monitor the component’s behavior at run time.

GraPacc: A Graph-Based Pattern-Oriented, Context-Sensitive Code Completion Tool
Anh Tuan Nguyen, Hoan Anh Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen
(Iowa State University, USA)
Code completion tool plays an important role in daily development activities. It helps developers by auto-completing tedious and detailed code during an editing session. However, existing code completion tools are limited to recommending only context-free code templates and a single method call of the variable under editing. We introduce GraPacc, an advanced, context-sensitive code completion tool that is based on frequent API usage patterns. It extracts the context-sensitive features from the code under editing, e.g. the API elements on focus and the current editing point, and their relations to other code elements. It then ranks the relevant API usage patterns and auto-completes the current code with the proper elements according to the chosen pattern.

Code Bubbles: A Practical Working-Set Programming Environment
Steven P. Reiss, Jared N. Bott, and Joseph J. LaViola, Jr.
(Brown University, USA; University of Central Florida, USA)
Our original work on the Code Bubbles environment demonstrated that a working-set based framework for software development showed promise. We have spent the past several years extending the underlying concepts into a fully-functional system. In our demonstration, we will show the current Code Bubbles environment for Java, how it works, how it can be used, and why we prefer it over more traditional programming environments. We will also show how we have extended the framework to enhance software development tasks such as complex debugging, testing, and collaboration. This paper describes the features we will demonstrate.

EVOSS: A Tool for Managing the Evolution of Free and Open Source Software Systems
Davide Di Ruscio, Patrizio Pelliccione, and Alfonso Pierantonio
(University of L'Aquila, Italy)
Software systems increasingly require to deal with continuous evolution. In this paper we present the EVOSS tool that has been defined to support the upgrade of free and open source software systems. EVOSS is composed of a simulator and of a fault detector component. The simulator is able to predict failures before they can affect the real system. The fault detector component has been defined to discover inconsistencies in the system configuration model. EVOSS improves the state of the art of current tools, which are able to predict a very limited set of upgrade faults, while they leave a wide range of faults unpredicted.

Supporting Extract Class Refactoring in Eclipse: The ARIES Project
Gabriele Bavota, Andrea De Lucia, Andrian Marcus, Rocco Oliveto, and Fabio Palomba
(University of Salerno, Italy; Wayne State University, USA; University of Molise, Italy)
During software evolution changes are inevitable. These changes may lead to design erosion and the introduction of inadequate design solutions, such as design antipatterns. Several empirical studies provide evidence that the presence of antipatterns is generally associated with lower productivity, greater rework, and more significant design efforts for developers. In order to improve the quality and remove antipatterns, refactoring operations are needed. In this demo, we present the Extract class features of ARIES (Automated Refactoring In EclipSe), an Eclipse plug-in that supports the software engineer in removing the “Blob” antipattern.

EXSYST: Search-Based GUI Testing
Florian Gross, Gordon Fraser, and Andreas Zeller
(Saarland University, Germany)
Test generation tools commonly aim to cover structural artefacts of software, such as either the source code or the user interface. However, focusing only on source code can lead to unrealistic or irrelevant test cases, while only exploring a user interface often misses much of the underlying program behavior. Our EXSYST prototype takes a new approach by exploring user interfaces while aiming to maximize code coverage, thus combining the best of both worlds. Experiments show that such an approach can achieve high code coverage matching and exceeding the code coverage of traditional unit-based test generators; yet, by construction every test case is realistic and relevant, and every detected failure can be shown to be caused by a real sequence of input events.

JavaMOP: Efficient Parametric Runtime Monitoring Framework
Dongyun Jin, Patrick O’Neil Meredith, Choonghwan Lee, and Grigore Roşu
(University of Illinois at Urbana-Champaign, USA)
Runtime monitoring is a technique usable in all phases of the software development cycle, from initial testing, to debugging, to actually maintaining proper function in production code. Of particular importance are parametric monitoring systems, which allow the specification of properties that relate objects in a program, rather than only global properties. In the past decade, a number of parametric runtime monitoring systems have been developed. Here we give a demonstration of our system, JavaMOP. It is the only parametric monitoring system that allows multiple differing logical formalisms. It is also the most efficient in terms of runtime overhead, and very competitive with respect to memory usage.

Posters and Informal Demonstrations

Posters
Thu, Jun 7, 15:00 - 16:00

Using the GPGPU for Scaling Up Mining Software Repositories
Rina Nagano, Hiroki Nakamura, Yasutaka Kamei, Bram Adams, Kenji Hisazumi, Naoyasu Ubayashi, and Akira Fukuda
(Kyushu University, Japan; École Polytechnique de Montréal, Canada)
The Mining Software Repositories (MSR) field integrates and analyzes data stored in repositories such as source control and bug repositories to support practitioners. Given the abundance of repository data, scaling up MSR analyses has become a major challenge. Recently, researchers have experimented with conventional techniques like a super-computer or cloud computing, but these are either too expensive or too hard to configure. This paper proposes to scale up MSR analysis using ``general-purpose computing on graphics processing units'' (GPGPU) on off-the-shelf video cards. In a representative MSR case study to measure co-change on version history of the Eclipse project, we find that the GPU approach is up to a factor of 43.9 faster than a CPU-only approach.

FastFix: Monitoring Control for Remote Software Maintenance
Dennis Pagano, Miguel A. Juan, Alessandra Bagnato, Tobias Roehm, Bernd Brügge, and Walid Maalej
(TU Munich, Germany; S2 Grupo, Spain; TXT e-solutions, Italy)
Software maintenance and support services are key factors to the customer perception of software product quality. The overall goal of FastFix is to provide developers with a real-time maintenance environment that increases efficiency and reduces costs, improving accuracy in identification of failure causes and facilitating their resolution. To achieve this goal, FastFix observes application execution and user interaction at runtime. We give an overview of the functionality of FastFix and present one of its main application scenarios.

Modeling Cloud Performance with Kriging
Alessio Gambi and Giovanni Toffetti
(University of Lugano, Switzerland)
Cloud infrastructures allow service providers to implement elastic applications. These can be scaled at runtime to dynamically adjust their resources allocation to maintain consistent quality of service in response to changing working conditions, like flash crowds or periodic peaks. Providers need models to predict the system performances of different resource allocations to fully exploit dynamic application scaling. Traditional performance models such as linear models and queuing networks might be simplistic for real Cloud applications; moreover, they are not robust to change. We propose a performance modelling approach that is practical for highly variable elastic applications in the Cloud and automatically adapts to changing working conditions. We show the effectiveness of the proposed approach for the synthesis of a self-adaptive controller.

SOA Adoption in the Italian Industry
Maurizio Leotta, Filippo Ricca, Marina Ribaudo, Gianna Reggio, Egidio Astesiano, and Tullio Vernazza
(Università di Genova, Italy)
We conducted a personal opinion survey in two rounds – years 2008 and 2011 – with the aim of investigating the level of knowledge and adoption of SOA in the Italian industry. We are also interested in understanding what is the trend of SOA (positive or negative?) and what are the methods, technologies and tools really used in the industry. The main findings of this survey are the following: (1) SOA is a relevant phenomenon in Italy, (2) Web services and RESTFul services are well-known/used and (3) orchestration languages and UDDI are little known and used. These results suggest that in Italy SOA is interpreted in a more simplistic way with respect to the current/real definition (i.e., without the concepts of orchestration/choreography and registry). Currently, the adoption of SOA is medium/low with a stable/positive trend of pervasiveness.

A Bidirectional Model-Driven Spreadsheet Environment
Jácome Cunha, João Paulo Fernandes, Jorge Mendes, and João Saraiva
(University of Minho, Portugal)
In this extended abstract we present a bidirectional model-driven framework to develop spreadsheets. By being model driven, our approach allows to evolve a spreadsheet model and automatically have the data co-evolved. The bidirectional component achieves precisely the inverse, that is, to evolve the data and automatically obtain a new model to which the data conforms.

A Self-Healing Technique for Java Applications
Antonio Carzaniga, Alessandra Gorla, Andrea Mattavelli, and Nicolò Perino
(University of Lugano, Switzerland)
Despite the best design practices and testing techniques, many faults exist and manifest themselves in deployed software. In this paper we propose a self-healing framework that aims to mask fault manifestations at runtime in Java applications by automatically applying workarounds. The framework integrates a checkpoint-recovery mechanism to restore a consistent state after the failure, and a mechanism to replace the Java code at runtime to apply the workaround.

When Open Source Turns Cold on Innovation - The Challenges of Navigating Licensing Complexities in New Research Domains
Christopher Forbes, Iman Keivanloo, and Juergen Rilling
(Concordia University, Canada)
In this poster, we review the limitations open source licences introduce to the application of Linked Data in Software Engineering. We investigate whether open source licences support special requirements to publish source code as Linked Data on the Internet.

Informal Demonstrations
Thu, Jun 7, 16:00 - 17:30

Language Modularity with the MPS Language Workbench
Markus Voelter and Vaclav Pech
(itemis, Germany; voelter ingenieurbuero fuer softwaretechnologie, Germany; JetBrains, USA)
JetBrains MPS is a comprehensive environment for language engineering. New languages can be defined as standalone languages or as modular extensions of existing languages. Since MPS is a projectional editor, syntactic forms other than text are possible, including tables or mathematical symbols. This demo will show MPS based on mbeddr C, a novel approach for embedded software development that makes use of incremental language extension on the basis of C.

Mining Application Repository to Recommend XML Configuration Snippets
Sheng Huang, Yi Qi Lu, Yanghua Xiao, and Wei Wang
(Fudan University, China)
Framework-based applications controlled by XML configuration files are quite popularly used in current commercial applications. However, most of these frameworks are complex or not well documented, which poses a great challenge for programmers to correctly utilize them. To overcome these difficulties, we propose a new tool to recommend XML configuration snippets automatically through mining tree patterns and pattern associations from the application repository with the aim of assisting the programmer to generate proper XML configurations during the production phase. In this demo, we showcase this tool by presenting the major techniques behind the tool and the typical usage scenarios of our tool.

Locating Features in Dynamically Configured Avionics Software
Maxime Ouellet, Ettore Merlo, Neset Sozen, and Martin Gagnon
(École Polytechnique de Montréal, Canada; CMC Electronics, Canada)
Locating features in software is an important activity for program comprehension and to support software reengineering. We present a novel automated approach to locate features in source code based on static analysis and model checking. The technique is aimed at dynamically configured software, which is software in which the activation of specific features is controlled by configuration variables. The approach is evaluated on an industrial avionics system.

Detecting Metadata Bugs on the Fly
Myoungkyu Song and Eli Tilevich
(Virginia Tech, USA)
Programmers are spending a large and increasing amount of their time writing and modifying metadata, such as Java annotations and XML deployment descriptors. And yet, automatic bug finding tools cannot find metadata-related bugs introduced during program refactoring and enhancement. To address this shortcoming, we have created metadata invariants, a new programming abstraction that expresses naming and typing relationships between metadata and the main source code of a program. A paper that appears in the main technical program of ICSE 2012 describes the idea, concept, and prototype of metadata invariants. The goal of this demo is to supplement that paper with a demonstration of our Eclipse plugin, Metadata Bug Finder (MBF). MBF takes as input a script written in our domain-specific language that describes a set of metadata coding conventions the programmer wishes to enforce. Then after each file save operation, MBF checks the edited codebase for the presence of any violations of the given metadata programming conventions. These violations are immediately reported to the programmer as potential metadata-related bugs. By making the programmer aware of these potential bugs, MBF prevents them from seeping into production, thereby improving the overall correctness of the edited codebase.

ConTexter Feedback System
Tristan Wehrmaker, Stefan Gärtner, and Kurt Schneider
(Leibniz Universität Hannover, Germany)
Today’s large-scale software-intensive systems exhibit an increasing complexity due to a broad spectrum of technical and socio-technical components. Due to the very dynamic character of such systems as well as fast evolving technologies, most requirements cannot be planned a priori. To overcome this problem, we suggest a method to gather end-user needs for requirements engineers at any time by applying a geographical deployed feedback system. End-user needs are gathered in-situ by utilizing mobile devices. In this paper, we present the implementation of our feedback system enabling end-users to submit feedback with smartphones at very low effort and cost.

xMapper: An Architecture-Implementation Mapping Tool
Yongjie Zheng and Richard N. Taylor
(UC Irvine, USA)
xMapper is an Eclipse-based tool that implements a new architecture-implementation mapping approach called 1.x-way mapping. xMapper is able to record various architecture changes during software development, and automatically map specific kinds of architecture changes to code in specific ways. In addition, xMapper supports the mapping of behavioral architecture specifications modeled as UML-like sequence diagrams and state diagrams.

ConcernReCS: Finding Code Smells in Software Aspectization
Péricles Alves, Diogo Santana, and Eduardo Figueiredo
(UFMG, Brazil)
Refactoring object-oriented (OO) code to aspects is an error-prone task. To support this task, this paper presents ConcernReCS, an Eclipse plug-in to help developers to avoid recurring mistakes during software aspectization. Based on a map of concerns, ConcernReCS automatically finds and reports error-prone scenarios in OO source code; i.e., before the concerns have been refactored to aspects.

Egidio: A Non-Invasive Approach for Synthesizing Organizational Models
Saulius Astromskis, Andrea Janes, and Alireza Rezaei Mahdiraji
(Free University of Bolzano, Italy)
To understand and improve processes in organizations, six key questions need to be answered, namely, what, how, where, who, when, why. Organizations with established processes have IT system(s) that gather(s) information about some or all of the key questions. Software organizations usually have defined processes, but they usually lack information about how processes are actually executed. Moreover, there is no explicit information about process instances and activities. Existing process mining techniques face problems in coping with such environment. We propose a tool, Egidio, which uses non-invasively collected data and builds organizational models. In particular, we explain the tool within a software company, which is able to extract different aspects of development processes. The main contribution of Egidio is the ability to mine processes and organizational models from fine-grained data collected in a non-invasive manner, without interrupting the developers’ work.

SDiC: Context-Based Retrieval in Eclipse
Bruno Antunes, Joel Cordeiro, and Paulo Gomes
(University of Coimbra, Portugal)
While working in an IDE, developers typically deal with a large number of different artifacts at the same time. The software development process requires that they repeatedly switch between different artifacts, which often depends on searching for these artifacts in the source code structure. We propose a tool that integrates context-based search and recommendation of source code artifacts in Eclipse. The artifacts are collected from the workspace of the developer and represented using ontologies. A context model of the developer is used to improve search and give recommendations of these artifacts, which are ranked according to their relevance to the developer. The tool was tested by a group of developers and the results show that contextual information has an important role in retrieving relevant information for developers.

An Integrated Bug Processing Framework
Xiangyu Zhang, Mengxiang Lin, and Kai Yu
(Beihang University, China)
Software debugging starts with bug reports. Test engineers confirm bugs and determine the corresponding developers to fix them. However, the analysis of bug reports is time-consuming and manual inspection is difficult and tedious. To improve the efficiency of the whole process, we propose a bug processing framework that integrates bug report analysis and fault localization. An instance of the framework is implemented for regression faults. Preliminary results on a large open source application demonstrate both efficiency and effectiveness.

Repository for Model Driven Development (ReMoDD)
Robert B. France, James M. Bieman, Sai Pradeep Mandalaparty, Betty H. C. Cheng, and Adam C. Jensen
(Colorado State University, USA; Michigan State University, USA)
The Repository for Model-Driven Development (ReMoDD) contains artifacts that support Model-Driven Development (MDD) research and education. ReMoDD is collecting (1) documented MDD case studies, (2) examples of models reflecting good and bad modeling practices, (3) reference models (including metamodels) that can be used as the basis for comparing and evaluating MDD techniques, (4) generic models and transformations reflecting reusable modeling experience, (5) descriptions of modeling techniques, practices and experiences, and (6) modeling exercises and problems that can be used to develop classroom assignments and projects. ReMoDD provides a single point of access to shared artifacts reflecting high-quality MDD experience and knowledge from industry and academia. This access facilitates sharing of relevant knowledge and experience that improve MDD activities in research, education and industry.

Doctoral Symposium

Posters 1-12
Mon, Jun 4, 10:00 - 10:30

Going Global with Agile Service Networks
Damian A. Tamburri
(VU University Amsterdam, Netherlands)
ASNs are emergent networks of service-based applications (nodes) which collaborate through agile (i.e. adaptable) transactions. GSE comprises the management of project teams distanced in both space and time, collaborating in the same development effort. The GSE condition poses challenges both technical (e.g. geolocalization of resources, information continuity between timezones, etc.) and social (e.g. collaboration between different cultures, fear of competition, etc.). ASNs can be used to build an adaptable social network (ASN_GSE) supporting the collaborations (edges of ASN_GSE ) of GSE teams (nodes of ASN_GSE).

Using Structural and Semantic Information to Support Software Refactoring
Gabriele Bavota
(University of Salerno, Italy)
In the software life cycle the internal structure of the system undergoes continuous modifications. These changes push away the source code from its original design, often reducing its quality. In such cases refactoring techniques can be applied to improve the design quality of the system. Approaches existing in literature mainly exploit structural relationships present in the source code, e.g., method calls, to support the software engineer in identifying refactoring solutions. However, also semantic information is embedded in the source code by the developers, e.g., the terms used in the comments. This research investigates about the usefulness of combining structural and semantic information to support software refactoring.

An Approach to Variability Management in Service-Oriented Product Lines
Sedigheh Khoshnevis
(Shahid Beheshti University G.C., Iran)
Service-Oriented product lines (SOPLs) are dynamic software product lines, in which, the products are developed based on services and service-oriented architecture. Although there are similarities between components and services, there are important differences so that we cannot use component-based product line engineering methods and techniques for SOPL engineering. These differences emerge from the fact that, services can be discovered as black box elements from external repositories. Moreover, services can be dynamically bound and are business-aligned. Therefore, analyzing the conformance of discovered external services with the variability of services in the SOPL –which must be aligned to the variable business needs-is necessary. Variability must be managed, that is, it must be represented (modeled), used (instantiated and capable of conformance checking) and maintained (evolved) over time. Feature Models are insufficient for modeling variability in SOPL, because, services cannot be simply mapped to one or more features, and identification of the mapping depends on knowing the detailed implementation of the services. This research aims at providing an approach to managing the variability in SOPLs so that external services can be involved in the SOPL engineering. This paper presents an overview of the proposal.

Using Machine Learning to Enhance Automated Requirements Model Transformation
Erol-Valeriu Chioaşcă
(University of Manchester, UK)
Textual specification documents do not represent a suitable starting point for software development. This issue is due to the inherent problems of natural language such as ambiguity, impreciseness and incompleteness. In order to overcome these shortcomings, experts derive analysis models such as requirements models. However, these models are difficult and costly to create manually. Furthermore, the level of abstraction of the models is too low, thus hindering the automated transformation process. We propose a novel approach which uses high abstraction requirements models in the form of Object System Models (OSMs) as targets for the transformation of natural language specifications in conjunction with appropriate text mining and machine learning techniques. OSMs allow the interpretation of the textual specification based on a small set of facts and provide structural and behavioral information. This approach will allow both (1) the enhancement of minimal specifications, and in the case of comprehensive specifications (2) the determination of the most suitable structure of reusable requirements.

Security Testing of Web Applications: A Research Plan
Andrea Avancini
(Fondazione Bruno Kessler, Italy)
Cross-site scripting (XSS) vulnerabilities are specific flaws related to web applications, in which missing input validation can be exploited by attackers to inject malicious code into the application under attack. To guarantee high quality of web applications in terms of security, we propose a structured approach, inspired by software testing. In this paper we present our research plan and ongoing work to use security testing to address problems of potentially attackable code. Static analysis is used to reveal candidate vulnerabilities as a set of execution conditions that could lead to an attack. We then resort to automatic test case generation to obtain those input values that make the application execution satisfy such conditions. Eventually, we propose a security oracle to assess whether such test cases are instances of successful attacks.

Application of Self-Adaptive Techniques to Federated Authorization Models
Christopher Bailey
(University of Kent, UK)
Authorization infrastructures are an integral part of any network where resources need to be protected. As organisations start to federate access to their resources, authorization infrastructures become increasingly difficult to manage, to a point where relying only on human resources becomes unfeasible. In our work, we propose a Self-Adaptive Authorization Framework (SAAF) that is capable of monitoring the usage of resources, and controlling access to resources through the manipulation of authorization assets (e.g., authorization policies, access rights and sessions), due to the identification of abnormal usage. As part of this work, we explore the use of models for facilitating the autonomic management of federated authorization infrastructures by 1) classifying access behaviour exhibited by users, 2) modelling authorization assets, including usage, for identifying abnormal behaviour, and 3) managing authorization through the adaptation and reflection of modelled authorization assets. SAAF will be evaluated by integrating it into an existing authorization infrastructure that would allow the simulation of abnormal usage scenarios.

Improving Information Retrieval-Based Concept Location Using Contextual Relationships
Tezcan Dilshener
(Open University, UK)
For software engineers to find all the relevant program elements implementing a business concept, existing techniques based on information retrieval (IR) fall short in providing adequate solutions. Such techniques usually only consider the conceptual relations based on lexical similarities during concept mapping. However, it is also fundamental to consider the contextual relationships existing within an application’s business domain to aid in concept location. As an example, this paper proposes to use domain specific ontological relations during concept mapping and location activities when implementing business requirements.

Effective Specification of Decision Rights and Accountabilities for Better Performing Software Engineering Projects
Monde Kalumbilo
(University College London, UK)
A governance system for a software project involves the distribution and management of decision rights. Decision rights, are a central governance concept. Decision rights grant authority to make decisions and be held accountable for decision outcomes. Though prior research indicates that the exercise and degree of ownership of decision rights has an impact on software project performance, not much attention has been devoted toward understanding the underlying distribution of decision rights within software projects, particularly in terms of what decisions must be made, who should make these decisions and what constitutes an effective distribution of decision rights. In this paper, a research agenda to reveal such knowledge is presented. This report represents the first output of our work in this area.

Search Based Design of Software Product Lines Architectures
Thelma Elita Colanzi
(Federal University of Paraná, Brazil)
The Product-Line Architecture (PLA) is the main artifact of a Software Product Line (SPL). However, obtaining a modular, extensible and reusable PLA is a people-intensive and non-trivial task, related to different and possible conflicting factors. Hence, the PLA design is a hard problem and to find the best architecture can be formulated as an optimization problem with many factors. Similar Software Engineering problems have been efficiently solved by search-based algorithms in the field known as Search-based Software Engineering. The existing approaches used to optimize software architecture are not suitable since they do not encompass specific characteristics of SPL. To easy the SPL development and to automate the PLA design this work introduces a multi-objective optimization approach to the PLA design. The approach is now being implemented by using evolutionary algorithms. Empirical studies will be performed to validate the neighborhood operators, SPL measures and search algorithms chosen. Finally, we intend to compare the results of the proposed approach with PLAs designed by human architects.

Software Fault Localization Based on Program Slicing Spectrum
Wanzhi Wen
(Southeast University, China; Chinese Academy of Sciences, China)
During software development and maintenance stages, programmers have to frequently debug the software. One of the most difficult and complex tasks in the debugging activity is software fault localization. A commonly-used method to fix software fault is computing suspiciousness of program elements according to failed test executions and passed test executions. However, this technique does not give full consideration to dependences between program elements, thus its capacity for efficient fault localization is limited. Our research intends to introduce program slicing technique and statistical method which extracts dependencies between program elements and refines execution history, then builds program slicing spectra to rank suspicious elements by a statistical metric. We expect that our method will contribute directly to the improvement of the effectiveness and the accuracy of software fault localization and reduce the software development and maintenance effort and cost.

Architectural Task Allocation in Distributed Environment: A Traceability Perspective
Salma Imtiaz
(International Islamic University, Pakistan)
Task allocation in distributed development is a challenging task due to intricate dependencies between distributed sites/teams and prior need of multifaceted information. Literature performs task allocation between distributed sites on limited criteria irrespective of the communication and coordination needs of the people. Conway’s law relates product architecture with the communication and coordination needs of the people. Product architecture consists of multiple views based on different perspectives. Task allocation needs information about different architectural views and their interrelationships. Task allocation is also dependent on other factors not depicted in product architecture such as temporal, knowledge and cultural dependencies between distributed sites mentioned as external factors in the research. A well-conceived task allocation strategy will reduce communication and coordination dependency between sites/teams resulting in reduced time delay and smooth distributed development. The research aims to develop and validate a task allocation strategy based on information of system architecture for distributed environment. The strategy would consider all important factors during task allocation resulting in reduced communication and coordination overhead and time delay.

Using Invariant Relations in the Termination Analysis of While Loops
Wided Ghardallou
(University of Tunis El Manar, Tunisia)
Proving program termination plays an important role in ensuring reliability of software systems. Many researchers have lent much attention to this open long-standing problem, most of them were interested in proving that iterative programs terminate under a given input. In this paper, we present a method to solve a more interesting and challenging problem, namely, the generation of the termination condition of while loops i.e. condition over initial states under which a loop terminates normally. To this effect, we use a concept introduced by Mili et al., viz. invariant relation.

Presentations 1-4
Mon, Jun 4, 11:00 - 12:45

A Generic Methodology to Derive Domain-Specific Performance Feedback for Developers
Dennis Westermann
(SAP Research, Germany)
The performance of a system directly influences business critical metrics like total cost of ownership (TCO) and user satisfaction. However, building responsive, resource efficient and scalable applications is a challenging task. Thus, software engineering approaches are required to support software architects and developers in meeting these challenges. In this PhD research abstract, we propose a novel performance evaluation process applied during the software development phase. The goal is to increase the performance awareness of developers by providing feedback with respect to performance properties that is integrated in the every day development process. The feedback is based on domain-specific prediction functions derived by a generic methodology that executes a series of systematic measurements. We apply and validate the approach in different development scenarios at SAP.

Towards the Verification of Multi-diagram UML Models
Alfredo Motta
(Politecnico di Milano, Italy)
UML is a general-purpose modeling language that offers a heterogeneous set of diagrams to describe the different views of a software system. While there seems to be a general consensus on the semantics of some individual diagrams, the composite semantics of the different views is still an open problem. During my PhD I am considering a significant and consistent set of UML diagrams, where timed-related properties can be modeled carefully, and I am ascribing them with a formal semantics based on metric temporal logic. The use of logic is aimed to help capture the composite semantics of the different views efficiently. The result is then used to feed a bounded model/satisfiability checker to allow users to verify these systems, even from the initial phases of the design. The final goal is to realize an advanced modeling framework where users can exploit both a well-known modeling notation and advanced verification capabilities seamlessly.

Documenting and Sharing Knowledge about Code
Anja Guzzi
(TU Delft, Netherlands)
Software engineers spend a considerable amount of time on program comprehension. Current research has primarily focused on assisting the developer trying to build up his understanding of the code. This knowledge remains only in the mind of the developer and, as time elapses, often “disappears”. In this research, we shift the focus to the developer who is using her Integrated Development Environment (IDE) for writing, modifying, or reading the code, and who actually understands the code she is working with. The objective of this PhD research is to seek ways to support this developer to document and share her knowledge with the rest of the team. In particular, we investigate the full potential of micro-blogging integrated into the IDE for addressing the program comprehension problem.

Presentations 5-6
Mon, Jun 4, 14:00 - 14:50

Timely and Efficient Facilitation of Coordination of Software Developers’ Activities
Kelly Blincoe
(Drexel University, USA)
Work dependencies often exist between the developers of a software project. These dependencies frequently result in a need for coordination between the involved developers. However, developers are not always aware of these Coordination Requirements. Current methods which detect the need to coordinate rely on information which is available only after development work has been completed. This does not enable developers to act on their coordination needs. Furthermore, even if developers were aware of all Coordination Requirements, they likely would be overwhelmed by the large number and would not be able to effectively follow up directly with the developers involved in each dependent task. I will investigate a more timely method to determine Coordination Requirements in a software development team as they emerge and how to focus the developers’ attention on the most crucial ones. Further, I hope to prove that direct inter-personal communication is not always necessary to fulfill these requirements and gain insight on how we can develop tools that encourage cheaper forms of coordination.

Stack Layout Transformation: Towards Diversity for Securing Binary Programs
Benjamin Rodes
(University of Virginia, USA)
Despite protracted efforts by both researchers and practitioners, security vulnerabilities remain in modern software. Artificial diversity is an effective defense against many types of attack, and one form, address-space randomization, has been widely applied. Present artificial diversity implementations are either coarse-grained or require source code. Because of the widespread use of software of unknown provenance, e.g., libraries, where no source code is provided or available, building diversity into the source code is not always possible. I investigate an approach to stack layout transformation that operates on x86 binary programs, which would allow users to obfuscate vulnerabilities and increase their confidence in the software’s dependability. The proposed approach is speculative: the stack frame layout for a function is inferred from the binary and assessed by executing the transformed program. Upon assessment failure, the inferred layout is refined in hopes to better reflect the actual function layout.

Posters 13-25
Mon, Jun 4, 14:50 - 15:30

Synthesis of Event-Based Controllers: A Software Engineering Challenge
Nicolás D'Ippolito
(Imperial College London, UK)
Existing software engineering techniques for automatic synthesis of event-based controllers have various limitations. In the context of the world/machine approach such limitations can be seen as restrictions in the expressiveness of the controller goals and domain model specifications or in the relation between the controllable and monitorable actions. In this thesis we aim to provide techniques that overcome such limitations, e.g. supporting more expressive goal specifications, distinguishing controllable from monitorable actions or guaranteeing achievement of the desired goals, among others. Hence, improving the state of the art in the synthesis of event-based controllers. Moreover, we plan to provide efficient tools supporting the developed techniques and evaluate them by modelling known case studies from the software engineering literature. Ultimately, showing that by allowing more expressiveness of controller goals and domain model specifications, and explicitly distinguishing controllable and monitorable actions such case studies can be more accurately modelled and solutions guaranteeing satisfaction of the goals can be achieved.

Empirically Researching Development of International Software
Malte Ressin
(University of West London, UK)
Software localization is an important process for international acceptance of software products. However, software development and localization does not always come together without friction. In our empirical software engineering research, we examine the interplay of software development and software localization by gathering and analyzing qualitative and quantitative data from professionals in relevant roles. Our aim is to co-validate issues and inform practice about the development of international software.

Model Translations among Big-Step Modeling Languages
Fathiyeh Faghih
(University of Waterloo, Canada)
Model Driven Engineering (MDE) is a progressive area that tries to fill the gap between problem definition and software development. There are many modeling languages proposed for use in MDE. A challenge is how to provide automatic analysis for these models without having to create new analyzers for each different language. In this research, we tackle this problem for a family of modeling languages using a semantically configurable model translation framework.

HARPPIE: Hyper Algorithmic Recipe for Productive Parallelism Intensive Endeavors
Pedro Monteiro
(Universidade Nova de Lisboa, Portugal)
Over the last few years, Parallelism has been gaining increasing importance and multicore processing is now common. Massification of parallelism is driving research and development of novel techniques to overcome current limits of Parallel Computing. However, the scope of parallelization research focuses mainly on ever-increasing performance and much still remains to be accomplished regarding improving productivity in the development of parallel software. This PhD research aims to develop methods and tools to dilute parallel programming complexity and enable non-expert programmer to fully benefit from a new generation of parallelism-driven programming platforms. Although much work remains to be done to reduce the skill requirements for parallel programming to become within reach of medium-skill programming workforces, it is our belief that this research will help bridge that gap.

On the Analysis of Evolution of Software Artefacts and Programs
Fehmi Jaafar
(University of Montreal, Canada)
The literature describes several approaches to identify the artefacts of programs that evolve together to reveal the (hidden) dependencies among these artefacts and to infer and describe their evolution trends. We propose the use of biological methods to group artefacts, to detect co-evolution among them, and to construct their phylogenic trees to express their evolution trends. First, we introduced the novel concepts of macro co-changes (MCCs), i.e., of artefacts that co-change within a large time interval and of dephase macro co-changes (DMCCs), i.e., macro co-changes that always happen with the same shifts in time. We developped an approach, Macocha, to identify these new patterns of artefacts co-evolution in large programs. Now, we are analysing the evolution of classes playing roles in design pattern and–or anti-patterns. In parallel to previous work, we are detecting what classes are in macro co-change or in dephase macro co-change with the design motifs. Results trend to show that classes plying roles in design motifs have specifics evolution trends. Finally, we are implementing an approach, Profilo, to achieve the analysis of the evolution of artefacts and versions of large object-oriented programs. Profilo create a phylogenic tree of different versions of program that describes versions evolution and the relation among versions and programs. We will also evaluate the usefulness of our tools using lab and field studies.

Societal Computing
Swapneel Sheth
(Columbia University, USA)
Social Computing research focuses on online social behavior and using artifacts derived from it for providing recommendations and other useful community knowledge. Unfortunately, some of that behavior and knowledge incur societal costs, particularly with regards to Privacy, which is viewed quite differently by different populations as well as regulated differently in different locales. But clever technical solutions to those challenges may impose additional societal costs, e.g., by consuming substantial resources at odds with Green Computing, another major area of societal concern. We propose a new crosscutting research area, Societal Computing, that focuses on the technical tradeoffs among computational models and application domains that raise significant societal issues. This dissertation, advised by Prof. Gail Kaiser, will focus on privacy concerns in the context of Societal Computing and will aim to address research topics such as design patterns and architectures for privacy tradeoffs, better understanding of users' privacy requirements so that tradeoffs with other areas such as green computing can be dealt with in a more effective manner, and better visualization techniques for making privacy and its tradeoffs more understandable.

Finding Suitable Programs: Semantic Search with Incomplete and Lightweight Specifications
Kathryn T. Stolee
(University of Nebraska-Lincoln, USA)
Finding suitable code for reuse is a common task for programmers. Two general approaches dominate the code search literature: syntactic and semantic. While queries for syntactic search are easy to compose, the results are often vague or irrelevant. On the other hand, a semantic search may return relevant results, but current techniques require developers to write specifications by hand, are costly as potentially matching code need to be executed to verify congruence with the specifications, or only return exact matches. In this work, we propose an approach for semantic search in which programmers specify lightweight, incomplete specifications and an SMT solver automatically identifies programs from a repository, encoded as constraints, that match the specifications. The repository of programs is automatically encoded offline so the search for matching programs is efficient. The program encodings cover various levels of abstraction to enable partial matches when no or few exact matches exists. We instantiate this approach on a subset of the Yahoo! Pipes mashup language, and plan to extend our techniques to more traditional programming languages as the research progresses.

Certification-Based Development of Critical Systems
Panayiotis Steele
(University of Virginia, USA)
Safety-critical systems certification is a complex endeavor. Regulating agencies are moving to goal-based standards in an effort to remedy significant problems of prescriptive standards. However, goal-based standards introduce new difficulties into the development and certification processes. In this work I introduce Certification-Based Development, or CBD. CBD is a process framework designed to mitigate these difficulties by meeting the needs of a specific certifying agency with regard to a specific system.

Testing and Debugging UML Models Based on fUML
Tanja Mayerhofer
(Vienna University of Technology, Austria)
Model-driven development, which has recently gained momentum in academia as well as in industry, changed the software engineering process significantly from being code-centric to being model-centric. Models are considered as the key artifacts and as a result the success of the whole software development process relies on these models and their quality. Consequently, there is an urgent need for adequate methods to ensure high quality of models. Model execution can serve as the crucial basis for such methods by enabling to automatically test and debug models. Therefore, lessons learned from testing and debugging of code may serve as a valuable source of inspiration. However, the peculiarities of models in comparison to code, such as multiple views and different abstraction levels, impede the direct adoption of existing methods for models. Thus, we claim that the currently available tool support for model testing and debugging is still insufficient because these peculiarities are not adequately addressed. In this work, we aim at tackling these shortcomings by proposing a novel model execution environment based on fUML, which enables to efficiently test and debug UML models.

Bridging the Divide between Software Developers and Operators Using Logs
Weiyi Shang
(Queen's University, Canada)
There is a growing gap between the software development and operation worlds. Software developers rarely divulge development knowledge about the software to operators, while operators rarely communicate field knowledge to developers. To improve the quality and reduce the operational cost of large-scale software systems, bridging the gap between these two worlds is essential. This thesis proposes the use of logs as mechanism to bridge the gap between these two worlds. Logs are messages generated from statements inserted by developers in the source code and are often used by operators for monitoring the field operation of a system. However, the rich knowledge in logs has not yet been fully used because of their non-structured nature, their large scale, and the use of the ad hoc log analysis techniques. Through case studies on large commercial and open source systems, we plan to demonstrate the value of logs as a tool to support developers and operators.

The Co-evolution of Socio-technical Structures in Sustainable Software Development: Lessons from the Open Source Software Communities
Marcelo Serrano Zanetti
(ETH Zurich, Switzerland)
Software development depends on many factors, including technical, human and social aspects. Due to the complexity of this dependence, a unifying framework must be defined and for this purpose we adopt the complex networks methodology. We use a data-driven approach based on a large collection of open source software projects extracted from online project development platforms. The preliminary results presented in this article reveal that the network perspective yields key insights into the sustainability of software development.

Moving Mobile Applications between Mobile Devices Seamlessly
Volker Schuchardt
(University of Duisburg-Essen, Germany)
Users prefer using multiple mobile devices interchangeably by switching between the devices. A solution to this requirement is the migration of applications between mobile devices at runtime. In our vision to move the application from a device A to a device B, instead of synchronizing just the application's data, a simple swiping gesture can be used. Afterwards the user is able to use the same application including its current state on device B. To achieve this, we plan to put the running application on the device A into a paused state, take a snapshot afterwards, move the application to the device B by using a middleware on both devices, extract the snapshot on device B and finally resume it on device B from its paused state. The outcome of the research will be a framework and either a kernel module or an API to migrate mobile applications.

ACM Student Research Competition
Thu, Jun 7, 16:00 - 17:30

Timely Detection of Coordination Requirements to Support Collaboration among Software Developers
Kelly Blincoe
(Drexel University, USA)
Work dependencies often exist between the developers of a software project. These dependencies frequently result in a need for coordination between the involved developers. However, developers are not always aware of these Coordination Requirements. Current methods which detect the need to coordinate rely on information which is available only after development work has been completed. This does not enable developers to act on their coordination needs. I have investigated a more timely method to determine Coordination Requirements in a software development team as they emerge.

Improving Failure-Inducing Changes Identification Using Coverage Analysis
Kai Yu
(Beihang University, China)
Delta debugging has been proposed for failure-inducing changes identification. Despite promising results, there are two practical factors that thwart the application of delta debugging: large number of tests and misleading false positives. To address the issues, we present a combination of coverage analysis and delta debugging that automatically isolates failure-inducing changes. Evaluations on twelve real regressions in GNU software demonstrate both the speed gain and effectiveness improvements.

A Study on Improving Static Analysis Tools: Why Are We Not Using Them?
Brittany Johnson
(North Carolina State University, USA)
Using static analysis tools for automating code inspections can be beneficial for software engineers. Despite the benefits of using static analysis tools, research suggests that these tools are underused. In this research, we propose to investigate why developers are not widely using static analysis tools and how current tools could potentially be improved to increase usage.

Winbook: A Social Networking Based Framework for Collaborative Requirements Elicitation and WinWin Negotiations
Nupul Kukreja
(University of Southern California, USA)
Easy-to-use groupware for diverse stakeholder negotiation has been a continuing challenge [7, 8, 9]. USC’s fifth-generation wiki-based win-win negotiation support tool [1] was not as successful in improving over the previous four generations [2] as hoped - it encountered problems with non-technical stakeholder usage. The popularity of Facebook and Gmail ushered in a new era of widely-used social networking capabilities that I have been using to develop and experiment with a new way for collaborative requirements elicitation and management – marrying the way people collaborate on Facebook and organize their emails on Gmail to come up with a social networking-like platform to help achieve better usage of the WinWin negotiation framework [4]. Initial usage results on 14 small projects involving non-technical stakeholders have shown profound implications on the way requirements are negotiated and used, through the system and software definition and development processes. Subsequently, Winbook has also been adopted as a part of a project to bridge requirements and architecting for a major US government organization.
Keywords – collaborative requirements elicitation; WinWin negotiations; social networking

Using Automatic Static Analysis to Identify Technical Debt
Antonio Vetrò
(Politecnico di Torino, Italy; Fraunhofer CESE, USA)
The technical debt (TD) metaphor describes a tradeoff between short-term and long-term goals in software development. Developers, in such situations, accept compromises in one dimension (e.g. maintainability) to meet an urgent demand in another dimension (e.g. delivering a release on time). Since TD produces interests in terms of time spent to correct the code and accomplish quality goals, accumulation of TD in software systems is dangerous because it could lead to more difficult and expensive maintenance. The research presented in this paper is focused on the usage of automatic static analysis to identify Technical Debt at code level with respect to different quality dimensions. The methodological approach is that of Empirical Software Engineering and both past and current achieved results are presented, focusing on functionality, efficiency and maintainability.

Coupled Evolution of Model-Driven Spreadsheets
Jorge Mendes
(University of Minho, Portugal)
Spreadsheets are increasingly used as programming languages, in the construction of large and complex systems. The fact is that spreadsheets, being a highly flexible framework, lack important programming language features such as abstraction or encapsulation. This flexibility, however, comes with a price: spreadsheets are populated with significant amounts of errors. One of the approaches that try to overcome this problem advocates the use of model-driven spreadsheet development: a spreadsheet model is defined, from which a concrete spreadsheet is generated. Although this approach has been proved effective in other contexts, still it needs to accommodate for future evolution of both the model and its instance, so that they remain synchronized at all moments. In this paper, we propose a pair of transformation sets, one working at the model level and the other at the instance level, such that each transformation in one set is related to a transformation in the other set. With our approach, we ensure model/data compliance while allowing for model and data evolution.

Managing Evolution of Software Product Line
Cheng Thao
(University of Wisconsin-Milwaukee, USA)
In software product line engineering, core assets are shared among multiple products. Core assets and products generally evolve independently. Developers need to capture evolution in both contexts and to propagate changes in both directions between the core assets and the products. We propose a version control system to support product line engineering by supporting the evolution of product line, product derivation, and change propagation from core assets to products and vice versa.

Enabling Dynamic Metamodels through Constraint-Driven Modeling
Andreas Demuth
(JKU Linz, Austria)
Metamodels are commonly used in Model-Driven Engineering to define available model elements and structures. However, metamodels are likely to change during development for various reasons like requirement changes or evolving domain knowledge. Updating a metamodel typically leads to non-conformance issues with existing models. Hence, evolution strategies must be developed. Additionally, the tool implementation must also be updated to support the evolved metamodel. We propose the use of metamodel-independent tools with unified modeling concepts for working with all kinds of metamodels and models. By applying the Constraint-Driven Modeling approach and generating model constraints from metamodels automatically, we solve the described issues and enable dynamic, evolving metamodels. A prototype implementation has shown the feasibility of the approach and performance tests suggest that it also scales with increasing model sizes.

Assisting End-User Development in Browser-Based Mashup Tools
Soudip Roy Chowdhury
(University of Trento, Italy)
Despite the recent progresses in end-user development and particularly in mashup application development, developing even simple mashups is still non-trivial and requires intimate knowledge about the functionality of web APIs and services, their interfaces, parameter settings, data mappings, and so on. We aim to assist less skilled developers in composing own mashups by interactively recommending composition knowledge in the form of modeling patterns and fostering knowledge reuse. Our prototype system demonstrates our idea of interactive recommendation and automated pattern weaving, which involves recommending relevant composition patterns to the users during development, and once selected, applying automatically the changes as suggested in the selected pattern to the mashup model under development. The experimental evaluation of our prototype implementation demonstrates that even complex composition patterns can be efficiently stored, queried and weaved into the model under development in browser-based mashup tools.

Hot Clones: Combining Search-Driven Development, Clone Management, and Code Provenance
Niko Schwarz
(University of Bern, Switzerland)
Code duplication is common in current programming-practice: programmers search for snippets of code, incorporate them into their projects and then modify them to their needs. In today's practice, no automated scheme is in place to inform both parties of any distant changes of the code. As code snippets continue to evolve both on the side of the user and on the side of the author, both may wish to benefit from remote bug fixes or refinements --- authors may be interested in the actual usage of their code snippets, and researchers could gather information on clone usage. We propose to maintain a link between software clones across repositories and outline how the links can be created and maintained.

Capturing and Exploiting Fine-Grained IDE Interactions
Zhongxian Gu
(UC Davis, USA)
Developers interact with IDEs intensively to maximize productivity. A developer’s interactions with an IDE reflect his thought process and work habits. In this paper, we propose a general framework to capture and exploit all types of IDE interactions. We have two explicit goals for the framework: its systematic interception of comprehensive user interactions, and the ease of use in writing customized applications. To this end, we developed IDE++ on top of Eclipse IDE. For evaluation, we built applications upon the framework to illustrate 1) the need for capturing comprehensive, finegrained IDE interactions, and 2) IDE++’s ease of use. We believe that IDE++ is a step toward building next generation, customizable and intelligent IDEs.

Restructuring Unit Tests with TestSurgeon
Pablo Estefó
(University of Chile, Chile)
The software engineering community has produced great techniques for software maintainability, however, less effort is dedicated to have unit tests modular and extensible. TestSurgeon is a profiler for unit tests which collects information from tests execution. It proposes a metric for similarity between tests and provides a visualization to help developers restructure their unit tests.

A Requirements-Based Approach for the Design of Adaptive Systems
Vítor E. Silva Souza
(University of Trento, Italy)
Complexity is now one of the major challenges for the IT industry. Systems might become too complex to be managed by humans and, thus, will have to be self-managed: Self-configure themselves for operation, self-protect from attacks, self-heal from errors and self-tune for optimal performance. (Self-)Adaptive systems evaluate their own behavior and change it when the evaluation indicates that it is not accomplishing the software's purpose or when better functionality and performance are possible.
To that end, we need to monitor the behavior of the running system and compare it to an explicit formulation of requirements and domain assumptions. Feedback loops (e.g., the MAPE loop) constitute an architectural solution for this and, as proposed by past research, should be a first class citizen in the design of such systems. We advocate that adaptive systems should be designed this way from as early as Requirements Engineering and that reasoning over requirements is fundamental for run-time adaptation.
We therefore propose an approach for the design of adaptive systems based on requirements and inspired in control theory. Our proposal is goal-oriented and targets software-intensive socio-technical systems, in an attempt to integrate control-loop approaches with decentralized agents inspired approaches. Our final objective is a set of extensions to state-of-the-art goal-oriented modeling languages that allow practitioners to clearly specify the requirements of adaptive systems and a run-time framework that helps developers implement such requirements.

Petri Nets State Space Analysis in the Cloud
Matteo Camilli
(University of Milan, Italy)
Several techniques for addressing the state space explosion problem in model checking have been studied. One of these is to use distributed memory and computation for storing and exploring the state space of the model of a system. In this report, we present and compare different multi- thread, distributed, and cloud approaches to face the state-space explosion problem. The experiments report shows the convenience (in particular) of cloud approaches.

Mining Java Class Identifier Naming Conventions
Simon Butler
(Open University, UK)
Classes represent key elements of knowledge in object-orientated source code. Class identifier names describe the knowledge recorded in the class and, much of the time, record some detail of the lineage of the class. We investigate the structure of Java class names identifying common patterns of naming and the way components of class identifier names are repeated in inheritance hierarchies. Detailed knowledge of class identifier name structures can be used to improve the accuracy of concept location tools, to support reverse engineering of domain models and requirements traceability, and to support development teams through class identifier naming recommendation systems.

Online Sharing and Integration of Results from Mining Software Repositories
Iman Keivanloo
(Concordia University, Canada)
The mining of software repository involves the extraction of both basic and value-added information from existing software repositories. Depending on stakeholders (e.g., researchers, management), these repositories are mined several times for different application purposes. To avoid unnecessary pre-processing steps and improve productivity, sharing, and integration of extracted facts and results are needed. The motivation of this research is to introduce a novel collaborative sharing platform for software datasets that supports on-the-fly inter-datasets integration. We want to facilitate and promote a paradigm shift in the source code analysis domain, similar to the one by Wikipedia in the knowledge-sharing domain. In this paper, we present the SeCold project, which is the first online, publicly available software ecosystem Linked Data dataset. As part of this research, not only theoretical background on how to publish such datasets is provided, but also the actual dataset. SeCold contains about two billion facts, such as source code statements, software licenses, and code clones from over 18.000 software projects. SeCold is also an official member of the Linked Data cloud and one of the eight largest online Linked Data datasets available on the cloud.

Invited Summaries

Refounding Software Engineering: The Semat Initiative (Invited Presentation)
Mira Kajko-Mattsson, Ivar Jacobson, Ian Spence, Paul McMahon, Brian Elvesæter, Arne J. Berre, Michael Striewe, Michael Goedicke, Shihong Huang, Bruce MacIsaac, and Ed Seymour
(KTH Royal Institute of Technology, Sweden; Ivar Jacobson Int., UK; PEM Systems, USA; SINTEF, Norway; University of Duisburg-Essen, Germany; Florida Atlantic University, USA; IBM, USA; Fujitsu, UK)
The new software engineering initiative, Semat, is in the process of developing a kernel for software engineering that stands on a solid theoretical basis. So far, it has suggested a set of kernel elements for software engineering and basic language constructs for defining the elements and their usage. This paper describes a session during which Semat results and status will be presented. The presentation will be followed by a discussion panel.

Summary of the ICSE 2012 Workshops
Alessandro Orso and Ralf Reussner
(Georgia Tech, USA; KIT, Germany; FZI, Germany)
The workshops of ICSE 2012 provide a forum for researchers and practitioners to exchange and discuss scientific ideas before they have matured to warrant conference or journal publication. ICSE Workshops also serve as incubators for scientific communities that form and share a particular research agenda.

Summary of the ICSE 2012 Tutorials and Technical Briefings
Andreas Leitner and Oscar Nierstrasz
(Google, Switzerland; University of Bern, Switzerland)
This year ICSE is offering a mix of half-day and full day tutorials in addition to shorter technical briefings in selected domains. Whereas tutorials cover a wide range of mature topics of both academic and practical interest, technical briefings are intended to provide a compact introduction to the state-of-the-art in an emerging area.

ICSE 2012 – Proceedings

Preface

Technical Research

Fault Handling Wed, Jun 6, 10:45 - 12:45

Code Generation and Recovery Wed, Jun 6, 10:45 - 12:45

Empirical Studies of Development Wed, Jun 6, 10:45 - 12:45

Performance Analysis Wed, Jun 6, 10:45 - 12:45

Defect Prediction Wed, Jun 6, 14:00 - 15:30

Refactoring Wed, Jun 6, 14:00 - 15:30

Human Aspects of Development Wed, Jun 6, 14:00 - 15:30

Bug Detection Wed, Jun 6, 16:00 - 18:00

Multiversion Software Wed, Jun 6, 16:00 - 18:00

Similarity and Classification Wed, Jun 6, 16:00 - 18:00

Analysis for Evolution Thu, Jun 7, 10:45 - 12:45

Debugging Thu, Jun 7, 10:45 - 12:45

Human Aspects of Process Thu, Jun 7, 10:45 - 12:45

Models Thu, Jun 7, 10:45 - 12:45

Concurrency and Exceptions Thu, Jun 7, 16:00 - 17:30

Software Architecture Thu, Jun 7, 16:00 - 17:30

Formal Verification Thu, Jun 7, 16:00 - 17:30

Invariant Generation Fri, Jun 8, 08:45 - 10:15

Regression Testing Fri, Jun 8, 08:45 - 10:15

Software Vulnerability Fri, Jun 8, 08:45 - 10:15

API Learning Fri, Jun 8, 10:45 - 12:45

Code Recommenders Fri, Jun 8, 10:45 - 12:45

Test Automation Fri, Jun 8, 10:45 - 12:45

Validation of Specification Fri, Jun 8, 10:45 - 12:45

Keynotes

Software Engineering in Practice

Services and Analytics Wed, Jun 6, 10:45 - 12:45

Mini-Tutorial: Software Analytics Wed, Jun 6, 14:00 - 15:30

Invited Industrial Experts Wed, Jun 6, 16:00 - 18:00

Formal Methods Thu, Jun 7, 10:45 - 12:45

Goldfish Bowl Panel: Software Development Analytics Thu, Jun 7, 16:00 - 17:30

Re-engineering Thu, Jun 7, 16:00 - 17:30

Debugging Fri, Jun 8, 08:45 - 10:15

Case Studies Fri, Jun 8, 08:45 - 10:15

Testing Fri, Jun 8, 10:45 - 12:45