OOPSLA 2016 – Proceedings

Message from the Chairs
Welcome to the 2016 ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). The selection of papers in this year’s conference showcases the breadth and depth of research in programming languages and software development. Of the 30 paper topics listed on the submission form, the top-5 for accepted papers have been “language design”, “language implementation”, “concurrency”, “program analysis”, and “tools”. Closely following are “compilers”, “programming models and paradigms”, and “type systems and logics”. These topics are an excellent reflection of the mix of subjects that OOPSLA treats, as well as the conference’s flavor.

Optimization and Performance

A Compiler for Throughput Optimization of Graph Algorithms on GPUs
Sreepathi Pai and Keshav Pingali
(University of Texas at Austin, USA)
Writing high-performance GPU implementations of graph algorithms can be challenging. In this paper, we argue that three optimizations called throughput optimizations are key to high-performance for this application class. These optimizations describe a large implementation space making it unrealistic for programmers to implement them by hand.
To address this problem, we have implemented these optimizations in a compiler that produces CUDA code from an intermediate-level program representation called IrGL. Compared to state-of-the-art handwritten CUDA implementations of eight graph applications, code generated by the IrGL compiler is up to 5.95x times faster (median 1.4x) for five applications and never more than 30% slower for the others. Throughput optimizations contribute an improvement up to 4.16x (median 1.4x) to the performance of unoptimized IrGL code.

Automatic Parallelization of Pure Method Calls via Conditional Future Synthesis
Rishi Surendran and Vivek Sarkar
(Rice University, USA)
We introduce a novel approach for using futures to automatically parallelize the execution of pure method calls. Our approach is built on three new techniques to address the challenge of automatic parallelization via future synthesis: candidate future synthesis, parallelism benefit analysis, and threshold expression synthesis. During candidate future synthesis, our system annotates pure method calls as async expressions and synthesizes a parallel program with future objects and their type declarations. Next, the system performs a parallel benefit analysis to determine which async expressions may need to be executed sequentially due to overhead reasons, based on execution profile information collected from multiple test inputs. Finally, threshold expression synthesis uses the output from parallelism benefit analysis to synthesize predicate expressions that can be used to determine at runtime if a specific pure method call should be executed sequentially or in parallel.
We have implemented our approach, and the results obtained from an experimental evaluation of the complete system on a range of sequential Java benchmarks are very encouraging. Our evaluation shows that our approach can provide significant parallel speedups of up to 7.4× (geometric mean of 3.69×) relative to the sequential programs when using 8 processor cores, with zero programmer effort beyond providing the sequential program and test cases for parallelism benefit analysis.

Portable Inter-workgroup Barrier Synchronisation for GPUs
Tyler Sorensen, Alastair F. Donaldson

, Mark Batty, Ganesh Gopalakrishnan, and Zvonimir Rakamarić
(Imperial College London, UK; University of Kent, UK; University of Utah, USA)
Despite the growing popularity of GPGPU programming, there is not yet a portable and formally-specified barrier that one can use to synchronise across workgroups. Moreover, the occupancy-bound execution model of GPUs breaks assumptions inherent in traditional software execution barriers, exposing them to deadlock. We present an occupancy discovery protocol that dynamically discovers a safe estimate of the occupancy for a given GPU and kernel, allowing for a starvation-free (and hence, deadlock-free) inter-workgroup barrier by restricting the number of workgroups according to this estimate. We implement this idea by adapting an existing, previously non-portable, GPU inter-workgroup barrier to use OpenCL 2.0 atomic operations, and prove that the barrier meets its natural specification in terms of synchronisation.
We assess the portability of our approach over eight GPUs spanning four vendors, comparing the performance of our method against alternative methods. Our key findings include: (1) ‍the recall of our discovery protocol is nearly 100%; (2) ‍runtime comparisons vary substantially across GPUs and applications; and (3) ‍our method provides portable and safe inter-workgroup synchronisation across the applications we study.

Parallel Incremental Whole-Program Optimizations for Scala.js
Sébastien Doeraene and Tobias Schlatter
(EPFL, Switzerland)
Whole-program optimizations are powerful tools that can dramatically improve performance, size and other aspects of programs. Because they depend on global knowledge, they must typically be reapplied to the whole program when small changes are made, which makes them too slow for the development cycle. This is an issue for some environments that require, or benefit a lot from, whole-program optimizations, such as compilation to JavaScript or to the Dalvik VM, because their development cycle is slowed down either by the lack of optimizations, or by the time spent on applying them.
We present a new approach to designing incremental whole-program optimizers for object-oriented and functional languages: when part of a program changes, only the portions affected by the changes are reoptimized. An incremental optimizer using this approach for Scala.js, the Scala to JavaScript compiler, demonstrates speedups from 10x to 100x compared to its batch version. As a result, the optimizer's running time becomes insignificant compared to separate compilation, making it fit for use on every compilation run during the development cycle. We also show how to parallelize the incremental algorithm to take advantage of multicore hardware.

Semantics and Verification

Semantics-Based Program Verifiers for All Languages
Andrei Stefănescu, Daejun Park, Shijiao Yuwen, Yilong Li, and Grigore Roşu

(University of Illinois at Urbana-Champaign, USA; Runtime Verification, USA)
We present a language-independent verification framework that can be instantiated with an operational semantics to automatically generate a program verifier. The framework treats both the operational semantics and the program correctness specifications as reachability rules between matching logic patterns, and uses the sound and relatively complete reachability logic proof system to prove the specifications using the semantics. We instantiate the framework with the semantics of one academic language, KernelC, as well as with three recent semantics of real-world languages, C, Java, and JavaScript, developed independently of our verification infrastructure. We evaluate our approach empirically and show that the generated program verifiers can check automatically the full functional correctness of challenging heap-manipulating programs implementing operations on list and tree data structures, like AVL trees. This is the first approach that can turn the operational semantics of real-world languages into correct-by-construction automatic verifiers.

Info

Hoare-Style Specifications as Correctness Conditions for Non-linearizable Concurrent Objects
Ilya Sergey, Aleksandar Nanevski

, Anindya Banerjee

, and Germán Andrés Delbianco
(University College London, UK; IMDEA Software Institute, Spain)
Designing efficient concurrent objects often requires abandoning the standard specification technique of linearizability in favor of more relaxed correctness conditions. However, the variety of alternatives makes it difficult to choose which condition to employ, and how to compose them when using objects specified by different conditions.
In this work, we propose a uniform alternative in the form of Hoare logic, which can explicitly capture--in the auxiliary state--the interference of environment threads. We demonstrate the expressiveness of our method by verifying a number of concurrent objects and their clients, which have so far been specified only by non-standard conditions of concurrency-aware linearizability, quiescent, and quantitative quiescent consistency. We report on the implementation of the ideas in an existing Coq-based tool, providing the first mechanized proofs for all the examples in the paper.

Info

An Operational Semantics for C/C++11 Concurrency
Kyndylan Nienhuis, Kayvan Memarian

, and Peter Sewell

(University of Cambridge, UK)
The C/C++11 concurrency model balances two goals: it is relaxed enough to be efficiently implementable and (leaving aside the ``thin-air'' problem) it is strong enough to give useful guarantees to programmers. It is mathematically precise and has been used in verification research and compiler testing. However, the model is expressed in an axiomatic style, as predicates on complete candidate executions. This suffices for computing the set of allowed executions of a small litmus test, but it does not directly support the incremental construction of executions of larger programs. It is also at odds with conventional operational semantics, as used implicitly in the rest of the C/C++ standards.
Our main contribution is the development of an operational model for C/C++11 concurrency. This covers all the features of the previous formalised axiomatic model, and we have a mechanised proof that the two are equivalent, in Isabelle/HOL. We also integrate this semantics with an operational semantics for sequential C (described elsewhere); the combined semantics can incrementally execute programs in a small fragment of C.
Doing this uncovered several new aspects of the C/C++11 model: we show that one cannot build an equivalent operational model that simply follows program order, sequential consistent order, or the synchronises-with order. The first negative result is forced by hardware-observable behaviour, but the latter two are not, and so might be ameliorated by changing C/C++11. More generally, we hope that this work, with its focus on incremental construction of executions, will inform the future design of new concurrency models.

Modeling and Analysis of Remote Memory Access Programming
Andrei Marian Dan, Patrick Lam, Torsten Hoefler, and Martin Vechev

(ETH Zurich, Switzerland; University of Waterloo, Canada)
Recent advances in networking hardware have led to a new generation of Remote Memory Access (RMA) networks in which processors from different machines can communicate directly, bypassing the operating system and allowing higher performance. Researchers and practitioners have proposed libraries and programming models for RMA to enable the development of applications running on these networks,
However, the memory models implied by these RMA libraries and languages are often loosely specified, poorly understood, and differ depending on the underlying network architecture and other factors. Hence, it is difficult to precisely reason about the semantics of RMA programs or how changes in the network architecture affect them.
We address this problem with the following contributions: (i) a coreRMA language which serves as a common foundation, formalizing the essential characteristics of RMA programming; (ii) complete axiomatic semantics for that language; (iii) integration of our semantics with an existing constraint solver, enabling us to exhaustively generate coreRMA programs (litmus tests) up to a specified bound and check whether the tests satisfy their specification; and (iv) extensive validation of our semantics on real-world RMA systems. We generated and ran 7441 litmus tests using each of the low-level RMA network APIs: DMAPP, VPI Verbs, and Portals 4. Our results confirmed that our model successfully captures behaviors exhibited by these networks. Moreover, we found RMA programs that behave inconsistently with existing documentation, confirmed by network experts.
Our work provides an important step towards understanding existing RMA networks, thus influencing the design of future RMA interfaces and hardware.

Program Synthesis

Deriving Divide-and-Conquer Dynamic Programming Algorithms using Solver-Aided Transformations
Shachar Itzhaky, Rohit Singh, Armando Solar-Lezama

, Kuat Yessenov, Yongquan Lu, Charles Leiserson, and Rezaul Chowdhury
(Massachusetts Institute of Technology, USA; Stony Brook University, USA)
We introduce a framework allowing domain experts to manipulate computational terms in the interest of deriving better, more efficient implementations.It employs deductive reasoning to generate provably correct efficient implementations from a very high-level specification of an algorithm, and inductive constraint-based synthesis to improve automation. Semantic information is encoded into program terms through the use of refinement types.
In this paper, we develop the technique in the context of a system called Bellmania that uses solver-aided tactics to derive parallel divide-and-conquer implementations of dynamic programming algorithms that have better locality and are significantly more efficient than traditional loop-based implementations. Bellmania includes a high-level language for specifying dynamic programming algorithms and a calculus that facilitates gradual transformation of these specifications into efficient implementations. These transformations formalize the divide-and conquer technique; a visualization interface helps users to interactively guide the process, while an SMT-based back-end verifies each step and takes care of low-level reasoning required for parallelism.
We have used the system to generate provably correct implementations of several algorithms, including some important algorithms from computational biology, and show that the performance is comparable to that of the best manually optimized code.

Speeding Up Machine-Code Synthesis
Venkatesh Srinivasan, Tushar Sharma, and Thomas Reps

(University of Wisconsin-Madison, USA; GrammaTech, USA)
Machine-code synthesis is the problem of searching for an instruction sequence that implements a semantic specification, given as a formula in quantifier-free bit-vector logic (QFBV). Instruction sets like Intel's IA-32 have around 43,000 unique instruction schemas; this huge instruction pool, along with the exponential cost inherent in enumerative synthesis, results in an enormous search space for a machine-code synthesizer: even for relatively small specifications, the synthesizer might take several hours or days to find an implementation. In this paper, we present several improvements to the algorithms used in a state-of-the-art machine-code synthesizer McSynth. In addition to a novel pruning heuristic, our improvements incorporate a number of ideas known from the literature, which we adapt in novel ways for the purpose of speeding up machine-code synthesis. Our experiments for Intel's IA-32 instruction set show that our improvements enable synthesis of code for 12 out of 14 formulas on which McSynth times out, speeding up the synthesis time by at least 1981X, and for the remaining formulas, speeds up synthesis by 3X.

Automated Reasoning for Web Page Layout
Pavel Panchekha and Emina Torlak

(University of Washington, USA)
Web pages define their appearance using Cascading Style Sheets, a modular language for layout of tree-structured documents. In principle, using CSS is easy: the developer specifies declarative constraints on the layout of an HTML document (such as the positioning of nodes in the HTML tree), and the browser solves the constraints to produce a box-based rendering of that document. In practice, however, the subtleties of CSS semantics make it difficult to develop stylesheets that produce the intended layout across different user preferences and browser settings.
This paper presents the first mechanized formalization of a substantial fragment of the CSS semantics. This formalization is equipped with an efficient reduction to the theory of quantifier-free linear real arithmetic, enabling effective automated reasoning about CSS stylesheets and their behavior. We implement this reduction in Cassius, a solver-aided framework for building semantics-aware tools for CSS. To demonstrate the utility of Cassius, we prototype new tools for automated verification, debugging, and synthesis of CSS code. We show that these tools work on fragments of real-world websites, and that Cassius is a practical first step toward solver-aided programming for the web.

Info

FIDEX: Filtering Spreadsheet Data using Examples
Xinyu Wang, Sumit Gulwani

, and Rishabh Singh
(University of Texas at Austin, USA; Microsoft Research, USA)
Data filtering in spreadsheets is a common problem faced by millions of end-users. The task of data filtering requires a computational model that can separate intended positive and negative string instances. We present a system, FIDEX, that can efficiently learn desired data filtering expressions from a small set of positive and negative string examples.
There are two key ideas of our approach. First, we design an expressive DSL to represent disjunctive filter expressions needed for several real-world data filtering tasks. Second, we develop an efficient synthesis algorithm for incrementally learning consistent filter expressions in the DSL from very few positive and negative examples. A DAG-based data structure is used to succinctly represent a large number of filter expressions, and two corresponding operators are defined for algorithmically handling positive and negative examples, namely, the intersection and subtraction operators. FIDEX is able to learn data filters for 452 out of 460 real-world data filtering tasks in real time (0.22s), using only 2.2 positive string instances and 2.7 negative string instances on average.

Language Design and Programming Models I

Extensible Access Control with Authorization Contracts
Scott Moore, Christos Dimoulas, Robert Bruce Findler, Matthew Flatt

, and Stephen Chong

(Harvard University, USA; Northwestern University, USA; University of Utah, USA)
Existing programming language access control frameworks do not meet the needs of all software components. We propose an expressive framework for implementing access control monitors for components. The basis of the framework is a novel concept: the authority environment. An authority environment associates rights with an execution context. The building blocks of access control monitors in our framework are authorization contracts: software contracts that manage authority environments. We demonstrate the expressiveness of our framework by implementing a diverse set of existing access control mechanisms and writing custom access control monitors for three realistic case studies.

Gentrification Gone too Far? Affordable 2nd-Class Values for Fun and (Co-)Effect
Leo Osvald, Grégory Essertel, Xilun Wu, Lilliam I. González Alayón, and Tiark Rompf

(Purdue University, USA)
First-class functions dramatically increase expressiveness, at the expense of static guarantees. In ALGOL or PASCAL, functions could be passed as arguments but never escape their defining scope. Therefore, function arguments could serve as temporary access tokens or capabilities, enabling callees to perform some action, but only for the duration of the call. In modern languages, such programming patterns are no longer available.
The central thrust of this paper is to re-introduce second-class functions and other values alongside first-class entities in modern languages. We formalize second-class values with stack-bounded lifetimes as an extension to simply-typed λ calculus, and for richer type systems such as F_<: and systems with path-dependent types. We generalize the binary first- vs second-class distinction to arbitrary privilege lattices, with the underlying type lattice as a special case. In this setting, abstract types naturally enable privilege parametricity. We prove type soundness and lifetime properties in Coq.
We implement our system as an extension of Scala, and present several case studies. First, we modify the Scala Collections library and add privilege annotations to all higher-order functions. Privilege parametricity is key to retain the high degree of code-reuse between sequential and parallel as well as lazy and eager collections. Second, we use scoped capabilities to introduce a model of checked exceptions in the Scala library, with only few changes to the code. Third, we employ second-class capabilities for memory safety in a region-based off-heap memory library.

Incremental Forest: A DSL for Efficiently Managing Filestores
Jonathan DiLorenzo, Richard Zhang, Erin Menzies, Kathleen Fisher

, and Nate Foster

(Cornell University, USA; University of Pennsylvania, USA; Tufts University, USA)
File systems are often used to store persistent application data, but manipulating file systems using standard APIs can be difficult for programmers. Forest is a domain-specific language that bridges the gap between the on-disk and in-memory representations of file system data. Given a high-level specification of the structure, contents, and properties of a collection of directories, files, and symbolic links, the Forest compiler generates tools for loading, storing, and validating that data. Unfortunately, the initial implementation of Forest offered few mechanisms for controlling cost—e.g., the run-time system could load gigabytes of data, even if only a few bytes were needed. This paper introduces Incremental Forest (iForest), an extension to Forest with an explicit delay construct that programmers can use to precisely control costs. We describe the design of iForest using a series of running examples, present a formal semantics in a core calculus, and define a simple cost model that accurately characterizes the resources needed to use a given specification. We propose skins, which allow programmers to modify the delay structure of a specification in a compositional way, and develop a static type system for ensuring compatibility between specifications and skins. We prove the soundness and completeness of the type system and a variety of algebraic properties of skins. We describe an OCaml implementation and evaluate its performance on applications developed in collaboration with watershed hydrologists.

LaCasa: Lightweight Affinity and Object Capabilities in Scala
Philipp Haller

and Alex Loiko
(KTH, Sweden; Google, Sweden)
Aliasing is a known source of challenges in the context of imperative object-oriented languages, which have led to important advances in type systems for aliasing control. However, their large-scale adoption has turned out to be a surprisingly difficult challenge. While new language designs show promise, they do not address the need of aliasing control in existing languages.
This paper presents a new approach to isolation and uniqueness in an existing, widely-used language, Scala. The approach is unique in the way it addresses some of the most important obstacles to the adoption of type system extensions for aliasing control. First, adaptation of existing code requires only a minimal set of annotations. Only a single bit of information is required per class. Surprisingly, the paper shows that this information can be provided by the object-capability discipline, widely-used in program security. We formalize our approach as a type system and prove key soundness theorems. The type system is implemented for the full Scala language, providing, for the first time, a sound integration with Scala's local type inference. Finally, we empirically evaluate the conformity of existing Scala open-source code on a corpus of over 75,000 LOC.

Info

Programming Frameworks, Tools, and Methodologies

Purposes, Concepts, Misfits, and a Redesign of Git
Santiago Perez De Rosso and Daniel Jackson
(Massachusetts Institute of Technology, USA)
Git is a widely used version control system that is powerful but complicated. Its complexity may not be an inevitable consequence of its power but rather evidence of flaws in its design. To explore this hypothesis, we analyzed the design of Git using a theory that identifies concepts, purposes, and misfits. Some well-known difficulties with Git are described, and explained as misfits in which underlying concepts fail to meet their intended purpose. Based on this analysis, we designed a reworking of Git (called Gitless) that attempts to remedy these flaws.
To correlate misfits with issues reported by users, we conducted a study of Stack Overflow questions. And to determine whether users experienced fewer complications using Gitless in place of Git, we conducted a small user study. Results suggest our approach can be profitable in identifying, analyzing, and fixing design problems.

Info

Apex: Automatic Programming Assignment Error Explanation
Dohyeong Kim, Yonghwi Kwon, Peng Liu, I. Luk Kim, David Mitchel Perry, Xiangyu Zhang

, and Gustavo Rodriguez-Rivera
(Purdue University, USA)
This paper presents Apex, a system that can automatically generate explanations for programming assignment bugs, regarding where the bugs are and how the root causes led to the runtime failures. It works by comparing the passing execution of a correct implementation (provided by the instructor) and the failing execution of the buggy implementation (submitted by the student). The technique overcomes a number of technical challenges caused by syntactic and semantic differences of the two implementations. It collects the symbolic traces of the executions and matches assignment statements in the two execution traces by reasoning about symbolic equivalence. It then matches predicates by aligning the control dependences of the matched assignment statements, avoiding direct matching of path conditions which are usually quite different. Our evaluation shows that Apex is every effective for 205 buggy real world student submissions of 4 programming assignments, and a set of 15 programming assignment type of buggy programs collected from stackoverflow.com, precisely pinpointing the root causes and capturing the causality for 94.5% of them. The evaluation on a standard benchmark set with over 700 student bugs shows similar results. A user study in the classroom shows that Apex has substantially improved student productivity.

Asserting Reliable Convergence for Configuration Management Scripts
Oliver Hanappi, Waldemar Hummer, and Schahram Dustdar
(Vienna University of Technology, Austria)
The rise of elastically scaling applications that frequently deploy new machines has led to the adoption of DevOps practices across the cloud engineering stack. So-called configuration management tools utilize scripts that are based on declarative resource descriptions and make the system converge to the desired state. It is crucial for convergent configurations to be able to gracefully handle transient faults, e.g., network outages when downloading and installing software packages. In this paper we introduce a conceptual framework for asserting reliable convergence in configuration management. Based on a formal definition of configuration scripts and their resources, we utilize state transition graphs to test whether a script makes the system converge to the desired state under different conditions. In our generalized model, configuration actions are partially ordered, often resulting in prohibitively many possible execution orders. To reduce this problem space, we define and analyze a property called preservation, and we show that if preservation holds for all pairs of resources, then convergence holds for the entire configuration. Our implementation builds on Puppet, but the approach is equally applicable to other frameworks like Chef, Ansible, etc. We perform a comprehensive evaluation based on real world Puppet scripts and show the effectiveness of the approach. Our tool is able to detect all idempotence and convergence related issues in a set of existing Puppet scripts with known issues as well as some hitherto undiscovered bugs in a large random sample of scripts.

Info

Dependent Partitioning
Sean Treichler, Michael Bauer, Rahul Sharma, Elliott Slaughter, and Alex Aiken

(Stanford University, USA; NVIDIA Research, USA)
A key problem in parallel programming is how data is partitioned: divided into subsets that can be operated on in parallel and, in distributed memory machines, spread across multiple address spaces.
We present a dependent partitioning framework that allows an application to concisely describe relationships between partitions. Applications first establish independent partitions, which may contain arbitrary subsets of application data, permitting the expression of arbitrary application-specific data distributions. Dependent partitions are then derived from these using the dependent partitioning operations provided by the framework. By directly capturing inter-partition relationships, our framework can soundly and precisely reason about programs to perform important program analyses crucial to ensuring correctness and achieving good performance. As an example of the reasoning made possible, we present a static analysis that discharges most consistency checks on partitioned data during compilation.
We describe an implementation of our framework within Regent, a language designed for the Legion programming model. The use of dependent partitioning constructs results in a 86-96% decrease in the lines of code required to describe the partitioning, eliminates many of the expensive dynamic checks required for soundness by the current Regent partitioning implementation, and speeds up the computation of partitions by 2.6-12.7X even on a single thread. Additionally, we show that a distributed implementation incorporated into the the Legion runtime system allows partitioning of data sets that are too large to fit on a single node and yields a further 29X speedup of partitioning operations on 64 nodes.

Static Analysis

Accelerating Program Analyses by Cross-Program Training
Sulekha Kulkarni, Ravi Mangal, Xin Zhang, and Mayur Naik
(Georgia Tech, USA)
Practical programs share large modules of code. However, many program analyses are ineffective at reusing analysis results for shared code across programs. We present POLYMER, an analysis optimizer to address this problem. POLYMER runs the analysis offline on a corpus of training programs and learns analysis facts over shared code. It prunes the learnt facts to eliminate intermediate computations and then reuses these pruned facts to accelerate the analysis of other programs that share code with the training corpus. We have implemented POLYMER to accelerate analyses specified in Datalog, and apply it to optimize two analyses for Java programs: a call-graph analysis that is flow- and context-insensitive, and a points-to analysis that is flow- and context-sensitive. We evaluate the resulting analyses on ten programs from the DaCapo suite that share the JDK library. POLYMER achieves average speedups of 2.6× for the call- graph analysis and 5.2× for the points-to analysis.

An Improved Algorithm for Slicing Machine Code
Venkatesh Srinivasan and Thomas Reps

(University of Wisconsin-Madison, USA; GrammaTech, USA)
Machine-code slicing is an important primitive for building binary analysis and rewriting tools, such as taint trackers, fault localizers, and partial evaluators. However, it is not easy to create a machine-code slicer that exhibits a high level of precision. Moreover, the problem of creating such a tool is compounded by the fact that a small amount of local imprecision can be amplified via cascade effects.
Most instructions in instruction sets such as Intel's IA-32 and ARM are multi-assignments: they have several inputs and several outputs (registers, flags, and memory locations). This aspect of the instruction set introduces a granularity issue during slicing: there are often instructions at which we would like the slice to include only a subset of the instruction's semantics, whereas the slice is forced to include the entire instruction. Consequently, the slice computed by state-of-the-art tools is very imprecise, often including essentially the entire program.
This paper presents an algorithm to slice machine code more accurately. To counter the granularity issue, our algorithm performs slicing at the microcode level, instead of the instruction level, and obtains a more precise microcode slice. To reconstitute a machine-code program from a microcode slice, our algorithm uses machine-code synthesis. Our experiments on IA-32 binaries of FreeBSD utilities show that, in comparison to slices computed by a state-of-the-art tool, our algorithm reduces the size of backward slices by 33%, and forward slices by 70%.

Call Graphs for Languages with Parametric Polymorphism
Dmitry Petrashko, Vlad Ureche, Ondřej Lhoták

, and Martin Odersky

(EPFL, Switzerland; University of Waterloo, Canada)
The performance of contemporary object oriented languages depends on optimizations such as devirtualization, inlining, and specialization, and these in turn depend on precise call graph analysis. Existing call graph analyses do not take advantage of the information provided by the rich type systems of contemporary languages, in particular generic type arguments. Many existing approaches analyze Java bytecode, in which generic types have been erased. This paper shows that this discarded information is actually very useful as the context in a context-sensitive analysis, where it significantly improves precision and keeps the running time small. Specifically, we propose and evaluate call graph construction algorithms in which the contexts of a method are (i) the type arguments passed to its type parameters, and (ii) the static types of the arguments passed to its term parameters. The use of static types from the caller as context is effective because it allows more precise dispatch of call sites inside the callee.
Our evaluation indicates that the average number of contexts required per method is small. We implement the analysis in the Dotty compiler for Scala, and evaluate it on programs that use the type-parametric Scala collections library and on the Dotty compiler itself. The context-sensitive analysis runs 1.4x faster than a context-insensitive one and discovers 20% more monomorphic call sites at the same time. When applied to method specialization, the imprecision in a context-insensitive call graph would require the average method to be cloned 22 times, whereas the context-sensitive call graph indicates a much more practical 1.00 to 1.50 clones per method.
We applied the proposed analysis to automatically specialize generic methods. The resulting automatic transformation achieves the same performance as state-of-the-art techniques requiring manual annotations, while reducing the size of the generated bytecode by up to 5×.

Type Inference for Static Compilation of JavaScript
Satish Chandra, Colin S. Gordon, Jean-Baptiste Jeannin, Cole Schlesinger, Manu Sridharan, Frank Tip

, and Youngil Choi
(Samsung Research, USA; Drexel University, USA; Northeastern University, USA; Samsung Electronics, South Korea)
We present a type system and inference algorithm for a rich subset of JavaScript equipped with objects, structural subtyping, prototype inheritance, and first-class methods. The type system supports abstract and recursive objects, and is expressive enough to accommodate several standard benchmarks with only minor workarounds. The invariants enforced by the types enable an ahead-of-time compiler to carry out optimizations typically beyond the reach of static compilers for dynamic languages. Unlike previous inference techniques for prototype inheritance, our algorithm uses a combination of lower and upper bound propagation to infer types and discover type errors in all code, including uninvoked functions. The inference is expressed in a simple constraint language, designed to leverage off-the-shelf fixed point solvers. We prove soundness for both the type system and inference algorithm. An experimental evaluation showed that the inference is powerful, handling the aforementioned benchmarks with no manual type annotation, and that the inferred types enable effective static compilation.

Concurrency Analysis and Model Checking

Directed Synthesis of Failing Concurrent Executions
Malavika Samak, Omer Tripp, and Murali Krishna Ramanathan
(IISc Bangalore, India; Google, USA)
Detecting concurrency-induced bugs in multithreaded libraries can be challenging due to the intricacies associated with their manifestation. This includes invocation of multiple methods, synthesis of inputs to the methods to reach the failing location, and crafting of thread interleavings that cause the erroneous behavior. Neither fuzzing-based testing techniques nor over-approximate static analyses are well positioned to detect such subtle defects while retaining high accuracy alongside satisfactory coverage.
In this paper, we propose a directed, iterative and scalable testing engine that combines the strengths of static and dynamic analysis to help synthesize concurrent executions to expose complex concurrency-induced bugs. Our engine accepts as input the library, its client (either sequential or concurrent) and a specification of correctness. Then, it iteratively refines the client to generate an execution that can break the input specification. Each step of the iterative process includes statically identifying sub-goals towards the goal of failing the specification, generating a plan toward meeting these goals, and merging of the paths traversed dynamically with the plan computed statically via constraint solving to generate a new client. The engine reports full reproduction scenarios, guaranteed to be true, for the bugs it finds.
We have created a prototype of our approach named MINION. We validated MINION by applying it to well-tested concurrent classes from popular Java libraries, including the latest versions of openjdk and google-guava. We were able to detect 31 real crashes across 10 classes in a total of 23 minutes, including previously unknown bugs. Comparison with three other tools reveals that combined, they report only 9 of the 31 crashes (and no other crashes beyond MINION). This is because several of these bugs manifest under deeply nested path conditions (observed maximum of 11), deep nesting of method invocations (observed maximum of 6) and multiple refinement iterations to generate the crash-inducing client.

Maximal Causality Reduction for TSO and PSO
Shiyou Huang and Jeff Huang

(Texas A&M University, USA)
Verifying concurrent programs is challenging due to the exponentially large thread interleaving space. The problem is exacerbated by relaxed memory models such as Total Store Order (TSO) and Partial Store Order (PSO) which further explode the interleaving space by reordering instructions. A recent advance, Maximal Causality Reduction (MCR), has shown great promise to improve verification effectiveness by maximally reducing redundant explorations. However, the original MCR only works for the Sequential Consistency (SC) memory model, but not for TSO and PSO. In this paper, we develop novel extensions to MCR by solving two key problems under TSO and PSO: 1) generating interleavings that can reach new states by encoding the operational semantics of TSO and PSO with first-order logical constraints and solving them with SMT solvers, and 2) enforcing TSO and PSO interleavings by developing novel replay algorithms that allow executions out of the program order. We show that our approach successfully enables MCR to effectively explore TSO and PSO interleavings. We have compared our approach with a recent Dynamic Partial Order Reduction (DPOR) algorithm for TSO and PSO and a SAT-based stateless model checking approach. Our results show that our approach is much more effective than the other approaches for both state-space exploration and bug finding – on average it explores 5-10X fewer executions and finds many bugs that the other tools cannot find.

Precise and Maximal Race Detection from Incomplete Traces
Jeff Huang

and Arun K. Rajagopalan
(Texas A&M University, USA)
We present RDIT, a novel dynamic technique to detect data races in multithreaded programs with incomplete trace information, i.e., in the presence of missing events. RDIT is both precise and maximal: it does not report any false alarms and it detects a maximal set of true traces from the observed incomplete trace. RDIT is underpinned by a sound BarrierPair model that abstracts away the missing events by capturing the invocation data of their enclosing methods. By making the least conservative abstraction that a missing method introduces synchronization only when it has a memory address in scope that overlaps with other events or other missing methods, and by formulating maximal thread causality as logical constraints, RDIT guarantees to precisely detect races with maximal capability. RDIT has been applied in seven real-world large concurrent systems and has detected dozens of true races with zero false alarms. Comparatively, existing algorithms such as Happens-Before, Causal- Precedes, and Maximal-Causality which are known to be precise all report many false alarms when missing synchronizations.

Stateless Model Checking with Data-Race Preemption Points
Ben Blum and Garth Gibson
(Carnegie Mellon University, USA)
Stateless model checking is a powerful technique for testing concurrent programs, but suffers from exponential state space explosion when the test input parameters are too large. Several reduction techniques can mitigate this explosion, but even after pruning equivalent interleavings, the state space size is often intractable. Most prior tools are limited to preempting only on synchronization APIs, which reduces the space further, but can miss unsynchronized thread communication bugs. Data race detection, another concurrency testing approach, focuses on suspicious memory access pairs during a single test execution. It avoids concerns of state space size, but may report races that do not lead to observable failures, which jeopardizes a user’s willingness to use the analysis.
We present Quicksand, a new stateless model checking framework which manages the exploration of many state spaces using different preemption points. It uses state space estimation to prioritize jobs most likely to complete in a fixed CPU budget, and it incorporates data-race analysis to add new preemption points on the fly. Preempting threads during a data race’s instructions can automatically classify the race as buggy or benign, and uncovers new bugs not reachable by prior model checkers. It also enables full verification of all possible schedules when every data race is verified as benign within the CPU budget. In our evaluation, Quicksand found 1.25x as many bugs and verified 4.3x as many tests compared to prior model checking approaches.

Language Design and Programming Models II

Automatic Enforcement of Expressive Security Policies using Enclaves
Anitha Gollamudi and Stephen Chong

(Harvard University, USA)
Hardware-based enclave protection mechanisms, such as Intel’s SGX, ARM’s TrustZone, and Apple’s Secure Enclave, can protect code and data from powerful low-level attackers. In this work, we use enclaves to enforce strong application-specific information security policies.
We present IMP_E, a novel calculus that captures the essence of SGX-like enclave mechanisms, and show that a security-type system for IMP_E can enforce expressive confidentiality policies (including erasure policies and delimited release policies) against powerful low-level attackers, including attackers that can arbitrarily corrupt non-enclave code, and, under some circumstances, corrupt enclave code. We present a translation from an expressive security-typed calculus (that is not aware of enclaves) to IMP_E. The translation automatically places code and data into enclaves to enforce the security policies of the source program.

Chain: Tasks and Channels for Reliable Intermittent Programs
Alexei Colin and Brandon Lucia

(Carnegie Mellon University, USA)
Energy harvesting computers enable general-purpose computing using energy collected from their environment. Energy-autonomy of such devices has great potential, but their intermittent power supply poses a challenge. Intermittent program execution compromises progress and leaves state inconsistent. This work describes Chain: a new model for programming intermittent devices.
A Chain program is a set of programmer-defined tasks that compute and exchange data through channels. Chain guarantees forward progress at task granularity. A task is restartable and never sees inconsistent state, because its input and output channels are separated. Our system supports language features for expressing advanced data exchange patterns and for encapsulating reusable functionality.
Chain fundamentally differs from state-of-the-art checkpointing approaches and does not incur the associated overhead. We implement Chain as C language extensions and a runtime library. We used Chain to implement four applications: machine learning, encryption, compression, and sensing. In experiments, Chain ensured consistency where prior approaches failed and improved throughput by 2-7x over the leading state-of-the-art system.

Info

GEMs: Shared-Memory Parallel Programming for Node.js
Daniele Bonetta

, Luca Salucci, Stefan Marr, and Walter Binder

(Oracle Labs, Austria; University of Lugano, Switzerland; JKU Linz, Austria)
JavaScript is the most popular programming language for client-side Web applications, and Node.js has popularized the language for server-side computing, too. In this domain, the minimal support for parallel programming remains however a major limitation. In this paper we introduce a novel parallel programming abstraction called Generic Messages (GEMs). GEMs allow one to combine message passing and shared-memory parallelism, extending the classes of parallel applications that can be built with Node.js. GEMs have customizable semantics and enable several forms of thread safety, isolation, and concurrency control. GEMs are designed as convenient JavaScript abstractions that expose high-level and safe parallelism models to the developer. Experiments show that GEMs outperform equivalent Node.js applications thanks to their usage of shared memory.

OrcO: A Concurrency-First Approach to Objects
Arthur Michener Peters, David Kitchin, John A. Thywissen, and William R. Cook
(University of Texas at Austin, USA; Google, USA)
The majority of modern programming languages provide concurrency and object-orientation in some form. However, object-oriented concurrency remains cumbersome in many situations. We introduce the language OrcO, Orc with concurrent Objects, which enables a flexible style of concurrent object-oriented programming. OrcO extends the Orc programming language by adding abstractions for programming-in-the-large; namely objects, classes, and inheritance. OrcO objects are designed to be orthogonal to concurrency, allowing the concurrent structure and object structure of a program to evolve independently. This paper describes OrcO's goals and design and provides examples of how OrcO can be used to deftly handle events, object management, and object composition.

Principles, Across the Compilation Stack

Semantic Subtyping for Imperative Object-Oriented Languages
Davide Ancona

and Andrea Corradi
(University of Genoa, Italy)
Semantic subtyping is an approach for defining sound and complete procedures to decide subtyping for expressive types, including union and intersection types; although it has been exploited especially in functional languages for XML based programming, recently it has been partially investigated in the context of object-oriented languages, and a sound and complete subtyping algorithm has been proposed for record types, but restricted to immutable fields, with union and recursive types interpreted coinductively to support cyclic objects. In this work we address the problem of studying semantic subtyping for imperative object-oriented languages, where fields can be mutable; in particular, we add read/write field annotations to record types, and, besides union, we consider intersection types as well, while maintaining coinductive interpretation of recursive types. In this way, we get a richer notion of type with a flexible subtyping relation, able to express a variety of type invariants useful for enforcing static guarantees for mutable objects. The addition of these features radically changes the defi- nition of subtyping, and, hence, the corresponding decision procedure, and surprisingly invalidates some subtyping laws that hold in the functional setting. We propose an intuitive model where mutable record val- ues contain type information to specify the values that can be correctly stored in fields. Such a model, and the correspond- ing subtyping rules, require particular care to avoid circularity between coinductive judgments and their negations which, by duality, have to be interpreted inductively. A sound and complete subtyping algorithm is provided, together with a prototype implementation.

Parsing with First-Class Derivatives
Jonathan Immanuel Brachthäuser

, Tillmann Rendel, and Klaus Ostermann

(University of Tübingen, Germany)
Brzozowski derivatives, well known in the context of regular expressions, have recently been rediscovered to give a simplified explanation to parsers of context-free languages. We add derivatives as a novel first-class feature to a standard parser combinator language. First-class derivatives enable an inversion of the control flow, allowing to implement modular parsers for languages that previously required separate pre-processing steps or cross-cutting modifications of the parsers. We show that our framework offers new opportunities for reuse and supports a modular definition of interesting use cases of layout-sensitive parsing.

The Missing Link: Explaining ELF Static Linking, Semantically
Stephen Kell, Dominic P. Mulligan, and Peter Sewell

(University of Cambridge, UK)
Beneath the surface, software usually depends on complex linker behaviour to work as intended. Even linking hello_world.c is surprisingly involved, and systems software such as libc and operating system kernels rely on a host of linker features. But linking is poorly understood by working programmers and has largely been neglected by language researchers.
In this paper we survey the many use-cases that linkers support and the poorly specified linker speak by which they are controlled: metadata in object files, command-line options, and linker-script language. We provide the first validated formalisation of a realistic executable and linkable format (ELF), and capture aspects of the Application Binary Interfaces for four mainstream platforms (AArch64, AMD64, Power64, and IA32). Using these, we develop an executable specification of static linking, covering (among other things) enough to link small C programs (we use the example of bzip2) into a correctly running executable. We provide our specification in Lem and Isabelle/HOL forms. This is the first formal specification of mainstream linking. We have used the Isabelle/HOL version to prove a sample correctness property for one case of AMD64 ABI relocation, demonstrating that the specification supports formal proof, and as a first step towards the much more ambitious goal of verified linking. Our work should enable several novel strands of research, including linker-aware verified compilation and program analysis, and better languages for controlling linking.

Type Soundness for Dependent Object Types (DOT)
Tiark Rompf

and Nada Amin
(Purdue University, USA; EPFL, Switzerland)
Scala’s type system unifies aspects of ML modules, object- oriented, and functional programming. The Dependent Object Types (DOT) family of calculi has been proposed as a new theoretic foundation for Scala and similar expressive languages. Unfortunately, type soundness has only been established for restricted subsets of DOT. In fact, it has been shown that important Scala features such as type refinement or a subtyping relation with lattice structure break at least one key metatheoretic property such as environment narrowing or invertible subtyping transitivity, which are usually required for a type soundness proof. The main contribution of this paper is to demonstrate how, perhaps surprisingly, even though these properties are lost in their full generality, a rich DOT calculus that includes recursive type refinement and a subtyping lattice with intersection types can still be proved sound. The key insight is that subtyping transitivity only needs to be invertible in code paths executed at runtime, with contexts consisting entirely of valid runtime objects, whereas inconsistent subtyping contexts can be permitted for code that is never executed.

Runtime Support

Efficient and Thread-Safe Objects for Dynamically-Typed Languages
Benoit Daloze, Stefan Marr, Daniele Bonetta

, and Hanspeter Mössenböck
(JKU Linz, Austria; Oracle Labs, Austria)
We are in the multi-core era. Dynamically-typed languages are in widespread use, but their support for multithreading still lags behind. One of the reasons is that the sophisticated techniques they use to efficiently represent their dynamic object models are often unsafe in multithreaded environments.
This paper defines safety requirements for dynamic object models in multithreaded environments. Based on these requirements, a language-agnostic and thread-safe object model is designed that maintains the efficiency of sequential approaches. This is achieved by ensuring that field reads do not require synchronization and field updates only need to synchronize on objects shared between threads.
Basing our work on JRuby+Truffle, we show that our safe object model has zero overhead on peak performance for thread-local objects and only 3% average overhead on parallel benchmarks where field updates require synchronization. Thus, it can be a foundation for safe and efficient multithreaded VMs for a wide range of dynamic languages.

Hybrid STM/HTM for Nested Transactions on OpenJDK
Keith Chapman, Antony L. Hosking

, and J. Eliot B. Moss

(Purdue University, USA; Australian National University, Australia; Data61, Australia; University of Massachusetts at Amherst, USA)
Transactional memory (TM) has long been advocated as a promising pathway to more automated concurrency control for scaling concurrent programs running on parallel hardware. Software TM (STM) has the benefit of being able to run general transactional programs, but at the significant cost of overheads imposed to log memory accesses, mediate access conflicts, and maintain other transaction metadata. Recently, hardware manufacturers have begun to offer commodity hardware TM (HTM) support in their processors wherein the transaction metadata is maintained “for free” in hardware. However, HTM approaches are only best-effort: they cannot successfully run all transactional programs, whether because of hardware capacity issues (causing large transactions to fail), or compatibility restrictions on the processor instructions permitted within hardware transactions (causing transactions that execute those instructions to fail). In such cases, programs must include failure-handling code to attempt the computation by some other software means, since retrying the transaction would be futile. Thus, a canonical use of HTM is lock elision: replacing lock regions with transactions, retrying some number of times in the case of conflicts, but falling back to locking when HTM fails for other reasons.
Here, we describe how software and hardware schemes can combine seamlessly into a hybrid system in support of transactional programs, allowing use of low-cost HTM when it works, but reverting to STM when it doesn’t. We describe heuristics used to make this choice dynamically and automatically, but allowing the transition back to HTM opportunistically. Our implementation is for an extension of Java having syntax for both open and closed nested transactions, and boosting, running on the OpenJDK, with dynamic injection of STM mechanisms (into code variants used under STM) and HTM instructions (into code variants used under HTM). Both schemes are compatible to allow different threads to run concurrently with either mechanism, while preserving transaction safety. Using a standard synthetic benchmark we demonstrate that HTM offers significant acceleration of both closed and open nested transactions, while yielding parallel scaling up to the limits of the hardware, whereupon scaling in software continues but with the penalty to throughput imposed by software mechanisms.

Makalu: Fast Recoverable Allocation of Non-volatile Memory
Kumud Bhandari, Dhruva R. Chakrabarti, and Hans-J. Boehm
(Rice University, USA; Hewlett Packard Labs, USA; Google, USA)
Byte addressable non-volatile memory (NVRAM) is likely to supplement, and perhaps eventually replace, DRAM. Applications can then persist data structures directly in memory instead of serializing them and storing them onto a durable block device. However, failures during execution can leave data structures in NVRAM unreachable or corrupt. In this paper, we present Makalu, a system that addresses non-volatile memory management. Makalu offers an integrated allocator and recovery-time garbage collector that maintains internal consistency, avoids NVRAM memory leaks, and is efficient, all in the face of failures.
We show that a careful allocator design can support a less restrictive and a much more familiar programming model than existing persistent memory allocators. Our allocator significantly reduces the per allocation persistence overhead by lazily persisting non-essential metadata and by employing a post-failure recovery-time garbage collector. Experimental results show that the resulting online speed and scalability of our allocator are comparable to well-known transient allocators, and significantly better than state-of-the-art persistent allocators.

Prioritized Garbage Collection: Explicit GC Support for Software Caches
Diogenes Nunez, Samuel Z. Guyer, and Emery D. Berger

(Tufts University, USA; University of Massachusetts at Amherst, USA)
Programmers routinely trade space for time to increase performance, often in the form of caching or memoization. In managed languages like Java or JavaScript, however, this space-time tradeoff is complex. Using more space translates into higher garbage collection costs, especially at the limit of available memory. Existing runtime systems provide limited support for space-sensitive algorithms, forcing programmers into difficult and often brittle choices about provisioning.
This paper presents prioritized garbage collection, a cooperative programming language and runtime solution to this problem. Prioritized GC provides an interface similar to soft references, called priority references, which identify objects that the collector can reclaim eagerly if necessary. The key difference is an API for defining the policy that governs when priority references are cleared and in what order. Application code specifies a priority value for each reference and a target memory bound. The collector reclaims references, lowest priority first, until the total memory footprint of the cache fits within the bound. We use this API to implement a space-aware least-recently-used (LRU) cache, called a Sache, that is a drop-in replacement for existing caches, such as Google’s Guava library. The garbage collector automatically grows and shrinks the Sache in response to available memory and workload with minimal provisioning information from the programmer. Using a Sache, it is almost impossible for an application to experience a memory leak, memory pressure, or an out-of-memory crash caused by software caching.

Program Modeling and Learning

Computing Repair Alternatives for Malformed Programs using Constraint Attribute Grammars
Friedrich Steimann

, Jörg Hagemann, and Bastian Ulke
(Fernuniversität in Hagen, Germany)
Attribute grammars decorate the nodes of a program's parse tree with attributes whose values are defined by equations encoding the (static) semantics of a programming language. We show how replacing the equations of an attribute grammar with equivalent constraints that can be solved by a constraint solver allows us to compute repairs of a malformed program solely from a specification that was originally designed for checking its well-formedness. We present two repair modes --- shallow and deep fixing --- whose computed repair alternatives are guaranteed to repair every error on which they are invoked. While shallow fixing may introduce new errors, deep fixing never does; to make it tractable, we implement it using neighborhood search. We demonstrate the feasibility of our approach by implementing it on top of ExtendJ, an attribute grammar based Java compiler, and by applying it to an example from the Java EE context, detecting and fixing well-formedness errors (both real and injected) in a body of 14 open-source subject programs.

Probabilistic Model for Code with Decision Trees
Veselin Raychev, Pavol Bielik, and Martin Vechev

(ETH Zurich, Switzerland)
In this paper we introduce a new approach for learning precise and general probabilistic models of code based on decision tree learning. Our approach directly benefits an emerging class of statistical programming tools which leverage probabilistic models of code learned over large codebases (e.g., GitHub) to make predictions about new programs (e.g., code completion, repair, etc).
The key idea is to phrase the problem of learning a probabilistic model of code as learning a decision tree in a domain specific language over abstract syntax trees (called TGen). This allows us to condition the prediction of a program element on a dynamically computed context. Further, our problem formulation enables us to easily instantiate known decision tree learning algorithms such as ID3, but also to obtain new variants we refer to as ID3+ and E13, not previously explored and ones that outperform ID3 in prediction accuracy.
Our approach is general and can be used to learn a probabilistic model of any programming language. We implemented our approach in a system called Deep3 and evaluated it for the challenging task of learning probabilistic models of JavaScript and Python. Our experimental results indicate that Deep3 predicts elements of JavaScript and Python code with precision above 82% and 69%, respectively. Further, Deep3 often significantly outperforms state-of-the-art approaches in overall prediction accuracy.

Ringer: Web Automation by Demonstration
Shaon Barman, Sarah Chasins, Rastislav Bodik, and Sumit Gulwani

(University of California at Berkeley, USA; University of Washington, USA; Microsoft Research, USA)
With increasing amounts of data available on the web and a diverse range of users interested in programmatically accessing that data, web automation must become easier. Automation helps users complete many tedious interactions, such as scraping data, completing forms, or transferring data between websites. However, writing web automation scripts typically requires an expert programmer because the writer must be able to reverse engineer the target webpage. We have built a record and replay tool, Ringer, that makes web automation accessible to non-coders. Ringer takes a user demonstration as input and creates a script that interacts with the page as a user would. This approach makes Ringer scripts more robust to webpage changes because user-facing interfaces remain relatively stable compared to the underlying webpage implementations. We evaluated our approach on benchmarks recorded on real webpages and found that it replayed 4x more benchmarks than a state-of-the-art replay tool.

Info

Scalable Verification of Border Gateway Protocol Configurations with an SMT Solver
Konstantin Weitz, Doug Woos, Emina Torlak

, Michael D. Ernst

, Arvind Krishnamurthy, and Zachary Tatlock

(University of Washington, USA)
Internet Service Providers (ISPs) use the Border Gateway Protocol (BGP) to announce and exchange routes for de- livering packets through the internet. ISPs must carefully configure their BGP routers to ensure traffic is routed reli- ably and securely. Correctly configuring BGP routers has proven challenging in practice, and misconfiguration has led to worldwide outages and traffic hijacks. This paper presents Bagpipe, a system that enables ISPs to declaratively express BGP policies and that automatically verifies that router configurations implement such policies. The novel initial network reduction soundly reduces policy verification to a search for counterexamples in a finite space. An SMT-based symbolic execution engine performs this search efficiently. Bagpipe reduces the size of its search space using predicate abstraction and parallelizes its search using symbolic variable hoisting. Bagpipe's policy specification language is expressive: we expressed policies inferred from real AS configurations, policies from the literature, and policies for 10 Juniper TechLibrary configuration scenarios. Bagpipe is efficient: we ran it on three ASes with a total of over 240,000 lines of Cisco and Juniper BGP configuration. Bagpipe is effective: it revealed 19 policy violations without issuing any false positives.

Info

Typing, in Practice

A Practical Framework for Type Inference Error Explanation
Calvin Loncaric, Satish Chandra, Cole Schlesinger, and Manu Sridharan

(University of Washington, USA; Samsung Research, USA)
Many languages have support for automatic type inference. But when inference fails, the reported error messages can be unhelpful, highlighting a code location far from the source of the problem. Several lines of work have emerged proposing error reports derived from correcting sets: a set of program points that, when fixed, produce a well-typed program. Unfortunately, these approaches are tightly tied to specific languages; targeting a new language requires encoding a type inference algorithm for the language in a custom constraint system specific to the error reporting tool.
We show how to produce correcting set-based error reports by leveraging existing type inference implementations, easing the burden of adoption and, as type inference algorithms tend to be efficient in practice, producing error reports of comparable quality to similar error reporting tools orders of magnitude faster. Many type inference algorithms are already formulated as dual phases of type constraint generation and solving; rather than (re)implementing type inference in an error explanation tool, we isolate the solving phase and treat it as an oracle for solving typing constraints. Given any set of typing constraints, error explanation proceeds by iteratively removing conflicting constraints from the initial constraint set until discovering a subset on which the solver succeeds; the constraints removed form a correcting set. Our approach is agnostic to the semantics of any particular language or type system, instead leveraging the existing type inference engine to give meaning to constraints.

Dynamically Diagnosing Type Errors in Unsafe Code
Stephen Kell
(University of Cambridge, UK)
Existing approaches for detecting type errors in unsafe languages are limited. Static analysis methods are imprecise, and often require source-level changes, while most dynamic methods check only memory properties (bounds, liveness, etc.), owing to a lack of run-time type information. This paper describes libcrunch, a system for binary-compatible run-time type checking of unmodified unsafe code, currently focusing on C. Practical experience shows that our prototype implementation is easily applicable to many real codebases without source-level modification, correctly flags programmer errors with a very low rate of false positives, offers a very low run-time overhead, and covers classes of error caught by no previously existing tool.

First-Class Effect Reflection for Effect-Guided Programming
Yuheng Long, Yu David Liu, and Hridesh Rajan

(Iowa State University, USA; SUNY Binghamton, USA)
This paper introduces a novel type-and-effect calculus, first-class effects, where the computational effect of an expression can be programmatically reflected, passed around as values, and analyzed at run time. A broad range of designs "hard-coded" in existing effect-guided analyses — from thread scheduling, version-consistent software updating, to data zeroing — can be naturally supported through the programming abstractions. The core technical development is a type system with a number of features, including a hybrid type system that integrates static and dynamic effect analyses, a refinement type system to verify application-specific effect management properties, a double-bounded type system that computes both over-approximation of effects and their under-approximation. We introduce and establish a notion of soundness called trace consistency, defined in terms of how the effect and trace correspond. The property sheds foundational insight on "good" first-class effect programming.

Java and Scala's Type Systems are Unsound: The Existential Crisis of Null Pointers
Nada Amin and Ross Tate

(EPFL, Switzerland; Cornell University, USA)
We present short programs that demonstrate the unsoundness of Java and Scala's current type systems. In particular, these programs provide parametrically polymorphic functions that can turn any type into any type without (down)casting. Fortunately, parametric polymorphism was not integrated into the Java Virtual Machine (JVM), so these examples do not demonstrate any unsoundness of the JVM. Nonetheless, we discuss broader implications of these findings on the field of programming languages.

Info

Bug Detection Analysis and Model Checking

Finding Compiler Bugs via Live Code Mutation
Chengnian Sun, Vu Le, and Zhendong Su
(University of California at Davis, USA)
Validating optimizing compilers is challenging because it is hard to generate valid test programs (i.e., those that do not expose any undefined behavior). Equivalence Modulo Inputs (EMI) is an effective, promising methodology to tackle this problem. Given a test program with some inputs, EMI mutates the program to derive variants that are semantically equivalent w.r.t. these inputs. The state-of-the-art instantiations of EMI are Orion and Athena, both of which rely on deleting code from or inserting code into code regions that are not executed under the inputs. Although both have demonstrated their ability in finding many bugs in GCC and LLVM, they are still limited due to their mutation strategies that operate only on dead code regions.
This paper presents a novel EMI technique that allows mutation in the entire program (i.e., both live and dead regions). By removing the restriction of mutating only the dead regions, our technique significantly increases the EMI variant space. It also helps to more thoroughly stress test compilers as compilers must optimize mutated live code, whereas mutated dead code might be eliminated. Finally, our technique also makes compiler bugs more noticeable as miscompilations on mutated dead code may not be observable.
We have realized the proposed technique in Hermes. The evaluation demonstrates Hermes’s effectiveness. In 13 months, Hermes found 168 confirmed, valid bugs in GCC and LLVM, of which 132 have already been fixed.

Finding Resume and Restart Errors in Android Applications
Zhiyong Shan, Tanzirul Azim, and Iulian Neamtiu
(University of Central Missouri, USA; University of California at Riverside, USA; New Jersey Institute of Technology, USA)
Smartphone apps create and handle a large variety of ``instance'' data that has to persist across runs, such as the current navigation route, workout results, antivirus settings, or game state. Due to the nature of the smartphone platform, an app can be paused, sent into background, or killed at any time. If the instance data is not saved and restored between runs, in addition to data loss, partially-saved or corrupted data can crash the app upon resume or restart. While smartphone platforms offer API support for data-saving and data-retrieving operations, the use of this API is ad-hoc: left to the programmer, rather than enforced by the compiler. We have observed that several categories of bugs---including data loss, failure to resume/restart or resuming/restarting in the wrong state---are due to incorrect handling of instance data and are easily triggered by just pressing the `Home' or `Back' buttons. To help address this problem, we have constructed a tool chain for Android (the KREfinder static analysis and the KREreproducer input generator) that helps find and reproduce such incorrect handling. We have evaluated our approach by running the static analysis on 324 apps, of which 49 were further analyzed manually. Results indicate that our approach is (i) effective, as it has discovered 49 bugs, including in popular Android apps, and (ii) efficient, completing on average in 61 seconds per app. More generally, our approach helps determine whether an app saves too much or too little state.

Info

Low-Overhead and Fully Automated Statistical Debugging with Abstraction Refinement
Zhiqiang Zuo, Lu Fang, Siau-Cheng Khoo, Guoqing Xu, and Shan Lu

(University of California at Irvine, USA; National University of Singapore, Singapore; University of Chicago, USA)
Cooperative statistical debugging is an effective approach for diagnosing production-run failures. To quickly identify failure predictors from the huge program predicate space, existing techniques rely on random or heuristics-guided predicate sampling at the user side. However, none of them can satisfy the requirements of low cost, low diagnosis latency, and high diagnosis quality simultaneously, which are all indispensable for statistical debugging to be practical.
This paper presents a new technique that tackles the above challenges. We formulate the technique as an instance of abstraction refinement, where efficient abstract-level profiling is first applied to the whole program and its execution brings information that can pinpoint suspicious coarse-grained entities that need to be refined. The refinement profiles a corresponding set of fine-grained entities, and generates feedback that determines what to prune and what to refine next. The process is fully automated, and more importantly, guided by a mathematically rigorous analysis that guarantees that our approach produces the same debugging results as an exhaustive analysis in deterministic settings.
We have implemented this technique for both C and Java on both single machine and distributed system. A thorough evaluation demonstrates that our approach yields (1) an order of magnitude reduction in the user-side runtime overhead even compared to a sampling-based approach and (2) two orders of magnitude reduction in the size of data transferred over the network, completely automatically without sacrificing any debugging capability.

Info

To Be Precise: Regression Aware Debugging
Rohan Bavishi, Awanish Pandey, and Subhajit Roy

(IIT Kanpur, India)
Bounded model checking based debugging solutions search for mutations of program expressions that produce the expected output for a currently failing test. However, the current localization tools are not regression aware: they do not use information from the passing tests in their localization formula. On the other hand, the current repair tools attempt to guarantee regression freedom: when provided with a set of passing tests, they guarantee that none of these tests can break due to the suggested repair patch, thereby constructing a large repair formula.
In this paper, we propose regression awareness as a means to improve the quality of localization and to scale repair. To enable regression awareness, we summarize the proof of correctness of each passing test by computing Craig Interpolants over a symbolic encoding of the passing execution, and use these summaries as additional soft constraints while synthesizing altered executions corresponding to failing tests. Intuitively, these additional constraints act as roadblocks, thereby discouraging executions that may damage the proof of a passing test. We use a partial MAXSAT solver to relax the proofs in a systematic way, and use a ranking function that penalizes mutations that damage the existing proofs.
We have implemented our algorithms into a tool, TINTIN, that enables regression aware localization and repair. For localizations, our strategy is effective in extracting a superior ranking of suspicious locations: on a set of 52 different versions across 12 different programs spanning three benchmark suites, TINTIN achieves a saving of developer effort by almost 45% (in terms of the locations that must be examined by a developer to reach the ground-truth repair) in the worst case and 27% in the average case over existing techniques. For automated repairs, on our set of benchmarks, TINTIN achieves a 2.3X speedup over existing techniques without sacrificing much on the ranking of the repair patches: the ground-truth repair appears as the topmost suggestion in more than 70% of our benchmarks.

OOPSLA 2016 – Proceedings

Frontmatter

Optimization and Performance

Semantics and Verification

Program Synthesis

Language Design and Programming Models I

Programming Frameworks, Tools, and Methodologies

Static Analysis

Concurrency Analysis and Model Checking

Language Design and Programming Models II

Principles, Across the Compilation Stack

Runtime Support

Program Modeling and Learning

Typing, in Practice

Bug Detection Analysis and Model Checking