POPL 2016 – Author Index 
Contents 
Abstracts 
Authors

A B C D E F G H I J K L M N O P R S T U V W X Y Z
Adams, Michael D. 
POPL'16: "Pushdown ControlFlow Analysis ..."
Pushdown ControlFlow Analysis for Free
Thomas Gilray, Steven Lyde, Michael D. Adams, Matthew Might, and David Van Horn (University of Utah, USA; University of Maryland, USA)
Traditional controlflow analysis (CFA) for higherorder languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been published that provide perfect callstack precision in a computable manner: CFA2, PDCFA, and AAC. Unfortunately, implementing CFA2 and PDCFA requires significant engineering effort. Furthermore, all three are computationally expensive. For a monovariant analysis, CFA2 is in O(2^n), PDCFA is in O(n^6), and AAC is in O(n^8).
In this paper, we describe a new technique that builds on these but is both straightforward to implement and computationally inexpensive. The crucial insight is an unusual statedependent allocation strategy for the addresses of continuations. Our technique imposes only a constantfactor overhead on the underlying analysis and costs only O(n^3) in the monovariant case. We present the intuitions behind this development, benchmarks demonstrating its efficacy, and a proof of the precision of this analysis.
@InProceedings{POPL16p691,
author = {Thomas Gilray and Steven Lyde and Michael D. Adams and Matthew Might and David Van Horn},
title = {Pushdown ControlFlow Analysis for Free},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {691704},
doi = {10.1145/2837614.2837631},
year = {2016},
}
Publisher's Version
Article Search


Albarghouthi, Aws 
POPL'16: "Maximal Specification Synthesis ..."
Maximal Specification Synthesis
Aws Albarghouthi, Isil Dillig, and Arie Gurfinkel (University of WisconsinMadison, USA; University of Texas at Austin, USA; Carnegie Mellon University, USA)
Many problems in program analysis, verification, and synthesis require inferring specifications of unknown procedures. Motivated by a broad range of applications, we formulate the problem of maximal specification inference: Given a postcondition Phi and a program P calling a set of unknown procedures F_1,…,F_n, what are the most permissive specifications of procedures F_i that ensure correctness of P? In other words, we are looking for the smallest number of assumptions we need to make about the behaviours of F_i in order to prove that P satisfies its postcondition. To solve this problem, we present a novel approach that utilizes a counterexampleguided inductive synthesis loop and reduces the maximal specification inference problem to multiabduction. We formulate the novel notion of multiabduction as a generalization of classical logical abduction and present an algorithm for solving multiabduction problems. On the practical side, we evaluate our specification inference technique on a range of benchmarks and demonstrate its ability to synthesize specifications of kernel routines invoked by device drivers.
@InProceedings{POPL16p789,
author = {Aws Albarghouthi and Isil Dillig and Arie Gurfinkel},
title = {Maximal Specification Synthesis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {789801},
doi = {10.1145/2837614.2837628},
year = {2016},
}
Publisher's Version
Article Search


Altenkirch, Thorsten 
POPL'16: "Type Theory in Type Theory ..."
Type Theory in Type Theory using Quotient Inductive Types
Thorsten Altenkirch and Ambrus Kaposi (University of Nottingham, UK)
We present an internal formalisation of a type heory with dependent types in Type Theory using a special case of higher inductive types from Homotopy Type Theory which we call quotient inductive types (QITs). Our formalisation of type theory avoids referring to preterms or a typability relation but defines directly well typed objects by an inductive definition. We use the elimination principle to define the settheoretic and logical predicate interpretation. The work has been formalized using the Agda system extended with QITs using postulates.
@InProceedings{POPL16p18,
author = {Thorsten Altenkirch and Ambrus Kaposi},
title = {Type Theory in Type Theory using Quotient Inductive Types},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {1829},
doi = {10.1145/2837614.2837638},
year = {2016},
}
Publisher's Version
Article Search
Info


Andrysco, Marc 
POPL'16: "Printing FloatingPoint Numbers: ..."
Printing FloatingPoint Numbers: A Faster, Always Correct Method
Marc Andrysco, Ranjit Jhala, and Sorin Lerner (University of California at San Diego, USA)
Floatingpoint numbers are an essential part of modern software, recently gaining particular prominence on the web as the exclusive numeric format of Javascript. To use floatingpoint numbers, we require a way to convert binary machine representations into human readable decimal outputs. Existing conversion algorithms make tradeoffs between completeness and performance. The classic Dragon4 algorithm by Steele and White and its later refinements achieve completeness  i.e. produce correct and optimal outputs on all inputs  by using arbitrary precision integer (bignum) arithmetic which leads to a high performance cost. On the other hand, the recent Grisu3 algorithm by Loitsch shows how to recover performance by using native integer arithmetic but sacrifices optimality for 0.5% of all inputs. We present Errol, a new complete algorithm that is guaranteed to produce correct and optimal results for all inputs while simultaneously being 2x faster than the incomplete Grisu3 and 4x faster than previous complete methods.
@InProceedings{POPL16p555,
author = {Marc Andrysco and Ranjit Jhala and Sorin Lerner},
title = {Printing FloatingPoint Numbers: A Faster, Always Correct Method},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {555567},
doi = {10.1145/2837614.2837654},
year = {2016},
}
Publisher's Version
Article Search


Bao, Wenlei 
POPL'16: "PolyCheck: Dynamic Verification ..."
PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs
Wenlei Bao, Sriram Krishnamoorthy, LouisNoël Pouchet, Fabrice Rastello, and P. Sadayappan (Ohio State University, USA; Pacific Northwest National Laboratory, USA; Inria, France)
Highlevel compiler transformations, especially loop transformations, are widely recognized as critical optimizations to restructure programs to improve data locality and expose parallelism. Guaranteeing the correctness of program transformations is essential, and to date three main approaches have been developed: proof of equivalence of affine programs, matching the execution traces of programs, and checking bitbybit equivalence of program outputs. Each technique suffers from limitations in the kind of transformations supported, space complexity, or the sensitivity to the testing dataset. In this paper, we take a novel approach that addresses all three limitations to provide an automatic bug checker to verify any iteration reordering transformations on affine programs, including nonaffine transformations, with space consumption proportional to the original program data and robust to arbitrary datasets of a given size. We achieve this by exploiting the structure of affine program control and dataflow to generate at compiletime lightweight checker code to be executed within the transformed program. Experimental results assess the correctness and effectiveness of our method and its increased coverage over previous approaches.
@InProceedings{POPL16p539,
author = {Wenlei Bao and Sriram Krishnamoorthy and LouisNoël Pouchet and Fabrice Rastello and P. Sadayappan},
title = {PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {539554},
doi = {10.1145/2837614.2837656},
year = {2016},
}
Publisher's Version
Article Search


Barceló, Pablo 
POPL'16: "String Solving with Word Equations ..."
String Solving with Word Equations and Transducers: Towards a Logic for Analysing Mutation XSS
Anthony W. Lin and Pablo Barceló (YaleNUS College, Singapore; University of Chile, Chile)
We study the fundamental issue of decidability of satisfiability over string logics with concatenations and finitestate transducers as atomic operations. Although restricting to one type of operations yields decidability, little is known about the decidability of their combined theory, which is especially relevant when analysing security vulnerabilities of dynamic web pages in a more realistic browser model. On the one hand, word equations (string logic with concatenations) cannot precisely capture sanitisation functions (e.g. htmlescape) and implicit browser transductions (e.g. innerHTML mutations). On the other hand, transducers suffer from the reverse problem of being able to model sanitisation functions and browser transductions, but not string concatenations. Naively combining word equations and transducers easily leads to an undecidable logic. Our main contribution is to show that the "straightline fragment" of the logic is decidable (complexity ranges from PSPACE to EXPSPACE). The fragment can express the program logics of straightline stringmanipulating programs with concatenations and transductions as atomic operations, which arise when performing bounded model checking or dynamic symbolic executions. We demonstrate that the logic can naturally express constraints required for analysing mutation XSS in web applications. Finally, the logic remains decidable in the presence of length, lettercounting, regular, indexOf, and disequality constraints.
@InProceedings{POPL16p123,
author = {Anthony W. Lin and Pablo Barceló},
title = {String Solving with Word Equations and Transducers: Towards a Logic for Analysing Mutation XSS},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {123136},
doi = {10.1145/2837614.2837641},
year = {2016},
}
Publisher's Version
Article Search


Bartel, Alexandre 
POPL'16: "Combining Static Analysis ..."
Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis
Damien Octeau, Somesh Jha, Matthew Dering, Patrick McDaniel, Alexandre Bartel, Li Li, Jacques Klein, and Yves Le Traon (University of Wisconsin, USA; Pennsylvania State University, USA; IMDEA Software Institute, Spain; TU Darmstadt, Germany; University of Luxembourg, Luxembourg)
Static analysis has been successfully used in many areas, from verifying missioncritical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer InterComponent Communication (ICC) yield many potential intercomponent and interapplication links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the intercomponent links is a prerequisite to intercomponent analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static interapplication analysis more tractable, even at large scales.
@InProceedings{POPL16p469,
author = {Damien Octeau and Somesh Jha and Matthew Dering and Patrick McDaniel and Alexandre Bartel and Li Li and Jacques Klein and Yves Le Traon},
title = {Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {469484},
doi = {10.1145/2837614.2837661},
year = {2016},
}
Publisher's Version
Article Search


Batty, Mark 
POPL'16: "Overhauling SC Atomics in ..."
Overhauling SC Atomics in C11 and OpenCL
Mark Batty, Alastair F. Donaldson, and John Wickerson (University of Kent, UK; Imperial College London, UK)
Despite the conceptual simplicity of sequential consistency (SC), the semantics of SC atomic operations and fences in the C11 and OpenCL memory models is subtle, leading to convoluted prose descriptions that translate to complex axiomatic formalisations. We conduct an overhaul of SC atomics in C11, reducing the associated axioms in both number and complexity. A consequence of our simplification is that the SC operations in an execution no longer need to be totally ordered. This relaxation enables, for the first time, efficient and exhaustive simulation of litmus tests that use SC atomics. We extend our improved C11 model to obtain the first rigorous memory model formalisation for OpenCL (which extends C11 with support for heterogeneous manycore programming). In the OpenCL setting, we refine the SC axioms still further to give a sensible semantics to SC operations that employ a ‘memory scope’ to restrict their visibility to specific threads. Our overhaul requires slight strengthenings of both the C11 and the OpenCL memory models, causing some behaviours to become disallowed. We argue that these strengthenings are natural, and that all of the formalised C11 and OpenCL compilation schemes of which we are aware (Power and x86 CPUs for C11, AMD GPUs for OpenCL) remain valid in our revised models. Using the HERD memory model simulator, we show that our overhaul leads to an exponential improvement in simulation time for C11 litmus tests compared with the original model, making *exhaustive* simulation competitive, timewise, with the *nonexhaustive* CDSChecker tool.
@InProceedings{POPL16p634,
author = {Mark Batty and Alastair F. Donaldson and John Wickerson},
title = {Overhauling SC Atomics in C11 and OpenCL},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {634648},
doi = {10.1145/2837614.2837637},
year = {2016},
}
Publisher's Version
Article Search


Bell, Christian J. 
POPL'16: "Chapar: Certified Causally ..."
Chapar: Certified Causally Consistent Distributed KeyValue Stores
Mohsen Lesani, Christian J. Bell, and Adam Chlipala (Massachusetts Institute of Technology, USA)
Today’s Internet services are often expected to stay available and render high responsiveness even in the face of site crashes and network partitions. Theoretical results state that causal consistency is one of the strongest consistency guarantees that is possible under these requirements, and many practical systems provide causally consistent keyvalue stores. In this paper, we present a framework called Chapar for modular verification of causal consistency for replicated keyvalue store implementations and their client programs. Specifically, we formulate separate correctness conditions for keyvalue store implementations and for their clients. The interface between the two is a novel operational semantics for causal consistency. We have verified the causal consistency of two keyvalue store implementations from the literature using a novel proof technique. We have also implemented a simple automatic model checker for the correctness of client programs. The two independently verified results for the implementations and clients can be composed to conclude the correctness of any of the programs when executed with any of the implementations. We have developed and checked our framework in Coq, extracted it to OCaml, and built executable stores.
@InProceedings{POPL16p357,
author = {Mohsen Lesani and Christian J. Bell and Adam Chlipala},
title = {Chapar: Certified Causally Consistent Distributed KeyValue Stores},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {357370},
doi = {10.1145/2837614.2837622},
year = {2016},
}
Publisher's Version
Article Search


Bhargavan, Karthikeyan 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Bhaskaracharya, Somashekaracharya G. 
POPL'16: "SMO: An Integrated Approach ..."
SMO: An Integrated Approach to Intraarray and Interarray Storage Optimization
Somashekaracharya G. Bhaskaracharya, Uday Bondhugula, and Albert Cohen (Indian Institute of Science, India; National Instruments, India; Inria, France; ENS, France)
The polyhedral model provides an expressive intermediate representation that is convenient for the analysis and subsequent transformation of affine loop nests. Several heuristics exist for achieving complex program transformations in this model. However, there is also considerable scope to utilize this model to tackle the problem of automatic memory footprint optimization. In this paper, we present a new automatic storage optimization technique which can be used to achieve both intraarray as well as interarray storage reuse with a predetermined schedule for the computation. Our approach works by finding statementwise storage partitioning hyperplanes that partition a unified global array space so that values with overlapping live ranges are not mapped to the same partition. Our heuristic is driven by a fourfold objective function which not only minimizes the dimensionality and storage requirements of arrays required for each highlevel statement, but also maximizes interstatement storage reuse. The storage mappings obtained using our heuristic can be asymptotically better than those obtained by any existing technique. We implement our technique and demonstrate its practical impact by evaluating its effectiveness on several benchmarks chosen from the domains of image processing, stencil computations, and highperformance computing.
@InProceedings{POPL16p526,
author = {Somashekaracharya G. Bhaskaracharya and Uday Bondhugula and Albert Cohen},
title = {SMO: An Integrated Approach to Intraarray and Interarray Storage Optimization},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {526538},
doi = {10.1145/2837614.2837636},
year = {2016},
}
Publisher's Version
Article Search


Bielik, Pavol 
POPL'16: "Learning Programs from Noisy ..."
Learning Programs from Noisy Data
Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause (ETH Zurich, Switzerland)
We present a new approach for learning programs from noisy datasets. Our approach is based on two new concepts: a regularized program generator which produces a candidate program based on a small sample of the entire dataset while avoiding overfitting, and a dataset sampler which carefully samples the dataset by leveraging the candidate program's score on that dataset. The two components are connected in a continuous feedbackdirected loop. We show how to apply this approach to two settings: one where the dataset has a bound on the noise, and another without a noise bound. The second setting leads to a new way of performing approximate empirical risk minimization on hypotheses classes formed by a discrete search space. We then present two new kinds of program synthesizers which target the two noise settings. First, we introduce a novel regularized bitstream synthesizer that successfully generates programs even in the presence of incorrect examples. We show that the synthesizer can detect errors in the examples while combating overfitting  a major problem in existing synthesis techniques. We also show how the approach can be used in a setting where the dataset grows dynamically via new examples (e.g., provided by a human). Second, we present a novel technique for constructing statistical code completion systems. These are systems trained on massive datasets of open source programs, also known as ``Big Code''. The key idea is to introduce a domain specific language (DSL) over trees and to learn functions in that DSL directly from the dataset. These learned functions then condition the predictions made by the system. This is a flexible and powerful technique which generalizes several existing works as we no longer need to decide a priori on what the prediction should be conditioned (another benefit is that the learned functions are a natural mechanism for explaining the prediction). As a result, our code completion system surpasses the prediction capabilities of existing, hardwired systems.
@InProceedings{POPL16p761,
author = {Veselin Raychev and Pavol Bielik and Martin Vechev and Andreas Krause},
title = {Learning Programs from Noisy Data},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {761774},
doi = {10.1145/2837614.2837671},
year = {2016},
}
Publisher's Version
Article Search


Bjørner, Nikolaj 
POPL'16: "Scaling Network Verification ..."
Scaling Network Verification using Symmetry and Surgery
Gordon D. Plotkin, Nikolaj Bjørner, Nuno P. Lopes, Andrey Rybalchenko, and George Varghese (University of Edinburgh, UK; Microsoft Research, USA; Microsoft Research, UK)
On the surface, large data centers with about 100,000 stations and nearly a million routing rules are complex and hard to verify. However, these networks are highly regular by design; for example they employ fat tree topologies with backup routers interconnected by redundant patterns. To exploit these regularities, we introduce network transformations: given a reachability formula and a network, we transform the network into a simpler to verify network and a corresponding transformed formula, such that the original formula is valid in the network if and only if the transformed formula is valid in the transformed network. Our network transformations exploit network surgery (in which irrelevant or redundant sets of nodes, headers, ports, or rules are ``sliced'' away) and network symmetry (say between backup routers). The validity of these transformations is established using a formal theory of networks. In particular, using Van BenthemHennessyMilner style bisimulation, we show that one can generally associate bisimulations to transformations connecting networks and formulas with their transforms. Our work is a development in an area of current wide interest: applying programming language techniques (in our case bisimulation and modal logic) to problems in switching networks. We provide experimental evidence that our network transformations can speed up by 65x the task of verifying the communication between all pairs of Virtual Machines in a large datacenter network with about 100,000 VMs. An allpair reachability calculation, which formerly took 5.5 days, can be done in 2 hours, and can be easily parallelized to complete in
@InProceedings{POPL16p69,
author = {Gordon D. Plotkin and Nikolaj Bjørner and Nuno P. Lopes and Andrey Rybalchenko and George Varghese},
title = {Scaling Network Verification using Symmetry and Surgery},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {6983},
doi = {10.1145/2837614.2837657},
year = {2016},
}
Publisher's Version
Article Search


Bondhugula, Uday 
POPL'16: "SMO: An Integrated Approach ..."
SMO: An Integrated Approach to Intraarray and Interarray Storage Optimization
Somashekaracharya G. Bhaskaracharya, Uday Bondhugula, and Albert Cohen (Indian Institute of Science, India; National Instruments, India; Inria, France; ENS, France)
The polyhedral model provides an expressive intermediate representation that is convenient for the analysis and subsequent transformation of affine loop nests. Several heuristics exist for achieving complex program transformations in this model. However, there is also considerable scope to utilize this model to tackle the problem of automatic memory footprint optimization. In this paper, we present a new automatic storage optimization technique which can be used to achieve both intraarray as well as interarray storage reuse with a predetermined schedule for the computation. Our approach works by finding statementwise storage partitioning hyperplanes that partition a unified global array space so that values with overlapping live ranges are not mapped to the same partition. Our heuristic is driven by a fourfold objective function which not only minimizes the dimensionality and storage requirements of arrays required for each highlevel statement, but also maximizes interstatement storage reuse. The storage mappings obtained using our heuristic can be asymptotically better than those obtained by any existing technique. We implement our technique and demonstrate its practical impact by evaluating its effectiveness on several benchmarks chosen from the domains of image processing, stencil computations, and highperformance computing.
@InProceedings{POPL16p526,
author = {Somashekaracharya G. Bhaskaracharya and Uday Bondhugula and Albert Cohen},
title = {SMO: An Integrated Approach to Intraarray and Interarray Storage Optimization},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {526538},
doi = {10.1145/2837614.2837636},
year = {2016},
}
Publisher's Version
Article Search


Borgström, Johannes 
POPL'16: "Fabular: Regression Formulas ..."
Fabular: Regression Formulas as Probabilistic Programming
Johannes Borgström, Andrew D. Gordon, Long Ouyang, Claudio Russo, Adam Ścibior, and Marcin Szymczak (Uppsala University, Sweden; Microsoft Research, UK; University of Edinburgh, UK; Stanford University, USA; University of Cambridge, UK; MPI Tübingen, Germany)
Regression formulas are a domainspecific language adopted by several R packages for describing an important and useful class of statistical models: hierarchical linear regressions. Formulas are succinct, expressive, and clearly popular, so are they a useful addition to probabilistic programming languages? And what do they mean? We propose a core calculus of hierarchical linear regression, in which regression coefficients are themselves defined by nested regressions (unlike in R). We explain how our calculus captures the essence of the formula DSL found in R. We describe the design and implementation of Fabular, a version of the Tabular schemadriven probabilistic programming language, enriched with formulas based on our regression calculus. To the best of our knowledge, this is the first formal description of the core ideas of R's formula notation, the first development of a calculus of regression formulas, and the first demonstration of the benefits of composing regression formulas and latent variables in a probabilistic programming language.
@InProceedings{POPL16p271,
author = {Johannes Borgström and Andrew D. Gordon and Long Ouyang and Claudio Russo and Adam Ścibior and Marcin Szymczak},
title = {Fabular: Regression Formulas as Probabilistic Programming},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {271283},
doi = {10.1145/2837614.2837653},
year = {2016},
}
Publisher's Version
Article Search


Bornholt, James 
POPL'16: "Optimizing Synthesis with ..."
Optimizing Synthesis with Metasketches
James Bornholt, Emina Torlak, Dan Grossman, and Luis Ceze (University of Washington, USA)
Many advanced programming toolsfor both endusers and expert developersrely on program synthesis to automatically generate implementations from highlevel specifications. These tools often need to employ tricky, custombuilt synthesis algorithms because they require synthesized programs to be not only correct, but also optimal with respect to a desired cost metric, such as program size. Finding these optimal solutions efficiently requires domainspecific search strategies, but existing synthesizers hardcode the strategy, making them difficult to reuse. This paper presents metasketches, a general framework for specifying and solving optimal synthesis problems. metasketches make the search strategy a part of the problem definition by specifying a fragmentation of the search space into an ordered set of classic sketches. We provide two cooperating search algorithms to effectively solve metasketches. A global optimizing search coordinates the activities of local searches, informing them of the costs of potentiallyoptimal solutions as they explore different regions of the candidate space in parallel. The local searches execute an incremental form of counterexampleguided inductive synthesis to incorporate information sent from the global search. We present Synapse, an implementation of these algorithms, and show that it effectively solves optimal synthesis problems with a variety of different cost functions. In addition, metasketches can be used to accelerate classic (nonoptimal) synthesis by explicitly controlling the search strategy, and we show that Synapse solves classic synthesis problems that stateoftheart tools cannot.
@InProceedings{POPL16p775,
author = {James Bornholt and Emina Torlak and Dan Grossman and Luis Ceze},
title = {Optimizing Synthesis with Metasketches},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {775788},
doi = {10.1145/2837614.2837666},
year = {2016},
}
Publisher's Version
Article Search


Brotherston, James 
POPL'16: "Model Checking for SymbolicHeap ..."
Model Checking for SymbolicHeap Separation Logic with Inductive Predicates
James Brotherston, Nikos Gorogiannis, Max Kanovich, and Reuben Rowe (University College London, UK; Middlesex University, UK; National Research University Higher School of Economics, Russia)
We investigate the *model checking* problem for symbolicheap separation logic with userdefined inductive predicates, i.e., the problem of checking that a given stackheap memory state satisfies a given formula in this language, as arises e.g. in software testing or runtime verification. First, we show that the problem is *decidable*; specifically, we present a bottomup fixed point algorithm that decides the problem and runs in exponential time in the size of the problem instance. Second, we show that, while model checking for the full language is EXPTIMEcomplete, the problem becomes NPcomplete or PTIMEsolvable when we impose natural syntactic restrictions on the schemata defining the inductive predicates. We additionally present NP and PTIME algorithms for these restricted fragments. Finally, we report on the experimental performance of our procedures on a variety of specifications extracted from programs, exercising multiple combinations of syntactic restrictions.
@InProceedings{POPL16p84,
author = {James Brotherston and Nikos Gorogiannis and Max Kanovich and Reuben Rowe},
title = {Model Checking for SymbolicHeap Separation Logic with Inductive Predicates},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {8496},
doi = {10.1145/2837614.2837621},
year = {2016},
}
Publisher's Version
Article Search


Brown, Matt 
POPL'16: "Breaking through the Normalization ..."
Breaking through the Normalization Barrier: A SelfInterpreter for Fomega
Matt Brown and Jens Palsberg (University of California at Los Angeles, USA)
According to conventional wisdom, a selfinterpreter for a strongly normalizing lambdacalculus is impossible. We call this the normalization barrier. The normalization barrier stems from a theorem in computability theory that says that a total universal function for the total computable functions is impossible. In this paper we break through the normalization barrier and define a selfinterpreter for System F_omega, a strongly normalizing lambdacalculus. After a careful analysis of the classical theorem, we show that static type checking in F_omega can exclude the proof's diagonalization gadget, leaving open the possibility for a selfinterpreter. Along with the selfinterpreter, we program four other operations in F_omega, including a continuationpassing style transformation. Our operations rely on a new approach to program representation that may be useful in theorem provers and compilers.
@InProceedings{POPL16p5,
author = {Matt Brown and Jens Palsberg},
title = {Breaking through the Normalization Barrier: A SelfInterpreter for Fomega},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {517},
doi = {10.1145/2837614.2837623},
year = {2016},
}
Publisher's Version
Article Search


Cai, Yufei 
POPL'16: "System Fomega with Equirecursive ..."
System Fomega with Equirecursive Types for DatatypeGeneric Programming
Yufei Cai, Paolo G. Giarrusso, and Klaus Ostermann (University of Tübingen, Germany)
Traversing an algebraic datatype by hand requires boilerplate code which duplicates the structure of the datatype. Datatypegeneric programming (DGP) aims to eliminate such boilerplate code by decomposing algebraic datatypes into type constructor applications from which generic traversals can be synthesized. However, different traversals require different decompositions, which yield isomorphic but unequal types. This hinders the interoperability of different DGP techniques. In this paper, we propose Fωμ, an extension of the higherorder polymorphic lambda calculus Fω with records, variants, and equirecursive types. We prove the soundness of the type system, and show that type checking for firstorder recursive types is decidable with a practical type checking algorithm. In our soundness proof we define type equality by interpreting types as infinitary λterms (in particular, Berarduccitrees). To decide type equality we βnormalize types, and then use an extension of equivalence checking for usual equirecursive types. Thanks to equirecursive types, new decompositions for a datatype can be added modularly and still interoperate with each other, allowing multiple DGP techniques to work together. We sketch how generic traversals can be synthesized, and apply these components to some examples. Since the set of datatype decomposition becomes extensible, System Fωμ enables using DGP techniques incrementally, instead of planning for them upfront or doing invasive refactoring.
@InProceedings{POPL16p30,
author = {Yufei Cai and Paolo G. Giarrusso and Klaus Ostermann},
title = {System Fomega with Equirecursive Types for DatatypeGeneric Programming},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {3043},
doi = {10.1145/2837614.2837660},
year = {2016},
}
Publisher's Version
Article Search
Info


Cardelli, Luca 
POPL'16: "Symbolic Computation of Differential ..."
Symbolic Computation of Differential Equivalences
Luca Cardelli, Mirco Tribastone, Max Tschaikowski, and Andrea Vandin (Microsoft Research, UK; University of Oxford, UK; IMT Lucca, Italy)
Ordinary differential equations (ODEs) are widespread in many natural sciences including chemistry, ecology, and systems biology, and in disciplines such as control theory and electrical engineering. Building on the celebrated moleculesasprocesses paradigm, they have become increasingly popular in computer science, with highlevel languages and formal methods such as Petri nets, process algebra, and rulebased systems that are interpreted as ODEs. We consider the problem of comparing and minimizing ODEs automatically. Influenced by traditional approaches in the theory of programming, we propose differential equivalence relations. We study them for a basic intermediate language, for which we have decidability results, that can be targeted by a class of highlevel specifications. An ODE implicitly represents an uncountable state space, hence reasoning techniques cannot be borrowed from established domains such as probabilistic programs with finitestate Markov chain semantics. We provide novel symbolic procedures to check an equivalence and compute the largest one via partition refinement algorithms that use satisfiability modulo theories. We illustrate the generality of our framework by showing that differential equivalences include (i) wellknown notions for the minimization of continuoustime Markov chains (lumpability), (ii)~bisimulations for chemical reaction networks recently proposed by Cardelli et al., and (iii) behavioral relations for process algebra with ODE semantics. With a prototype implementation we are able to detect equivalences in biochemical models from the literature that cannot be reduced using competing automatic techniques.
@InProceedings{POPL16p137,
author = {Luca Cardelli and Mirco Tribastone and Max Tschaikowski and Andrea Vandin},
title = {Symbolic Computation of Differential Equivalences},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {137150},
doi = {10.1145/2837614.2837649},
year = {2016},
}
Publisher's Version
Article Search


Ceze, Luis 
POPL'16: "Optimizing Synthesis with ..."
Optimizing Synthesis with Metasketches
James Bornholt, Emina Torlak, Dan Grossman, and Luis Ceze (University of Washington, USA)
Many advanced programming toolsfor both endusers and expert developersrely on program synthesis to automatically generate implementations from highlevel specifications. These tools often need to employ tricky, custombuilt synthesis algorithms because they require synthesized programs to be not only correct, but also optimal with respect to a desired cost metric, such as program size. Finding these optimal solutions efficiently requires domainspecific search strategies, but existing synthesizers hardcode the strategy, making them difficult to reuse. This paper presents metasketches, a general framework for specifying and solving optimal synthesis problems. metasketches make the search strategy a part of the problem definition by specifying a fragmentation of the search space into an ordered set of classic sketches. We provide two cooperating search algorithms to effectively solve metasketches. A global optimizing search coordinates the activities of local searches, informing them of the costs of potentiallyoptimal solutions as they explore different regions of the candidate space in parallel. The local searches execute an incremental form of counterexampleguided inductive synthesis to incorporate information sent from the global search. We present Synapse, an implementation of these algorithms, and show that it effectively solves optimal synthesis problems with a variety of different cost functions. In addition, metasketches can be used to accelerate classic (nonoptimal) synthesis by explicitly controlling the search strategy, and we show that Synapse solves classic synthesis problems that stateoftheart tools cannot.
@InProceedings{POPL16p775,
author = {James Bornholt and Emina Torlak and Dan Grossman and Luis Ceze},
title = {Optimizing Synthesis with Metasketches},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {775788},
doi = {10.1145/2837614.2837666},
year = {2016},
}
Publisher's Version
Article Search


Chatterjee, Krishnendu 
POPL'16: "Algorithmic Analysis of Qualitative ..."
Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs
Krishnendu Chatterjee, Hongfei Fu, Petr Novotný, and Rouzbeh Hasheminezhad (IST Austria, Austria; Institute of Software at Chinese Academy of Sciences, China; Sharif University of Technology, Iran)
In this paper, we consider termination of probabilistic programs with realvalued variables. The questions concerned are: 1. qualitative ones that ask (i) whether the program terminates with probability 1 (almostsure termination) and (ii) whether the expected termination time is finite (finite termination); 2. quantitative ones that ask (i) to approximate the expected termination time (expectation problem) and (ii) to compute a bound B such that the probability to terminate after B steps decreases exponentially (concentration problem). To solve these questions, we utilize the notion of ranking supermartingales which is a powerful approach for proving termination of probabilistic programs. In detail, we focus on algorithmic synthesis of linear rankingsupermartingales over affine probabilistic programs (APP's) with both angelic and demonic nondeterminism. An important subclass of APP's is LRAPP which is defined as the class of all APP's over which a linear rankingsupermartingale exists. Our main contributions are as follows. Firstly, we show that the membership problem of LRAPP (i) can be decided in polynomial time for APP's with at most demonic nondeterminism, and (ii) is NPhard and in PSPACE for APP's with angelic nondeterminism; moreover, the NPhardness result holds already for APP's without probability and demonic nondeterminism. Secondly, we show that the concentration problem over LRAPP can be solved in the same complexity as for the membership problem of LRAPP. Finally, we show that the expectation problem over LRAPP can be solved in 2EXPTIME and is PSPACEhard even for APP's without probability and nondeterminism (i.e., deterministic programs). Our experimental results demonstrate the effectiveness of our approach to answer the qualitative and quantitative questions over APP's with at most demonic nondeterminism.
@InProceedings{POPL16p327,
author = {Krishnendu Chatterjee and Hongfei Fu and Petr Novotný and Rouzbeh Hasheminezhad},
title = {Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {327342},
doi = {10.1145/2837614.2837639},
year = {2016},
}
Publisher's Version
Article Search
POPL'16: "Algorithms for Algebraic Path ..."
Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components
Krishnendu Chatterjee, Amir Kafshdar Goharshady, Rasmus IbsenJensen, and Andreas Pavlogiannis (IST Austria, Austria)
We study algorithmic questions for concurrent systems where the transitions are labeled from a complete, closed semiring, and path properties are algebraic with semiring operations. The algebraic path properties can model dataflow analysis problems, the shortest path problem, and many other natural problems that arise in program analysis. We consider that each component of the concurrent system is a graph with constant treewidth, a property satisfied by the controlflow graphs of most programs. We allow for multiple possible queries, which arise naturally in demand driven dataflow analysis. The study of multiple queries allows us to consider the tradeoff between the resource usage of the onetime preprocessing and for each individual query. The traditional approach constructs the product graph of all components and applies the bestknown graph algorithm on the product. In this approach, even the answer to a single query requires the transitive closure (i.e., the results of all possible queries), which provides no room for tradeoff between preprocessing and query time. Our main contributions are algorithms that significantly improve the worstcase running time of the traditional approach, and provide various tradeoffs depending on the number of queries. For example, in a concurrent system of two components, the traditional approach requires hexic time in the worst case for answering one query as well as computing the transitive closure, whereas we show that with onetime preprocessing in almost cubic time, each subsequent query can be answered in at most linear time, and even the transitive closure can be computed in almost quartic time. Furthermore, we establish conditional optimality results showing that the worstcase running time of our algorithms cannot be improved without achieving major breakthroughs in graph algorithms (i.e., improving the worstcase bound for the shortest path problem in general graphs). Preliminary experimental results show that our algorithms perform favorably on several benchmarks.
@InProceedings{POPL16p733,
author = {Krishnendu Chatterjee and Amir Kafshdar Goharshady and Rasmus IbsenJensen and Andreas Pavlogiannis},
title = {Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {733747},
doi = {10.1145/2837614.2837624},
year = {2016},
}
Publisher's Version
Article Search


Chen, Sheng 
POPL'16: "Principal Type Inference for ..."
Principal Type Inference for GADTs
Sheng Chen and Martin Erwig (University of Louisiana at Lafayette, USA; Oregon State University, USA)
We present a new method for GADT type inference that improves the precision of previous approaches. In particular, our approach accepts more typecorrect programs than previous approaches when they do not employ type annotations. A side benefit of our approach is that it can detect a wide range of runtime errors that are missed by previous approaches. Our method is based on the idea to represent type refinements in patternmatching branches by choice types, which facilitate a separation of the typing and reconciliation phases and thus support case expressions. This idea is formalized in a type system, which is both sound and a conservative extension of the classical HindleyMilner system. We present the results of an empirical evaluation that compares our algorithm with previous approaches.
@InProceedings{POPL16p416,
author = {Sheng Chen and Martin Erwig},
title = {Principal Type Inference for GADTs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {416428},
doi = {10.1145/2837614.2837665},
year = {2016},
}
Publisher's Version
Article Search


Cheung, ShingChi 
POPL'16: "Casper: An Efficient Approach ..."
Casper: An Efficient Approach to Call Trace Collection
Rongxin Wu, Xiao Xiao, ShingChi Cheung, Hongyu Zhang, and Charles Zhang (Hong Kong University of Science and Technology, China; Microsoft Research, China)
Call traces, i.e., sequences of function calls and returns, are fundamental to a wide range of program analyses such as bug reproduction, fault diagnosis, performance analysis, and many others. The conventional approach to collect call traces that instruments each function call and return site incurs large space and time overhead. Our approach aims at reducing the recording overheads by instrumenting only a small amount of call sites while keeping the capability of recovering the full trace. We propose a call trace model and a logged call trace model based on an LL(1) grammar, which enables us to define the criteria of a feasible solution to call trace collection. Based on the two models, we prove that to collect call traces with minimal instrumentation is an NPhard problem. We then propose an efficient approach to obtaining a suboptimal solution. We implemented our approach as a tool Casper and evaluated it using the DaCapo benchmark suite. The experiment results show that our approach causes significantly lower runtime (and space) overhead than two stateofthearts approaches.
@InProceedings{POPL16p678,
author = {Rongxin Wu and Xiao Xiao and ShingChi Cheung and Hongyu Zhang and Charles Zhang},
title = {Casper: An Efficient Approach to Call Trace Collection},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {678690},
doi = {10.1145/2837614.2837619},
year = {2016},
}
Publisher's Version
Article Search


Chlipala, Adam 
POPL'16: "Chapar: Certified Causally ..."
Chapar: Certified Causally Consistent Distributed KeyValue Stores
Mohsen Lesani, Christian J. Bell, and Adam Chlipala (Massachusetts Institute of Technology, USA)
Today’s Internet services are often expected to stay available and render high responsiveness even in the face of site crashes and network partitions. Theoretical results state that causal consistency is one of the strongest consistency guarantees that is possible under these requirements, and many practical systems provide causally consistent keyvalue stores. In this paper, we present a framework called Chapar for modular verification of causal consistency for replicated keyvalue store implementations and their client programs. Specifically, we formulate separate correctness conditions for keyvalue store implementations and for their clients. The interface between the two is a novel operational semantics for causal consistency. We have verified the causal consistency of two keyvalue store implementations from the literature using a novel proof technique. We have also implemented a simple automatic model checker for the correctness of client programs. The two independently verified results for the implementations and clients can be composed to conclude the correctness of any of the programs when executed with any of the implementations. We have developed and checked our framework in Coq, extracted it to OCaml, and built executable stores.
@InProceedings{POPL16p357,
author = {Mohsen Lesani and Christian J. Bell and Adam Chlipala},
title = {Chapar: Certified Causally Consistent Distributed KeyValue Stores},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {357370},
doi = {10.1145/2837614.2837622},
year = {2016},
}
Publisher's Version
Article Search


Cimini, Matteo 
POPL'16: "The Gradualizer: A Methodology ..."
The Gradualizer: A Methodology and Algorithm for Generating Gradual Type Systems
Matteo Cimini and Jeremy G. Siek (Indiana University, USA)
Many languages are beginning to integrate dynamic and static typing. Siek and Taha offered gradual typing as an approach to this integration that provides a coherent and fullspan migration between the two disciplines. However, the literature lacks a general methodology for designing gradually typed languages. Our first contribution is to provide a methodology for deriving the gradual type system and the compilation to the cast calculus. Based on this methodology, we present the Gradualizer, an algorithm that generates a gradual type system from a wellformed type system and also generates a compiler to the cast calculus. Our algorithm handles a large class of type systems and generates systems that are correct with respect to the formal criteria of gradual typing. We also report on an implementation of the Gradualizer that takes a type system expressed in lambdaprolog and outputs its gradually typed version and a compiler to the cast calculus in lambdaprolog.
@InProceedings{POPL16p443,
author = {Matteo Cimini and Jeremy G. Siek},
title = {The Gradualizer: A Methodology and Algorithm for Generating Gradual Type Systems},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {443455},
doi = {10.1145/2837614.2837632},
year = {2016},
}
Publisher's Version
Article Search


Cîrstea, Corina 
POPL'16: "LatticeTheoretic Progress ..."
LatticeTheoretic Progress Measures and Coalgebraic Model Checking
Ichiro Hasuo, Shunsuke Shimizu, and Corina Cîrstea (University of Tokyo, Japan; University of Southampton, UK)
In the context of formal verification in general and model checking in particular, parity games serve as a mighty vehicle: many problems are encoded as parity games, which are then solved by the seminal algorithm by Jurdzinski. In this paper we identify the essence of this workflow to be the notion of progress measure, and formalize it in general, possibly infinitary, latticetheoretic terms. Our view on progress measures is that they are to nested/alternating fixed points what invariants are to safety/greatest fixed points, and what ranking functions are to liveness/least fixed points. That is, progress measures are combination of the latter two notions (invariant and ranking function) that have been extensively studied in the context of (program) verification. We then apply our theory of progress measures to a general modelchecking framework, where systems are categorically presented as coalgebras. The framework's theoretical robustness is witnessed by a smooth transfer from the branchingtime setting to the lineartime one. Although the framework can be used to derive some decision procedures for finite settings, we also expect the proposed framework to form a basis for sound proof methods for some undecidable/infinitary problems.
@InProceedings{POPL16p718,
author = {Ichiro Hasuo and Shunsuke Shimizu and Corina Cîrstea},
title = {LatticeTheoretic Progress Measures and Coalgebraic Model Checking},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {718732},
doi = {10.1145/2837614.2837673},
year = {2016},
}
Publisher's Version
Article Search


Clark, Alison M. 
POPL'16: "Abstracting Gradual Typing ..."
Abstracting Gradual Typing
Ronald Garcia, Alison M. Clark, and Éric Tanter (University of British Columbia, Canada; University of Chile, Chile)
Language researchers and designers have extended a wide variety of type systems to support gradual typing, which enables languages to seamlessly combine dynamic and static checking. These efforts consistently demonstrate that designing a satisfactory gradual counterpart to a static type system is challenging, and this challenge only increases with the sophistication of the type system. Gradual type system designers need more formal tools to help them conceptualize, structure, and evaluate their designs. In this paper, we propose a new formal foundation for gradual typing, drawing on principles from abstract interpretation to give gradual types a semantics in terms of preexisting static types. Abstracting Gradual Typing (AGT for short) yields a formal account of consistencyone of the cornerstones of the gradual typing approachthat subsumes existing notions of consistency, which were developed through intuition and ad hoc reasoning. Given a syntaxdirected static typing judgment, the AGT approach induces a corresponding gradual typing judgment. Then the type safety proof for the underlying static discipline induces a dynamic semantics for gradual programs defined over sourcelanguage typing derivations. The AGT approach does not resort to an externally justified cast calculus: instead, runtime checks naturally arise by deducing evidence for consistent judgments during proof reduction. To illustrate the approach, we develop a novel graduallytyped counterpart for a language with record subtyping. Gradual languages designed with the AGT approach satisfy by construction the refined criteria for gradual typing set forth by Siek and colleagues.
@InProceedings{POPL16p429,
author = {Ronald Garcia and Alison M. Clark and Éric Tanter},
title = {Abstracting Gradual Typing},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {429442},
doi = {10.1145/2837614.2837670},
year = {2016},
}
Publisher's Version
Article Search


Cohen, Albert 
POPL'16: "SMO: An Integrated Approach ..."
SMO: An Integrated Approach to Intraarray and Interarray Storage Optimization
Somashekaracharya G. Bhaskaracharya, Uday Bondhugula, and Albert Cohen (Indian Institute of Science, India; National Instruments, India; Inria, France; ENS, France)
The polyhedral model provides an expressive intermediate representation that is convenient for the analysis and subsequent transformation of affine loop nests. Several heuristics exist for achieving complex program transformations in this model. However, there is also considerable scope to utilize this model to tackle the problem of automatic memory footprint optimization. In this paper, we present a new automatic storage optimization technique which can be used to achieve both intraarray as well as interarray storage reuse with a predetermined schedule for the computation. Our approach works by finding statementwise storage partitioning hyperplanes that partition a unified global array space so that values with overlapping live ranges are not mapped to the same partition. Our heuristic is driven by a fourfold objective function which not only minimizes the dimensionality and storage requirements of arrays required for each highlevel statement, but also maximizes interstatement storage reuse. The storage mappings obtained using our heuristic can be asymptotically better than those obtained by any existing technique. We implement our technique and demonstrate its practical impact by evaluating its effectiveness on several benchmarks chosen from the domains of image processing, stencil computations, and highperformance computing.
@InProceedings{POPL16p526,
author = {Somashekaracharya G. Bhaskaracharya and Uday Bondhugula and Albert Cohen},
title = {SMO: An Integrated Approach to Intraarray and Interarray Storage Optimization},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {526538},
doi = {10.1145/2837614.2837636},
year = {2016},
}
Publisher's Version
Article Search


Curien, PierreLouis 
POPL'16: "A Theory of Effects and Resources: ..."
A Theory of Effects and Resources: Adjunction Models and Polarised Calculi
PierreLouis Curien, Marcelo Fiore, and Guillaume MunchMaccagnoni (University of Paris Diderot, France; Inria, France; University of Cambridge, UK)
We consider the CurryHowardLambek correspondence for effectful computation and resource management, specifically proposing polarised calculi together with presheafenriched adjunction models as the starting point for a comprehensive semantic theory relating logical systems, typed calculi, and categorical models in this context. Our thesis is that the combination of effects and resources should be considered orthogonally. Model theoretically, this leads to an understanding of our categorical models from two complementary perspectives: (i) as a linearisation of CBPV (CallbyPushValue) adjunction models, and (ii) as an extension of linear/nonlinear adjunction models with an adjoint resolution of computational effects. When the linear structure is cartesian and the resource structure is trivial we recover Levy’s notion of CBPV adjunction model, while when the effect structure is trivial we have Benton’s linear/nonlinear adjunction models. Further instances of our model theory include the dialogue categories with a resource modality of Melliès and Tabareau, and the [E]EC ([Enriched] Effect Calculus) models of Egger, Møgelberg and Simpson. Our development substantiates the approach by providing a lifting theorem of linear models into cartesian ones. To each of our categorical models we systematically associate a typed term calculus, each of which corresponds to a variant of the sequent calculi LJ (Intuitionistic Logic) or ILL (Intuitionistic Linear Logic). The adjoint resolution of effects corresponds to polarisation whereby, syntactically, types locally determine a strict or lazy evaluation order and, semantically, the associativity of cuts is relaxed. In particular, our results show that polarisation provides a computational interpretation of CBPV in direct style. Further, we characterise depolarised models: those where the cut is associative, and where the evaluation order is unimportant. We explain possible advantages of this style of calculi for the operational semantics of effects.
@InProceedings{POPL16p44,
author = {PierreLouis Curien and Marcelo Fiore and Guillaume MunchMaccagnoni},
title = {A Theory of Effects and Resources: Adjunction Models and Polarised Calculi},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {4456},
doi = {10.1145/2837614.2837652},
year = {2016},
}
Publisher's Version
Article Search


Deacon, Will 
POPL'16: "Modelling the ARMv8 Architecture, ..."
Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell (University of Cambridge, UK; University of St. Andrews, UK; Inria, France; ARM, UK)
In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64bit applicationlevel instruction set (ISA). Our goal is to clarify what the range of architecturally allowable behaviour is, and thereby to support future work on formal verification, analysis, and testing of concurrent ARM software and hardware. Establishing such models with high confidence is intrinsically difficult: it involves capturing the vendor's architectural intent, aspects of which (especially for concurrency) have not previously been precisely defined. We therefore first develop a concurrency model with a microarchitectural flavour, abstracting from many hardware implementation concerns but still close to hardwaredesigner intuition. This means it can be discussed in detail with ARM architects. We then develop a more abstract model, better suited for use as an architectural specification, which we prove sound w.r.t.~the first. The instruction semantics involves further difficulties, handling the mass of detail and the subtle intensional information required to interface to the concurrency model. We have a novel ISA description language, with a lightweight dependent type system, letting us do both with a rather direct representation of the ARM reference manual instruction descriptions. We build a tool from the combined semantics that lets one explore, either interactively or exhaustively, the full range of architecturally allowed behaviour, for litmus tests and (small) ELF executables. We prove correctness of some optimisations needed for tool performance. We validate the models by discussion with ARM staff, and by comparison against ARM hardware behaviour, for ISA single instruction tests and concurrent litmus tests.
@InProceedings{POPL16p608,
author = {Shaked Flur and Kathryn E. Gray and Christopher Pulte and Susmit Sarkar and Ali Sezgin and Luc Maranget and Will Deacon and Peter Sewell},
title = {Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {608621},
doi = {10.1145/2837614.2837615},
year = {2016},
}
Publisher's Version
Article Search
Info


DelignatLavaud, Antoine 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Dering, Matthew 
POPL'16: "Combining Static Analysis ..."
Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis
Damien Octeau, Somesh Jha, Matthew Dering, Patrick McDaniel, Alexandre Bartel, Li Li, Jacques Klein, and Yves Le Traon (University of Wisconsin, USA; Pennsylvania State University, USA; IMDEA Software Institute, Spain; TU Darmstadt, Germany; University of Luxembourg, Luxembourg)
Static analysis has been successfully used in many areas, from verifying missioncritical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer InterComponent Communication (ICC) yield many potential intercomponent and interapplication links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the intercomponent links is a prerequisite to intercomponent analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static interapplication analysis more tractable, even at large scales.
@InProceedings{POPL16p469,
author = {Damien Octeau and Somesh Jha and Matthew Dering and Patrick McDaniel and Alexandre Bartel and Li Li and Jacques Klein and Yves Le Traon},
title = {Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {469484},
doi = {10.1145/2837614.2837661},
year = {2016},
}
Publisher's Version
Article Search


Devriese, Dominique 
POPL'16: "FullyAbstract Compilation ..."
FullyAbstract Compilation by Approximate BackTranslation
Dominique Devriese, Marco Patrignani, and Frank Piessens (KU Leuven, Belgium)
A compiler is fullyabstract if the compilation from source language programs to target language programs reflects and preserves behavioural equivalence. Such compilers have important security benefits, as they limit the power of an attacker interacting with the program in the target language to that of an attacker interacting with the program in the source language. Proving compiler fullabstraction is, however, rather complicated. A common proof technique is based on the backtranslation of targetlevel program contexts to behaviourallyequivalent sourcelevel contexts. However, constructing such a backtranslation is problematic when the source language is not strong enough to embed an encoding of the target language. For instance, when compiling from the simplytyped λcalculus (λτ) to the untyped λcalculus (λu), the lack of recursive types in λτ prevents such a backtranslation. We propose a general and elegant solution for this problem. The key insight is that it suffices to construct an approximate backtranslation. The approximation is only accurate up to a certain number of steps and conservative beyond that, in the sense that the context generated by the backtranslation may diverge when the original would not, but not vice versa. Based on this insight, we describe a general technique for proving compiler fullabstraction and demonstrate it on a compiler from λτ to λu . The proof uses asymmetric crosslanguage logical relations and makes innovative use of stepindexing to express the relation between a context and its approximate backtranslation. We believe this proof technique can scale to challenging settings and enable simpler, more scalable proofs of compiler fullabstraction.
@InProceedings{POPL16p164,
author = {Dominique Devriese and Marco Patrignani and Frank Piessens},
title = {FullyAbstract Compilation by Approximate BackTranslation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {164177},
doi = {10.1145/2837614.2837618},
year = {2016},
}
Publisher's Version
Article Search


Dillig, Isil 
POPL'16: "Maximal Specification Synthesis ..."
Maximal Specification Synthesis
Aws Albarghouthi, Isil Dillig, and Arie Gurfinkel (University of WisconsinMadison, USA; University of Texas at Austin, USA; Carnegie Mellon University, USA)
Many problems in program analysis, verification, and synthesis require inferring specifications of unknown procedures. Motivated by a broad range of applications, we formulate the problem of maximal specification inference: Given a postcondition Phi and a program P calling a set of unknown procedures F_1,…,F_n, what are the most permissive specifications of procedures F_i that ensure correctness of P? In other words, we are looking for the smallest number of assumptions we need to make about the behaviours of F_i in order to prove that P satisfies its postcondition. To solve this problem, we present a novel approach that utilizes a counterexampleguided inductive synthesis loop and reduces the maximal specification inference problem to multiabduction. We formulate the novel notion of multiabduction as a generalization of classical logical abduction and present an algorithm for solving multiabduction problems. On the practical side, we evaluate our specification inference technique on a range of benchmarks and demonstrate its ability to synthesize specifications of kernel routines invoked by device drivers.
@InProceedings{POPL16p789,
author = {Aws Albarghouthi and Isil Dillig and Arie Gurfinkel},
title = {Maximal Specification Synthesis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {789801},
doi = {10.1145/2837614.2837628},
year = {2016},
}
Publisher's Version
Article Search


Donaldson, Alastair F. 
POPL'16: "Overhauling SC Atomics in ..."
Overhauling SC Atomics in C11 and OpenCL
Mark Batty, Alastair F. Donaldson, and John Wickerson (University of Kent, UK; Imperial College London, UK)
Despite the conceptual simplicity of sequential consistency (SC), the semantics of SC atomic operations and fences in the C11 and OpenCL memory models is subtle, leading to convoluted prose descriptions that translate to complex axiomatic formalisations. We conduct an overhaul of SC atomics in C11, reducing the associated axioms in both number and complexity. A consequence of our simplification is that the SC operations in an execution no longer need to be totally ordered. This relaxation enables, for the first time, efficient and exhaustive simulation of litmus tests that use SC atomics. We extend our improved C11 model to obtain the first rigorous memory model formalisation for OpenCL (which extends C11 with support for heterogeneous manycore programming). In the OpenCL setting, we refine the SC axioms still further to give a sensible semantics to SC operations that employ a ‘memory scope’ to restrict their visibility to specific threads. Our overhaul requires slight strengthenings of both the C11 and the OpenCL memory models, causing some behaviours to become disallowed. We argue that these strengthenings are natural, and that all of the formalised C11 and OpenCL compilation schemes of which we are aware (Power and x86 CPUs for C11, AMD GPUs for OpenCL) remain valid in our revised models. Using the HERD memory model simulator, we show that our overhaul leads to an exponential improvement in simulation time for C11 litmus tests compared with the original model, making *exhaustive* simulation competitive, timewise, with the *nonexhaustive* CDSChecker tool.
@InProceedings{POPL16p634,
author = {Mark Batty and Alastair F. Donaldson and John Wickerson},
title = {Overhauling SC Atomics in C11 and OpenCL},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {634648},
doi = {10.1145/2837614.2837637},
year = {2016},
}
Publisher's Version
Article Search


Drăgoi, Cezara 
POPL'16: "PSync: A Partially Synchronous ..."
PSync: A Partially Synchronous Language for FaultTolerant Distributed Algorithms
Cezara Drăgoi, Thomas A. Henzinger, and Damien Zufferey (Inria, France; ENS, France; CNRS, France; IST Austria, Austria; Massachusetts Institute of Technology, USA)
Faulttolerant distributed algorithms play an important role in many critical/highavailability applications. These algorithms are notoriously difficult to implement correctly, due to asynchronous communication and the occurrence of faults, such as the network dropping messages or computers crashing. We introduce PSync, a domain specific language based on the HeardOf model, which views asynchronous faulty systems as synchronous ones with an adversarial environment that simulates asynchrony and faults by dropping messages. We define a runtime system for PSync that efficiently executes on asynchronous networks. We formalise the relation between the runtime system and PSync in terms of observational refinement. The highlevel lockstep abstraction introduced by PSync simplifies the design and implementation of faulttolerant distributed algorithms and enables automated formal verification. We have implemented an embedding of PSync in the Scala programming language with a runtime system for partially synchronous networks. We show the applicability of PSync by implementing several important faulttolerant distributed algorithms and we compare the implementation of consensus algorithms in PSync against implementations in other languages in terms of code size, runtime efficiency, and verification.
@InProceedings{POPL16p400,
author = {Cezara Drăgoi and Thomas A. Henzinger and Damien Zufferey},
title = {PSync: A Partially Synchronous Language for FaultTolerant Distributed Algorithms},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {400415},
doi = {10.1145/2837614.2837650},
year = {2016},
}
Publisher's Version
Article Search


Dreyer, Derek 
POPL'16: "Lightweight Verification of ..."
Lightweight Verification of Separate Compilation
Jeehoon Kang, Yoonseung Kim, ChungKil Hur, Derek Dreyer, and Viktor Vafeiadis (Seoul National University, South Korea; MPISWS, Germany)
Major compiler verification efforts, such as the CompCert project, have traditionally simplified the verification problem by restricting attention to the correctness of wholeprogram compilation, leaving open the question of how to verify the correctness of separate compilation. Recently, a number of sophisticated techniques have been proposed for proving more flexible, compositional notions of compiler correctness, but these approaches tend to be quite heavyweight compared to the simple "closed simulations" used in verifying wholeprogram compilation. Applying such techniques to a compiler like CompCert, as Stewart et al. have done, involves major changes and extensions to its original verification. In this paper, we show that if we aim somewhat lowerto prove correctness of separate compilation, but only for a *single* compilerwe can drastically simplify the proof effort. Toward this end, we develop several lightweight techniques that recast the compositional verification problem in terms of wholeprogram compilation, thereby enabling us to largely reuse the closedsimulation proofs from existing compiler verifications. We demonstrate the effectiveness of these techniques by applying them to CompCert 2.4, converting its verification of wholeprogram compilation into a verification of separate compilation in less than two personmonths. This conversion only required a small number of changes to the original proofs, and uncovered two compiler bugs along the way. The result is SepCompCert, the first verification of separate compilation for the full CompCert compiler.
@InProceedings{POPL16p178,
author = {Jeehoon Kang and Yoonseung Kim and ChungKil Hur and Derek Dreyer and Viktor Vafeiadis},
title = {Lightweight Verification of Separate Compilation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {178190},
doi = {10.1145/2837614.2837642},
year = {2016},
}
Publisher's Version
Article Search
Info


ElYaniv, Ran 
POPL'16: "Estimating Types in Binaries ..."
Estimating Types in Binaries using Predictive Modeling
Omer Katz, Ran ElYaniv, and Eran Yahav (Technion, Israel)
Reverse engineering is an important tool in mitigating vulnerabilities in binaries. As a lot of software is developed in objectoriented languages, reverse engineering of objectoriented code is of critical importance. One of the major hurdles in reverse engineering binaries compiled from objectoriented code is the use of dynamic dispatch. In the absence of debug information, any dynamic dispatch may seem to jump to many possible targets, posing a significant challenge to a reverse engineer trying to track the program flow. We present a novel technique that allows us to statically determine the likely targets of virtual function calls. Our technique uses object tracelets – statically constructed sequences of operations performed on an object – to capture potential runtime behaviors of the object. Our analysis automatically prelabels some of the object tracelets by relying on instances where the type of an object is known. The resulting typelabeled tracelets are then used to train a statistical language model (SLM) for each type.We then use the resulting ensemble of SLMs over unlabeled tracelets to generate a ranking of their most likely types, from which we deduce the likely targets of dynamic dispatches.We have implemented our technique and evaluated it over realworld C++ binaries. Our evaluation shows that when there are multiple alternative targets, our approach can drastically reduce the number of targets that have to be considered by a reverse engineer.
@InProceedings{POPL16p313,
author = {Omer Katz and Ran ElYaniv and Eran Yahav},
title = {Estimating Types in Binaries using Predictive Modeling},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {313326},
doi = {10.1145/2837614.2837674},
year = {2016},
}
Publisher's Version
Article Search


Emmi, Michael 
POPL'16: "Symbolic Abstract Data Type ..."
Symbolic Abstract Data Type Inference
Michael Emmi and Constantin Enea (IMDEA Software Institute, Spain; University of Paris Diderot, France)
Formal specification is a vital ingredient to scalable verification of software systems. In the case of efficient implementations of concurrent objects like atomic registers, queues, and locks, symbolic formal representations of their abstract data types (ADTs) enable efficient modular reasoning, decoupling clients from implementations. Writing adequate formal specifications, however, is a complex task requiring rare expertise. In practice, programmers write reference implementations as informal specifications. In this work we demonstrate that effective symbolic ADT representations can be automatically generated from the executions of reference implementations. Our approach exploits two key features of naturallyoccurring ADTs: violations can be decomposed into a small set of representative patterns, and these patterns manifest in executions with few operations. By identifying certain algebraic properties of naturallyoccurring ADTs, and exhaustively sampling executions up to a small number of operations, we generate concise symbolic ADT representations which are complete in practice, enabling the application of efficient symbolic verification algorithms without the burden of manual specification. Furthermore, the concise ADT violation patterns we generate are humanreadable, and can serve as useful, formal documentation.
@InProceedings{POPL16p513,
author = {Michael Emmi and Constantin Enea},
title = {Symbolic Abstract Data Type Inference},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {513525},
doi = {10.1145/2837614.2837645},
year = {2016},
}
Publisher's Version
Article Search


Enea, Constantin 
POPL'16: "Symbolic Abstract Data Type ..."
Symbolic Abstract Data Type Inference
Michael Emmi and Constantin Enea (IMDEA Software Institute, Spain; University of Paris Diderot, France)
Formal specification is a vital ingredient to scalable verification of software systems. In the case of efficient implementations of concurrent objects like atomic registers, queues, and locks, symbolic formal representations of their abstract data types (ADTs) enable efficient modular reasoning, decoupling clients from implementations. Writing adequate formal specifications, however, is a complex task requiring rare expertise. In practice, programmers write reference implementations as informal specifications. In this work we demonstrate that effective symbolic ADT representations can be automatically generated from the executions of reference implementations. Our approach exploits two key features of naturallyoccurring ADTs: violations can be decomposed into a small set of representative patterns, and these patterns manifest in executions with few operations. By identifying certain algebraic properties of naturallyoccurring ADTs, and exhaustively sampling executions up to a small number of operations, we generate concise symbolic ADT representations which are complete in practice, enabling the application of efficient symbolic verification algorithms without the burden of manual specification. Furthermore, the concise ADT violation patterns we generate are humanreadable, and can serve as useful, formal documentation.
@InProceedings{POPL16p513,
author = {Michael Emmi and Constantin Enea},
title = {Symbolic Abstract Data Type Inference},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {513525},
doi = {10.1145/2837614.2837645},
year = {2016},
}
Publisher's Version
Article Search


Erdweg, Sebastian 
POPL'16: "Sound TypeDependent Syntactic ..."
Sound TypeDependent Syntactic Language Extension
Florian Lorenzen and Sebastian Erdweg (TU Berlin, Germany; TU Darmstadt, Germany)
Syntactic language extensions can introduce new facilities into a programming language while requiring little implementation effort and modest changes to the compiler. It is typical to desugar language extensions in a distinguished compiler phase after parsing or type checking, not affecting any of the later compiler phases. If desugaring happens before type checking, the desugaring cannot depend on typing information and type errors are reported in terms of the generated code. If desugaring happens after type checking, the code generated by the desugaring is not type checked and may introduce vulnerabilities. Both options are undesirable. We propose a system for syntactic extensibility where desugaring happens after type checking and desugarings are guaranteed to only generate welltyped code. A major novelty of our work is that desugarings operate on typing derivations instead of plain syntax trees. This provides desugarings access to typing information and forms the basis for the soundness guarantee we provide, namely that a desugaring generates a valid typing derivation. We have implemented our system for syntactic extensibility in a languageindependent fashion and instantiated it for a substantial subset of Java, including generics and inheritance. We provide a sound Java extension for Scalalike forcomprehensions.
@InProceedings{POPL16p204,
author = {Florian Lorenzen and Sebastian Erdweg},
title = {Sound TypeDependent Syntactic Language Extension},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {204216},
doi = {10.1145/2837614.2837644},
year = {2016},
}
Publisher's Version
Article Search


Erwig, Martin 
POPL'16: "Principal Type Inference for ..."
Principal Type Inference for GADTs
Sheng Chen and Martin Erwig (University of Louisiana at Lafayette, USA; Oregon State University, USA)
We present a new method for GADT type inference that improves the precision of previous approaches. In particular, our approach accepts more typecorrect programs than previous approaches when they do not employ type annotations. A side benefit of our approach is that it can detect a wide range of runtime errors that are missed by previous approaches. Our method is based on the idea to represent type refinements in patternmatching branches by choice types, which facilitate a separation of the typing and reconciliation phases and thus support case expressions. This idea is formalized in a type system, which is both sound and a conservative extension of the classical HindleyMilner system. We present the results of an empirical evaluation that compares our algorithm with previous approaches.
@InProceedings{POPL16p416,
author = {Sheng Chen and Martin Erwig},
title = {Principal Type Inference for GADTs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {416428},
doi = {10.1145/2837614.2837665},
year = {2016},
}
Publisher's Version
Article Search


Felleisen, Matthias 
POPL'16: "Is Sound Gradual Typing Dead? ..."
Is Sound Gradual Typing Dead?
Asumu Takikawa, Daniel Feltey, Ben Greenman, Max S. New, Jan Vitek, and Matthias Felleisen (Northeastern University, USA)
Programmers have come to embrace dynamicallytyped languages for prototyping and delivering large and complex systems. When it comes to maintaining and evolving these systems, the lack of explicit static typing becomes a bottleneck. In response, researchers have explored the idea of graduallytyped programming languages which allow the incremental addition of type annotations to software written in one of these untyped languages. Some of these new, hybrid languages insert runtime checks at the boundary between typed and untyped code to establish type soundness for the overall system. With sound gradual typing, programmers can rely on the language implementation to provide meaningful error messages when type invariants are violated. While most research on sound gradual typing remains theoretical, the few emerging implementations suffer from performance overheads due to these checks. None of the publications on this topic comes with a comprehensive performance evaluation. Worse, a few report disastrous numbers. In response, this paper proposes a method for evaluating the performance of graduallytyped programming languages. The method hinges on exploring the space of partial conversions from untyped to typed. For each benchmark, the performance of the different versions is reported in a synthetic metric that associates runtime overhead to conversion effort. The paper reports on the results of applying the method to Typed Racket, a mature implementation of sound gradual typing, using a suite of realworld programs of various sizes and complexities. Based on these results the paper concludes that, given the current state of implementation technologies, sound gradual typing faces significant challenges. Conversely, it raises the question of how implementations could reduce the overheads associated with soundness and how tools could be used to steer programmers clear from pathological cases.
@InProceedings{POPL16p456,
author = {Asumu Takikawa and Daniel Feltey and Ben Greenman and Max S. New and Jan Vitek and Matthias Felleisen},
title = {Is Sound Gradual Typing Dead?},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {456468},
doi = {10.1145/2837614.2837630},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


Feltey, Daniel 
POPL'16: "Is Sound Gradual Typing Dead? ..."
Is Sound Gradual Typing Dead?
Asumu Takikawa, Daniel Feltey, Ben Greenman, Max S. New, Jan Vitek, and Matthias Felleisen (Northeastern University, USA)
Programmers have come to embrace dynamicallytyped languages for prototyping and delivering large and complex systems. When it comes to maintaining and evolving these systems, the lack of explicit static typing becomes a bottleneck. In response, researchers have explored the idea of graduallytyped programming languages which allow the incremental addition of type annotations to software written in one of these untyped languages. Some of these new, hybrid languages insert runtime checks at the boundary between typed and untyped code to establish type soundness for the overall system. With sound gradual typing, programmers can rely on the language implementation to provide meaningful error messages when type invariants are violated. While most research on sound gradual typing remains theoretical, the few emerging implementations suffer from performance overheads due to these checks. None of the publications on this topic comes with a comprehensive performance evaluation. Worse, a few report disastrous numbers. In response, this paper proposes a method for evaluating the performance of graduallytyped programming languages. The method hinges on exploring the space of partial conversions from untyped to typed. For each benchmark, the performance of the different versions is reported in a synthetic metric that associates runtime overhead to conversion effort. The paper reports on the results of applying the method to Typed Racket, a mature implementation of sound gradual typing, using a suite of realworld programs of various sizes and complexities. Based on these results the paper concludes that, given the current state of implementation technologies, sound gradual typing faces significant challenges. Conversely, it raises the question of how implementations could reduce the overheads associated with soundness and how tools could be used to steer programmers clear from pathological cases.
@InProceedings{POPL16p456,
author = {Asumu Takikawa and Daniel Feltey and Ben Greenman and Max S. New and Jan Vitek and Matthias Felleisen},
title = {Is Sound Gradual Typing Dead?},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {456468},
doi = {10.1145/2837614.2837630},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


Feng, Xinyu 
POPL'16: "A Program Logic for Concurrent ..."
A Program Logic for Concurrent Objects under Fair Scheduling
Hongjin Liang and Xinyu Feng (University of Science and Technology of China, China)
Existing work on verifying concurrent objects is mostly concerned with safety only, e.g., partial correctness or linearizability. Although there has been recent work verifying lockfreedom of nonblocking objects, much less efforts are focused on deadlockfreedom and starvationfreedom, progress properties of blocking objects. These properties are more challenging to verify than lockfreedom because they allow the progress of one thread to depend on the progress of another, assuming fair scheduling.
We propose LiLi, a new relyguarantee style program logic for verifying linearizability and progress together for concurrent objects under fair scheduling. The relyguarantee style logic unifies threadmodular reasoning about both starvationfreedom and deadlockfreedom in one framework. It also establishes progressaware abstraction for concurrent objects, which can be applied when verifying safety and liveness of client code. We have successfully applied the logic to verify starvationfreedom or deadlockfreedom of representative algorithms such as ticket locks, queue locks, lockcoupling lists, optimistic lists and lazy lists.
@InProceedings{POPL16p385,
author = {Hongjin Liang and Xinyu Feng},
title = {A Program Logic for Concurrent Objects under Fair Scheduling},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {385399},
doi = {10.1145/2837614.2837635},
year = {2016},
}
Publisher's Version
Article Search


Ferreira, Carla 
POPL'16: "'Cause I'm Strong ..."
'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems
Alexey Gotsman, Hongseok Yang, Carla Ferreira, Mahsa Najafzadeh, and Marc Shapiro (IMDEA Software Institute, Spain; University of Oxford, UK; Universidade Nova Lisboa, Potugal; Sorbonne, France; Inria, France; UPMC, France)
Largescale distributed systems often rely on replicated databases that allow a
programmer to request different data consistency guarantees for different
operations, and thereby control their performance. Using such databases is far
from trivial: requesting stronger consistency in too many places may hurt
performance, and requesting it in too few places may violate correctness. To
help programmers in this task, we propose the first proof rule for establishing
that a particular choice of consistency guarantees for various operations on a
replicated database is enough to ensure the preservation of a given data
integrity invariant. Our rule is modular: it allows reasoning about the
behaviour of every operation separately under some assumption on the behaviour
of other operations. This leads to simple reasoning, which we have automated in
an SMTbased tool. We present a nontrivial proof of soundness of our rule and
illustrate its use on several examples.
@InProceedings{POPL16p371,
author = {Alexey Gotsman and Hongseok Yang and Carla Ferreira and Mahsa Najafzadeh and Marc Shapiro},
title = {'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {371384},
doi = {10.1145/2837614.2837625},
year = {2016},
}
Publisher's Version
Article Search


Fiore, Marcelo 
POPL'16: "A Theory of Effects and Resources: ..."
A Theory of Effects and Resources: Adjunction Models and Polarised Calculi
PierreLouis Curien, Marcelo Fiore, and Guillaume MunchMaccagnoni (University of Paris Diderot, France; Inria, France; University of Cambridge, UK)
We consider the CurryHowardLambek correspondence for effectful computation and resource management, specifically proposing polarised calculi together with presheafenriched adjunction models as the starting point for a comprehensive semantic theory relating logical systems, typed calculi, and categorical models in this context. Our thesis is that the combination of effects and resources should be considered orthogonally. Model theoretically, this leads to an understanding of our categorical models from two complementary perspectives: (i) as a linearisation of CBPV (CallbyPushValue) adjunction models, and (ii) as an extension of linear/nonlinear adjunction models with an adjoint resolution of computational effects. When the linear structure is cartesian and the resource structure is trivial we recover Levy’s notion of CBPV adjunction model, while when the effect structure is trivial we have Benton’s linear/nonlinear adjunction models. Further instances of our model theory include the dialogue categories with a resource modality of Melliès and Tabareau, and the [E]EC ([Enriched] Effect Calculus) models of Egger, Møgelberg and Simpson. Our development substantiates the approach by providing a lifting theorem of linear models into cartesian ones. To each of our categorical models we systematically associate a typed term calculus, each of which corresponds to a variant of the sequent calculi LJ (Intuitionistic Logic) or ILL (Intuitionistic Linear Logic). The adjoint resolution of effects corresponds to polarisation whereby, syntactically, types locally determine a strict or lazy evaluation order and, semantically, the associativity of cuts is relaxed. In particular, our results show that polarisation provides a computational interpretation of CBPV in direct style. Further, we characterise depolarised models: those where the cut is associative, and where the evaluation order is unimportant. We explain possible advantages of this style of calculi for the operational semantics of effects.
@InProceedings{POPL16p44,
author = {PierreLouis Curien and Marcelo Fiore and Guillaume MunchMaccagnoni},
title = {A Theory of Effects and Resources: Adjunction Models and Polarised Calculi},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {4456},
doi = {10.1145/2837614.2837652},
year = {2016},
}
Publisher's Version
Article Search


Flatt, Matthew 
POPL'16: "Binding as Sets of Scopes ..."
Binding as Sets of Scopes
Matthew Flatt (University of Utah, USA)
Our new macro expander for Racket builds on a novel approach to hygiene. Instead of basing macro expansion on variable renamings that are mediated by expansion history, our new expander tracks binding through a set of scopes that an identifier acquires from both binding forms and macro expansions. The resulting model of macro expansion is simpler and more uniform than one based on renaming, and it is sufficiently compatible with Racket's old expander to be practical.
@InProceedings{POPL16p705,
author = {Matthew Flatt},
title = {Binding as Sets of Scopes},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {705717},
doi = {10.1145/2837614.2837620},
year = {2016},
}
Publisher's Version
Article Search
Info
Artifacts Available


Flur, Shaked 
POPL'16: "Modelling the ARMv8 Architecture, ..."
Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell (University of Cambridge, UK; University of St. Andrews, UK; Inria, France; ARM, UK)
In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64bit applicationlevel instruction set (ISA). Our goal is to clarify what the range of architecturally allowable behaviour is, and thereby to support future work on formal verification, analysis, and testing of concurrent ARM software and hardware. Establishing such models with high confidence is intrinsically difficult: it involves capturing the vendor's architectural intent, aspects of which (especially for concurrency) have not previously been precisely defined. We therefore first develop a concurrency model with a microarchitectural flavour, abstracting from many hardware implementation concerns but still close to hardwaredesigner intuition. This means it can be discussed in detail with ARM architects. We then develop a more abstract model, better suited for use as an architectural specification, which we prove sound w.r.t.~the first. The instruction semantics involves further difficulties, handling the mass of detail and the subtle intensional information required to interface to the concurrency model. We have a novel ISA description language, with a lightweight dependent type system, letting us do both with a rather direct representation of the ARM reference manual instruction descriptions. We build a tool from the combined semantics that lets one explore, either interactively or exhaustively, the full range of architecturally allowed behaviour, for litmus tests and (small) ELF executables. We prove correctness of some optimisations needed for tool performance. We validate the models by discussion with ARM staff, and by comparison against ARM hardware behaviour, for ISA single instruction tests and concurrent litmus tests.
@InProceedings{POPL16p608,
author = {Shaked Flur and Kathryn E. Gray and Christopher Pulte and Susmit Sarkar and Ali Sezgin and Luc Maranget and Will Deacon and Peter Sewell},
title = {Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {608621},
doi = {10.1145/2837614.2837615},
year = {2016},
}
Publisher's Version
Article Search
Info


Forest, Simon 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Fournet, Cédric 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Frankle, Jonathan 
POPL'16: "ExampleDirected Synthesis: ..."
ExampleDirected Synthesis: A TypeTheoretic Interpretation
Jonathan Frankle, PeterMichael Osera, David Walker, and Steve Zdancewic (Princeton University, USA; Grinnell College, USA; University of Pennsylvania, USA)
Inputoutput examples have emerged as a practical and userfriendly
specification mechanism for program synthesis in many environments.
While exampledriven tools have demonstrated tangible impact that has
inspired adoption in industry, their underlying semantics are less wellunderstood:
what are "examples" and how do they
relate to other kinds of specifications? This paper
demonstrates that examples can, in general, be interpreted
as refinement types. Seen in this light, program
synthesis is the task of finding an inhabitant of
such a type. This insight provides an immediate
semantic interpretation for examples. Moreover,
it enables us to exploit decades of research in type theory as
well as its correspondence with intuitionistic logic rather
than designing ad hoc theoretical frameworks for synthesis from scratch.
We put this observation into practice by formalizing synthesis
as proof search in a sequent calculus with
intersection and union refinements that we prove
to be sound with respect to a conventional type system.
In addition, we show how to handle negative examples,
which arise from user feedback or counterexampleguided loops.
This theory serves as the basis for a prototype
implementation that extends our core language to
support MLstyle algebraic data types and structurally
inductive functions. Users can also specify
synthesis goals using polymorphic refinements and
import monomorphic libraries.
The prototype serves as a vehicle
for empirically evaluating a number of different
strategies for resolving the nondeterminism of the sequent
calculusbottomup theoremproving,
term enumeration with refinement type checking, and
combinations of boththe results of which classify, explain, and
validate the design choices of existing synthesis systems.
It also provides a platform for measuring the practical
value of a specification language that combines
"examples" with the more general expressiveness of refinements.
@InProceedings{POPL16p802,
author = {Jonathan Frankle and PeterMichael Osera and David Walker and Steve Zdancewic},
title = {ExampleDirected Synthesis: A TypeTheoretic Interpretation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {802815},
doi = {10.1145/2837614.2837629},
year = {2016},
}
Publisher's Version
Article Search


Fu, Hongfei 
POPL'16: "Algorithmic Analysis of Qualitative ..."
Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs
Krishnendu Chatterjee, Hongfei Fu, Petr Novotný, and Rouzbeh Hasheminezhad (IST Austria, Austria; Institute of Software at Chinese Academy of Sciences, China; Sharif University of Technology, Iran)
In this paper, we consider termination of probabilistic programs with realvalued variables. The questions concerned are: 1. qualitative ones that ask (i) whether the program terminates with probability 1 (almostsure termination) and (ii) whether the expected termination time is finite (finite termination); 2. quantitative ones that ask (i) to approximate the expected termination time (expectation problem) and (ii) to compute a bound B such that the probability to terminate after B steps decreases exponentially (concentration problem). To solve these questions, we utilize the notion of ranking supermartingales which is a powerful approach for proving termination of probabilistic programs. In detail, we focus on algorithmic synthesis of linear rankingsupermartingales over affine probabilistic programs (APP's) with both angelic and demonic nondeterminism. An important subclass of APP's is LRAPP which is defined as the class of all APP's over which a linear rankingsupermartingale exists. Our main contributions are as follows. Firstly, we show that the membership problem of LRAPP (i) can be decided in polynomial time for APP's with at most demonic nondeterminism, and (ii) is NPhard and in PSPACE for APP's with angelic nondeterminism; moreover, the NPhardness result holds already for APP's without probability and demonic nondeterminism. Secondly, we show that the concentration problem over LRAPP can be solved in the same complexity as for the membership problem of LRAPP. Finally, we show that the expectation problem over LRAPP can be solved in 2EXPTIME and is PSPACEhard even for APP's without probability and nondeterminism (i.e., deterministic programs). Our experimental results demonstrate the effectiveness of our approach to answer the qualitative and quantitative questions over APP's with at most demonic nondeterminism.
@InProceedings{POPL16p327,
author = {Krishnendu Chatterjee and Hongfei Fu and Petr Novotný and Rouzbeh Hasheminezhad},
title = {Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {327342},
doi = {10.1145/2837614.2837639},
year = {2016},
}
Publisher's Version
Article Search


Garcia, Ronald 
POPL'16: "Abstracting Gradual Typing ..."
Abstracting Gradual Typing
Ronald Garcia, Alison M. Clark, and Éric Tanter (University of British Columbia, Canada; University of Chile, Chile)
Language researchers and designers have extended a wide variety of type systems to support gradual typing, which enables languages to seamlessly combine dynamic and static checking. These efforts consistently demonstrate that designing a satisfactory gradual counterpart to a static type system is challenging, and this challenge only increases with the sophistication of the type system. Gradual type system designers need more formal tools to help them conceptualize, structure, and evaluate their designs. In this paper, we propose a new formal foundation for gradual typing, drawing on principles from abstract interpretation to give gradual types a semantics in terms of preexisting static types. Abstracting Gradual Typing (AGT for short) yields a formal account of consistencyone of the cornerstones of the gradual typing approachthat subsumes existing notions of consistency, which were developed through intuition and ad hoc reasoning. Given a syntaxdirected static typing judgment, the AGT approach induces a corresponding gradual typing judgment. Then the type safety proof for the underlying static discipline induces a dynamic semantics for gradual programs defined over sourcelanguage typing derivations. The AGT approach does not resort to an externally justified cast calculus: instead, runtime checks naturally arise by deducing evidence for consistent judgments during proof reduction. To illustrate the approach, we develop a novel graduallytyped counterpart for a language with record subtyping. Gradual languages designed with the AGT approach satisfy by construction the refined criteria for gradual typing set forth by Siek and colleagues.
@InProceedings{POPL16p429,
author = {Ronald Garcia and Alison M. Clark and Éric Tanter},
title = {Abstracting Gradual Typing},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {429442},
doi = {10.1145/2837614.2837670},
year = {2016},
}
Publisher's Version
Article Search


Garg, Pranav 
POPL'16: "Learning Invariants using ..."
Learning Invariants using Decision Trees and Implication Counterexamples
Pranav Garg, Daniel Neider, P. Madhusudan, and Dan Roth (University of Illinois at UrbanaChampaign, USA)
Inductive invariants can be robustly synthesized using a learning model where the teacher is a program verifier who instructs the learner through concrete program configurations, classified as positive, negative, and implications. We propose the first learning algorithms in this model with implication counterexamples that are based on machine learning techniques. In particular, we extend classical decisiontree learning algorithms in machine learning to handle implication samples, building new scalable ways to construct small decision trees using statistical measures. We also develop a decisiontree learning algorithm in this model that is guaranteed to converge to the right concept (invariant) if one exists. We implement the learners and an appropriate teacher, and show that the resulting invariant synthesis is efficient and convergent for a large suite of programs.
@InProceedings{POPL16p499,
author = {Pranav Garg and Daniel Neider and P. Madhusudan and Dan Roth},
title = {Learning Invariants using Decision Trees and Implication Counterexamples},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {499512},
doi = {10.1145/2837614.2837664},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


Giannarakis, Nick 
POPL'16: "Taming ReleaseAcquire Consistency ..."
Taming ReleaseAcquire Consistency
Ori Lahav, Nick Giannarakis, and Viktor Vafeiadis (MPISWS, Germany)
We introduce a strengthening of the releaseacquire fragment of the C11 memory model that (i) forbids dubious behaviors that are not observed in any implementation; (ii) supports fence instructions that restore sequential consistency; and (iii) admits an equivalent intuitive operational semantics based on pointtopoint communication. This strengthening has no additional implementation cost: it allows the same local optimizations as C11 release and acquire accesses, and has exactly the same compilation schemes to the x86TSO and Power architectures. In fact, the compilation to Power is complete with respect to a recent axiomatic model of Power; that is, the compiled program exhibits exactly the same behaviors as the source one. Moreover, we provide criteria for placing enough fence instructions to ensure sequential consistency, and apply them to an efficient RCU implementation.
@InProceedings{POPL16p649,
author = {Ori Lahav and Nick Giannarakis and Viktor Vafeiadis},
title = {Taming ReleaseAcquire Consistency},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {649662},
doi = {10.1145/2837614.2837643},
year = {2016},
}
Publisher's Version
Article Search
Info


Giarrusso, Paolo G. 
POPL'16: "System Fomega with Equirecursive ..."
System Fomega with Equirecursive Types for DatatypeGeneric Programming
Yufei Cai, Paolo G. Giarrusso, and Klaus Ostermann (University of Tübingen, Germany)
Traversing an algebraic datatype by hand requires boilerplate code which duplicates the structure of the datatype. Datatypegeneric programming (DGP) aims to eliminate such boilerplate code by decomposing algebraic datatypes into type constructor applications from which generic traversals can be synthesized. However, different traversals require different decompositions, which yield isomorphic but unequal types. This hinders the interoperability of different DGP techniques. In this paper, we propose Fωμ, an extension of the higherorder polymorphic lambda calculus Fω with records, variants, and equirecursive types. We prove the soundness of the type system, and show that type checking for firstorder recursive types is decidable with a practical type checking algorithm. In our soundness proof we define type equality by interpreting types as infinitary λterms (in particular, Berarduccitrees). To decide type equality we βnormalize types, and then use an extension of equivalence checking for usual equirecursive types. Thanks to equirecursive types, new decompositions for a datatype can be added modularly and still interoperate with each other, allowing multiple DGP techniques to work together. We sketch how generic traversals can be synthesized, and apply these components to some examples. Since the set of datatype decomposition becomes extensible, System Fωμ enables using DGP techniques incrementally, instead of planning for them upfront or doing invasive refactoring.
@InProceedings{POPL16p30,
author = {Yufei Cai and Paolo G. Giarrusso and Klaus Ostermann},
title = {System Fomega with Equirecursive Types for DatatypeGeneric Programming},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {3043},
doi = {10.1145/2837614.2837660},
year = {2016},
}
Publisher's Version
Article Search
Info


Gilray, Thomas 
POPL'16: "Pushdown ControlFlow Analysis ..."
Pushdown ControlFlow Analysis for Free
Thomas Gilray, Steven Lyde, Michael D. Adams, Matthew Might, and David Van Horn (University of Utah, USA; University of Maryland, USA)
Traditional controlflow analysis (CFA) for higherorder languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been published that provide perfect callstack precision in a computable manner: CFA2, PDCFA, and AAC. Unfortunately, implementing CFA2 and PDCFA requires significant engineering effort. Furthermore, all three are computationally expensive. For a monovariant analysis, CFA2 is in O(2^n), PDCFA is in O(n^6), and AAC is in O(n^8).
In this paper, we describe a new technique that builds on these but is both straightforward to implement and computationally inexpensive. The crucial insight is an unusual statedependent allocation strategy for the addresses of continuations. Our technique imposes only a constantfactor overhead on the underlying analysis and costs only O(n^3) in the monovariant case. We present the intuitions behind this development, benchmarks demonstrating its efficacy, and a proof of the precision of this analysis.
@InProceedings{POPL16p691,
author = {Thomas Gilray and Steven Lyde and Michael D. Adams and Matthew Might and David Van Horn},
title = {Pushdown ControlFlow Analysis for Free},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {691704},
doi = {10.1145/2837614.2837631},
year = {2016},
}
Publisher's Version
Article Search


Gimenez, Stéphane 
POPL'16: "The Complexity of Interaction ..."
The Complexity of Interaction
Stéphane Gimenez and Georg Moser (University of Innsbruck, Austria)
In this paper, we analyze the complexity of functional programs written in the interactionnet computation model, an asynchronous, parallel and confluent model that generalizes linearlogic proof nets. Employing userdefined sized and scheduled types, we certify concrete time, space and spacetime complexity bounds for both sequential and parallel reductions of interactionnet programs by suitably assigning complexity potentials to typed nodes. The relevance of this approach is illustrated on archetypal programming examples. The provided analysis is precise, compositional and is, in theory, not restricted to particular complexity classes.
@InProceedings{POPL16p243,
author = {Stéphane Gimenez and Georg Moser},
title = {The Complexity of Interaction},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {243255},
doi = {10.1145/2837614.2837646},
year = {2016},
}
Publisher's Version
Article Search


Goharshady, Amir Kafshdar 
POPL'16: "Algorithms for Algebraic Path ..."
Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components
Krishnendu Chatterjee, Amir Kafshdar Goharshady, Rasmus IbsenJensen, and Andreas Pavlogiannis (IST Austria, Austria)
We study algorithmic questions for concurrent systems where the transitions are labeled from a complete, closed semiring, and path properties are algebraic with semiring operations. The algebraic path properties can model dataflow analysis problems, the shortest path problem, and many other natural problems that arise in program analysis. We consider that each component of the concurrent system is a graph with constant treewidth, a property satisfied by the controlflow graphs of most programs. We allow for multiple possible queries, which arise naturally in demand driven dataflow analysis. The study of multiple queries allows us to consider the tradeoff between the resource usage of the onetime preprocessing and for each individual query. The traditional approach constructs the product graph of all components and applies the bestknown graph algorithm on the product. In this approach, even the answer to a single query requires the transitive closure (i.e., the results of all possible queries), which provides no room for tradeoff between preprocessing and query time. Our main contributions are algorithms that significantly improve the worstcase running time of the traditional approach, and provide various tradeoffs depending on the number of queries. For example, in a concurrent system of two components, the traditional approach requires hexic time in the worst case for answering one query as well as computing the transitive closure, whereas we show that with onetime preprocessing in almost cubic time, each subsequent query can be answered in at most linear time, and even the transitive closure can be computed in almost quartic time. Furthermore, we establish conditional optimality results showing that the worstcase running time of our algorithms cannot be improved without achieving major breakthroughs in graph algorithms (i.e., improving the worstcase bound for the shortest path problem in general graphs). Preliminary experimental results show that our algorithms perform favorably on several benchmarks.
@InProceedings{POPL16p733,
author = {Krishnendu Chatterjee and Amir Kafshdar Goharshady and Rasmus IbsenJensen and Andreas Pavlogiannis},
title = {Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {733747},
doi = {10.1145/2837614.2837624},
year = {2016},
}
Publisher's Version
Article Search


Gommerstadt, Hannah 
POPL'16: "Monitors and Blame Assignment ..."
Monitors and Blame Assignment for HigherOrder Session Types
Limin Jia, Hannah Gommerstadt, and Frank Pfenning (Carnegie Mellon University, USA)
Session types provide a means to prescribe the communication behavior between concurrent messagepassing processes. However, in a distributed setting, some processes may be written in languages that do not support static typing of sessions or may be compromised by a malicious intruder, violating invariants of the session types. In such a setting, dynamically monitoring communication between processes becomes a necessity for identifying undesirable actions. In this paper, we show how to dynamically monitor communication to enforce adherence to session types in a higherorder setting. We present a system of blame assignment in the case when the monitor detects an undesirable action and an alarm is raised. We prove that dynamic monitoring does not change system behavior for welltyped processes, and that one of an indicated set of possible culprits must have been compromised in case of an alarm.
@InProceedings{POPL16p582,
author = {Limin Jia and Hannah Gommerstadt and Frank Pfenning},
title = {Monitors and Blame Assignment for HigherOrder Session Types},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {582594},
doi = {10.1145/2837614.2837662},
year = {2016},
}
Publisher's Version
Article Search


Gordon, Andrew D. 
POPL'16: "Fabular: Regression Formulas ..."
Fabular: Regression Formulas as Probabilistic Programming
Johannes Borgström, Andrew D. Gordon, Long Ouyang, Claudio Russo, Adam Ścibior, and Marcin Szymczak (Uppsala University, Sweden; Microsoft Research, UK; University of Edinburgh, UK; Stanford University, USA; University of Cambridge, UK; MPI Tübingen, Germany)
Regression formulas are a domainspecific language adopted by several R packages for describing an important and useful class of statistical models: hierarchical linear regressions. Formulas are succinct, expressive, and clearly popular, so are they a useful addition to probabilistic programming languages? And what do they mean? We propose a core calculus of hierarchical linear regression, in which regression coefficients are themselves defined by nested regressions (unlike in R). We explain how our calculus captures the essence of the formula DSL found in R. We describe the design and implementation of Fabular, a version of the Tabular schemadriven probabilistic programming language, enriched with formulas based on our regression calculus. To the best of our knowledge, this is the first formal description of the core ideas of R's formula notation, the first development of a calculus of regression formulas, and the first demonstration of the benefits of composing regression formulas and latent variables in a probabilistic programming language.
@InProceedings{POPL16p271,
author = {Johannes Borgström and Andrew D. Gordon and Long Ouyang and Claudio Russo and Adam Ścibior and Marcin Szymczak},
title = {Fabular: Regression Formulas as Probabilistic Programming},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {271283},
doi = {10.1145/2837614.2837653},
year = {2016},
}
Publisher's Version
Article Search


Gorogiannis, Nikos 
POPL'16: "Model Checking for SymbolicHeap ..."
Model Checking for SymbolicHeap Separation Logic with Inductive Predicates
James Brotherston, Nikos Gorogiannis, Max Kanovich, and Reuben Rowe (University College London, UK; Middlesex University, UK; National Research University Higher School of Economics, Russia)
We investigate the *model checking* problem for symbolicheap separation logic with userdefined inductive predicates, i.e., the problem of checking that a given stackheap memory state satisfies a given formula in this language, as arises e.g. in software testing or runtime verification. First, we show that the problem is *decidable*; specifically, we present a bottomup fixed point algorithm that decides the problem and runs in exponential time in the size of the problem instance. Second, we show that, while model checking for the full language is EXPTIMEcomplete, the problem becomes NPcomplete or PTIMEsolvable when we impose natural syntactic restrictions on the schemata defining the inductive predicates. We additionally present NP and PTIME algorithms for these restricted fragments. Finally, we report on the experimental performance of our procedures on a variety of specifications extracted from programs, exercising multiple combinations of syntactic restrictions.
@InProceedings{POPL16p84,
author = {James Brotherston and Nikos Gorogiannis and Max Kanovich and Reuben Rowe},
title = {Model Checking for SymbolicHeap Separation Logic with Inductive Predicates},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {8496},
doi = {10.1145/2837614.2837621},
year = {2016},
}
Publisher's Version
Article Search


Gotsman, Alexey 
POPL'16: "'Cause I'm Strong ..."
'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems
Alexey Gotsman, Hongseok Yang, Carla Ferreira, Mahsa Najafzadeh, and Marc Shapiro (IMDEA Software Institute, Spain; University of Oxford, UK; Universidade Nova Lisboa, Potugal; Sorbonne, France; Inria, France; UPMC, France)
Largescale distributed systems often rely on replicated databases that allow a
programmer to request different data consistency guarantees for different
operations, and thereby control their performance. Using such databases is far
from trivial: requesting stronger consistency in too many places may hurt
performance, and requesting it in too few places may violate correctness. To
help programmers in this task, we propose the first proof rule for establishing
that a particular choice of consistency guarantees for various operations on a
replicated database is enough to ensure the preservation of a given data
integrity invariant. Our rule is modular: it allows reasoning about the
behaviour of every operation separately under some assumption on the behaviour
of other operations. This leads to simple reasoning, which we have automated in
an SMTbased tool. We present a nontrivial proof of soundness of our rule and
illustrate its use on several examples.
@InProceedings{POPL16p371,
author = {Alexey Gotsman and Hongseok Yang and Carla Ferreira and Mahsa Najafzadeh and Marc Shapiro},
title = {'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {371384},
doi = {10.1145/2837614.2837625},
year = {2016},
}
Publisher's Version
Article Search


Grathwohl, Bjørn Bugge 
POPL'16: "Kleenex: Compiling Nondeterministic ..."
Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers
Bjørn Bugge Grathwohl, Fritz Henglein, Ulrik Terp Rasmussen, Kristoffer Aalund Søholm, and Sebastian Paaske Tørholm (University of Copenhagen, Denmark; Jobindex, Denmark)
We present and illustrate Kleenex, a language for expressing general nondeterministic finite transducers, and its novel compilation to streaming string transducers with essentially optimal streaming behavior, worstcase lineartime performance and sustained high throughput. Its underlying theory is based on transducer decomposition into oracle and action machines: the oracle machine performs streaming greedy disambiguation of the input; the action machine performs the output actions. In use cases Kleenex achieves consistently high throughput rates around the 1 Gbps range on stock hardware. It performs well, especially in complex use cases, in comparison to both specialized and related tools such as GNUawk, GNUsed, GNUgrep, RE2, Ragel and regularexpression libraries.
@InProceedings{POPL16p284,
author = {Bjørn Bugge Grathwohl and Fritz Henglein and Ulrik Terp Rasmussen and Kristoffer Aalund Søholm and Sebastian Paaske Tørholm},
title = {Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {284297},
doi = {10.1145/2837614.2837647},
year = {2016},
}
Publisher's Version
Article Search


Gray, Kathryn E. 
POPL'16: "Modelling the ARMv8 Architecture, ..."
Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell (University of Cambridge, UK; University of St. Andrews, UK; Inria, France; ARM, UK)
In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64bit applicationlevel instruction set (ISA). Our goal is to clarify what the range of architecturally allowable behaviour is, and thereby to support future work on formal verification, analysis, and testing of concurrent ARM software and hardware. Establishing such models with high confidence is intrinsically difficult: it involves capturing the vendor's architectural intent, aspects of which (especially for concurrency) have not previously been precisely defined. We therefore first develop a concurrency model with a microarchitectural flavour, abstracting from many hardware implementation concerns but still close to hardwaredesigner intuition. This means it can be discussed in detail with ARM architects. We then develop a more abstract model, better suited for use as an architectural specification, which we prove sound w.r.t.~the first. The instruction semantics involves further difficulties, handling the mass of detail and the subtle intensional information required to interface to the concurrency model. We have a novel ISA description language, with a lightweight dependent type system, letting us do both with a rather direct representation of the ARM reference manual instruction descriptions. We build a tool from the combined semantics that lets one explore, either interactively or exhaustively, the full range of architecturally allowed behaviour, for litmus tests and (small) ELF executables. We prove correctness of some optimisations needed for tool performance. We validate the models by discussion with ARM staff, and by comparison against ARM hardware behaviour, for ISA single instruction tests and concurrent litmus tests.
@InProceedings{POPL16p608,
author = {Shaked Flur and Kathryn E. Gray and Christopher Pulte and Susmit Sarkar and Ali Sezgin and Luc Maranget and Will Deacon and Peter Sewell},
title = {Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {608621},
doi = {10.1145/2837614.2837615},
year = {2016},
}
Publisher's Version
Article Search
Info


Greenman, Ben 
POPL'16: "Is Sound Gradual Typing Dead? ..."
Is Sound Gradual Typing Dead?
Asumu Takikawa, Daniel Feltey, Ben Greenman, Max S. New, Jan Vitek, and Matthias Felleisen (Northeastern University, USA)
Programmers have come to embrace dynamicallytyped languages for prototyping and delivering large and complex systems. When it comes to maintaining and evolving these systems, the lack of explicit static typing becomes a bottleneck. In response, researchers have explored the idea of graduallytyped programming languages which allow the incremental addition of type annotations to software written in one of these untyped languages. Some of these new, hybrid languages insert runtime checks at the boundary between typed and untyped code to establish type soundness for the overall system. With sound gradual typing, programmers can rely on the language implementation to provide meaningful error messages when type invariants are violated. While most research on sound gradual typing remains theoretical, the few emerging implementations suffer from performance overheads due to these checks. None of the publications on this topic comes with a comprehensive performance evaluation. Worse, a few report disastrous numbers. In response, this paper proposes a method for evaluating the performance of graduallytyped programming languages. The method hinges on exploring the space of partial conversions from untyped to typed. For each benchmark, the performance of the different versions is reported in a synthetic metric that associates runtime overhead to conversion effort. The paper reports on the results of applying the method to Typed Racket, a mature implementation of sound gradual typing, using a suite of realworld programs of various sizes and complexities. Based on these results the paper concludes that, given the current state of implementation technologies, sound gradual typing faces significant challenges. Conversely, it raises the question of how implementations could reduce the overheads associated with soundness and how tools could be used to steer programmers clear from pathological cases.
@InProceedings{POPL16p456,
author = {Asumu Takikawa and Daniel Feltey and Ben Greenman and Max S. New and Jan Vitek and Matthias Felleisen},
title = {Is Sound Gradual Typing Dead?},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {456468},
doi = {10.1145/2837614.2837630},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


Grigore, Radu 
POPL'16: "Abstraction Refinement Guided ..."
Abstraction Refinement Guided by a Learnt Probabilistic Model
Radu Grigore and Hongseok Yang (University of Oxford, UK)
The core challenge in designing an effective static program analysis is to find a good program abstraction  one that retains only details relevant to a given query. In this paper, we present a new approach for automatically finding such an abstraction. Our approach uses a pessimistic strategy, which can optionally use guidance from a probabilistic model. Our approach applies to parametric static analyses implemented in Datalog, and is based on counterexampleguided abstraction refinement. For each untried abstraction, our probabilistic model provides a probability of success, while the size of the abstraction provides an estimate of its cost in terms of analysis time. Combining these two metrics, probability and cost, our refinement algorithm picks an optimal abstraction. Our probabilistic model is a variant of the ErdosRenyi random graph model, and it is tunable by what we call hyperparameters. We present a method to learn good values for these hyperparameters, by observing past runs of the analysis on an existing codebase. We evaluate our approach on an object sensitive pointer analysis for Java programs, with two client analyses (PolySite and Downcast).
@InProceedings{POPL16p485,
author = {Radu Grigore and Hongseok Yang},
title = {Abstraction Refinement Guided by a Learnt Probabilistic Model},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {485498},
doi = {10.1145/2837614.2837663},
year = {2016},
}
Publisher's Version
Article Search
Info


Grossman, Dan 
POPL'16: "Optimizing Synthesis with ..."
Optimizing Synthesis with Metasketches
James Bornholt, Emina Torlak, Dan Grossman, and Luis Ceze (University of Washington, USA)
Many advanced programming toolsfor both endusers and expert developersrely on program synthesis to automatically generate implementations from highlevel specifications. These tools often need to employ tricky, custombuilt synthesis algorithms because they require synthesized programs to be not only correct, but also optimal with respect to a desired cost metric, such as program size. Finding these optimal solutions efficiently requires domainspecific search strategies, but existing synthesizers hardcode the strategy, making them difficult to reuse. This paper presents metasketches, a general framework for specifying and solving optimal synthesis problems. metasketches make the search strategy a part of the problem definition by specifying a fragmentation of the search space into an ordered set of classic sketches. We provide two cooperating search algorithms to effectively solve metasketches. A global optimizing search coordinates the activities of local searches, informing them of the costs of potentiallyoptimal solutions as they explore different regions of the candidate space in parallel. The local searches execute an incremental form of counterexampleguided inductive synthesis to incorporate information sent from the global search. We present Synapse, an implementation of these algorithms, and show that it effectively solves optimal synthesis problems with a variety of different cost functions. In addition, metasketches can be used to accelerate classic (nonoptimal) synthesis by explicitly controlling the search strategy, and we show that Synapse solves classic synthesis problems that stateoftheart tools cannot.
@InProceedings{POPL16p775,
author = {James Bornholt and Emina Torlak and Dan Grossman and Luis Ceze},
title = {Optimizing Synthesis with Metasketches},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {775788},
doi = {10.1145/2837614.2837666},
year = {2016},
}
Publisher's Version
Article Search


Gulwani, Sumit 
POPL'16: "Transforming Spreadsheet Data ..."
Transforming Spreadsheet Data Types using Examples
Rishabh Singh and Sumit Gulwani (Microsoft Research, USA)
Cleaning spreadsheet data types is a common problem faced by millions of spreadsheet users. Data types such as date, time, name, and units are ubiquitous in spreadsheets, and cleaning transformations on these data types involve parsing and pretty printing their string representations. This presents many challenges to users because cleaning such data requires some background knowledge about the data itself and moreover this data is typically nonuniform, unstructured, and ambiguous. Spreadsheet systems and Programming Languages provide some UIbased and programmatic solutions for this problem but they are either insufficient for the user's needs or are beyond their expertise. In this paper, we present a programming by example methodology of cleaning data types that learns the desired transformation from a few inputoutput examples. We propose a domain specific language with probabilistic semantics that is parameterized with declarative data type definitions. The probabilistic semantics is based on three key aspects: (i) approximate predicate matching, (ii) joint learning of data type interpretation, and (iii) weighted branches. This probabilistic semantics enables the language to handle nonuniform, unstructured, and ambiguous data. We then present a synthesis algorithm that learns the desired program in this language from a set of inputoutput examples. We have implemented our algorithm as an Excel addin and present its successful evaluation on 55 benchmark problems obtained from online help forums and Excel product team.
@InProceedings{POPL16p343,
author = {Rishabh Singh and Sumit Gulwani},
title = {Transforming Spreadsheet Data Types using Examples},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {343356},
doi = {10.1145/2837614.2837668},
year = {2016},
}
Publisher's Version
Article Search


Gurfinkel, Arie 
POPL'16: "Maximal Specification Synthesis ..."
Maximal Specification Synthesis
Aws Albarghouthi, Isil Dillig, and Arie Gurfinkel (University of WisconsinMadison, USA; University of Texas at Austin, USA; Carnegie Mellon University, USA)
Many problems in program analysis, verification, and synthesis require inferring specifications of unknown procedures. Motivated by a broad range of applications, we formulate the problem of maximal specification inference: Given a postcondition Phi and a program P calling a set of unknown procedures F_1,…,F_n, what are the most permissive specifications of procedures F_i that ensure correctness of P? In other words, we are looking for the smallest number of assumptions we need to make about the behaviours of F_i in order to prove that P satisfies its postcondition. To solve this problem, we present a novel approach that utilizes a counterexampleguided inductive synthesis loop and reduces the maximal specification inference problem to multiabduction. We formulate the novel notion of multiabduction as a generalization of classical logical abduction and present an algorithm for solving multiabduction problems. On the practical side, we evaluate our specification inference technique on a range of benchmarks and demonstrate its ability to synthesize specifications of kernel routines invoked by device drivers.
@InProceedings{POPL16p789,
author = {Aws Albarghouthi and Isil Dillig and Arie Gurfinkel},
title = {Maximal Specification Synthesis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {789801},
doi = {10.1145/2837614.2837628},
year = {2016},
}
Publisher's Version
Article Search


Hague, Matthew 
POPL'16: "Unboundedness and Downward ..."
Unboundedness and Downward Closures of HigherOrder Pushdown Automata
Matthew Hague, Jonathan Kochems, and C.H. Luke Ong (University of London, UK; University of Oxford, UK)
We show the diagonal problem for higherorder pushdown automata (HOPDA), and hence the simultaneous unboundedness problem, is decidable. From recent work by Zetzsche this means that we can construct the downward closure of the set of words accepted by a given HOPDA. This also means we can construct the downward closure of the Parikh image of a HOPDA. Both of these consequences play an important role in verifying concurrent higherorder programs expressed as HOPDA or safe higherorder recursion schemes.
@InProceedings{POPL16p151,
author = {Matthew Hague and Jonathan Kochems and C.H. Luke Ong},
title = {Unboundedness and Downward Closures of HigherOrder Pushdown Automata},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {151163},
doi = {10.1145/2837614.2837627},
year = {2016},
}
Publisher's Version
Article Search
Info


Hasheminezhad, Rouzbeh 
POPL'16: "Algorithmic Analysis of Qualitative ..."
Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs
Krishnendu Chatterjee, Hongfei Fu, Petr Novotný, and Rouzbeh Hasheminezhad (IST Austria, Austria; Institute of Software at Chinese Academy of Sciences, China; Sharif University of Technology, Iran)
In this paper, we consider termination of probabilistic programs with realvalued variables. The questions concerned are: 1. qualitative ones that ask (i) whether the program terminates with probability 1 (almostsure termination) and (ii) whether the expected termination time is finite (finite termination); 2. quantitative ones that ask (i) to approximate the expected termination time (expectation problem) and (ii) to compute a bound B such that the probability to terminate after B steps decreases exponentially (concentration problem). To solve these questions, we utilize the notion of ranking supermartingales which is a powerful approach for proving termination of probabilistic programs. In detail, we focus on algorithmic synthesis of linear rankingsupermartingales over affine probabilistic programs (APP's) with both angelic and demonic nondeterminism. An important subclass of APP's is LRAPP which is defined as the class of all APP's over which a linear rankingsupermartingale exists. Our main contributions are as follows. Firstly, we show that the membership problem of LRAPP (i) can be decided in polynomial time for APP's with at most demonic nondeterminism, and (ii) is NPhard and in PSPACE for APP's with angelic nondeterminism; moreover, the NPhardness result holds already for APP's without probability and demonic nondeterminism. Secondly, we show that the concentration problem over LRAPP can be solved in the same complexity as for the membership problem of LRAPP. Finally, we show that the expectation problem over LRAPP can be solved in 2EXPTIME and is PSPACEhard even for APP's without probability and nondeterminism (i.e., deterministic programs). Our experimental results demonstrate the effectiveness of our approach to answer the qualitative and quantitative questions over APP's with at most demonic nondeterminism.
@InProceedings{POPL16p327,
author = {Krishnendu Chatterjee and Hongfei Fu and Petr Novotný and Rouzbeh Hasheminezhad},
title = {Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {327342},
doi = {10.1145/2837614.2837639},
year = {2016},
}
Publisher's Version
Article Search


Hasuo, Ichiro 
POPL'16: "Memoryful Geometry of Interaction ..."
Memoryful Geometry of Interaction II: Recursion and Adequacy
Koko Muroya, Naohiko Hoshino, and Ichiro Hasuo (University of Tokyo, Japan; Kyoto University, Japan)
A general framework of Memoryful Geometry of Interaction (mGoI) is introduced recently by the authors. It provides a sound translation of lambdaterms (on the highlevel) to their realizations by stream transducers (on the lowlevel), where the internal states of the latter (called memories) are exploited for accommodating algebraic effects of Plotkin and Power. The translation is compositional, hence ``denotational,'' where transducers are inductively composed using an adaptation of Barbosa's coalgebraic component calculus. In the current paper we extend the mGoI framework and provide a systematic treatment of recursionan essential feature of programming languages that was however missing in our previous work. Specifically, we introduce two new fixedpoint operators in the coalgebraic component calculus. The two follow the previous work on recursion in GoI and are called Girard style and Mackie style: the former obviously exhibits some nice domaintheoretic properties, while the latter allows simpler construction. Their equivalence is established on the categorical (or, traced monoidal) level of abstraction, and is therefore generic with respect to the choice of algebraic effects. Our main result is an adequacy theorem of our mGoI translation, against Plotkin and Power's operational semantics for algebraic effects.
@InProceedings{POPL16p748,
author = {Koko Muroya and Naohiko Hoshino and Ichiro Hasuo},
title = {Memoryful Geometry of Interaction II: Recursion and Adequacy},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {748760},
doi = {10.1145/2837614.2837672},
year = {2016},
}
Publisher's Version
Article Search
POPL'16: "LatticeTheoretic Progress ..."
LatticeTheoretic Progress Measures and Coalgebraic Model Checking
Ichiro Hasuo, Shunsuke Shimizu, and Corina Cîrstea (University of Tokyo, Japan; University of Southampton, UK)
In the context of formal verification in general and model checking in particular, parity games serve as a mighty vehicle: many problems are encoded as parity games, which are then solved by the seminal algorithm by Jurdzinski. In this paper we identify the essence of this workflow to be the notion of progress measure, and formalize it in general, possibly infinitary, latticetheoretic terms. Our view on progress measures is that they are to nested/alternating fixed points what invariants are to safety/greatest fixed points, and what ranking functions are to liveness/least fixed points. That is, progress measures are combination of the latter two notions (invariant and ranking function) that have been extensively studied in the context of (program) verification. We then apply our theory of progress measures to a general modelchecking framework, where systems are categorically presented as coalgebras. The framework's theoretical robustness is witnessed by a smooth transfer from the branchingtime setting to the lineartime one. Although the framework can be used to derive some decision procedures for finite settings, we also expect the proposed framework to form a basis for sound proof methods for some undecidable/infinitary problems.
@InProceedings{POPL16p718,
author = {Ichiro Hasuo and Shunsuke Shimizu and Corina Cîrstea},
title = {LatticeTheoretic Progress Measures and Coalgebraic Model Checking},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {718732},
doi = {10.1145/2837614.2837673},
year = {2016},
}
Publisher's Version
Article Search


Henglein, Fritz 
POPL'16: "Kleenex: Compiling Nondeterministic ..."
Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers
Bjørn Bugge Grathwohl, Fritz Henglein, Ulrik Terp Rasmussen, Kristoffer Aalund Søholm, and Sebastian Paaske Tørholm (University of Copenhagen, Denmark; Jobindex, Denmark)
We present and illustrate Kleenex, a language for expressing general nondeterministic finite transducers, and its novel compilation to streaming string transducers with essentially optimal streaming behavior, worstcase lineartime performance and sustained high throughput. Its underlying theory is based on transducer decomposition into oracle and action machines: the oracle machine performs streaming greedy disambiguation of the input; the action machine performs the output actions. In use cases Kleenex achieves consistently high throughput rates around the 1 Gbps range on stock hardware. It performs well, especially in complex use cases, in comparison to both specialized and related tools such as GNUawk, GNUsed, GNUgrep, RE2, Ragel and regularexpression libraries.
@InProceedings{POPL16p284,
author = {Bjørn Bugge Grathwohl and Fritz Henglein and Ulrik Terp Rasmussen and Kristoffer Aalund Søholm and Sebastian Paaske Tørholm},
title = {Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {284297},
doi = {10.1145/2837614.2837647},
year = {2016},
}
Publisher's Version
Article Search


Henzinger, Thomas A. 
POPL'16: "PSync: A Partially Synchronous ..."
PSync: A Partially Synchronous Language for FaultTolerant Distributed Algorithms
Cezara Drăgoi, Thomas A. Henzinger, and Damien Zufferey (Inria, France; ENS, France; CNRS, France; IST Austria, Austria; Massachusetts Institute of Technology, USA)
Faulttolerant distributed algorithms play an important role in many critical/highavailability applications. These algorithms are notoriously difficult to implement correctly, due to asynchronous communication and the occurrence of faults, such as the network dropping messages or computers crashing. We introduce PSync, a domain specific language based on the HeardOf model, which views asynchronous faulty systems as synchronous ones with an adversarial environment that simulates asynchrony and faults by dropping messages. We define a runtime system for PSync that efficiently executes on asynchronous networks. We formalise the relation between the runtime system and PSync in terms of observational refinement. The highlevel lockstep abstraction introduced by PSync simplifies the design and implementation of faulttolerant distributed algorithms and enables automated formal verification. We have implemented an embedding of PSync in the Scala programming language with a runtime system for partially synchronous networks. We show the applicability of PSync by implementing several important faulttolerant distributed algorithms and we compare the implementation of consensus algorithms in PSync against implementations in other languages in terms of code size, runtime efficiency, and verification.
@InProceedings{POPL16p400,
author = {Cezara Drăgoi and Thomas A. Henzinger and Damien Zufferey},
title = {PSync: A Partially Synchronous Language for FaultTolerant Distributed Algorithms},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {400415},
doi = {10.1145/2837614.2837650},
year = {2016},
}
Publisher's Version
Article Search


Hoshino, Naohiko 
POPL'16: "Memoryful Geometry of Interaction ..."
Memoryful Geometry of Interaction II: Recursion and Adequacy
Koko Muroya, Naohiko Hoshino, and Ichiro Hasuo (University of Tokyo, Japan; Kyoto University, Japan)
A general framework of Memoryful Geometry of Interaction (mGoI) is introduced recently by the authors. It provides a sound translation of lambdaterms (on the highlevel) to their realizations by stream transducers (on the lowlevel), where the internal states of the latter (called memories) are exploited for accommodating algebraic effects of Plotkin and Power. The translation is compositional, hence ``denotational,'' where transducers are inductively composed using an adaptation of Barbosa's coalgebraic component calculus. In the current paper we extend the mGoI framework and provide a systematic treatment of recursionan essential feature of programming languages that was however missing in our previous work. Specifically, we introduce two new fixedpoint operators in the coalgebraic component calculus. The two follow the previous work on recursion in GoI and are called Girard style and Mackie style: the former obviously exhibits some nice domaintheoretic properties, while the latter allows simpler construction. Their equivalence is established on the categorical (or, traced monoidal) level of abstraction, and is therefore generic with respect to the choice of algebraic effects. Our main result is an adequacy theorem of our mGoI translation, against Plotkin and Power's operational semantics for algebraic effects.
@InProceedings{POPL16p748,
author = {Koko Muroya and Naohiko Hoshino and Ichiro Hasuo},
title = {Memoryful Geometry of Interaction II: Recursion and Adequacy},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {748760},
doi = {10.1145/2837614.2837672},
year = {2016},
}
Publisher's Version
Article Search


Hriţcu, Cătălin 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Hur, ChungKil 
POPL'16: "Lightweight Verification of ..."
Lightweight Verification of Separate Compilation
Jeehoon Kang, Yoonseung Kim, ChungKil Hur, Derek Dreyer, and Viktor Vafeiadis (Seoul National University, South Korea; MPISWS, Germany)
Major compiler verification efforts, such as the CompCert project, have traditionally simplified the verification problem by restricting attention to the correctness of wholeprogram compilation, leaving open the question of how to verify the correctness of separate compilation. Recently, a number of sophisticated techniques have been proposed for proving more flexible, compositional notions of compiler correctness, but these approaches tend to be quite heavyweight compared to the simple "closed simulations" used in verifying wholeprogram compilation. Applying such techniques to a compiler like CompCert, as Stewart et al. have done, involves major changes and extensions to its original verification. In this paper, we show that if we aim somewhat lowerto prove correctness of separate compilation, but only for a *single* compilerwe can drastically simplify the proof effort. Toward this end, we develop several lightweight techniques that recast the compositional verification problem in terms of wholeprogram compilation, thereby enabling us to largely reuse the closedsimulation proofs from existing compiler verifications. We demonstrate the effectiveness of these techniques by applying them to CompCert 2.4, converting its verification of wholeprogram compilation into a verification of separate compilation in less than two personmonths. This conversion only required a small number of changes to the original proofs, and uncovered two compiler bugs along the way. The result is SepCompCert, the first verification of separate compilation for the full CompCert compiler.
@InProceedings{POPL16p178,
author = {Jeehoon Kang and Yoonseung Kim and ChungKil Hur and Derek Dreyer and Viktor Vafeiadis},
title = {Lightweight Verification of Separate Compilation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {178190},
doi = {10.1145/2837614.2837642},
year = {2016},
}
Publisher's Version
Article Search
Info


IbsenJensen, Rasmus 
POPL'16: "Algorithms for Algebraic Path ..."
Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components
Krishnendu Chatterjee, Amir Kafshdar Goharshady, Rasmus IbsenJensen, and Andreas Pavlogiannis (IST Austria, Austria)
We study algorithmic questions for concurrent systems where the transitions are labeled from a complete, closed semiring, and path properties are algebraic with semiring operations. The algebraic path properties can model dataflow analysis problems, the shortest path problem, and many other natural problems that arise in program analysis. We consider that each component of the concurrent system is a graph with constant treewidth, a property satisfied by the controlflow graphs of most programs. We allow for multiple possible queries, which arise naturally in demand driven dataflow analysis. The study of multiple queries allows us to consider the tradeoff between the resource usage of the onetime preprocessing and for each individual query. The traditional approach constructs the product graph of all components and applies the bestknown graph algorithm on the product. In this approach, even the answer to a single query requires the transitive closure (i.e., the results of all possible queries), which provides no room for tradeoff between preprocessing and query time. Our main contributions are algorithms that significantly improve the worstcase running time of the traditional approach, and provide various tradeoffs depending on the number of queries. For example, in a concurrent system of two components, the traditional approach requires hexic time in the worst case for answering one query as well as computing the transitive closure, whereas we show that with onetime preprocessing in almost cubic time, each subsequent query can be answered in at most linear time, and even the transitive closure can be computed in almost quartic time. Furthermore, we establish conditional optimality results showing that the worstcase running time of our algorithms cannot be improved without achieving major breakthroughs in graph algorithms (i.e., improving the worstcase bound for the shortest path problem in general graphs). Preliminary experimental results show that our algorithms perform favorably on several benchmarks.
@InProceedings{POPL16p733,
author = {Krishnendu Chatterjee and Amir Kafshdar Goharshady and Rasmus IbsenJensen and Andreas Pavlogiannis},
title = {Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {733747},
doi = {10.1145/2837614.2837624},
year = {2016},
}
Publisher's Version
Article Search


Immerman, Neil 
POPL'16: "Decidability of Inferring ..."
Decidability of Inferring Inductive Invariants
Oded Padon, Neil Immerman, Sharon Shoham, Aleksandr Karbyshev, and Mooly Sagiv (Tel Aviv University, Israel; University of Massachusetts at Amherst, USA; Academic College of Tel Aviv Yaffo, Israel)
Induction is a successful approach for verification of hardware and software systems. A common practice is to model a system using logical formulas, and then use a decision procedure to verify that some logical formula is an inductive safety invariant for the system. A key ingredient in this approach is coming up with the inductive invariant, which is known as invariant inference. This is a major difficulty, and it is often left for humans or addressed by sound but incomplete abstract interpretation. This paper is motivated by the problem of inductive invariants in shape analysis and in distributed protocols. This paper approaches the general problem of inferring firstorder inductive invariants by restricting the language L of candidate invariants. Notice that the problem of invariant inference in a restricted language L differs from the safety problem, since a system may be safe and still not have any inductive invariant in L that proves safety. Clearly, if L is finite (and if testing an inductive invariant is decidable), then inferring invariants in L is decidable. This paper presents some interesting cases when inferring inductive invariants in L is decidable even when L is an infinite language of universal formulas. Decidability is obtained by restricting L and defining a suitable wellquasiorder on the state space. We also present some undecidability results that show that our restrictions are necessary. We further present a framework for systematically constructing infinite languages while keeping the invariant inference problem decidable. We illustrate our approach by showing the decidability of inferring invariants for programs manipulating linkedlists, and for distributed protocols.
@InProceedings{POPL16p217,
author = {Oded Padon and Neil Immerman and Sharon Shoham and Aleksandr Karbyshev and Mooly Sagiv},
title = {Decidability of Inferring Inductive Invariants},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {217231},
doi = {10.1145/2837614.2837640},
year = {2016},
}
Publisher's Version
Article Search


Jha, Somesh 
POPL'16: "Combining Static Analysis ..."
Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis
Damien Octeau, Somesh Jha, Matthew Dering, Patrick McDaniel, Alexandre Bartel, Li Li, Jacques Klein, and Yves Le Traon (University of Wisconsin, USA; Pennsylvania State University, USA; IMDEA Software Institute, Spain; TU Darmstadt, Germany; University of Luxembourg, Luxembourg)
Static analysis has been successfully used in many areas, from verifying missioncritical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer InterComponent Communication (ICC) yield many potential intercomponent and interapplication links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the intercomponent links is a prerequisite to intercomponent analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static interapplication analysis more tractable, even at large scales.
@InProceedings{POPL16p469,
author = {Damien Octeau and Somesh Jha and Matthew Dering and Patrick McDaniel and Alexandre Bartel and Li Li and Jacques Klein and Yves Le Traon},
title = {Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {469484},
doi = {10.1145/2837614.2837661},
year = {2016},
}
Publisher's Version
Article Search


Jhala, Ranjit 
POPL'16: "Printing FloatingPoint Numbers: ..."
Printing FloatingPoint Numbers: A Faster, Always Correct Method
Marc Andrysco, Ranjit Jhala, and Sorin Lerner (University of California at San Diego, USA)
Floatingpoint numbers are an essential part of modern software, recently gaining particular prominence on the web as the exclusive numeric format of Javascript. To use floatingpoint numbers, we require a way to convert binary machine representations into human readable decimal outputs. Existing conversion algorithms make tradeoffs between completeness and performance. The classic Dragon4 algorithm by Steele and White and its later refinements achieve completeness  i.e. produce correct and optimal outputs on all inputs  by using arbitrary precision integer (bignum) arithmetic which leads to a high performance cost. On the other hand, the recent Grisu3 algorithm by Loitsch shows how to recover performance by using native integer arithmetic but sacrifices optimality for 0.5% of all inputs. We present Errol, a new complete algorithm that is guaranteed to produce correct and optimal results for all inputs while simultaneously being 2x faster than the incomplete Grisu3 and 4x faster than previous complete methods.
@InProceedings{POPL16p555,
author = {Marc Andrysco and Ranjit Jhala and Sorin Lerner},
title = {Printing FloatingPoint Numbers: A Faster, Always Correct Method},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {555567},
doi = {10.1145/2837614.2837654},
year = {2016},
}
Publisher's Version
Article Search


Jia, Limin 
POPL'16: "Monitors and Blame Assignment ..."
Monitors and Blame Assignment for HigherOrder Session Types
Limin Jia, Hannah Gommerstadt, and Frank Pfenning (Carnegie Mellon University, USA)
Session types provide a means to prescribe the communication behavior between concurrent messagepassing processes. However, in a distributed setting, some processes may be written in languages that do not support static typing of sessions or may be compromised by a malicious intruder, violating invariants of the session types. In such a setting, dynamically monitoring communication between processes becomes a necessity for identifying undesirable actions. In this paper, we show how to dynamically monitor communication to enforce adherence to session types in a higherorder setting. We present a system of blame assignment in the case when the monitor detects an undesirable action and an alarm is raised. We prove that dynamic monitoring does not change system behavior for welltyped processes, and that one of an indicated set of possible culprits must have been compromised in case of an alarm.
@InProceedings{POPL16p582,
author = {Limin Jia and Hannah Gommerstadt and Frank Pfenning},
title = {Monitors and Blame Assignment for HigherOrder Session Types},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {582594},
doi = {10.1145/2837614.2837662},
year = {2016},
}
Publisher's Version
Article Search


Kang, Jeehoon 
POPL'16: "Lightweight Verification of ..."
Lightweight Verification of Separate Compilation
Jeehoon Kang, Yoonseung Kim, ChungKil Hur, Derek Dreyer, and Viktor Vafeiadis (Seoul National University, South Korea; MPISWS, Germany)
Major compiler verification efforts, such as the CompCert project, have traditionally simplified the verification problem by restricting attention to the correctness of wholeprogram compilation, leaving open the question of how to verify the correctness of separate compilation. Recently, a number of sophisticated techniques have been proposed for proving more flexible, compositional notions of compiler correctness, but these approaches tend to be quite heavyweight compared to the simple "closed simulations" used in verifying wholeprogram compilation. Applying such techniques to a compiler like CompCert, as Stewart et al. have done, involves major changes and extensions to its original verification. In this paper, we show that if we aim somewhat lowerto prove correctness of separate compilation, but only for a *single* compilerwe can drastically simplify the proof effort. Toward this end, we develop several lightweight techniques that recast the compositional verification problem in terms of wholeprogram compilation, thereby enabling us to largely reuse the closedsimulation proofs from existing compiler verifications. We demonstrate the effectiveness of these techniques by applying them to CompCert 2.4, converting its verification of wholeprogram compilation into a verification of separate compilation in less than two personmonths. This conversion only required a small number of changes to the original proofs, and uncovered two compiler bugs along the way. The result is SepCompCert, the first verification of separate compilation for the full CompCert compiler.
@InProceedings{POPL16p178,
author = {Jeehoon Kang and Yoonseung Kim and ChungKil Hur and Derek Dreyer and Viktor Vafeiadis},
title = {Lightweight Verification of Separate Compilation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {178190},
doi = {10.1145/2837614.2837642},
year = {2016},
}
Publisher's Version
Article Search
Info


Kanovich, Max 
POPL'16: "Model Checking for SymbolicHeap ..."
Model Checking for SymbolicHeap Separation Logic with Inductive Predicates
James Brotherston, Nikos Gorogiannis, Max Kanovich, and Reuben Rowe (University College London, UK; Middlesex University, UK; National Research University Higher School of Economics, Russia)
We investigate the *model checking* problem for symbolicheap separation logic with userdefined inductive predicates, i.e., the problem of checking that a given stackheap memory state satisfies a given formula in this language, as arises e.g. in software testing or runtime verification. First, we show that the problem is *decidable*; specifically, we present a bottomup fixed point algorithm that decides the problem and runs in exponential time in the size of the problem instance. Second, we show that, while model checking for the full language is EXPTIMEcomplete, the problem becomes NPcomplete or PTIMEsolvable when we impose natural syntactic restrictions on the schemata defining the inductive predicates. We additionally present NP and PTIME algorithms for these restricted fragments. Finally, we report on the experimental performance of our procedures on a variety of specifications extracted from programs, exercising multiple combinations of syntactic restrictions.
@InProceedings{POPL16p84,
author = {James Brotherston and Nikos Gorogiannis and Max Kanovich and Reuben Rowe},
title = {Model Checking for SymbolicHeap Separation Logic with Inductive Predicates},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {8496},
doi = {10.1145/2837614.2837621},
year = {2016},
}
Publisher's Version
Article Search


Kaposi, Ambrus 
POPL'16: "Type Theory in Type Theory ..."
Type Theory in Type Theory using Quotient Inductive Types
Thorsten Altenkirch and Ambrus Kaposi (University of Nottingham, UK)
We present an internal formalisation of a type heory with dependent types in Type Theory using a special case of higher inductive types from Homotopy Type Theory which we call quotient inductive types (QITs). Our formalisation of type theory avoids referring to preterms or a typability relation but defines directly well typed objects by an inductive definition. We use the elimination principle to define the settheoretic and logical predicate interpretation. The work has been formalized using the Agda system extended with QITs using postulates.
@InProceedings{POPL16p18,
author = {Thorsten Altenkirch and Ambrus Kaposi},
title = {Type Theory in Type Theory using Quotient Inductive Types},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {1829},
doi = {10.1145/2837614.2837638},
year = {2016},
}
Publisher's Version
Article Search
Info


Karbyshev, Aleksandr 
POPL'16: "Decidability of Inferring ..."
Decidability of Inferring Inductive Invariants
Oded Padon, Neil Immerman, Sharon Shoham, Aleksandr Karbyshev, and Mooly Sagiv (Tel Aviv University, Israel; University of Massachusetts at Amherst, USA; Academic College of Tel Aviv Yaffo, Israel)
Induction is a successful approach for verification of hardware and software systems. A common practice is to model a system using logical formulas, and then use a decision procedure to verify that some logical formula is an inductive safety invariant for the system. A key ingredient in this approach is coming up with the inductive invariant, which is known as invariant inference. This is a major difficulty, and it is often left for humans or addressed by sound but incomplete abstract interpretation. This paper is motivated by the problem of inductive invariants in shape analysis and in distributed protocols. This paper approaches the general problem of inferring firstorder inductive invariants by restricting the language L of candidate invariants. Notice that the problem of invariant inference in a restricted language L differs from the safety problem, since a system may be safe and still not have any inductive invariant in L that proves safety. Clearly, if L is finite (and if testing an inductive invariant is decidable), then inferring invariants in L is decidable. This paper presents some interesting cases when inferring inductive invariants in L is decidable even when L is an infinite language of universal formulas. Decidability is obtained by restricting L and defining a suitable wellquasiorder on the state space. We also present some undecidability results that show that our restrictions are necessary. We further present a framework for systematically constructing infinite languages while keeping the invariant inference problem decidable. We illustrate our approach by showing the decidability of inferring invariants for programs manipulating linkedlists, and for distributed protocols.
@InProceedings{POPL16p217,
author = {Oded Padon and Neil Immerman and Sharon Shoham and Aleksandr Karbyshev and Mooly Sagiv},
title = {Decidability of Inferring Inductive Invariants},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {217231},
doi = {10.1145/2837614.2837640},
year = {2016},
}
Publisher's Version
Article Search


Katz, Omer 
POPL'16: "Estimating Types in Binaries ..."
Estimating Types in Binaries using Predictive Modeling
Omer Katz, Ran ElYaniv, and Eran Yahav (Technion, Israel)
Reverse engineering is an important tool in mitigating vulnerabilities in binaries. As a lot of software is developed in objectoriented languages, reverse engineering of objectoriented code is of critical importance. One of the major hurdles in reverse engineering binaries compiled from objectoriented code is the use of dynamic dispatch. In the absence of debug information, any dynamic dispatch may seem to jump to many possible targets, posing a significant challenge to a reverse engineer trying to track the program flow. We present a novel technique that allows us to statically determine the likely targets of virtual function calls. Our technique uses object tracelets – statically constructed sequences of operations performed on an object – to capture potential runtime behaviors of the object. Our analysis automatically prelabels some of the object tracelets by relying on instances where the type of an object is known. The resulting typelabeled tracelets are then used to train a statistical language model (SLM) for each type.We then use the resulting ensemble of SLMs over unlabeled tracelets to generate a ranking of their most likely types, from which we deduce the likely targets of dynamic dispatches.We have implemented our technique and evaluated it over realworld C++ binaries. Our evaluation shows that when there are multiple alternative targets, our approach can drastically reduce the number of targets that have to be considered by a reverse engineer.
@InProceedings{POPL16p313,
author = {Omer Katz and Ran ElYaniv and Eran Yahav},
title = {Estimating Types in Binaries using Predictive Modeling},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {313326},
doi = {10.1145/2837614.2837674},
year = {2016},
}
Publisher's Version
Article Search


Keller, Chantal 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Kim, Yoonseung 
POPL'16: "Lightweight Verification of ..."
Lightweight Verification of Separate Compilation
Jeehoon Kang, Yoonseung Kim, ChungKil Hur, Derek Dreyer, and Viktor Vafeiadis (Seoul National University, South Korea; MPISWS, Germany)
Major compiler verification efforts, such as the CompCert project, have traditionally simplified the verification problem by restricting attention to the correctness of wholeprogram compilation, leaving open the question of how to verify the correctness of separate compilation. Recently, a number of sophisticated techniques have been proposed for proving more flexible, compositional notions of compiler correctness, but these approaches tend to be quite heavyweight compared to the simple "closed simulations" used in verifying wholeprogram compilation. Applying such techniques to a compiler like CompCert, as Stewart et al. have done, involves major changes and extensions to its original verification. In this paper, we show that if we aim somewhat lowerto prove correctness of separate compilation, but only for a *single* compilerwe can drastically simplify the proof effort. Toward this end, we develop several lightweight techniques that recast the compositional verification problem in terms of wholeprogram compilation, thereby enabling us to largely reuse the closedsimulation proofs from existing compiler verifications. We demonstrate the effectiveness of these techniques by applying them to CompCert 2.4, converting its verification of wholeprogram compilation into a verification of separate compilation in less than two personmonths. This conversion only required a small number of changes to the original proofs, and uncovered two compiler bugs along the way. The result is SepCompCert, the first verification of separate compilation for the full CompCert compiler.
@InProceedings{POPL16p178,
author = {Jeehoon Kang and Yoonseung Kim and ChungKil Hur and Derek Dreyer and Viktor Vafeiadis},
title = {Lightweight Verification of Separate Compilation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {178190},
doi = {10.1145/2837614.2837642},
year = {2016},
}
Publisher's Version
Article Search
Info


King, Andy 
POPL'16: "From MinX to MinC: SemanticsDriven ..."
From MinX to MinC: SemanticsDriven Decompilation of Recursive Datatypes
Ed Robbins, Andy King, and Tom Schrijvers (University of Kent, UK; KU Leuven, Belgium)
Reconstructing the meaning of a program from its binary executable is known as reverse engineering; it has a wide range of applications in software security, exposing piracy, legacy systems, etc. Since reversing is ultimately a search for meaning, there is much interest in inferring a type (a meaning) for the elements of a binary in a consistent way. Unfortunately existing approaches do not guarantee any semantic relevance for their reconstructed types. This paper presents a new and semanticallyfounded approach that provides strong guarantees for the reconstructed types. Key to our approach is the derivation of a witness program in a highlevel language alongside the reconstructed types. This witness has the same semantics as the binary, is type correct by construction, and it induces a (justifiable) type assignment on the binary. Moreover, the approach effectively yields a typedirected decompiler. We formalise and implement the approach for reversing MinX, an abstraction of x86, to MinC, a typesafe dialect of C with recursive datatypes. Our evaluation compiles a range of textbook C algorithms to MinX and then recovers the original structures.
@InProceedings{POPL16p191,
author = {Ed Robbins and Andy King and Tom Schrijvers},
title = {From MinX to MinC: SemanticsDriven Decompilation of Recursive Datatypes},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {191203},
doi = {10.1145/2837614.2837633},
year = {2016},
}
Publisher's Version
Article Search


Klein, Jacques 
POPL'16: "Combining Static Analysis ..."
Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis
Damien Octeau, Somesh Jha, Matthew Dering, Patrick McDaniel, Alexandre Bartel, Li Li, Jacques Klein, and Yves Le Traon (University of Wisconsin, USA; Pennsylvania State University, USA; IMDEA Software Institute, Spain; TU Darmstadt, Germany; University of Luxembourg, Luxembourg)
Static analysis has been successfully used in many areas, from verifying missioncritical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer InterComponent Communication (ICC) yield many potential intercomponent and interapplication links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the intercomponent links is a prerequisite to intercomponent analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static interapplication analysis more tractable, even at large scales.
@InProceedings{POPL16p469,
author = {Damien Octeau and Somesh Jha and Matthew Dering and Patrick McDaniel and Alexandre Bartel and Li Li and Jacques Klein and Yves Le Traon},
title = {Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {469484},
doi = {10.1145/2837614.2837661},
year = {2016},
}
Publisher's Version
Article Search


Kobayashi, Naoki 
POPL'16: "Temporal Verification of HigherOrder ..."
Temporal Verification of HigherOrder Functional Programs
Akihiro Murase, Tachio Terauchi, Naoki Kobayashi, Ryosuke Sato, and Hiroshi Unno (Nagoya University, Japan; JAIST, Japan; University of Tokyo, Japan; University of Tsukuba, Japan)
We present an automated approach to verifying arbitrary omegaregular
properties of higherorder functional programs. Previous automated
methods proposed for this class of programs could only handle safety
properties or termination, and our approach is the first to be able
to verify arbitrary omegaregular liveness properties.
Our approach is automatatheoretic, and extends our recent work on
binaryreachabilitybased approach to automated termination
verification of higherorder functional programs to fair termination
published in ESOP 2014. In that work, we have shown that checking
disjunctive wellfoundedness of (the transitive closure of) the
``calling relation'' is sound and complete for termination. The
extension to fair termination is tricky, however, because the
straightforward extension that checks disjunctive wellfoundedness of
the fair calling relation turns out to be unsound, as we shall show in
the paper. Roughly, our solution is to check fairness on the
transition relation instead of the calling relation, and propagate the
information to determine when it is necessary and sufficient to check
for disjunctive wellfoundedness on the calling relation. We prove
that our approach is sound and complete. We have implemented
a prototype of our approach, and confirmed that it is able to
automatically verify liveness properties of some nontrivial
higherorder programs.
@InProceedings{POPL16p57,
author = {Akihiro Murase and Tachio Terauchi and Naoki Kobayashi and Ryosuke Sato and Hiroshi Unno},
title = {Temporal Verification of HigherOrder Functional Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {5768},
doi = {10.1145/2837614.2837667},
year = {2016},
}
Publisher's Version
Article Search


Kochems, Jonathan 
POPL'16: "Unboundedness and Downward ..."
Unboundedness and Downward Closures of HigherOrder Pushdown Automata
Matthew Hague, Jonathan Kochems, and C.H. Luke Ong (University of London, UK; University of Oxford, UK)
We show the diagonal problem for higherorder pushdown automata (HOPDA), and hence the simultaneous unboundedness problem, is decidable. From recent work by Zetzsche this means that we can construct the downward closure of the set of words accepted by a given HOPDA. This also means we can construct the downward closure of the Parikh image of a HOPDA. Both of these consequences play an important role in verifying concurrent higherorder programs expressed as HOPDA or safe higherorder recursion schemes.
@InProceedings{POPL16p151,
author = {Matthew Hague and Jonathan Kochems and C.H. Luke Ong},
title = {Unboundedness and Downward Closures of HigherOrder Pushdown Automata},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {151163},
doi = {10.1145/2837614.2837627},
year = {2016},
}
Publisher's Version
Article Search
Info


Kohlweiss, Markulf 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Koskinen, Eric 
POPL'16: "Reducing Crash Recoverability ..."
Reducing Crash Recoverability to Reachability
Eric Koskinen and Junfeng Yang (Yale University, USA; Columbia University, USA)
Software applications run on a variety of platforms (filesystems, virtual slices, mobile hardware, etc.) that do not provide 100% uptime. As such, these applications may crash at any unfortunate moment losing volatile data and, when relaunched, they must be able to correctly recover from potentially inconsistent states left on persistent storage. From a verification perspective, crash recovery bugs can be particularly frustrating because, even when it has been formally proved for a program that it satisfies a property, the proof is foiled by these external events that crash and restart the program. In this paper we first provide a hierarchical formal model of what it means for a program to be crash recoverable. Our model captures the recoverability of many real world programs, including those in our evaluation which use sophisticated recovery algorithms such as shadow paging and writeahead logging. Next, we introduce a novel technique capable of automatically proving that a program correctly recovers from a crash via a reduction to reachability. Our technique takes an input controlflow automaton and transforms it into an encoding that blends the capture of snapshots of precrash states into a symbolic search for a proof that recovery terminates and every recovered execution simulates some crashfree execution. Our encoding is designed to enable one to apply existing abstraction techniques in order to do the work that is necessary to prove recoverability. We have implemented our technique in a tool called Eleven82, capable of analyzing C programs to detect recoverability bugs or prove their absence. We have applied our tool to benchmark examples drawn from industrial file systems and databases, including GDBM, LevelDB, LMDB, PostgreSQL, SQLite, VMware and ZooKeeper. Within minutes, our tool is able to discover bugs or prove that these fragments are crash recoverable.
@InProceedings{POPL16p97,
author = {Eric Koskinen and Junfeng Yang},
title = {Reducing Crash Recoverability to Reachability},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {97108},
doi = {10.1145/2837614.2837648},
year = {2016},
}
Publisher's Version
Article Search


Krause, Andreas 
POPL'16: "Learning Programs from Noisy ..."
Learning Programs from Noisy Data
Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause (ETH Zurich, Switzerland)
We present a new approach for learning programs from noisy datasets. Our approach is based on two new concepts: a regularized program generator which produces a candidate program based on a small sample of the entire dataset while avoiding overfitting, and a dataset sampler which carefully samples the dataset by leveraging the candidate program's score on that dataset. The two components are connected in a continuous feedbackdirected loop. We show how to apply this approach to two settings: one where the dataset has a bound on the noise, and another without a noise bound. The second setting leads to a new way of performing approximate empirical risk minimization on hypotheses classes formed by a discrete search space. We then present two new kinds of program synthesizers which target the two noise settings. First, we introduce a novel regularized bitstream synthesizer that successfully generates programs even in the presence of incorrect examples. We show that the synthesizer can detect errors in the examples while combating overfitting  a major problem in existing synthesis techniques. We also show how the approach can be used in a setting where the dataset grows dynamically via new examples (e.g., provided by a human). Second, we present a novel technique for constructing statistical code completion systems. These are systems trained on massive datasets of open source programs, also known as ``Big Code''. The key idea is to introduce a domain specific language (DSL) over trees and to learn functions in that DSL directly from the dataset. These learned functions then condition the predictions made by the system. This is a flexible and powerful technique which generalizes several existing works as we no longer need to decide a priori on what the prediction should be conditioned (another benefit is that the learned functions are a natural mechanism for explaining the prediction). As a result, our code completion system surpasses the prediction capabilities of existing, hardwired systems.
@InProceedings{POPL16p761,
author = {Veselin Raychev and Pavol Bielik and Martin Vechev and Andreas Krause},
title = {Learning Programs from Noisy Data},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {761774},
doi = {10.1145/2837614.2837671},
year = {2016},
}
Publisher's Version
Article Search


Krishnamoorthy, Sriram 
POPL'16: "PolyCheck: Dynamic Verification ..."
PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs
Wenlei Bao, Sriram Krishnamoorthy, LouisNoël Pouchet, Fabrice Rastello, and P. Sadayappan (Ohio State University, USA; Pacific Northwest National Laboratory, USA; Inria, France)
Highlevel compiler transformations, especially loop transformations, are widely recognized as critical optimizations to restructure programs to improve data locality and expose parallelism. Guaranteeing the correctness of program transformations is essential, and to date three main approaches have been developed: proof of equivalence of affine programs, matching the execution traces of programs, and checking bitbybit equivalence of program outputs. Each technique suffers from limitations in the kind of transformations supported, space complexity, or the sensitivity to the testing dataset. In this paper, we take a novel approach that addresses all three limitations to provide an automatic bug checker to verify any iteration reordering transformations on affine programs, including nonaffine transformations, with space consumption proportional to the original program data and robust to arbitrary datasets of a given size. We achieve this by exploiting the structure of affine program control and dataflow to generate at compiletime lightweight checker code to be executed within the transformed program. Experimental results assess the correctness and effectiveness of our method and its increased coverage over previous approaches.
@InProceedings{POPL16p539,
author = {Wenlei Bao and Sriram Krishnamoorthy and LouisNoël Pouchet and Fabrice Rastello and P. Sadayappan},
title = {PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {539554},
doi = {10.1145/2837614.2837656},
year = {2016},
}
Publisher's Version
Article Search


Lahav, Ori 
POPL'16: "Taming ReleaseAcquire Consistency ..."
Taming ReleaseAcquire Consistency
Ori Lahav, Nick Giannarakis, and Viktor Vafeiadis (MPISWS, Germany)
We introduce a strengthening of the releaseacquire fragment of the C11 memory model that (i) forbids dubious behaviors that are not observed in any implementation; (ii) supports fence instructions that restore sequential consistency; and (iii) admits an equivalent intuitive operational semantics based on pointtopoint communication. This strengthening has no additional implementation cost: it allows the same local optimizations as C11 release and acquire accesses, and has exactly the same compilation schemes to the x86TSO and Power architectures. In fact, the compilation to Power is complete with respect to a recent axiomatic model of Power; that is, the compiled program exhibits exactly the same behaviors as the source one. Moreover, we provide criteria for placing enough fence instructions to ensure sequential consistency, and apply them to an efficient RCU implementation.
@InProceedings{POPL16p649,
author = {Ori Lahav and Nick Giannarakis and Viktor Vafeiadis},
title = {Taming ReleaseAcquire Consistency},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {649662},
doi = {10.1145/2837614.2837643},
year = {2016},
}
Publisher's Version
Article Search
Info


Lavaee, Rahman 
POPL'16: "The Hardness of Data Packing ..."
The Hardness of Data Packing
Rahman Lavaee (University of Rochester, USA)
A program can benefit from improved cache block utilization when contemporaneously accessed data elements are placed in the same memory block. This can reduce the program's memory block working set and thereby, reduce the capacity miss rate. We formally define the problem of data packing for arbitrary number of blocks in the cache and packing factor (the number of data objects fitting in a cache block) and study how well the optimal solution can be approximated for two dual problems. On the one hand, we show that the cache hit maximization problem is approximable within a constant factor, for every fixed number of blocks in the cache. On the other hand, we show that unless P=NP, the cache miss minimization problem cannot be efficiently approximated.
@InProceedings{POPL16p232,
author = {Rahman Lavaee},
title = {The Hardness of Data Packing},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {232242},
doi = {10.1145/2837614.2837669},
year = {2016},
}
Publisher's Version
Article Search


Lerner, Sorin 
POPL'16: "Printing FloatingPoint Numbers: ..."
Printing FloatingPoint Numbers: A Faster, Always Correct Method
Marc Andrysco, Ranjit Jhala, and Sorin Lerner (University of California at San Diego, USA)
Floatingpoint numbers are an essential part of modern software, recently gaining particular prominence on the web as the exclusive numeric format of Javascript. To use floatingpoint numbers, we require a way to convert binary machine representations into human readable decimal outputs. Existing conversion algorithms make tradeoffs between completeness and performance. The classic Dragon4 algorithm by Steele and White and its later refinements achieve completeness  i.e. produce correct and optimal outputs on all inputs  by using arbitrary precision integer (bignum) arithmetic which leads to a high performance cost. On the other hand, the recent Grisu3 algorithm by Loitsch shows how to recover performance by using native integer arithmetic but sacrifices optimality for 0.5% of all inputs. We present Errol, a new complete algorithm that is guaranteed to produce correct and optimal results for all inputs while simultaneously being 2x faster than the incomplete Grisu3 and 4x faster than previous complete methods.
@InProceedings{POPL16p555,
author = {Marc Andrysco and Ranjit Jhala and Sorin Lerner},
title = {Printing FloatingPoint Numbers: A Faster, Always Correct Method},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {555567},
doi = {10.1145/2837614.2837654},
year = {2016},
}
Publisher's Version
Article Search


Lesani, Mohsen 
POPL'16: "Chapar: Certified Causally ..."
Chapar: Certified Causally Consistent Distributed KeyValue Stores
Mohsen Lesani, Christian J. Bell, and Adam Chlipala (Massachusetts Institute of Technology, USA)
Today’s Internet services are often expected to stay available and render high responsiveness even in the face of site crashes and network partitions. Theoretical results state that causal consistency is one of the strongest consistency guarantees that is possible under these requirements, and many practical systems provide causally consistent keyvalue stores. In this paper, we present a framework called Chapar for modular verification of causal consistency for replicated keyvalue store implementations and their client programs. Specifically, we formulate separate correctness conditions for keyvalue store implementations and for their clients. The interface between the two is a novel operational semantics for causal consistency. We have verified the causal consistency of two keyvalue store implementations from the literature using a novel proof technique. We have also implemented a simple automatic model checker for the correctness of client programs. The two independently verified results for the implementations and clients can be composed to conclude the correctness of any of the programs when executed with any of the implementations. We have developed and checked our framework in Coq, extracted it to OCaml, and built executable stores.
@InProceedings{POPL16p357,
author = {Mohsen Lesani and Christian J. Bell and Adam Chlipala},
title = {Chapar: Certified Causally Consistent Distributed KeyValue Stores},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {357370},
doi = {10.1145/2837614.2837622},
year = {2016},
}
Publisher's Version
Article Search


Le Traon, Yves 
POPL'16: "Combining Static Analysis ..."
Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis
Damien Octeau, Somesh Jha, Matthew Dering, Patrick McDaniel, Alexandre Bartel, Li Li, Jacques Klein, and Yves Le Traon (University of Wisconsin, USA; Pennsylvania State University, USA; IMDEA Software Institute, Spain; TU Darmstadt, Germany; University of Luxembourg, Luxembourg)
Static analysis has been successfully used in many areas, from verifying missioncritical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer InterComponent Communication (ICC) yield many potential intercomponent and interapplication links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the intercomponent links is a prerequisite to intercomponent analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static interapplication analysis more tractable, even at large scales.
@InProceedings{POPL16p469,
author = {Damien Octeau and Somesh Jha and Matthew Dering and Patrick McDaniel and Alexandre Bartel and Li Li and Jacques Klein and Yves Le Traon},
title = {Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {469484},
doi = {10.1145/2837614.2837661},
year = {2016},
}
Publisher's Version
Article Search


Li, Li 
POPL'16: "Combining Static Analysis ..."
Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis
Damien Octeau, Somesh Jha, Matthew Dering, Patrick McDaniel, Alexandre Bartel, Li Li, Jacques Klein, and Yves Le Traon (University of Wisconsin, USA; Pennsylvania State University, USA; IMDEA Software Institute, Spain; TU Darmstadt, Germany; University of Luxembourg, Luxembourg)
Static analysis has been successfully used in many areas, from verifying missioncritical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer InterComponent Communication (ICC) yield many potential intercomponent and interapplication links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the intercomponent links is a prerequisite to intercomponent analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static interapplication analysis more tractable, even at large scales.
@InProceedings{POPL16p469,
author = {Damien Octeau and Somesh Jha and Matthew Dering and Patrick McDaniel and Alexandre Bartel and Li Li and Jacques Klein and Yves Le Traon},
title = {Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {469484},
doi = {10.1145/2837614.2837661},
year = {2016},
}
Publisher's Version
Article Search


Liang, Hongjin 
POPL'16: "A Program Logic for Concurrent ..."
A Program Logic for Concurrent Objects under Fair Scheduling
Hongjin Liang and Xinyu Feng (University of Science and Technology of China, China)
Existing work on verifying concurrent objects is mostly concerned with safety only, e.g., partial correctness or linearizability. Although there has been recent work verifying lockfreedom of nonblocking objects, much less efforts are focused on deadlockfreedom and starvationfreedom, progress properties of blocking objects. These properties are more challenging to verify than lockfreedom because they allow the progress of one thread to depend on the progress of another, assuming fair scheduling.
We propose LiLi, a new relyguarantee style program logic for verifying linearizability and progress together for concurrent objects under fair scheduling. The relyguarantee style logic unifies threadmodular reasoning about both starvationfreedom and deadlockfreedom in one framework. It also establishes progressaware abstraction for concurrent objects, which can be applied when verifying safety and liveness of client code. We have successfully applied the logic to verify starvationfreedom or deadlockfreedom of representative algorithms such as ticket locks, queue locks, lockcoupling lists, optimistic lists and lazy lists.
@InProceedings{POPL16p385,
author = {Hongjin Liang and Xinyu Feng},
title = {A Program Logic for Concurrent Objects under Fair Scheduling},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {385399},
doi = {10.1145/2837614.2837635},
year = {2016},
}
Publisher's Version
Article Search


Lin, Anthony W. 
POPL'16: "String Solving with Word Equations ..."
String Solving with Word Equations and Transducers: Towards a Logic for Analysing Mutation XSS
Anthony W. Lin and Pablo Barceló (YaleNUS College, Singapore; University of Chile, Chile)
We study the fundamental issue of decidability of satisfiability over string logics with concatenations and finitestate transducers as atomic operations. Although restricting to one type of operations yields decidability, little is known about the decidability of their combined theory, which is especially relevant when analysing security vulnerabilities of dynamic web pages in a more realistic browser model. On the one hand, word equations (string logic with concatenations) cannot precisely capture sanitisation functions (e.g. htmlescape) and implicit browser transductions (e.g. innerHTML mutations). On the other hand, transducers suffer from the reverse problem of being able to model sanitisation functions and browser transductions, but not string concatenations. Naively combining word equations and transducers easily leads to an undecidable logic. Our main contribution is to show that the "straightline fragment" of the logic is decidable (complexity ranges from PSPACE to EXPSPACE). The fragment can express the program logics of straightline stringmanipulating programs with concatenations and transductions as atomic operations, which arise when performing bounded model checking or dynamic symbolic executions. We demonstrate that the logic can naturally express constraints required for analysing mutation XSS in web applications. Finally, the logic remains decidable in the presence of length, lettercounting, regular, indexOf, and disequality constraints.
@InProceedings{POPL16p123,
author = {Anthony W. Lin and Pablo Barceló},
title = {String Solving with Word Equations and Transducers: Towards a Logic for Analysing Mutation XSS},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {123136},
doi = {10.1145/2837614.2837641},
year = {2016},
}
Publisher's Version
Article Search


Long, Fan 
POPL'16: "Automatic Patch Generation ..."
Automatic Patch Generation by Learning Correct Code
Fan Long and Martin Rinard (Massachusetts Institute of Technology, USA)
We present Prophet, a novel patch generation system that works with a set of successful human patches obtained from open source software repositories to learn a probabilistic, applicationindependent model of correct code. It generates a space of candidate patches, uses the model to rank the candidate patches in order of likely correctness, and validates the ranked patches against a suite of test cases to find correct patches. Experimental results show that, on a benchmark set of 69 realworld defects drawn from eight opensource projects, Prophet significantly outperforms the previous stateoftheart patch generation system.
@InProceedings{POPL16p298,
author = {Fan Long and Martin Rinard},
title = {Automatic Patch Generation by Learning Correct Code},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {298312},
doi = {10.1145/2837614.2837617},
year = {2016},
}
Publisher's Version
Article Search


Lopes, Nuno P. 
POPL'16: "Scaling Network Verification ..."
Scaling Network Verification using Symmetry and Surgery
Gordon D. Plotkin, Nikolaj Bjørner, Nuno P. Lopes, Andrey Rybalchenko, and George Varghese (University of Edinburgh, UK; Microsoft Research, USA; Microsoft Research, UK)
On the surface, large data centers with about 100,000 stations and nearly a million routing rules are complex and hard to verify. However, these networks are highly regular by design; for example they employ fat tree topologies with backup routers interconnected by redundant patterns. To exploit these regularities, we introduce network transformations: given a reachability formula and a network, we transform the network into a simpler to verify network and a corresponding transformed formula, such that the original formula is valid in the network if and only if the transformed formula is valid in the transformed network. Our network transformations exploit network surgery (in which irrelevant or redundant sets of nodes, headers, ports, or rules are ``sliced'' away) and network symmetry (say between backup routers). The validity of these transformations is established using a formal theory of networks. In particular, using Van BenthemHennessyMilner style bisimulation, we show that one can generally associate bisimulations to transformations connecting networks and formulas with their transforms. Our work is a development in an area of current wide interest: applying programming language techniques (in our case bisimulation and modal logic) to problems in switching networks. We provide experimental evidence that our network transformations can speed up by 65x the task of verifying the communication between all pairs of Virtual Machines in a large datacenter network with about 100,000 VMs. An allpair reachability calculation, which formerly took 5.5 days, can be done in 2 hours, and can be easily parallelized to complete in
@InProceedings{POPL16p69,
author = {Gordon D. Plotkin and Nikolaj Bjørner and Nuno P. Lopes and Andrey Rybalchenko and George Varghese},
title = {Scaling Network Verification using Symmetry and Surgery},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {6983},
doi = {10.1145/2837614.2837657},
year = {2016},
}
Publisher's Version
Article Search


Lorenzen, Florian 
POPL'16: "Sound TypeDependent Syntactic ..."
Sound TypeDependent Syntactic Language Extension
Florian Lorenzen and Sebastian Erdweg (TU Berlin, Germany; TU Darmstadt, Germany)
Syntactic language extensions can introduce new facilities into a programming language while requiring little implementation effort and modest changes to the compiler. It is typical to desugar language extensions in a distinguished compiler phase after parsing or type checking, not affecting any of the later compiler phases. If desugaring happens before type checking, the desugaring cannot depend on typing information and type errors are reported in terms of the generated code. If desugaring happens after type checking, the code generated by the desugaring is not type checked and may introduce vulnerabilities. Both options are undesirable. We propose a system for syntactic extensibility where desugaring happens after type checking and desugarings are guaranteed to only generate welltyped code. A major novelty of our work is that desugarings operate on typing derivations instead of plain syntax trees. This provides desugarings access to typing information and forms the basis for the soundness guarantee we provide, namely that a desugaring generates a valid typing derivation. We have implemented our system for syntactic extensibility in a languageindependent fashion and instantiated it for a substantial subset of Java, including generics and inheritance. We provide a sound Java extension for Scalalike forcomprehensions.
@InProceedings{POPL16p204,
author = {Florian Lorenzen and Sebastian Erdweg},
title = {Sound TypeDependent Syntactic Language Extension},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {204216},
doi = {10.1145/2837614.2837644},
year = {2016},
}
Publisher's Version
Article Search


Lyde, Steven 
POPL'16: "Pushdown ControlFlow Analysis ..."
Pushdown ControlFlow Analysis for Free
Thomas Gilray, Steven Lyde, Michael D. Adams, Matthew Might, and David Van Horn (University of Utah, USA; University of Maryland, USA)
Traditional controlflow analysis (CFA) for higherorder languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been published that provide perfect callstack precision in a computable manner: CFA2, PDCFA, and AAC. Unfortunately, implementing CFA2 and PDCFA requires significant engineering effort. Furthermore, all three are computationally expensive. For a monovariant analysis, CFA2 is in O(2^n), PDCFA is in O(n^6), and AAC is in O(n^8).
In this paper, we describe a new technique that builds on these but is both straightforward to implement and computationally inexpensive. The crucial insight is an unusual statedependent allocation strategy for the addresses of continuations. Our technique imposes only a constantfactor overhead on the underlying analysis and costs only O(n^3) in the monovariant case. We present the intuitions behind this development, benchmarks demonstrating its efficacy, and a proof of the precision of this analysis.
@InProceedings{POPL16p691,
author = {Thomas Gilray and Steven Lyde and Michael D. Adams and Matthew Might and David Van Horn},
title = {Pushdown ControlFlow Analysis for Free},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {691704},
doi = {10.1145/2837614.2837631},
year = {2016},
}
Publisher's Version
Article Search


Madhusudan, P. 
POPL'16: "Learning Invariants using ..."
Learning Invariants using Decision Trees and Implication Counterexamples
Pranav Garg, Daniel Neider, P. Madhusudan, and Dan Roth (University of Illinois at UrbanaChampaign, USA)
Inductive invariants can be robustly synthesized using a learning model where the teacher is a program verifier who instructs the learner through concrete program configurations, classified as positive, negative, and implications. We propose the first learning algorithms in this model with implication counterexamples that are based on machine learning techniques. In particular, we extend classical decisiontree learning algorithms in machine learning to handle implication samples, building new scalable ways to construct small decision trees using statistical measures. We also develop a decisiontree learning algorithm in this model that is guaranteed to converge to the right concept (invariant) if one exists. We implement the learners and an appropriate teacher, and show that the resulting invariant synthesis is efficient and convergent for a large suite of programs.
@InProceedings{POPL16p499,
author = {Pranav Garg and Daniel Neider and P. Madhusudan and Dan Roth},
title = {Learning Invariants using Decision Trees and Implication Counterexamples},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {499512},
doi = {10.1145/2837614.2837664},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


Mangal, Ravi 
POPL'16: "QueryGuided Maximum Satisfiability ..."
QueryGuided Maximum Satisfiability
Xin Zhang, Ravi Mangal, Aditya V. Nori, and Mayur Naik (Georgia Institute of Technology, USA; Microsoft Research, UK)
We propose a new optimization problem "QMaxSAT", an extension of the wellknown Maximum Satisfiability or MaxSAT problem. In contrast to MaxSAT, which aims to find an assignment to all variables in the formula, QMaxSAT computes an assignment to a desired subset of variables (or queries) in the formula. Indeed, many problems in diverse domains such as program reasoning, information retrieval, and mathematical optimization can be naturally encoded as QMaxSAT instances. We describe an iterative algorithm for solving QMaxSAT. In each iteration, the algorithm solves a subproblem that is relevant to the queries, and applies a novel technique to check whether the partial assignment found is a solution to the QMaxSAT problem. If the check fails, the algorithm grows the subproblem with a new set of clauses identified as relevant to the queries. Our empirical evaluation shows that our QMaxSAT solver Pilot achieves significant improvements in runtime and memory consumption over conventional MaxSAT solvers on several QMaxSAT instances generated from realworld problems in program analysis and information retrieval.
@InProceedings{POPL16p109,
author = {Xin Zhang and Ravi Mangal and Aditya V. Nori and Mayur Naik},
title = {QueryGuided Maximum Satisfiability},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {109122},
doi = {10.1145/2837614.2837658},
year = {2016},
}
Publisher's Version
Article Search


Maranget, Luc 
POPL'16: "Modelling the ARMv8 Architecture, ..."
Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell (University of Cambridge, UK; University of St. Andrews, UK; Inria, France; ARM, UK)
In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64bit applicationlevel instruction set (ISA). Our goal is to clarify what the range of architecturally allowable behaviour is, and thereby to support future work on formal verification, analysis, and testing of concurrent ARM software and hardware. Establishing such models with high confidence is intrinsically difficult: it involves capturing the vendor's architectural intent, aspects of which (especially for concurrency) have not previously been precisely defined. We therefore first develop a concurrency model with a microarchitectural flavour, abstracting from many hardware implementation concerns but still close to hardwaredesigner intuition. This means it can be discussed in detail with ARM architects. We then develop a more abstract model, better suited for use as an architectural specification, which we prove sound w.r.t.~the first. The instruction semantics involves further difficulties, handling the mass of detail and the subtle intensional information required to interface to the concurrency model. We have a novel ISA description language, with a lightweight dependent type system, letting us do both with a rather direct representation of the ARM reference manual instruction descriptions. We build a tool from the combined semantics that lets one explore, either interactively or exhaustively, the full range of architecturally allowed behaviour, for litmus tests and (small) ELF executables. We prove correctness of some optimisations needed for tool performance. We validate the models by discussion with ARM staff, and by comparison against ARM hardware behaviour, for ISA single instruction tests and concurrent litmus tests.
@InProceedings{POPL16p608,
author = {Shaked Flur and Kathryn E. Gray and Christopher Pulte and Susmit Sarkar and Ali Sezgin and Luc Maranget and Will Deacon and Peter Sewell},
title = {Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {608621},
doi = {10.1145/2837614.2837615},
year = {2016},
}
Publisher's Version
Article Search
Info


McDaniel, Patrick 
POPL'16: "Combining Static Analysis ..."
Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis
Damien Octeau, Somesh Jha, Matthew Dering, Patrick McDaniel, Alexandre Bartel, Li Li, Jacques Klein, and Yves Le Traon (University of Wisconsin, USA; Pennsylvania State University, USA; IMDEA Software Institute, Spain; TU Darmstadt, Germany; University of Luxembourg, Luxembourg)
Static analysis has been successfully used in many areas, from verifying missioncritical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer InterComponent Communication (ICC) yield many potential intercomponent and interapplication links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the intercomponent links is a prerequisite to intercomponent analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static interapplication analysis more tractable, even at large scales.
@InProceedings{POPL16p469,
author = {Damien Octeau and Somesh Jha and Matthew Dering and Patrick McDaniel and Alexandre Bartel and Li Li and Jacques Klein and Yves Le Traon},
title = {Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {469484},
doi = {10.1145/2837614.2837661},
year = {2016},
}
Publisher's Version
Article Search


McKinley, Kathryn S. 
POPL'16KEY: "Programming the World of Uncertain ..."
Programming the World of Uncertain Things (Keynote)
Kathryn S. McKinley (Microsoft Research, USA)
Computing has entered the era of uncertain data, in which hardware and software generate and reason about estimates. Applications use estimates from sensors, machine learning, big data, humans, and approximate hardware and software. Unfortunately, developers face pervasive correctness, programmability, and optimization problems due to estimates. Most programming languages unfortunately make these problems worse. We propose a new programming abstraction called Uncertain<T> embedded into languages, such as C#, C++, Java, Python, and JavaScript. Applications that consume estimates use familiar discrete operations for their estimates; overloaded conditional operators specify hypothesis tests and applications use them control false positives and negatives; and new compositional operators express domain knowledge. By carefully restricting the expressiveness, the runtime automatically implements correct statistical reasoning at conditionals, relieving developers of the need to implement or deeply understand statistics. We demonstrate substantial programmability, correctness, and efficiency benefits of this programming model for GPS sensor navigation, approximate computing, machine learning, and xBox.
@InProceedings{POPL16p1,
author = {Kathryn S. McKinley},
title = {Programming the World of Uncertain Things (Keynote)},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {12},
doi = {10.1145/2837614.2843895},
year = {2016},
}
Publisher's Version
Article Search


Might, Matthew 
POPL'16: "Pushdown ControlFlow Analysis ..."
Pushdown ControlFlow Analysis for Free
Thomas Gilray, Steven Lyde, Michael D. Adams, Matthew Might, and David Van Horn (University of Utah, USA; University of Maryland, USA)
Traditional controlflow analysis (CFA) for higherorder languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been published that provide perfect callstack precision in a computable manner: CFA2, PDCFA, and AAC. Unfortunately, implementing CFA2 and PDCFA requires significant engineering effort. Furthermore, all three are computationally expensive. For a monovariant analysis, CFA2 is in O(2^n), PDCFA is in O(n^6), and AAC is in O(n^8).
In this paper, we describe a new technique that builds on these but is both straightforward to implement and computationally inexpensive. The crucial insight is an unusual statedependent allocation strategy for the addresses of continuations. Our technique imposes only a constantfactor overhead on the underlying analysis and costs only O(n^3) in the monovariant case. We present the intuitions behind this development, benchmarks demonstrating its efficacy, and a proof of the precision of this analysis.
@InProceedings{POPL16p691,
author = {Thomas Gilray and Steven Lyde and Michael D. Adams and Matthew Might and David Van Horn},
title = {Pushdown ControlFlow Analysis for Free},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {691704},
doi = {10.1145/2837614.2837631},
year = {2016},
}
Publisher's Version
Article Search


Moser, Georg 
POPL'16: "The Complexity of Interaction ..."
The Complexity of Interaction
Stéphane Gimenez and Georg Moser (University of Innsbruck, Austria)
In this paper, we analyze the complexity of functional programs written in the interactionnet computation model, an asynchronous, parallel and confluent model that generalizes linearlogic proof nets. Employing userdefined sized and scheduled types, we certify concrete time, space and spacetime complexity bounds for both sequential and parallel reductions of interactionnet programs by suitably assigning complexity potentials to typed nodes. The relevance of this approach is illustrated on archetypal programming examples. The provided analysis is precise, compositional and is, in theory, not restricted to particular complexity classes.
@InProceedings{POPL16p243,
author = {Stéphane Gimenez and Georg Moser},
title = {The Complexity of Interaction},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {243255},
doi = {10.1145/2837614.2837646},
year = {2016},
}
Publisher's Version
Article Search


MunchMaccagnoni, Guillaume 
POPL'16: "A Theory of Effects and Resources: ..."
A Theory of Effects and Resources: Adjunction Models and Polarised Calculi
PierreLouis Curien, Marcelo Fiore, and Guillaume MunchMaccagnoni (University of Paris Diderot, France; Inria, France; University of Cambridge, UK)
We consider the CurryHowardLambek correspondence for effectful computation and resource management, specifically proposing polarised calculi together with presheafenriched adjunction models as the starting point for a comprehensive semantic theory relating logical systems, typed calculi, and categorical models in this context. Our thesis is that the combination of effects and resources should be considered orthogonally. Model theoretically, this leads to an understanding of our categorical models from two complementary perspectives: (i) as a linearisation of CBPV (CallbyPushValue) adjunction models, and (ii) as an extension of linear/nonlinear adjunction models with an adjoint resolution of computational effects. When the linear structure is cartesian and the resource structure is trivial we recover Levy’s notion of CBPV adjunction model, while when the effect structure is trivial we have Benton’s linear/nonlinear adjunction models. Further instances of our model theory include the dialogue categories with a resource modality of Melliès and Tabareau, and the [E]EC ([Enriched] Effect Calculus) models of Egger, Møgelberg and Simpson. Our development substantiates the approach by providing a lifting theorem of linear models into cartesian ones. To each of our categorical models we systematically associate a typed term calculus, each of which corresponds to a variant of the sequent calculi LJ (Intuitionistic Logic) or ILL (Intuitionistic Linear Logic). The adjoint resolution of effects corresponds to polarisation whereby, syntactically, types locally determine a strict or lazy evaluation order and, semantically, the associativity of cuts is relaxed. In particular, our results show that polarisation provides a computational interpretation of CBPV in direct style. Further, we characterise depolarised models: those where the cut is associative, and where the evaluation order is unimportant. We explain possible advantages of this style of calculi for the operational semantics of effects.
@InProceedings{POPL16p44,
author = {PierreLouis Curien and Marcelo Fiore and Guillaume MunchMaccagnoni},
title = {A Theory of Effects and Resources: Adjunction Models and Polarised Calculi},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {4456},
doi = {10.1145/2837614.2837652},
year = {2016},
}
Publisher's Version
Article Search


Murase, Akihiro 
POPL'16: "Temporal Verification of HigherOrder ..."
Temporal Verification of HigherOrder Functional Programs
Akihiro Murase, Tachio Terauchi, Naoki Kobayashi, Ryosuke Sato, and Hiroshi Unno (Nagoya University, Japan; JAIST, Japan; University of Tokyo, Japan; University of Tsukuba, Japan)
We present an automated approach to verifying arbitrary omegaregular
properties of higherorder functional programs. Previous automated
methods proposed for this class of programs could only handle safety
properties or termination, and our approach is the first to be able
to verify arbitrary omegaregular liveness properties.
Our approach is automatatheoretic, and extends our recent work on
binaryreachabilitybased approach to automated termination
verification of higherorder functional programs to fair termination
published in ESOP 2014. In that work, we have shown that checking
disjunctive wellfoundedness of (the transitive closure of) the
``calling relation'' is sound and complete for termination. The
extension to fair termination is tricky, however, because the
straightforward extension that checks disjunctive wellfoundedness of
the fair calling relation turns out to be unsound, as we shall show in
the paper. Roughly, our solution is to check fairness on the
transition relation instead of the calling relation, and propagate the
information to determine when it is necessary and sufficient to check
for disjunctive wellfoundedness on the calling relation. We prove
that our approach is sound and complete. We have implemented
a prototype of our approach, and confirmed that it is able to
automatically verify liveness properties of some nontrivial
higherorder programs.
@InProceedings{POPL16p57,
author = {Akihiro Murase and Tachio Terauchi and Naoki Kobayashi and Ryosuke Sato and Hiroshi Unno},
title = {Temporal Verification of HigherOrder Functional Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {5768},
doi = {10.1145/2837614.2837667},
year = {2016},
}
Publisher's Version
Article Search


Muroya, Koko 
POPL'16: "Memoryful Geometry of Interaction ..."
Memoryful Geometry of Interaction II: Recursion and Adequacy
Koko Muroya, Naohiko Hoshino, and Ichiro Hasuo (University of Tokyo, Japan; Kyoto University, Japan)
A general framework of Memoryful Geometry of Interaction (mGoI) is introduced recently by the authors. It provides a sound translation of lambdaterms (on the highlevel) to their realizations by stream transducers (on the lowlevel), where the internal states of the latter (called memories) are exploited for accommodating algebraic effects of Plotkin and Power. The translation is compositional, hence ``denotational,'' where transducers are inductively composed using an adaptation of Barbosa's coalgebraic component calculus. In the current paper we extend the mGoI framework and provide a systematic treatment of recursionan essential feature of programming languages that was however missing in our previous work. Specifically, we introduce two new fixedpoint operators in the coalgebraic component calculus. The two follow the previous work on recursion in GoI and are called Girard style and Mackie style: the former obviously exhibits some nice domaintheoretic properties, while the latter allows simpler construction. Their equivalence is established on the categorical (or, traced monoidal) level of abstraction, and is therefore generic with respect to the choice of algebraic effects. Our main result is an adequacy theorem of our mGoI translation, against Plotkin and Power's operational semantics for algebraic effects.
@InProceedings{POPL16p748,
author = {Koko Muroya and Naohiko Hoshino and Ichiro Hasuo},
title = {Memoryful Geometry of Interaction II: Recursion and Adequacy},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {748760},
doi = {10.1145/2837614.2837672},
year = {2016},
}
Publisher's Version
Article Search


Murray, Richard M. 
POPL'16KEY: "Synthesis of Reactive Controllers ..."
Synthesis of Reactive Controllers for Hybrid Systems (Keynote)
Richard M. Murray (California Institute of Technology, USA)
Decisionmaking logic in hybrid systems is responsible for selecting
modes of operation for the underlying (continuous) control system,
reacting to external events and failures in the system, and insuring
that the overall control system is satisfying safety and performance
specifications. Tools from computer science, such as modelchecking
and logic synthesis, combined with design patterns from feedback
control theory provide new approaches to solving these problems. A
major shift is the move from ``design then verify'' to ``specify then
synthesize'' approaches to controller design that allow simultaneous
synthesis of highperformance, robust control laws and
correctbyconstruction decisionmaking logic.
@InProceedings{POPL16p3,
author = {Richard M. Murray},
title = {Synthesis of Reactive Controllers for Hybrid Systems (Keynote)},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {33},
doi = {10.1145/2837614.2843894},
year = {2016},
}
Publisher's Version
Article Search


Naik, Mayur 
POPL'16: "QueryGuided Maximum Satisfiability ..."
QueryGuided Maximum Satisfiability
Xin Zhang, Ravi Mangal, Aditya V. Nori, and Mayur Naik (Georgia Institute of Technology, USA; Microsoft Research, UK)
We propose a new optimization problem "QMaxSAT", an extension of the wellknown Maximum Satisfiability or MaxSAT problem. In contrast to MaxSAT, which aims to find an assignment to all variables in the formula, QMaxSAT computes an assignment to a desired subset of variables (or queries) in the formula. Indeed, many problems in diverse domains such as program reasoning, information retrieval, and mathematical optimization can be naturally encoded as QMaxSAT instances. We describe an iterative algorithm for solving QMaxSAT. In each iteration, the algorithm solves a subproblem that is relevant to the queries, and applies a novel technique to check whether the partial assignment found is a solution to the QMaxSAT problem. If the check fails, the algorithm grows the subproblem with a new set of clauses identified as relevant to the queries. Our empirical evaluation shows that our QMaxSAT solver Pilot achieves significant improvements in runtime and memory consumption over conventional MaxSAT solvers on several QMaxSAT instances generated from realworld problems in program analysis and information retrieval.
@InProceedings{POPL16p109,
author = {Xin Zhang and Ravi Mangal and Aditya V. Nori and Mayur Naik},
title = {QueryGuided Maximum Satisfiability},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {109122},
doi = {10.1145/2837614.2837658},
year = {2016},
}
Publisher's Version
Article Search


Najafzadeh, Mahsa 
POPL'16: "'Cause I'm Strong ..."
'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems
Alexey Gotsman, Hongseok Yang, Carla Ferreira, Mahsa Najafzadeh, and Marc Shapiro (IMDEA Software Institute, Spain; University of Oxford, UK; Universidade Nova Lisboa, Potugal; Sorbonne, France; Inria, France; UPMC, France)
Largescale distributed systems often rely on replicated databases that allow a
programmer to request different data consistency guarantees for different
operations, and thereby control their performance. Using such databases is far
from trivial: requesting stronger consistency in too many places may hurt
performance, and requesting it in too few places may violate correctness. To
help programmers in this task, we propose the first proof rule for establishing
that a particular choice of consistency guarantees for various operations on a
replicated database is enough to ensure the preservation of a given data
integrity invariant. Our rule is modular: it allows reasoning about the
behaviour of every operation separately under some assumption on the behaviour
of other operations. This leads to simple reasoning, which we have automated in
an SMTbased tool. We present a nontrivial proof of soundness of our rule and
illustrate its use on several examples.
@InProceedings{POPL16p371,
author = {Alexey Gotsman and Hongseok Yang and Carla Ferreira and Mahsa Najafzadeh and Marc Shapiro},
title = {'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {371384},
doi = {10.1145/2837614.2837625},
year = {2016},
}
Publisher's Version
Article Search


Neider, Daniel 
POPL'16: "Learning Invariants using ..."
Learning Invariants using Decision Trees and Implication Counterexamples
Pranav Garg, Daniel Neider, P. Madhusudan, and Dan Roth (University of Illinois at UrbanaChampaign, USA)
Inductive invariants can be robustly synthesized using a learning model where the teacher is a program verifier who instructs the learner through concrete program configurations, classified as positive, negative, and implications. We propose the first learning algorithms in this model with implication counterexamples that are based on machine learning techniques. In particular, we extend classical decisiontree learning algorithms in machine learning to handle implication samples, building new scalable ways to construct small decision trees using statistical measures. We also develop a decisiontree learning algorithm in this model that is guaranteed to converge to the right concept (invariant) if one exists. We implement the learners and an appropriate teacher, and show that the resulting invariant synthesis is efficient and convergent for a large suite of programs.
@InProceedings{POPL16p499,
author = {Pranav Garg and Daniel Neider and P. Madhusudan and Dan Roth},
title = {Learning Invariants using Decision Trees and Implication Counterexamples},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {499512},
doi = {10.1145/2837614.2837664},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


New, Max S. 
POPL'16: "Is Sound Gradual Typing Dead? ..."
Is Sound Gradual Typing Dead?
Asumu Takikawa, Daniel Feltey, Ben Greenman, Max S. New, Jan Vitek, and Matthias Felleisen (Northeastern University, USA)
Programmers have come to embrace dynamicallytyped languages for prototyping and delivering large and complex systems. When it comes to maintaining and evolving these systems, the lack of explicit static typing becomes a bottleneck. In response, researchers have explored the idea of graduallytyped programming languages which allow the incremental addition of type annotations to software written in one of these untyped languages. Some of these new, hybrid languages insert runtime checks at the boundary between typed and untyped code to establish type soundness for the overall system. With sound gradual typing, programmers can rely on the language implementation to provide meaningful error messages when type invariants are violated. While most research on sound gradual typing remains theoretical, the few emerging implementations suffer from performance overheads due to these checks. None of the publications on this topic comes with a comprehensive performance evaluation. Worse, a few report disastrous numbers. In response, this paper proposes a method for evaluating the performance of graduallytyped programming languages. The method hinges on exploring the space of partial conversions from untyped to typed. For each benchmark, the performance of the different versions is reported in a synthetic metric that associates runtime overhead to conversion effort. The paper reports on the results of applying the method to Typed Racket, a mature implementation of sound gradual typing, using a suite of realworld programs of various sizes and complexities. Based on these results the paper concludes that, given the current state of implementation technologies, sound gradual typing faces significant challenges. Conversely, it raises the question of how implementations could reduce the overheads associated with soundness and how tools could be used to steer programmers clear from pathological cases.
@InProceedings{POPL16p456,
author = {Asumu Takikawa and Daniel Feltey and Ben Greenman and Max S. New and Jan Vitek and Matthias Felleisen},
title = {Is Sound Gradual Typing Dead?},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {456468},
doi = {10.1145/2837614.2837630},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


Nori, Aditya V. 
POPL'16: "QueryGuided Maximum Satisfiability ..."
QueryGuided Maximum Satisfiability
Xin Zhang, Ravi Mangal, Aditya V. Nori, and Mayur Naik (Georgia Institute of Technology, USA; Microsoft Research, UK)
We propose a new optimization problem "QMaxSAT", an extension of the wellknown Maximum Satisfiability or MaxSAT problem. In contrast to MaxSAT, which aims to find an assignment to all variables in the formula, QMaxSAT computes an assignment to a desired subset of variables (or queries) in the formula. Indeed, many problems in diverse domains such as program reasoning, information retrieval, and mathematical optimization can be naturally encoded as QMaxSAT instances. We describe an iterative algorithm for solving QMaxSAT. In each iteration, the algorithm solves a subproblem that is relevant to the queries, and applies a novel technique to check whether the partial assignment found is a solution to the QMaxSAT problem. If the check fails, the algorithm grows the subproblem with a new set of clauses identified as relevant to the queries. Our empirical evaluation shows that our QMaxSAT solver Pilot achieves significant improvements in runtime and memory consumption over conventional MaxSAT solvers on several QMaxSAT instances generated from realworld problems in program analysis and information retrieval.
@InProceedings{POPL16p109,
author = {Xin Zhang and Ravi Mangal and Aditya V. Nori and Mayur Naik},
title = {QueryGuided Maximum Satisfiability},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {109122},
doi = {10.1145/2837614.2837658},
year = {2016},
}
Publisher's Version
Article Search


Novotný, Petr 
POPL'16: "Algorithmic Analysis of Qualitative ..."
Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs
Krishnendu Chatterjee, Hongfei Fu, Petr Novotný, and Rouzbeh Hasheminezhad (IST Austria, Austria; Institute of Software at Chinese Academy of Sciences, China; Sharif University of Technology, Iran)
In this paper, we consider termination of probabilistic programs with realvalued variables. The questions concerned are: 1. qualitative ones that ask (i) whether the program terminates with probability 1 (almostsure termination) and (ii) whether the expected termination time is finite (finite termination); 2. quantitative ones that ask (i) to approximate the expected termination time (expectation problem) and (ii) to compute a bound B such that the probability to terminate after B steps decreases exponentially (concentration problem). To solve these questions, we utilize the notion of ranking supermartingales which is a powerful approach for proving termination of probabilistic programs. In detail, we focus on algorithmic synthesis of linear rankingsupermartingales over affine probabilistic programs (APP's) with both angelic and demonic nondeterminism. An important subclass of APP's is LRAPP which is defined as the class of all APP's over which a linear rankingsupermartingale exists. Our main contributions are as follows. Firstly, we show that the membership problem of LRAPP (i) can be decided in polynomial time for APP's with at most demonic nondeterminism, and (ii) is NPhard and in PSPACE for APP's with angelic nondeterminism; moreover, the NPhardness result holds already for APP's without probability and demonic nondeterminism. Secondly, we show that the concentration problem over LRAPP can be solved in the same complexity as for the membership problem of LRAPP. Finally, we show that the expectation problem over LRAPP can be solved in 2EXPTIME and is PSPACEhard even for APP's without probability and nondeterminism (i.e., deterministic programs). Our experimental results demonstrate the effectiveness of our approach to answer the qualitative and quantitative questions over APP's with at most demonic nondeterminism.
@InProceedings{POPL16p327,
author = {Krishnendu Chatterjee and Hongfei Fu and Petr Novotný and Rouzbeh Hasheminezhad},
title = {Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {327342},
doi = {10.1145/2837614.2837639},
year = {2016},
}
Publisher's Version
Article Search


Octeau, Damien 
POPL'16: "Combining Static Analysis ..."
Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis
Damien Octeau, Somesh Jha, Matthew Dering, Patrick McDaniel, Alexandre Bartel, Li Li, Jacques Klein, and Yves Le Traon (University of Wisconsin, USA; Pennsylvania State University, USA; IMDEA Software Institute, Spain; TU Darmstadt, Germany; University of Luxembourg, Luxembourg)
Static analysis has been successfully used in many areas, from verifying missioncritical software to malware detection. Unfortunately, static analysis often produces false positives, which require significant manual effort to resolve. In this paper, we show how to overlay a probabilistic model, trained using domain knowledge, on top of static analysis results, in order to triage static analysis results. We apply this idea to analyzing mobile applications. Android application components can communicate with each other, both within single applications and between different applications. Unfortunately, techniques to statically infer InterComponent Communication (ICC) yield many potential intercomponent and interapplication links, most of which are false positives. At large scales, scrutinizing all potential links is simply not feasible. We therefore overlay a probabilistic model of ICC on top of static analysis results. Since computing the intercomponent links is a prerequisite to intercomponent analysis, we introduce a formalism for inferring ICC links based on set constraints. We design an efficient algorithm for performing link resolution. We compute all potential links in a corpus of 11,267 applications in 30 minutes and triage them using our probabilistic approach. We find that over 95.1% of all 636 million potential links are associated with probability values below 0.01 and are thus likely unfeasible links. Thus, it is possible to consider only a small subset of all links without significant loss of information. This work is the first significant step in making static interapplication analysis more tractable, even at large scales.
@InProceedings{POPL16p469,
author = {Damien Octeau and Somesh Jha and Matthew Dering and Patrick McDaniel and Alexandre Bartel and Li Li and Jacques Klein and Yves Le Traon},
title = {Combining Static Analysis with Probabilistic Models to Enable MarketScale Android Intercomponent Analysis},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {469484},
doi = {10.1145/2837614.2837661},
year = {2016},
}
Publisher's Version
Article Search


Ong, C.H. Luke 
POPL'16: "Unboundedness and Downward ..."
Unboundedness and Downward Closures of HigherOrder Pushdown Automata
Matthew Hague, Jonathan Kochems, and C.H. Luke Ong (University of London, UK; University of Oxford, UK)
We show the diagonal problem for higherorder pushdown automata (HOPDA), and hence the simultaneous unboundedness problem, is decidable. From recent work by Zetzsche this means that we can construct the downward closure of the set of words accepted by a given HOPDA. This also means we can construct the downward closure of the Parikh image of a HOPDA. Both of these consequences play an important role in verifying concurrent higherorder programs expressed as HOPDA or safe higherorder recursion schemes.
@InProceedings{POPL16p151,
author = {Matthew Hague and Jonathan Kochems and C.H. Luke Ong},
title = {Unboundedness and Downward Closures of HigherOrder Pushdown Automata},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {151163},
doi = {10.1145/2837614.2837627},
year = {2016},
}
Publisher's Version
Article Search
Info


Orchard, Dominic 
POPL'16: "Effects as Sessions, Sessions ..."
Effects as Sessions, Sessions as Effects
Dominic Orchard and Nobuko Yoshida (Imperial College London, UK)
Effect and session type systems are two expressive behavioural type systems. The former is usually developed in the context of the lambdacalculus and its variants, the latter for the picalculus. In this paper we explore their relative expressive power. Firstly, we give an embedding from PCF, augmented with a parameterised effect system, into a sessiontyped picalculus (session calculus), showing that session types are powerful enough to express effects. Secondly, we give a reverse embedding, from the session calculus back into PCF, by instantiating PCF with concurrency primitives and its effect system with a sessionlike effect algebra; effect systems are powerful enough to express sessions. The embedding of session types into an effect system is leveraged to give a new implementation of session types in Haskell, via an effect system encoding. The correctness of this implementation follows from the second embedding result. We also discuss various extensions to our embeddings.
@InProceedings{POPL16p568,
author = {Dominic Orchard and Nobuko Yoshida},
title = {Effects as Sessions, Sessions as Effects},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {568581},
doi = {10.1145/2837614.2837634},
year = {2016},
}
Publisher's Version
Article Search
Info


Osera, PeterMichael 
POPL'16: "ExampleDirected Synthesis: ..."
ExampleDirected Synthesis: A TypeTheoretic Interpretation
Jonathan Frankle, PeterMichael Osera, David Walker, and Steve Zdancewic (Princeton University, USA; Grinnell College, USA; University of Pennsylvania, USA)
Inputoutput examples have emerged as a practical and userfriendly
specification mechanism for program synthesis in many environments.
While exampledriven tools have demonstrated tangible impact that has
inspired adoption in industry, their underlying semantics are less wellunderstood:
what are "examples" and how do they
relate to other kinds of specifications? This paper
demonstrates that examples can, in general, be interpreted
as refinement types. Seen in this light, program
synthesis is the task of finding an inhabitant of
such a type. This insight provides an immediate
semantic interpretation for examples. Moreover,
it enables us to exploit decades of research in type theory as
well as its correspondence with intuitionistic logic rather
than designing ad hoc theoretical frameworks for synthesis from scratch.
We put this observation into practice by formalizing synthesis
as proof search in a sequent calculus with
intersection and union refinements that we prove
to be sound with respect to a conventional type system.
In addition, we show how to handle negative examples,
which arise from user feedback or counterexampleguided loops.
This theory serves as the basis for a prototype
implementation that extends our core language to
support MLstyle algebraic data types and structurally
inductive functions. Users can also specify
synthesis goals using polymorphic refinements and
import monomorphic libraries.
The prototype serves as a vehicle
for empirically evaluating a number of different
strategies for resolving the nondeterminism of the sequent
calculusbottomup theoremproving,
term enumeration with refinement type checking, and
combinations of boththe results of which classify, explain, and
validate the design choices of existing synthesis systems.
It also provides a platform for measuring the practical
value of a specification language that combines
"examples" with the more general expressiveness of refinements.
@InProceedings{POPL16p802,
author = {Jonathan Frankle and PeterMichael Osera and David Walker and Steve Zdancewic},
title = {ExampleDirected Synthesis: A TypeTheoretic Interpretation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {802815},
doi = {10.1145/2837614.2837629},
year = {2016},
}
Publisher's Version
Article Search


Ostermann, Klaus 
POPL'16: "System Fomega with Equirecursive ..."
System Fomega with Equirecursive Types for DatatypeGeneric Programming
Yufei Cai, Paolo G. Giarrusso, and Klaus Ostermann (University of Tübingen, Germany)
Traversing an algebraic datatype by hand requires boilerplate code which duplicates the structure of the datatype. Datatypegeneric programming (DGP) aims to eliminate such boilerplate code by decomposing algebraic datatypes into type constructor applications from which generic traversals can be synthesized. However, different traversals require different decompositions, which yield isomorphic but unequal types. This hinders the interoperability of different DGP techniques. In this paper, we propose Fωμ, an extension of the higherorder polymorphic lambda calculus Fω with records, variants, and equirecursive types. We prove the soundness of the type system, and show that type checking for firstorder recursive types is decidable with a practical type checking algorithm. In our soundness proof we define type equality by interpreting types as infinitary λterms (in particular, Berarduccitrees). To decide type equality we βnormalize types, and then use an extension of equivalence checking for usual equirecursive types. Thanks to equirecursive types, new decompositions for a datatype can be added modularly and still interoperate with each other, allowing multiple DGP techniques to work together. We sketch how generic traversals can be synthesized, and apply these components to some examples. Since the set of datatype decomposition becomes extensible, System Fωμ enables using DGP techniques incrementally, instead of planning for them upfront or doing invasive refactoring.
@InProceedings{POPL16p30,
author = {Yufei Cai and Paolo G. Giarrusso and Klaus Ostermann},
title = {System Fomega with Equirecursive Types for DatatypeGeneric Programming},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {3043},
doi = {10.1145/2837614.2837660},
year = {2016},
}
Publisher's Version
Article Search
Info


Ouyang, Long 
POPL'16: "Fabular: Regression Formulas ..."
Fabular: Regression Formulas as Probabilistic Programming
Johannes Borgström, Andrew D. Gordon, Long Ouyang, Claudio Russo, Adam Ścibior, and Marcin Szymczak (Uppsala University, Sweden; Microsoft Research, UK; University of Edinburgh, UK; Stanford University, USA; University of Cambridge, UK; MPI Tübingen, Germany)
Regression formulas are a domainspecific language adopted by several R packages for describing an important and useful class of statistical models: hierarchical linear regressions. Formulas are succinct, expressive, and clearly popular, so are they a useful addition to probabilistic programming languages? And what do they mean? We propose a core calculus of hierarchical linear regression, in which regression coefficients are themselves defined by nested regressions (unlike in R). We explain how our calculus captures the essence of the formula DSL found in R. We describe the design and implementation of Fabular, a version of the Tabular schemadriven probabilistic programming language, enriched with formulas based on our regression calculus. To the best of our knowledge, this is the first formal description of the core ideas of R's formula notation, the first development of a calculus of regression formulas, and the first demonstration of the benefits of composing regression formulas and latent variables in a probabilistic programming language.
@InProceedings{POPL16p271,
author = {Johannes Borgström and Andrew D. Gordon and Long Ouyang and Claudio Russo and Adam Ścibior and Marcin Szymczak},
title = {Fabular: Regression Formulas as Probabilistic Programming},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {271283},
doi = {10.1145/2837614.2837653},
year = {2016},
}
Publisher's Version
Article Search


Padon, Oded 
POPL'16: "Decidability of Inferring ..."
Decidability of Inferring Inductive Invariants
Oded Padon, Neil Immerman, Sharon Shoham, Aleksandr Karbyshev, and Mooly Sagiv (Tel Aviv University, Israel; University of Massachusetts at Amherst, USA; Academic College of Tel Aviv Yaffo, Israel)
Induction is a successful approach for verification of hardware and software systems. A common practice is to model a system using logical formulas, and then use a decision procedure to verify that some logical formula is an inductive safety invariant for the system. A key ingredient in this approach is coming up with the inductive invariant, which is known as invariant inference. This is a major difficulty, and it is often left for humans or addressed by sound but incomplete abstract interpretation. This paper is motivated by the problem of inductive invariants in shape analysis and in distributed protocols. This paper approaches the general problem of inferring firstorder inductive invariants by restricting the language L of candidate invariants. Notice that the problem of invariant inference in a restricted language L differs from the safety problem, since a system may be safe and still not have any inductive invariant in L that proves safety. Clearly, if L is finite (and if testing an inductive invariant is decidable), then inferring invariants in L is decidable. This paper presents some interesting cases when inferring inductive invariants in L is decidable even when L is an infinite language of universal formulas. Decidability is obtained by restricting L and defining a suitable wellquasiorder on the state space. We also present some undecidability results that show that our restrictions are necessary. We further present a framework for systematically constructing infinite languages while keeping the invariant inference problem decidable. We illustrate our approach by showing the decidability of inferring invariants for programs manipulating linkedlists, and for distributed protocols.
@InProceedings{POPL16p217,
author = {Oded Padon and Neil Immerman and Sharon Shoham and Aleksandr Karbyshev and Mooly Sagiv},
title = {Decidability of Inferring Inductive Invariants},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {217231},
doi = {10.1145/2837614.2837640},
year = {2016},
}
Publisher's Version
Article Search


Palsberg, Jens 
POPL'16: "Breaking through the Normalization ..."
Breaking through the Normalization Barrier: A SelfInterpreter for Fomega
Matt Brown and Jens Palsberg (University of California at Los Angeles, USA)
According to conventional wisdom, a selfinterpreter for a strongly normalizing lambdacalculus is impossible. We call this the normalization barrier. The normalization barrier stems from a theorem in computability theory that says that a total universal function for the total computable functions is impossible. In this paper we break through the normalization barrier and define a selfinterpreter for System F_omega, a strongly normalizing lambdacalculus. After a careful analysis of the classical theorem, we show that static type checking in F_omega can exclude the proof's diagonalization gadget, leaving open the possibility for a selfinterpreter. Along with the selfinterpreter, we program four other operations in F_omega, including a continuationpassing style transformation. Our operations rely on a new approach to program representation that may be useful in theorem provers and compilers.
@InProceedings{POPL16p5,
author = {Matt Brown and Jens Palsberg},
title = {Breaking through the Normalization Barrier: A SelfInterpreter for Fomega},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {517},
doi = {10.1145/2837614.2837623},
year = {2016},
}
Publisher's Version
Article Search


Patrignani, Marco 
POPL'16: "FullyAbstract Compilation ..."
FullyAbstract Compilation by Approximate BackTranslation
Dominique Devriese, Marco Patrignani, and Frank Piessens (KU Leuven, Belgium)
A compiler is fullyabstract if the compilation from source language programs to target language programs reflects and preserves behavioural equivalence. Such compilers have important security benefits, as they limit the power of an attacker interacting with the program in the target language to that of an attacker interacting with the program in the source language. Proving compiler fullabstraction is, however, rather complicated. A common proof technique is based on the backtranslation of targetlevel program contexts to behaviourallyequivalent sourcelevel contexts. However, constructing such a backtranslation is problematic when the source language is not strong enough to embed an encoding of the target language. For instance, when compiling from the simplytyped λcalculus (λτ) to the untyped λcalculus (λu), the lack of recursive types in λτ prevents such a backtranslation. We propose a general and elegant solution for this problem. The key insight is that it suffices to construct an approximate backtranslation. The approximation is only accurate up to a certain number of steps and conservative beyond that, in the sense that the context generated by the backtranslation may diverge when the original would not, but not vice versa. Based on this insight, we describe a general technique for proving compiler fullabstraction and demonstrate it on a compiler from λτ to λu . The proof uses asymmetric crosslanguage logical relations and makes innovative use of stepindexing to express the relation between a context and its approximate backtranslation. We believe this proof technique can scale to challenging settings and enable simpler, more scalable proofs of compiler fullabstraction.
@InProceedings{POPL16p164,
author = {Dominique Devriese and Marco Patrignani and Frank Piessens},
title = {FullyAbstract Compilation by Approximate BackTranslation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {164177},
doi = {10.1145/2837614.2837618},
year = {2016},
}
Publisher's Version
Article Search


Pavlogiannis, Andreas 
POPL'16: "Algorithms for Algebraic Path ..."
Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components
Krishnendu Chatterjee, Amir Kafshdar Goharshady, Rasmus IbsenJensen, and Andreas Pavlogiannis (IST Austria, Austria)
We study algorithmic questions for concurrent systems where the transitions are labeled from a complete, closed semiring, and path properties are algebraic with semiring operations. The algebraic path properties can model dataflow analysis problems, the shortest path problem, and many other natural problems that arise in program analysis. We consider that each component of the concurrent system is a graph with constant treewidth, a property satisfied by the controlflow graphs of most programs. We allow for multiple possible queries, which arise naturally in demand driven dataflow analysis. The study of multiple queries allows us to consider the tradeoff between the resource usage of the onetime preprocessing and for each individual query. The traditional approach constructs the product graph of all components and applies the bestknown graph algorithm on the product. In this approach, even the answer to a single query requires the transitive closure (i.e., the results of all possible queries), which provides no room for tradeoff between preprocessing and query time. Our main contributions are algorithms that significantly improve the worstcase running time of the traditional approach, and provide various tradeoffs depending on the number of queries. For example, in a concurrent system of two components, the traditional approach requires hexic time in the worst case for answering one query as well as computing the transitive closure, whereas we show that with onetime preprocessing in almost cubic time, each subsequent query can be answered in at most linear time, and even the transitive closure can be computed in almost quartic time. Furthermore, we establish conditional optimality results showing that the worstcase running time of our algorithms cannot be improved without achieving major breakthroughs in graph algorithms (i.e., improving the worstcase bound for the shortest path problem in general graphs). Preliminary experimental results show that our algorithms perform favorably on several benchmarks.
@InProceedings{POPL16p733,
author = {Krishnendu Chatterjee and Amir Kafshdar Goharshady and Rasmus IbsenJensen and Andreas Pavlogiannis},
title = {Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {733747},
doi = {10.1145/2837614.2837624},
year = {2016},
}
Publisher's Version
Article Search


Pfenning, Frank 
POPL'16: "Monitors and Blame Assignment ..."
Monitors and Blame Assignment for HigherOrder Session Types
Limin Jia, Hannah Gommerstadt, and Frank Pfenning (Carnegie Mellon University, USA)
Session types provide a means to prescribe the communication behavior between concurrent messagepassing processes. However, in a distributed setting, some processes may be written in languages that do not support static typing of sessions or may be compromised by a malicious intruder, violating invariants of the session types. In such a setting, dynamically monitoring communication between processes becomes a necessity for identifying undesirable actions. In this paper, we show how to dynamically monitor communication to enforce adherence to session types in a higherorder setting. We present a system of blame assignment in the case when the monitor detects an undesirable action and an alarm is raised. We prove that dynamic monitoring does not change system behavior for welltyped processes, and that one of an indicated set of possible culprits must have been compromised in case of an alarm.
@InProceedings{POPL16p582,
author = {Limin Jia and Hannah Gommerstadt and Frank Pfenning},
title = {Monitors and Blame Assignment for HigherOrder Session Types},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {582594},
doi = {10.1145/2837614.2837662},
year = {2016},
}
Publisher's Version
Article Search


PichonPharabod, Jean 
POPL'16: "A Concurrency Semantics for ..."
A Concurrency Semantics for Relaxed Atomics that Permits Optimisation and Avoids ThinAir Executions
Jean PichonPharabod and Peter Sewell (University of Cambridge, UK)
Despite much research on concurrent programming languages, especially for Java and C/C++, we still do not have a satisfactory definition of their semantics, one that admits all common optimisations without also admitting undesired behaviour. Especially problematic are the ``thinair'' examples involving highperformance concurrent accesses, such as C/C++11 relaxed atomics. The C/C++11 model is in a percandidateexecution style, and previous work has identified a tension between that and the fact that compiler optimisations do not operate over single candidate executions in isolation; rather, they operate over syntactic representations that represent all executions. In this paper we propose a novel approach that circumvents this difficulty. We define a concurrency semantics for a core calculus, including relaxedatomic and nonatomic accesses, and locks, that admits a wide range of optimisation while still forbidding the classic thinair examples. It also addresses other problems relating to undefined behaviour. The basic idea is to use an eventstructure representation of the current state of each thread, capturing all of its potential executions, and to permit interleaving of execution and transformation steps over that to reflect optimisation (possibly dynamic) of the code. These are combined with a nonmulticopyatomic storage subsystem, to reflect common hardware behaviour. The semantics is defined in a mechanised and executable form, and designed to be implementable above current relaxed hardware and strong enough to support the programming idioms that C/C++11 does for this fragment. It offers a potential way forward for concurrent programming language semantics, beyond the current C/C++11 and Java models.
@InProceedings{POPL16p622,
author = {Jean PichonPharabod and Peter Sewell},
title = {A Concurrency Semantics for Relaxed Atomics that Permits Optimisation and Avoids ThinAir Executions},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {622633},
doi = {10.1145/2837614.2837616},
year = {2016},
}
Publisher's Version
Article Search
Info


Piessens, Frank 
POPL'16: "FullyAbstract Compilation ..."
FullyAbstract Compilation by Approximate BackTranslation
Dominique Devriese, Marco Patrignani, and Frank Piessens (KU Leuven, Belgium)
A compiler is fullyabstract if the compilation from source language programs to target language programs reflects and preserves behavioural equivalence. Such compilers have important security benefits, as they limit the power of an attacker interacting with the program in the target language to that of an attacker interacting with the program in the source language. Proving compiler fullabstraction is, however, rather complicated. A common proof technique is based on the backtranslation of targetlevel program contexts to behaviourallyequivalent sourcelevel contexts. However, constructing such a backtranslation is problematic when the source language is not strong enough to embed an encoding of the target language. For instance, when compiling from the simplytyped λcalculus (λτ) to the untyped λcalculus (λu), the lack of recursive types in λτ prevents such a backtranslation. We propose a general and elegant solution for this problem. The key insight is that it suffices to construct an approximate backtranslation. The approximation is only accurate up to a certain number of steps and conservative beyond that, in the sense that the context generated by the backtranslation may diverge when the original would not, but not vice versa. Based on this insight, we describe a general technique for proving compiler fullabstraction and demonstrate it on a compiler from λτ to λu . The proof uses asymmetric crosslanguage logical relations and makes innovative use of stepindexing to express the relation between a context and its approximate backtranslation. We believe this proof technique can scale to challenging settings and enable simpler, more scalable proofs of compiler fullabstraction.
@InProceedings{POPL16p164,
author = {Dominique Devriese and Marco Patrignani and Frank Piessens},
title = {FullyAbstract Compilation by Approximate BackTranslation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {164177},
doi = {10.1145/2837614.2837618},
year = {2016},
}
Publisher's Version
Article Search


Plotkin, Gordon D. 
POPL'16: "Scaling Network Verification ..."
Scaling Network Verification using Symmetry and Surgery
Gordon D. Plotkin, Nikolaj Bjørner, Nuno P. Lopes, Andrey Rybalchenko, and George Varghese (University of Edinburgh, UK; Microsoft Research, USA; Microsoft Research, UK)
On the surface, large data centers with about 100,000 stations and nearly a million routing rules are complex and hard to verify. However, these networks are highly regular by design; for example they employ fat tree topologies with backup routers interconnected by redundant patterns. To exploit these regularities, we introduce network transformations: given a reachability formula and a network, we transform the network into a simpler to verify network and a corresponding transformed formula, such that the original formula is valid in the network if and only if the transformed formula is valid in the transformed network. Our network transformations exploit network surgery (in which irrelevant or redundant sets of nodes, headers, ports, or rules are ``sliced'' away) and network symmetry (say between backup routers). The validity of these transformations is established using a formal theory of networks. In particular, using Van BenthemHennessyMilner style bisimulation, we show that one can generally associate bisimulations to transformations connecting networks and formulas with their transforms. Our work is a development in an area of current wide interest: applying programming language techniques (in our case bisimulation and modal logic) to problems in switching networks. We provide experimental evidence that our network transformations can speed up by 65x the task of verifying the communication between all pairs of Virtual Machines in a large datacenter network with about 100,000 VMs. An allpair reachability calculation, which formerly took 5.5 days, can be done in 2 hours, and can be easily parallelized to complete in
@InProceedings{POPL16p69,
author = {Gordon D. Plotkin and Nikolaj Bjørner and Nuno P. Lopes and Andrey Rybalchenko and George Varghese},
title = {Scaling Network Verification using Symmetry and Surgery},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {6983},
doi = {10.1145/2837614.2837657},
year = {2016},
}
Publisher's Version
Article Search


Pouchet, LouisNoël 
POPL'16: "PolyCheck: Dynamic Verification ..."
PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs
Wenlei Bao, Sriram Krishnamoorthy, LouisNoël Pouchet, Fabrice Rastello, and P. Sadayappan (Ohio State University, USA; Pacific Northwest National Laboratory, USA; Inria, France)
Highlevel compiler transformations, especially loop transformations, are widely recognized as critical optimizations to restructure programs to improve data locality and expose parallelism. Guaranteeing the correctness of program transformations is essential, and to date three main approaches have been developed: proof of equivalence of affine programs, matching the execution traces of programs, and checking bitbybit equivalence of program outputs. Each technique suffers from limitations in the kind of transformations supported, space complexity, or the sensitivity to the testing dataset. In this paper, we take a novel approach that addresses all three limitations to provide an automatic bug checker to verify any iteration reordering transformations on affine programs, including nonaffine transformations, with space consumption proportional to the original program data and robust to arbitrary datasets of a given size. We achieve this by exploiting the structure of affine program control and dataflow to generate at compiletime lightweight checker code to be executed within the transformed program. Experimental results assess the correctness and effectiveness of our method and its increased coverage over previous approaches.
@InProceedings{POPL16p539,
author = {Wenlei Bao and Sriram Krishnamoorthy and LouisNoël Pouchet and Fabrice Rastello and P. Sadayappan},
title = {PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {539554},
doi = {10.1145/2837614.2837656},
year = {2016},
}
Publisher's Version
Article Search


Prabhu, Prathmesh 
POPL'16: "Newtonian Program Analysis ..."
Newtonian Program Analysis via Tensor Product
Thomas Reps, Emma Turetsky, and Prathmesh Prabhu (University of WisconsinMadison, USA; GrammaTech, USA; Google, USA)
Recently, Esparza et al. generalized Newton's method  a numericalanalysis algorithm for finding roots of realvalued functionsto a method for finding fixedpoints of systems of equations over semirings. Their method provides a new way to solve interprocedural dataflowanalysis problems. As in its realvalued counterpart, each iteration of their method solves a simpler ``linearized'' problem. One of the reasons this advance is exciting is that some numerical analysts have claimed that ```all' effective and fast iterative [numerical] methods are forms (perhaps very disguised) of Newton's method.'' However, there is an important difference between the dataflowanalysis and numericalanalysis contexts: when Newton's method is used on numericalanalysis problems, multiplicative commutativity is relied on to rearrange expressions of the form ``c*X + X*d'' into ``(c+d) * X.'' Such equations correspond to path problems described by regular languages. In contrast, when Newton's method is used for interprocedural dataflow analysis, the ``multiplication'' operation involves function composition, and hence is noncommutative: ``c*X + X*d'' cannot be rearranged into ``(c+d) * X.'' Such equations correspond to path problems described by linear contextfree languages (LCFLs). In this paper, we present an improved technique for solving the LCFL subproblems produced during successive rounds of Newton's method. Our method applies to predicate abstraction, on which most of today's software model checkers rely.
@InProceedings{POPL16p663,
author = {Thomas Reps and Emma Turetsky and Prathmesh Prabhu},
title = {Newtonian Program Analysis via Tensor Product},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {663677},
doi = {10.1145/2837614.2837659},
year = {2016},
}
Publisher's Version
Article Search


Pulte, Christopher 
POPL'16: "Modelling the ARMv8 Architecture, ..."
Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell (University of Cambridge, UK; University of St. Andrews, UK; Inria, France; ARM, UK)
In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64bit applicationlevel instruction set (ISA). Our goal is to clarify what the range of architecturally allowable behaviour is, and thereby to support future work on formal verification, analysis, and testing of concurrent ARM software and hardware. Establishing such models with high confidence is intrinsically difficult: it involves capturing the vendor's architectural intent, aspects of which (especially for concurrency) have not previously been precisely defined. We therefore first develop a concurrency model with a microarchitectural flavour, abstracting from many hardware implementation concerns but still close to hardwaredesigner intuition. This means it can be discussed in detail with ARM architects. We then develop a more abstract model, better suited for use as an architectural specification, which we prove sound w.r.t.~the first. The instruction semantics involves further difficulties, handling the mass of detail and the subtle intensional information required to interface to the concurrency model. We have a novel ISA description language, with a lightweight dependent type system, letting us do both with a rather direct representation of the ARM reference manual instruction descriptions. We build a tool from the combined semantics that lets one explore, either interactively or exhaustively, the full range of architecturally allowed behaviour, for litmus tests and (small) ELF executables. We prove correctness of some optimisations needed for tool performance. We validate the models by discussion with ARM staff, and by comparison against ARM hardware behaviour, for ISA single instruction tests and concurrent litmus tests.
@InProceedings{POPL16p608,
author = {Shaked Flur and Kathryn E. Gray and Christopher Pulte and Susmit Sarkar and Ali Sezgin and Luc Maranget and Will Deacon and Peter Sewell},
title = {Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {608621},
doi = {10.1145/2837614.2837615},
year = {2016},
}
Publisher's Version
Article Search
Info


Rasmussen, Ulrik Terp 
POPL'16: "Kleenex: Compiling Nondeterministic ..."
Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers
Bjørn Bugge Grathwohl, Fritz Henglein, Ulrik Terp Rasmussen, Kristoffer Aalund Søholm, and Sebastian Paaske Tørholm (University of Copenhagen, Denmark; Jobindex, Denmark)
We present and illustrate Kleenex, a language for expressing general nondeterministic finite transducers, and its novel compilation to streaming string transducers with essentially optimal streaming behavior, worstcase lineartime performance and sustained high throughput. Its underlying theory is based on transducer decomposition into oracle and action machines: the oracle machine performs streaming greedy disambiguation of the input; the action machine performs the output actions. In use cases Kleenex achieves consistently high throughput rates around the 1 Gbps range on stock hardware. It performs well, especially in complex use cases, in comparison to both specialized and related tools such as GNUawk, GNUsed, GNUgrep, RE2, Ragel and regularexpression libraries.
@InProceedings{POPL16p284,
author = {Bjørn Bugge Grathwohl and Fritz Henglein and Ulrik Terp Rasmussen and Kristoffer Aalund Søholm and Sebastian Paaske Tørholm},
title = {Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {284297},
doi = {10.1145/2837614.2837647},
year = {2016},
}
Publisher's Version
Article Search


Rastello, Fabrice 
POPL'16: "PolyCheck: Dynamic Verification ..."
PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs
Wenlei Bao, Sriram Krishnamoorthy, LouisNoël Pouchet, Fabrice Rastello, and P. Sadayappan (Ohio State University, USA; Pacific Northwest National Laboratory, USA; Inria, France)
Highlevel compiler transformations, especially loop transformations, are widely recognized as critical optimizations to restructure programs to improve data locality and expose parallelism. Guaranteeing the correctness of program transformations is essential, and to date three main approaches have been developed: proof of equivalence of affine programs, matching the execution traces of programs, and checking bitbybit equivalence of program outputs. Each technique suffers from limitations in the kind of transformations supported, space complexity, or the sensitivity to the testing dataset. In this paper, we take a novel approach that addresses all three limitations to provide an automatic bug checker to verify any iteration reordering transformations on affine programs, including nonaffine transformations, with space consumption proportional to the original program data and robust to arbitrary datasets of a given size. We achieve this by exploiting the structure of affine program control and dataflow to generate at compiletime lightweight checker code to be executed within the transformed program. Experimental results assess the correctness and effectiveness of our method and its increased coverage over previous approaches.
@InProceedings{POPL16p539,
author = {Wenlei Bao and Sriram Krishnamoorthy and LouisNoël Pouchet and Fabrice Rastello and P. Sadayappan},
title = {PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {539554},
doi = {10.1145/2837614.2837656},
year = {2016},
}
Publisher's Version
Article Search


Rastogi, Aseem 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Raychev, Veselin 
POPL'16: "Learning Programs from Noisy ..."
Learning Programs from Noisy Data
Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause (ETH Zurich, Switzerland)
We present a new approach for learning programs from noisy datasets. Our approach is based on two new concepts: a regularized program generator which produces a candidate program based on a small sample of the entire dataset while avoiding overfitting, and a dataset sampler which carefully samples the dataset by leveraging the candidate program's score on that dataset. The two components are connected in a continuous feedbackdirected loop. We show how to apply this approach to two settings: one where the dataset has a bound on the noise, and another without a noise bound. The second setting leads to a new way of performing approximate empirical risk minimization on hypotheses classes formed by a discrete search space. We then present two new kinds of program synthesizers which target the two noise settings. First, we introduce a novel regularized bitstream synthesizer that successfully generates programs even in the presence of incorrect examples. We show that the synthesizer can detect errors in the examples while combating overfitting  a major problem in existing synthesis techniques. We also show how the approach can be used in a setting where the dataset grows dynamically via new examples (e.g., provided by a human). Second, we present a novel technique for constructing statistical code completion systems. These are systems trained on massive datasets of open source programs, also known as ``Big Code''. The key idea is to introduce a domain specific language (DSL) over trees and to learn functions in that DSL directly from the dataset. These learned functions then condition the predictions made by the system. This is a flexible and powerful technique which generalizes several existing works as we no longer need to decide a priori on what the prediction should be conditioned (another benefit is that the learned functions are a natural mechanism for explaining the prediction). As a result, our code completion system surpasses the prediction capabilities of existing, hardwired systems.
@InProceedings{POPL16p761,
author = {Veselin Raychev and Pavol Bielik and Martin Vechev and Andreas Krause},
title = {Learning Programs from Noisy Data},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {761774},
doi = {10.1145/2837614.2837671},
year = {2016},
}
Publisher's Version
Article Search


Reps, Thomas 
POPL'16: "Newtonian Program Analysis ..."
Newtonian Program Analysis via Tensor Product
Thomas Reps, Emma Turetsky, and Prathmesh Prabhu (University of WisconsinMadison, USA; GrammaTech, USA; Google, USA)
Recently, Esparza et al. generalized Newton's method  a numericalanalysis algorithm for finding roots of realvalued functionsto a method for finding fixedpoints of systems of equations over semirings. Their method provides a new way to solve interprocedural dataflowanalysis problems. As in its realvalued counterpart, each iteration of their method solves a simpler ``linearized'' problem. One of the reasons this advance is exciting is that some numerical analysts have claimed that ```all' effective and fast iterative [numerical] methods are forms (perhaps very disguised) of Newton's method.'' However, there is an important difference between the dataflowanalysis and numericalanalysis contexts: when Newton's method is used on numericalanalysis problems, multiplicative commutativity is relied on to rearrange expressions of the form ``c*X + X*d'' into ``(c+d) * X.'' Such equations correspond to path problems described by regular languages. In contrast, when Newton's method is used for interprocedural dataflow analysis, the ``multiplication'' operation involves function composition, and hence is noncommutative: ``c*X + X*d'' cannot be rearranged into ``(c+d) * X.'' Such equations correspond to path problems described by linear contextfree languages (LCFLs). In this paper, we present an improved technique for solving the LCFL subproblems produced during successive rounds of Newton's method. Our method applies to predicate abstraction, on which most of today's software model checkers rely.
@InProceedings{POPL16p663,
author = {Thomas Reps and Emma Turetsky and Prathmesh Prabhu},
title = {Newtonian Program Analysis via Tensor Product},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {663677},
doi = {10.1145/2837614.2837659},
year = {2016},
}
Publisher's Version
Article Search


Rinard, Martin 
POPL'16: "Automatic Patch Generation ..."
Automatic Patch Generation by Learning Correct Code
Fan Long and Martin Rinard (Massachusetts Institute of Technology, USA)
We present Prophet, a novel patch generation system that works with a set of successful human patches obtained from open source software repositories to learn a probabilistic, applicationindependent model of correct code. It generates a space of candidate patches, uses the model to rank the candidate patches in order of likely correctness, and validates the ranked patches against a suite of test cases to find correct patches. Experimental results show that, on a benchmark set of 69 realworld defects drawn from eight opensource projects, Prophet significantly outperforms the previous stateoftheart patch generation system.
@InProceedings{POPL16p298,
author = {Fan Long and Martin Rinard},
title = {Automatic Patch Generation by Learning Correct Code},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {298312},
doi = {10.1145/2837614.2837617},
year = {2016},
}
Publisher's Version
Article Search


Robbins, Ed 
POPL'16: "From MinX to MinC: SemanticsDriven ..."
From MinX to MinC: SemanticsDriven Decompilation of Recursive Datatypes
Ed Robbins, Andy King, and Tom Schrijvers (University of Kent, UK; KU Leuven, Belgium)
Reconstructing the meaning of a program from its binary executable is known as reverse engineering; it has a wide range of applications in software security, exposing piracy, legacy systems, etc. Since reversing is ultimately a search for meaning, there is much interest in inferring a type (a meaning) for the elements of a binary in a consistent way. Unfortunately existing approaches do not guarantee any semantic relevance for their reconstructed types. This paper presents a new and semanticallyfounded approach that provides strong guarantees for the reconstructed types. Key to our approach is the derivation of a witness program in a highlevel language alongside the reconstructed types. This witness has the same semantics as the binary, is type correct by construction, and it induces a (justifiable) type assignment on the binary. Moreover, the approach effectively yields a typedirected decompiler. We formalise and implement the approach for reversing MinX, an abstraction of x86, to MinC, a typesafe dialect of C with recursive datatypes. Our evaluation compiles a range of textbook C algorithms to MinX and then recovers the original structures.
@InProceedings{POPL16p191,
author = {Ed Robbins and Andy King and Tom Schrijvers},
title = {From MinX to MinC: SemanticsDriven Decompilation of Recursive Datatypes},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {191203},
doi = {10.1145/2837614.2837633},
year = {2016},
}
Publisher's Version
Article Search


Roth, Dan 
POPL'16: "Learning Invariants using ..."
Learning Invariants using Decision Trees and Implication Counterexamples
Pranav Garg, Daniel Neider, P. Madhusudan, and Dan Roth (University of Illinois at UrbanaChampaign, USA)
Inductive invariants can be robustly synthesized using a learning model where the teacher is a program verifier who instructs the learner through concrete program configurations, classified as positive, negative, and implications. We propose the first learning algorithms in this model with implication counterexamples that are based on machine learning techniques. In particular, we extend classical decisiontree learning algorithms in machine learning to handle implication samples, building new scalable ways to construct small decision trees using statistical measures. We also develop a decisiontree learning algorithm in this model that is guaranteed to converge to the right concept (invariant) if one exists. We implement the learners and an appropriate teacher, and show that the resulting invariant synthesis is efficient and convergent for a large suite of programs.
@InProceedings{POPL16p499,
author = {Pranav Garg and Daniel Neider and P. Madhusudan and Dan Roth},
title = {Learning Invariants using Decision Trees and Implication Counterexamples},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {499512},
doi = {10.1145/2837614.2837664},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


Rowe, Reuben 
POPL'16: "Model Checking for SymbolicHeap ..."
Model Checking for SymbolicHeap Separation Logic with Inductive Predicates
James Brotherston, Nikos Gorogiannis, Max Kanovich, and Reuben Rowe (University College London, UK; Middlesex University, UK; National Research University Higher School of Economics, Russia)
We investigate the *model checking* problem for symbolicheap separation logic with userdefined inductive predicates, i.e., the problem of checking that a given stackheap memory state satisfies a given formula in this language, as arises e.g. in software testing or runtime verification. First, we show that the problem is *decidable*; specifically, we present a bottomup fixed point algorithm that decides the problem and runs in exponential time in the size of the problem instance. Second, we show that, while model checking for the full language is EXPTIMEcomplete, the problem becomes NPcomplete or PTIMEsolvable when we impose natural syntactic restrictions on the schemata defining the inductive predicates. We additionally present NP and PTIME algorithms for these restricted fragments. Finally, we report on the experimental performance of our procedures on a variety of specifications extracted from programs, exercising multiple combinations of syntactic restrictions.
@InProceedings{POPL16p84,
author = {James Brotherston and Nikos Gorogiannis and Max Kanovich and Reuben Rowe},
title = {Model Checking for SymbolicHeap Separation Logic with Inductive Predicates},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {8496},
doi = {10.1145/2837614.2837621},
year = {2016},
}
Publisher's Version
Article Search


Russo, Claudio 
POPL'16: "Fabular: Regression Formulas ..."
Fabular: Regression Formulas as Probabilistic Programming
Johannes Borgström, Andrew D. Gordon, Long Ouyang, Claudio Russo, Adam Ścibior, and Marcin Szymczak (Uppsala University, Sweden; Microsoft Research, UK; University of Edinburgh, UK; Stanford University, USA; University of Cambridge, UK; MPI Tübingen, Germany)
Regression formulas are a domainspecific language adopted by several R packages for describing an important and useful class of statistical models: hierarchical linear regressions. Formulas are succinct, expressive, and clearly popular, so are they a useful addition to probabilistic programming languages? And what do they mean? We propose a core calculus of hierarchical linear regression, in which regression coefficients are themselves defined by nested regressions (unlike in R). We explain how our calculus captures the essence of the formula DSL found in R. We describe the design and implementation of Fabular, a version of the Tabular schemadriven probabilistic programming language, enriched with formulas based on our regression calculus. To the best of our knowledge, this is the first formal description of the core ideas of R's formula notation, the first development of a calculus of regression formulas, and the first demonstration of the benefits of composing regression formulas and latent variables in a probabilistic programming language.
@InProceedings{POPL16p271,
author = {Johannes Borgström and Andrew D. Gordon and Long Ouyang and Claudio Russo and Adam Ścibior and Marcin Szymczak},
title = {Fabular: Regression Formulas as Probabilistic Programming},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {271283},
doi = {10.1145/2837614.2837653},
year = {2016},
}
Publisher's Version
Article Search


Rybalchenko, Andrey 
POPL'16: "Scaling Network Verification ..."
Scaling Network Verification using Symmetry and Surgery
Gordon D. Plotkin, Nikolaj Bjørner, Nuno P. Lopes, Andrey Rybalchenko, and George Varghese (University of Edinburgh, UK; Microsoft Research, USA; Microsoft Research, UK)
On the surface, large data centers with about 100,000 stations and nearly a million routing rules are complex and hard to verify. However, these networks are highly regular by design; for example they employ fat tree topologies with backup routers interconnected by redundant patterns. To exploit these regularities, we introduce network transformations: given a reachability formula and a network, we transform the network into a simpler to verify network and a corresponding transformed formula, such that the original formula is valid in the network if and only if the transformed formula is valid in the transformed network. Our network transformations exploit network surgery (in which irrelevant or redundant sets of nodes, headers, ports, or rules are ``sliced'' away) and network symmetry (say between backup routers). The validity of these transformations is established using a formal theory of networks. In particular, using Van BenthemHennessyMilner style bisimulation, we show that one can generally associate bisimulations to transformations connecting networks and formulas with their transforms. Our work is a development in an area of current wide interest: applying programming language techniques (in our case bisimulation and modal logic) to problems in switching networks. We provide experimental evidence that our network transformations can speed up by 65x the task of verifying the communication between all pairs of Virtual Machines in a large datacenter network with about 100,000 VMs. An allpair reachability calculation, which formerly took 5.5 days, can be done in 2 hours, and can be easily parallelized to complete in
@InProceedings{POPL16p69,
author = {Gordon D. Plotkin and Nikolaj Bjørner and Nuno P. Lopes and Andrey Rybalchenko and George Varghese},
title = {Scaling Network Verification using Symmetry and Surgery},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {6983},
doi = {10.1145/2837614.2837657},
year = {2016},
}
Publisher's Version
Article Search


Sadayappan, P. 
POPL'16: "PolyCheck: Dynamic Verification ..."
PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs
Wenlei Bao, Sriram Krishnamoorthy, LouisNoël Pouchet, Fabrice Rastello, and P. Sadayappan (Ohio State University, USA; Pacific Northwest National Laboratory, USA; Inria, France)
Highlevel compiler transformations, especially loop transformations, are widely recognized as critical optimizations to restructure programs to improve data locality and expose parallelism. Guaranteeing the correctness of program transformations is essential, and to date three main approaches have been developed: proof of equivalence of affine programs, matching the execution traces of programs, and checking bitbybit equivalence of program outputs. Each technique suffers from limitations in the kind of transformations supported, space complexity, or the sensitivity to the testing dataset. In this paper, we take a novel approach that addresses all three limitations to provide an automatic bug checker to verify any iteration reordering transformations on affine programs, including nonaffine transformations, with space consumption proportional to the original program data and robust to arbitrary datasets of a given size. We achieve this by exploiting the structure of affine program control and dataflow to generate at compiletime lightweight checker code to be executed within the transformed program. Experimental results assess the correctness and effectiveness of our method and its increased coverage over previous approaches.
@InProceedings{POPL16p539,
author = {Wenlei Bao and Sriram Krishnamoorthy and LouisNoël Pouchet and Fabrice Rastello and P. Sadayappan},
title = {PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {539554},
doi = {10.1145/2837614.2837656},
year = {2016},
}
Publisher's Version
Article Search


Sagiv, Mooly 
POPL'16: "Decidability of Inferring ..."
Decidability of Inferring Inductive Invariants
Oded Padon, Neil Immerman, Sharon Shoham, Aleksandr Karbyshev, and Mooly Sagiv (Tel Aviv University, Israel; University of Massachusetts at Amherst, USA; Academic College of Tel Aviv Yaffo, Israel)
Induction is a successful approach for verification of hardware and software systems. A common practice is to model a system using logical formulas, and then use a decision procedure to verify that some logical formula is an inductive safety invariant for the system. A key ingredient in this approach is coming up with the inductive invariant, which is known as invariant inference. This is a major difficulty, and it is often left for humans or addressed by sound but incomplete abstract interpretation. This paper is motivated by the problem of inductive invariants in shape analysis and in distributed protocols. This paper approaches the general problem of inferring firstorder inductive invariants by restricting the language L of candidate invariants. Notice that the problem of invariant inference in a restricted language L differs from the safety problem, since a system may be safe and still not have any inductive invariant in L that proves safety. Clearly, if L is finite (and if testing an inductive invariant is decidable), then inferring invariants in L is decidable. This paper presents some interesting cases when inferring inductive invariants in L is decidable even when L is an infinite language of universal formulas. Decidability is obtained by restricting L and defining a suitable wellquasiorder on the state space. We also present some undecidability results that show that our restrictions are necessary. We further present a framework for systematically constructing infinite languages while keeping the invariant inference problem decidable. We illustrate our approach by showing the decidability of inferring invariants for programs manipulating linkedlists, and for distributed protocols.
@InProceedings{POPL16p217,
author = {Oded Padon and Neil Immerman and Sharon Shoham and Aleksandr Karbyshev and Mooly Sagiv},
title = {Decidability of Inferring Inductive Invariants},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {217231},
doi = {10.1145/2837614.2837640},
year = {2016},
}
Publisher's Version
Article Search


Sangiorgi, Davide 
POPL'16: "Environmental Bisimulations ..."
Environmental Bisimulations for Probabilistic HigherOrder Languages
Davide Sangiorgi and Valeria Vignudelli (University of Bologna, Italy; Inria, France)
Environmental bisimulations for probabilistic higherorder languages are studied. In contrast with applicative bisimulations, environmental bisimulations are known to be more robust and do not require sophisticated techniques such as Howe’s in the proofs of congruence. As representative calculi, callbyname and callbyvalue λ calculus, and a (callbyvalue) λcalculus extended with references (i.e., a store) are considered. In each case full abstraction results are derived for probabilistic environmental similarity and bisimilarity with respect to contextual preorder and contextual equivalence, respectively. Some possible enhancements of the (bi)simulations, as ‘upto techniques’, are also presented. Probabilities force a number of modifications to the definition of environmental bisimulations in nonprobabilistic languages. Some of these modifications are specific to probabilities, others may be seen as general refinements of environmental bisimulations, applicable also to nonprobabilistic languages. Several examples are presented, to illustrate the modifications and the differences.
@InProceedings{POPL16p595,
author = {Davide Sangiorgi and Valeria Vignudelli},
title = {Environmental Bisimulations for Probabilistic HigherOrder Languages},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {595607},
doi = {10.1145/2837614.2837651},
year = {2016},
}
Publisher's Version
Article Search


Sarkar, Susmit 
POPL'16: "Modelling the ARMv8 Architecture, ..."
Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell (University of Cambridge, UK; University of St. Andrews, UK; Inria, France; ARM, UK)
In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64bit applicationlevel instruction set (ISA). Our goal is to clarify what the range of architecturally allowable behaviour is, and thereby to support future work on formal verification, analysis, and testing of concurrent ARM software and hardware. Establishing such models with high confidence is intrinsically difficult: it involves capturing the vendor's architectural intent, aspects of which (especially for concurrency) have not previously been precisely defined. We therefore first develop a concurrency model with a microarchitectural flavour, abstracting from many hardware implementation concerns but still close to hardwaredesigner intuition. This means it can be discussed in detail with ARM architects. We then develop a more abstract model, better suited for use as an architectural specification, which we prove sound w.r.t.~the first. The instruction semantics involves further difficulties, handling the mass of detail and the subtle intensional information required to interface to the concurrency model. We have a novel ISA description language, with a lightweight dependent type system, letting us do both with a rather direct representation of the ARM reference manual instruction descriptions. We build a tool from the combined semantics that lets one explore, either interactively or exhaustively, the full range of architecturally allowed behaviour, for litmus tests and (small) ELF executables. We prove correctness of some optimisations needed for tool performance. We validate the models by discussion with ARM staff, and by comparison against ARM hardware behaviour, for ISA single instruction tests and concurrent litmus tests.
@InProceedings{POPL16p608,
author = {Shaked Flur and Kathryn E. Gray and Christopher Pulte and Susmit Sarkar and Ali Sezgin and Luc Maranget and Will Deacon and Peter Sewell},
title = {Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {608621},
doi = {10.1145/2837614.2837615},
year = {2016},
}
Publisher's Version
Article Search
Info


Sato, Ryosuke 
POPL'16: "Temporal Verification of HigherOrder ..."
Temporal Verification of HigherOrder Functional Programs
Akihiro Murase, Tachio Terauchi, Naoki Kobayashi, Ryosuke Sato, and Hiroshi Unno (Nagoya University, Japan; JAIST, Japan; University of Tokyo, Japan; University of Tsukuba, Japan)
We present an automated approach to verifying arbitrary omegaregular
properties of higherorder functional programs. Previous automated
methods proposed for this class of programs could only handle safety
properties or termination, and our approach is the first to be able
to verify arbitrary omegaregular liveness properties.
Our approach is automatatheoretic, and extends our recent work on
binaryreachabilitybased approach to automated termination
verification of higherorder functional programs to fair termination
published in ESOP 2014. In that work, we have shown that checking
disjunctive wellfoundedness of (the transitive closure of) the
``calling relation'' is sound and complete for termination. The
extension to fair termination is tricky, however, because the
straightforward extension that checks disjunctive wellfoundedness of
the fair calling relation turns out to be unsound, as we shall show in
the paper. Roughly, our solution is to check fairness on the
transition relation instead of the calling relation, and propagate the
information to determine when it is necessary and sufficient to check
for disjunctive wellfoundedness on the calling relation. We prove
that our approach is sound and complete. We have implemented
a prototype of our approach, and confirmed that it is able to
automatically verify liveness properties of some nontrivial
higherorder programs.
@InProceedings{POPL16p57,
author = {Akihiro Murase and Tachio Terauchi and Naoki Kobayashi and Ryosuke Sato and Hiroshi Unno},
title = {Temporal Verification of HigherOrder Functional Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {5768},
doi = {10.1145/2837614.2837667},
year = {2016},
}
Publisher's Version
Article Search


Schrijvers, Tom 
POPL'16: "From MinX to MinC: SemanticsDriven ..."
From MinX to MinC: SemanticsDriven Decompilation of Recursive Datatypes
Ed Robbins, Andy King, and Tom Schrijvers (University of Kent, UK; KU Leuven, Belgium)
Reconstructing the meaning of a program from its binary executable is known as reverse engineering; it has a wide range of applications in software security, exposing piracy, legacy systems, etc. Since reversing is ultimately a search for meaning, there is much interest in inferring a type (a meaning) for the elements of a binary in a consistent way. Unfortunately existing approaches do not guarantee any semantic relevance for their reconstructed types. This paper presents a new and semanticallyfounded approach that provides strong guarantees for the reconstructed types. Key to our approach is the derivation of a witness program in a highlevel language alongside the reconstructed types. This witness has the same semantics as the binary, is type correct by construction, and it induces a (justifiable) type assignment on the binary. Moreover, the approach effectively yields a typedirected decompiler. We formalise and implement the approach for reversing MinX, an abstraction of x86, to MinC, a typesafe dialect of C with recursive datatypes. Our evaluation compiles a range of textbook C algorithms to MinX and then recovers the original structures.
@InProceedings{POPL16p191,
author = {Ed Robbins and Andy King and Tom Schrijvers},
title = {From MinX to MinC: SemanticsDriven Decompilation of Recursive Datatypes},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {191203},
doi = {10.1145/2837614.2837633},
year = {2016},
}
Publisher's Version
Article Search


Ścibior, Adam 
POPL'16: "Fabular: Regression Formulas ..."
Fabular: Regression Formulas as Probabilistic Programming
Johannes Borgström, Andrew D. Gordon, Long Ouyang, Claudio Russo, Adam Ścibior, and Marcin Szymczak (Uppsala University, Sweden; Microsoft Research, UK; University of Edinburgh, UK; Stanford University, USA; University of Cambridge, UK; MPI Tübingen, Germany)
Regression formulas are a domainspecific language adopted by several R packages for describing an important and useful class of statistical models: hierarchical linear regressions. Formulas are succinct, expressive, and clearly popular, so are they a useful addition to probabilistic programming languages? And what do they mean? We propose a core calculus of hierarchical linear regression, in which regression coefficients are themselves defined by nested regressions (unlike in R). We explain how our calculus captures the essence of the formula DSL found in R. We describe the design and implementation of Fabular, a version of the Tabular schemadriven probabilistic programming language, enriched with formulas based on our regression calculus. To the best of our knowledge, this is the first formal description of the core ideas of R's formula notation, the first development of a calculus of regression formulas, and the first demonstration of the benefits of composing regression formulas and latent variables in a probabilistic programming language.
@InProceedings{POPL16p271,
author = {Johannes Borgström and Andrew D. Gordon and Long Ouyang and Claudio Russo and Adam Ścibior and Marcin Szymczak},
title = {Fabular: Regression Formulas as Probabilistic Programming},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {271283},
doi = {10.1145/2837614.2837653},
year = {2016},
}
Publisher's Version
Article Search


Sewell, Peter 
POPL'16: "A Concurrency Semantics for ..."
A Concurrency Semantics for Relaxed Atomics that Permits Optimisation and Avoids ThinAir Executions
Jean PichonPharabod and Peter Sewell (University of Cambridge, UK)
Despite much research on concurrent programming languages, especially for Java and C/C++, we still do not have a satisfactory definition of their semantics, one that admits all common optimisations without also admitting undesired behaviour. Especially problematic are the ``thinair'' examples involving highperformance concurrent accesses, such as C/C++11 relaxed atomics. The C/C++11 model is in a percandidateexecution style, and previous work has identified a tension between that and the fact that compiler optimisations do not operate over single candidate executions in isolation; rather, they operate over syntactic representations that represent all executions. In this paper we propose a novel approach that circumvents this difficulty. We define a concurrency semantics for a core calculus, including relaxedatomic and nonatomic accesses, and locks, that admits a wide range of optimisation while still forbidding the classic thinair examples. It also addresses other problems relating to undefined behaviour. The basic idea is to use an eventstructure representation of the current state of each thread, capturing all of its potential executions, and to permit interleaving of execution and transformation steps over that to reflect optimisation (possibly dynamic) of the code. These are combined with a nonmulticopyatomic storage subsystem, to reflect common hardware behaviour. The semantics is defined in a mechanised and executable form, and designed to be implementable above current relaxed hardware and strong enough to support the programming idioms that C/C++11 does for this fragment. It offers a potential way forward for concurrent programming language semantics, beyond the current C/C++11 and Java models.
@InProceedings{POPL16p622,
author = {Jean PichonPharabod and Peter Sewell},
title = {A Concurrency Semantics for Relaxed Atomics that Permits Optimisation and Avoids ThinAir Executions},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {622633},
doi = {10.1145/2837614.2837616},
year = {2016},
}
Publisher's Version
Article Search
Info
POPL'16: "Modelling the ARMv8 Architecture, ..."
Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell (University of Cambridge, UK; University of St. Andrews, UK; Inria, France; ARM, UK)
In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64bit applicationlevel instruction set (ISA). Our goal is to clarify what the range of architecturally allowable behaviour is, and thereby to support future work on formal verification, analysis, and testing of concurrent ARM software and hardware. Establishing such models with high confidence is intrinsically difficult: it involves capturing the vendor's architectural intent, aspects of which (especially for concurrency) have not previously been precisely defined. We therefore first develop a concurrency model with a microarchitectural flavour, abstracting from many hardware implementation concerns but still close to hardwaredesigner intuition. This means it can be discussed in detail with ARM architects. We then develop a more abstract model, better suited for use as an architectural specification, which we prove sound w.r.t.~the first. The instruction semantics involves further difficulties, handling the mass of detail and the subtle intensional information required to interface to the concurrency model. We have a novel ISA description language, with a lightweight dependent type system, letting us do both with a rather direct representation of the ARM reference manual instruction descriptions. We build a tool from the combined semantics that lets one explore, either interactively or exhaustively, the full range of architecturally allowed behaviour, for litmus tests and (small) ELF executables. We prove correctness of some optimisations needed for tool performance. We validate the models by discussion with ARM staff, and by comparison against ARM hardware behaviour, for ISA single instruction tests and concurrent litmus tests.
@InProceedings{POPL16p608,
author = {Shaked Flur and Kathryn E. Gray and Christopher Pulte and Susmit Sarkar and Ali Sezgin and Luc Maranget and Will Deacon and Peter Sewell},
title = {Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {608621},
doi = {10.1145/2837614.2837615},
year = {2016},
}
Publisher's Version
Article Search
Info


Sezgin, Ali 
POPL'16: "Modelling the ARMv8 Architecture, ..."
Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell (University of Cambridge, UK; University of St. Andrews, UK; Inria, France; ARM, UK)
In this paper we develop semantics for key aspects of the ARMv8 multiprocessor architecture: the concurrency model and much of the 64bit applicationlevel instruction set (ISA). Our goal is to clarify what the range of architecturally allowable behaviour is, and thereby to support future work on formal verification, analysis, and testing of concurrent ARM software and hardware. Establishing such models with high confidence is intrinsically difficult: it involves capturing the vendor's architectural intent, aspects of which (especially for concurrency) have not previously been precisely defined. We therefore first develop a concurrency model with a microarchitectural flavour, abstracting from many hardware implementation concerns but still close to hardwaredesigner intuition. This means it can be discussed in detail with ARM architects. We then develop a more abstract model, better suited for use as an architectural specification, which we prove sound w.r.t.~the first. The instruction semantics involves further difficulties, handling the mass of detail and the subtle intensional information required to interface to the concurrency model. We have a novel ISA description language, with a lightweight dependent type system, letting us do both with a rather direct representation of the ARM reference manual instruction descriptions. We build a tool from the combined semantics that lets one explore, either interactively or exhaustively, the full range of architecturally allowed behaviour, for litmus tests and (small) ELF executables. We prove correctness of some optimisations needed for tool performance. We validate the models by discussion with ARM staff, and by comparison against ARM hardware behaviour, for ISA single instruction tests and concurrent litmus tests.
@InProceedings{POPL16p608,
author = {Shaked Flur and Kathryn E. Gray and Christopher Pulte and Susmit Sarkar and Ali Sezgin and Luc Maranget and Will Deacon and Peter Sewell},
title = {Modelling the ARMv8 Architecture, Operationally: Concurrency and ISA},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {608621},
doi = {10.1145/2837614.2837615},
year = {2016},
}
Publisher's Version
Article Search
Info


Shapiro, Marc 
POPL'16: "'Cause I'm Strong ..."
'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems
Alexey Gotsman, Hongseok Yang, Carla Ferreira, Mahsa Najafzadeh, and Marc Shapiro (IMDEA Software Institute, Spain; University of Oxford, UK; Universidade Nova Lisboa, Potugal; Sorbonne, France; Inria, France; UPMC, France)
Largescale distributed systems often rely on replicated databases that allow a
programmer to request different data consistency guarantees for different
operations, and thereby control their performance. Using such databases is far
from trivial: requesting stronger consistency in too many places may hurt
performance, and requesting it in too few places may violate correctness. To
help programmers in this task, we propose the first proof rule for establishing
that a particular choice of consistency guarantees for various operations on a
replicated database is enough to ensure the preservation of a given data
integrity invariant. Our rule is modular: it allows reasoning about the
behaviour of every operation separately under some assumption on the behaviour
of other operations. This leads to simple reasoning, which we have automated in
an SMTbased tool. We present a nontrivial proof of soundness of our rule and
illustrate its use on several examples.
@InProceedings{POPL16p371,
author = {Alexey Gotsman and Hongseok Yang and Carla Ferreira and Mahsa Najafzadeh and Marc Shapiro},
title = {'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {371384},
doi = {10.1145/2837614.2837625},
year = {2016},
}
Publisher's Version
Article Search


Shimizu, Shunsuke 
POPL'16: "LatticeTheoretic Progress ..."
LatticeTheoretic Progress Measures and Coalgebraic Model Checking
Ichiro Hasuo, Shunsuke Shimizu, and Corina Cîrstea (University of Tokyo, Japan; University of Southampton, UK)
In the context of formal verification in general and model checking in particular, parity games serve as a mighty vehicle: many problems are encoded as parity games, which are then solved by the seminal algorithm by Jurdzinski. In this paper we identify the essence of this workflow to be the notion of progress measure, and formalize it in general, possibly infinitary, latticetheoretic terms. Our view on progress measures is that they are to nested/alternating fixed points what invariants are to safety/greatest fixed points, and what ranking functions are to liveness/least fixed points. That is, progress measures are combination of the latter two notions (invariant and ranking function) that have been extensively studied in the context of (program) verification. We then apply our theory of progress measures to a general modelchecking framework, where systems are categorically presented as coalgebras. The framework's theoretical robustness is witnessed by a smooth transfer from the branchingtime setting to the lineartime one. Although the framework can be used to derive some decision procedures for finite settings, we also expect the proposed framework to form a basis for sound proof methods for some undecidable/infinitary problems.
@InProceedings{POPL16p718,
author = {Ichiro Hasuo and Shunsuke Shimizu and Corina Cîrstea},
title = {LatticeTheoretic Progress Measures and Coalgebraic Model Checking},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {718732},
doi = {10.1145/2837614.2837673},
year = {2016},
}
Publisher's Version
Article Search


Shoham, Sharon 
POPL'16: "Decidability of Inferring ..."
Decidability of Inferring Inductive Invariants
Oded Padon, Neil Immerman, Sharon Shoham, Aleksandr Karbyshev, and Mooly Sagiv (Tel Aviv University, Israel; University of Massachusetts at Amherst, USA; Academic College of Tel Aviv Yaffo, Israel)
Induction is a successful approach for verification of hardware and software systems. A common practice is to model a system using logical formulas, and then use a decision procedure to verify that some logical formula is an inductive safety invariant for the system. A key ingredient in this approach is coming up with the inductive invariant, which is known as invariant inference. This is a major difficulty, and it is often left for humans or addressed by sound but incomplete abstract interpretation. This paper is motivated by the problem of inductive invariants in shape analysis and in distributed protocols. This paper approaches the general problem of inferring firstorder inductive invariants by restricting the language L of candidate invariants. Notice that the problem of invariant inference in a restricted language L differs from the safety problem, since a system may be safe and still not have any inductive invariant in L that proves safety. Clearly, if L is finite (and if testing an inductive invariant is decidable), then inferring invariants in L is decidable. This paper presents some interesting cases when inferring inductive invariants in L is decidable even when L is an infinite language of universal formulas. Decidability is obtained by restricting L and defining a suitable wellquasiorder on the state space. We also present some undecidability results that show that our restrictions are necessary. We further present a framework for systematically constructing infinite languages while keeping the invariant inference problem decidable. We illustrate our approach by showing the decidability of inferring invariants for programs manipulating linkedlists, and for distributed protocols.
@InProceedings{POPL16p217,
author = {Oded Padon and Neil Immerman and Sharon Shoham and Aleksandr Karbyshev and Mooly Sagiv},
title = {Decidability of Inferring Inductive Invariants},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {217231},
doi = {10.1145/2837614.2837640},
year = {2016},
}
Publisher's Version
Article Search


Siek, Jeremy G. 
POPL'16: "The Gradualizer: A Methodology ..."
The Gradualizer: A Methodology and Algorithm for Generating Gradual Type Systems
Matteo Cimini and Jeremy G. Siek (Indiana University, USA)
Many languages are beginning to integrate dynamic and static typing. Siek and Taha offered gradual typing as an approach to this integration that provides a coherent and fullspan migration between the two disciplines. However, the literature lacks a general methodology for designing gradually typed languages. Our first contribution is to provide a methodology for deriving the gradual type system and the compilation to the cast calculus. Based on this methodology, we present the Gradualizer, an algorithm that generates a gradual type system from a wellformed type system and also generates a compiler to the cast calculus. Our algorithm handles a large class of type systems and generates systems that are correct with respect to the formal criteria of gradual typing. We also report on an implementation of the Gradualizer that takes a type system expressed in lambdaprolog and outputs its gradually typed version and a compiler to the cast calculus in lambdaprolog.
@InProceedings{POPL16p443,
author = {Matteo Cimini and Jeremy G. Siek},
title = {The Gradualizer: A Methodology and Algorithm for Generating Gradual Type Systems},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {443455},
doi = {10.1145/2837614.2837632},
year = {2016},
}
Publisher's Version
Article Search


Singh, Rishabh 
POPL'16: "Transforming Spreadsheet Data ..."
Transforming Spreadsheet Data Types using Examples
Rishabh Singh and Sumit Gulwani (Microsoft Research, USA)
Cleaning spreadsheet data types is a common problem faced by millions of spreadsheet users. Data types such as date, time, name, and units are ubiquitous in spreadsheets, and cleaning transformations on these data types involve parsing and pretty printing their string representations. This presents many challenges to users because cleaning such data requires some background knowledge about the data itself and moreover this data is typically nonuniform, unstructured, and ambiguous. Spreadsheet systems and Programming Languages provide some UIbased and programmatic solutions for this problem but they are either insufficient for the user's needs or are beyond their expertise. In this paper, we present a programming by example methodology of cleaning data types that learns the desired transformation from a few inputoutput examples. We propose a domain specific language with probabilistic semantics that is parameterized with declarative data type definitions. The probabilistic semantics is based on three key aspects: (i) approximate predicate matching, (ii) joint learning of data type interpretation, and (iii) weighted branches. This probabilistic semantics enables the language to handle nonuniform, unstructured, and ambiguous data. We then present a synthesis algorithm that learns the desired program in this language from a set of inputoutput examples. We have implemented our algorithm as an Excel addin and present its successful evaluation on 55 benchmark problems obtained from online help forums and Excel product team.
@InProceedings{POPL16p343,
author = {Rishabh Singh and Sumit Gulwani},
title = {Transforming Spreadsheet Data Types using Examples},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {343356},
doi = {10.1145/2837614.2837668},
year = {2016},
}
Publisher's Version
Article Search


Søholm, Kristoffer Aalund 
POPL'16: "Kleenex: Compiling Nondeterministic ..."
Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers
Bjørn Bugge Grathwohl, Fritz Henglein, Ulrik Terp Rasmussen, Kristoffer Aalund Søholm, and Sebastian Paaske Tørholm (University of Copenhagen, Denmark; Jobindex, Denmark)
We present and illustrate Kleenex, a language for expressing general nondeterministic finite transducers, and its novel compilation to streaming string transducers with essentially optimal streaming behavior, worstcase lineartime performance and sustained high throughput. Its underlying theory is based on transducer decomposition into oracle and action machines: the oracle machine performs streaming greedy disambiguation of the input; the action machine performs the output actions. In use cases Kleenex achieves consistently high throughput rates around the 1 Gbps range on stock hardware. It performs well, especially in complex use cases, in comparison to both specialized and related tools such as GNUawk, GNUsed, GNUgrep, RE2, Ragel and regularexpression libraries.
@InProceedings{POPL16p284,
author = {Bjørn Bugge Grathwohl and Fritz Henglein and Ulrik Terp Rasmussen and Kristoffer Aalund Søholm and Sebastian Paaske Tørholm},
title = {Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {284297},
doi = {10.1145/2837614.2837647},
year = {2016},
}
Publisher's Version
Article Search


Strub, PierreYves 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Swamy, Nikhil 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Szymczak, Marcin 
POPL'16: "Fabular: Regression Formulas ..."
Fabular: Regression Formulas as Probabilistic Programming
Johannes Borgström, Andrew D. Gordon, Long Ouyang, Claudio Russo, Adam Ścibior, and Marcin Szymczak (Uppsala University, Sweden; Microsoft Research, UK; University of Edinburgh, UK; Stanford University, USA; University of Cambridge, UK; MPI Tübingen, Germany)
Regression formulas are a domainspecific language adopted by several R packages for describing an important and useful class of statistical models: hierarchical linear regressions. Formulas are succinct, expressive, and clearly popular, so are they a useful addition to probabilistic programming languages? And what do they mean? We propose a core calculus of hierarchical linear regression, in which regression coefficients are themselves defined by nested regressions (unlike in R). We explain how our calculus captures the essence of the formula DSL found in R. We describe the design and implementation of Fabular, a version of the Tabular schemadriven probabilistic programming language, enriched with formulas based on our regression calculus. To the best of our knowledge, this is the first formal description of the core ideas of R's formula notation, the first development of a calculus of regression formulas, and the first demonstration of the benefits of composing regression formulas and latent variables in a probabilistic programming language.
@InProceedings{POPL16p271,
author = {Johannes Borgström and Andrew D. Gordon and Long Ouyang and Claudio Russo and Adam Ścibior and Marcin Szymczak},
title = {Fabular: Regression Formulas as Probabilistic Programming},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {271283},
doi = {10.1145/2837614.2837653},
year = {2016},
}
Publisher's Version
Article Search


Takikawa, Asumu 
POPL'16: "Is Sound Gradual Typing Dead? ..."
Is Sound Gradual Typing Dead?
Asumu Takikawa, Daniel Feltey, Ben Greenman, Max S. New, Jan Vitek, and Matthias Felleisen (Northeastern University, USA)
Programmers have come to embrace dynamicallytyped languages for prototyping and delivering large and complex systems. When it comes to maintaining and evolving these systems, the lack of explicit static typing becomes a bottleneck. In response, researchers have explored the idea of graduallytyped programming languages which allow the incremental addition of type annotations to software written in one of these untyped languages. Some of these new, hybrid languages insert runtime checks at the boundary between typed and untyped code to establish type soundness for the overall system. With sound gradual typing, programmers can rely on the language implementation to provide meaningful error messages when type invariants are violated. While most research on sound gradual typing remains theoretical, the few emerging implementations suffer from performance overheads due to these checks. None of the publications on this topic comes with a comprehensive performance evaluation. Worse, a few report disastrous numbers. In response, this paper proposes a method for evaluating the performance of graduallytyped programming languages. The method hinges on exploring the space of partial conversions from untyped to typed. For each benchmark, the performance of the different versions is reported in a synthetic metric that associates runtime overhead to conversion effort. The paper reports on the results of applying the method to Typed Racket, a mature implementation of sound gradual typing, using a suite of realworld programs of various sizes and complexities. Based on these results the paper concludes that, given the current state of implementation technologies, sound gradual typing faces significant challenges. Conversely, it raises the question of how implementations could reduce the overheads associated with soundness and how tools could be used to steer programmers clear from pathological cases.
@InProceedings{POPL16p456,
author = {Asumu Takikawa and Daniel Feltey and Ben Greenman and Max S. New and Jan Vitek and Matthias Felleisen},
title = {Is Sound Gradual Typing Dead?},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {456468},
doi = {10.1145/2837614.2837630},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


Tanter, Éric 
POPL'16: "Abstracting Gradual Typing ..."
Abstracting Gradual Typing
Ronald Garcia, Alison M. Clark, and Éric Tanter (University of British Columbia, Canada; University of Chile, Chile)
Language researchers and designers have extended a wide variety of type systems to support gradual typing, which enables languages to seamlessly combine dynamic and static checking. These efforts consistently demonstrate that designing a satisfactory gradual counterpart to a static type system is challenging, and this challenge only increases with the sophistication of the type system. Gradual type system designers need more formal tools to help them conceptualize, structure, and evaluate their designs. In this paper, we propose a new formal foundation for gradual typing, drawing on principles from abstract interpretation to give gradual types a semantics in terms of preexisting static types. Abstracting Gradual Typing (AGT for short) yields a formal account of consistencyone of the cornerstones of the gradual typing approachthat subsumes existing notions of consistency, which were developed through intuition and ad hoc reasoning. Given a syntaxdirected static typing judgment, the AGT approach induces a corresponding gradual typing judgment. Then the type safety proof for the underlying static discipline induces a dynamic semantics for gradual programs defined over sourcelanguage typing derivations. The AGT approach does not resort to an externally justified cast calculus: instead, runtime checks naturally arise by deducing evidence for consistent judgments during proof reduction. To illustrate the approach, we develop a novel graduallytyped counterpart for a language with record subtyping. Gradual languages designed with the AGT approach satisfy by construction the refined criteria for gradual typing set forth by Siek and colleagues.
@InProceedings{POPL16p429,
author = {Ronald Garcia and Alison M. Clark and Éric Tanter},
title = {Abstracting Gradual Typing},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {429442},
doi = {10.1145/2837614.2837670},
year = {2016},
}
Publisher's Version
Article Search


Terauchi, Tachio 
POPL'16: "Temporal Verification of HigherOrder ..."
Temporal Verification of HigherOrder Functional Programs
Akihiro Murase, Tachio Terauchi, Naoki Kobayashi, Ryosuke Sato, and Hiroshi Unno (Nagoya University, Japan; JAIST, Japan; University of Tokyo, Japan; University of Tsukuba, Japan)
We present an automated approach to verifying arbitrary omegaregular
properties of higherorder functional programs. Previous automated
methods proposed for this class of programs could only handle safety
properties or termination, and our approach is the first to be able
to verify arbitrary omegaregular liveness properties.
Our approach is automatatheoretic, and extends our recent work on
binaryreachabilitybased approach to automated termination
verification of higherorder functional programs to fair termination
published in ESOP 2014. In that work, we have shown that checking
disjunctive wellfoundedness of (the transitive closure of) the
``calling relation'' is sound and complete for termination. The
extension to fair termination is tricky, however, because the
straightforward extension that checks disjunctive wellfoundedness of
the fair calling relation turns out to be unsound, as we shall show in
the paper. Roughly, our solution is to check fairness on the
transition relation instead of the calling relation, and propagate the
information to determine when it is necessary and sufficient to check
for disjunctive wellfoundedness on the calling relation. We prove
that our approach is sound and complete. We have implemented
a prototype of our approach, and confirmed that it is able to
automatically verify liveness properties of some nontrivial
higherorder programs.
@InProceedings{POPL16p57,
author = {Akihiro Murase and Tachio Terauchi and Naoki Kobayashi and Ryosuke Sato and Hiroshi Unno},
title = {Temporal Verification of HigherOrder Functional Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {5768},
doi = {10.1145/2837614.2837667},
year = {2016},
}
Publisher's Version
Article Search


Tørholm, Sebastian Paaske 
POPL'16: "Kleenex: Compiling Nondeterministic ..."
Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers
Bjørn Bugge Grathwohl, Fritz Henglein, Ulrik Terp Rasmussen, Kristoffer Aalund Søholm, and Sebastian Paaske Tørholm (University of Copenhagen, Denmark; Jobindex, Denmark)
We present and illustrate Kleenex, a language for expressing general nondeterministic finite transducers, and its novel compilation to streaming string transducers with essentially optimal streaming behavior, worstcase lineartime performance and sustained high throughput. Its underlying theory is based on transducer decomposition into oracle and action machines: the oracle machine performs streaming greedy disambiguation of the input; the action machine performs the output actions. In use cases Kleenex achieves consistently high throughput rates around the 1 Gbps range on stock hardware. It performs well, especially in complex use cases, in comparison to both specialized and related tools such as GNUawk, GNUsed, GNUgrep, RE2, Ragel and regularexpression libraries.
@InProceedings{POPL16p284,
author = {Bjørn Bugge Grathwohl and Fritz Henglein and Ulrik Terp Rasmussen and Kristoffer Aalund Søholm and Sebastian Paaske Tørholm},
title = {Kleenex: Compiling Nondeterministic Transducers to Deterministic Streaming Transducers},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {284297},
doi = {10.1145/2837614.2837647},
year = {2016},
}
Publisher's Version
Article Search


Torlak, Emina 
POPL'16: "Optimizing Synthesis with ..."
Optimizing Synthesis with Metasketches
James Bornholt, Emina Torlak, Dan Grossman, and Luis Ceze (University of Washington, USA)
Many advanced programming toolsfor both endusers and expert developersrely on program synthesis to automatically generate implementations from highlevel specifications. These tools often need to employ tricky, custombuilt synthesis algorithms because they require synthesized programs to be not only correct, but also optimal with respect to a desired cost metric, such as program size. Finding these optimal solutions efficiently requires domainspecific search strategies, but existing synthesizers hardcode the strategy, making them difficult to reuse. This paper presents metasketches, a general framework for specifying and solving optimal synthesis problems. metasketches make the search strategy a part of the problem definition by specifying a fragmentation of the search space into an ordered set of classic sketches. We provide two cooperating search algorithms to effectively solve metasketches. A global optimizing search coordinates the activities of local searches, informing them of the costs of potentiallyoptimal solutions as they explore different regions of the candidate space in parallel. The local searches execute an incremental form of counterexampleguided inductive synthesis to incorporate information sent from the global search. We present Synapse, an implementation of these algorithms, and show that it effectively solves optimal synthesis problems with a variety of different cost functions. In addition, metasketches can be used to accelerate classic (nonoptimal) synthesis by explicitly controlling the search strategy, and we show that Synapse solves classic synthesis problems that stateoftheart tools cannot.
@InProceedings{POPL16p775,
author = {James Bornholt and Emina Torlak and Dan Grossman and Luis Ceze},
title = {Optimizing Synthesis with Metasketches},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {775788},
doi = {10.1145/2837614.2837666},
year = {2016},
}
Publisher's Version
Article Search


Tribastone, Mirco 
POPL'16: "Symbolic Computation of Differential ..."
Symbolic Computation of Differential Equivalences
Luca Cardelli, Mirco Tribastone, Max Tschaikowski, and Andrea Vandin (Microsoft Research, UK; University of Oxford, UK; IMT Lucca, Italy)
Ordinary differential equations (ODEs) are widespread in many natural sciences including chemistry, ecology, and systems biology, and in disciplines such as control theory and electrical engineering. Building on the celebrated moleculesasprocesses paradigm, they have become increasingly popular in computer science, with highlevel languages and formal methods such as Petri nets, process algebra, and rulebased systems that are interpreted as ODEs. We consider the problem of comparing and minimizing ODEs automatically. Influenced by traditional approaches in the theory of programming, we propose differential equivalence relations. We study them for a basic intermediate language, for which we have decidability results, that can be targeted by a class of highlevel specifications. An ODE implicitly represents an uncountable state space, hence reasoning techniques cannot be borrowed from established domains such as probabilistic programs with finitestate Markov chain semantics. We provide novel symbolic procedures to check an equivalence and compute the largest one via partition refinement algorithms that use satisfiability modulo theories. We illustrate the generality of our framework by showing that differential equivalences include (i) wellknown notions for the minimization of continuoustime Markov chains (lumpability), (ii)~bisimulations for chemical reaction networks recently proposed by Cardelli et al., and (iii) behavioral relations for process algebra with ODE semantics. With a prototype implementation we are able to detect equivalences in biochemical models from the literature that cannot be reduced using competing automatic techniques.
@InProceedings{POPL16p137,
author = {Luca Cardelli and Mirco Tribastone and Max Tschaikowski and Andrea Vandin},
title = {Symbolic Computation of Differential Equivalences},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {137150},
doi = {10.1145/2837614.2837649},
year = {2016},
}
Publisher's Version
Article Search


Tschaikowski, Max 
POPL'16: "Symbolic Computation of Differential ..."
Symbolic Computation of Differential Equivalences
Luca Cardelli, Mirco Tribastone, Max Tschaikowski, and Andrea Vandin (Microsoft Research, UK; University of Oxford, UK; IMT Lucca, Italy)
Ordinary differential equations (ODEs) are widespread in many natural sciences including chemistry, ecology, and systems biology, and in disciplines such as control theory and electrical engineering. Building on the celebrated moleculesasprocesses paradigm, they have become increasingly popular in computer science, with highlevel languages and formal methods such as Petri nets, process algebra, and rulebased systems that are interpreted as ODEs. We consider the problem of comparing and minimizing ODEs automatically. Influenced by traditional approaches in the theory of programming, we propose differential equivalence relations. We study them for a basic intermediate language, for which we have decidability results, that can be targeted by a class of highlevel specifications. An ODE implicitly represents an uncountable state space, hence reasoning techniques cannot be borrowed from established domains such as probabilistic programs with finitestate Markov chain semantics. We provide novel symbolic procedures to check an equivalence and compute the largest one via partition refinement algorithms that use satisfiability modulo theories. We illustrate the generality of our framework by showing that differential equivalences include (i) wellknown notions for the minimization of continuoustime Markov chains (lumpability), (ii)~bisimulations for chemical reaction networks recently proposed by Cardelli et al., and (iii) behavioral relations for process algebra with ODE semantics. With a prototype implementation we are able to detect equivalences in biochemical models from the literature that cannot be reduced using competing automatic techniques.
@InProceedings{POPL16p137,
author = {Luca Cardelli and Mirco Tribastone and Max Tschaikowski and Andrea Vandin},
title = {Symbolic Computation of Differential Equivalences},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {137150},
doi = {10.1145/2837614.2837649},
year = {2016},
}
Publisher's Version
Article Search


Turetsky, Emma 
POPL'16: "Newtonian Program Analysis ..."
Newtonian Program Analysis via Tensor Product
Thomas Reps, Emma Turetsky, and Prathmesh Prabhu (University of WisconsinMadison, USA; GrammaTech, USA; Google, USA)
Recently, Esparza et al. generalized Newton's method  a numericalanalysis algorithm for finding roots of realvalued functionsto a method for finding fixedpoints of systems of equations over semirings. Their method provides a new way to solve interprocedural dataflowanalysis problems. As in its realvalued counterpart, each iteration of their method solves a simpler ``linearized'' problem. One of the reasons this advance is exciting is that some numerical analysts have claimed that ```all' effective and fast iterative [numerical] methods are forms (perhaps very disguised) of Newton's method.'' However, there is an important difference between the dataflowanalysis and numericalanalysis contexts: when Newton's method is used on numericalanalysis problems, multiplicative commutativity is relied on to rearrange expressions of the form ``c*X + X*d'' into ``(c+d) * X.'' Such equations correspond to path problems described by regular languages. In contrast, when Newton's method is used for interprocedural dataflow analysis, the ``multiplication'' operation involves function composition, and hence is noncommutative: ``c*X + X*d'' cannot be rearranged into ``(c+d) * X.'' Such equations correspond to path problems described by linear contextfree languages (LCFLs). In this paper, we present an improved technique for solving the LCFL subproblems produced during successive rounds of Newton's method. Our method applies to predicate abstraction, on which most of today's software model checkers rely.
@InProceedings{POPL16p663,
author = {Thomas Reps and Emma Turetsky and Prathmesh Prabhu},
title = {Newtonian Program Analysis via Tensor Product},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {663677},
doi = {10.1145/2837614.2837659},
year = {2016},
}
Publisher's Version
Article Search


Unno, Hiroshi 
POPL'16: "Temporal Verification of HigherOrder ..."
Temporal Verification of HigherOrder Functional Programs
Akihiro Murase, Tachio Terauchi, Naoki Kobayashi, Ryosuke Sato, and Hiroshi Unno (Nagoya University, Japan; JAIST, Japan; University of Tokyo, Japan; University of Tsukuba, Japan)
We present an automated approach to verifying arbitrary omegaregular
properties of higherorder functional programs. Previous automated
methods proposed for this class of programs could only handle safety
properties or termination, and our approach is the first to be able
to verify arbitrary omegaregular liveness properties.
Our approach is automatatheoretic, and extends our recent work on
binaryreachabilitybased approach to automated termination
verification of higherorder functional programs to fair termination
published in ESOP 2014. In that work, we have shown that checking
disjunctive wellfoundedness of (the transitive closure of) the
``calling relation'' is sound and complete for termination. The
extension to fair termination is tricky, however, because the
straightforward extension that checks disjunctive wellfoundedness of
the fair calling relation turns out to be unsound, as we shall show in
the paper. Roughly, our solution is to check fairness on the
transition relation instead of the calling relation, and propagate the
information to determine when it is necessary and sufficient to check
for disjunctive wellfoundedness on the calling relation. We prove
that our approach is sound and complete. We have implemented
a prototype of our approach, and confirmed that it is able to
automatically verify liveness properties of some nontrivial
higherorder programs.
@InProceedings{POPL16p57,
author = {Akihiro Murase and Tachio Terauchi and Naoki Kobayashi and Ryosuke Sato and Hiroshi Unno},
title = {Temporal Verification of HigherOrder Functional Programs},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {5768},
doi = {10.1145/2837614.2837667},
year = {2016},
}
Publisher's Version
Article Search


Vafeiadis, Viktor 
POPL'16: "Lightweight Verification of ..."
Lightweight Verification of Separate Compilation
Jeehoon Kang, Yoonseung Kim, ChungKil Hur, Derek Dreyer, and Viktor Vafeiadis (Seoul National University, South Korea; MPISWS, Germany)
Major compiler verification efforts, such as the CompCert project, have traditionally simplified the verification problem by restricting attention to the correctness of wholeprogram compilation, leaving open the question of how to verify the correctness of separate compilation. Recently, a number of sophisticated techniques have been proposed for proving more flexible, compositional notions of compiler correctness, but these approaches tend to be quite heavyweight compared to the simple "closed simulations" used in verifying wholeprogram compilation. Applying such techniques to a compiler like CompCert, as Stewart et al. have done, involves major changes and extensions to its original verification. In this paper, we show that if we aim somewhat lowerto prove correctness of separate compilation, but only for a *single* compilerwe can drastically simplify the proof effort. Toward this end, we develop several lightweight techniques that recast the compositional verification problem in terms of wholeprogram compilation, thereby enabling us to largely reuse the closedsimulation proofs from existing compiler verifications. We demonstrate the effectiveness of these techniques by applying them to CompCert 2.4, converting its verification of wholeprogram compilation into a verification of separate compilation in less than two personmonths. This conversion only required a small number of changes to the original proofs, and uncovered two compiler bugs along the way. The result is SepCompCert, the first verification of separate compilation for the full CompCert compiler.
@InProceedings{POPL16p178,
author = {Jeehoon Kang and Yoonseung Kim and ChungKil Hur and Derek Dreyer and Viktor Vafeiadis},
title = {Lightweight Verification of Separate Compilation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {178190},
doi = {10.1145/2837614.2837642},
year = {2016},
}
Publisher's Version
Article Search
Info
POPL'16: "Taming ReleaseAcquire Consistency ..."
Taming ReleaseAcquire Consistency
Ori Lahav, Nick Giannarakis, and Viktor Vafeiadis (MPISWS, Germany)
We introduce a strengthening of the releaseacquire fragment of the C11 memory model that (i) forbids dubious behaviors that are not observed in any implementation; (ii) supports fence instructions that restore sequential consistency; and (iii) admits an equivalent intuitive operational semantics based on pointtopoint communication. This strengthening has no additional implementation cost: it allows the same local optimizations as C11 release and acquire accesses, and has exactly the same compilation schemes to the x86TSO and Power architectures. In fact, the compilation to Power is complete with respect to a recent axiomatic model of Power; that is, the compiled program exhibits exactly the same behaviors as the source one. Moreover, we provide criteria for placing enough fence instructions to ensure sequential consistency, and apply them to an efficient RCU implementation.
@InProceedings{POPL16p649,
author = {Ori Lahav and Nick Giannarakis and Viktor Vafeiadis},
title = {Taming ReleaseAcquire Consistency},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {649662},
doi = {10.1145/2837614.2837643},
year = {2016},
}
Publisher's Version
Article Search
Info


Vandin, Andrea 
POPL'16: "Symbolic Computation of Differential ..."
Symbolic Computation of Differential Equivalences
Luca Cardelli, Mirco Tribastone, Max Tschaikowski, and Andrea Vandin (Microsoft Research, UK; University of Oxford, UK; IMT Lucca, Italy)
Ordinary differential equations (ODEs) are widespread in many natural sciences including chemistry, ecology, and systems biology, and in disciplines such as control theory and electrical engineering. Building on the celebrated moleculesasprocesses paradigm, they have become increasingly popular in computer science, with highlevel languages and formal methods such as Petri nets, process algebra, and rulebased systems that are interpreted as ODEs. We consider the problem of comparing and minimizing ODEs automatically. Influenced by traditional approaches in the theory of programming, we propose differential equivalence relations. We study them for a basic intermediate language, for which we have decidability results, that can be targeted by a class of highlevel specifications. An ODE implicitly represents an uncountable state space, hence reasoning techniques cannot be borrowed from established domains such as probabilistic programs with finitestate Markov chain semantics. We provide novel symbolic procedures to check an equivalence and compute the largest one via partition refinement algorithms that use satisfiability modulo theories. We illustrate the generality of our framework by showing that differential equivalences include (i) wellknown notions for the minimization of continuoustime Markov chains (lumpability), (ii)~bisimulations for chemical reaction networks recently proposed by Cardelli et al., and (iii) behavioral relations for process algebra with ODE semantics. With a prototype implementation we are able to detect equivalences in biochemical models from the literature that cannot be reduced using competing automatic techniques.
@InProceedings{POPL16p137,
author = {Luca Cardelli and Mirco Tribastone and Max Tschaikowski and Andrea Vandin},
title = {Symbolic Computation of Differential Equivalences},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {137150},
doi = {10.1145/2837614.2837649},
year = {2016},
}
Publisher's Version
Article Search


Van Horn, David 
POPL'16: "Pushdown ControlFlow Analysis ..."
Pushdown ControlFlow Analysis for Free
Thomas Gilray, Steven Lyde, Michael D. Adams, Matthew Might, and David Van Horn (University of Utah, USA; University of Maryland, USA)
Traditional controlflow analysis (CFA) for higherorder languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been published that provide perfect callstack precision in a computable manner: CFA2, PDCFA, and AAC. Unfortunately, implementing CFA2 and PDCFA requires significant engineering effort. Furthermore, all three are computationally expensive. For a monovariant analysis, CFA2 is in O(2^n), PDCFA is in O(n^6), and AAC is in O(n^8).
In this paper, we describe a new technique that builds on these but is both straightforward to implement and computationally inexpensive. The crucial insight is an unusual statedependent allocation strategy for the addresses of continuations. Our technique imposes only a constantfactor overhead on the underlying analysis and costs only O(n^3) in the monovariant case. We present the intuitions behind this development, benchmarks demonstrating its efficacy, and a proof of the precision of this analysis.
@InProceedings{POPL16p691,
author = {Thomas Gilray and Steven Lyde and Michael D. Adams and Matthew Might and David Van Horn},
title = {Pushdown ControlFlow Analysis for Free},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {691704},
doi = {10.1145/2837614.2837631},
year = {2016},
}
Publisher's Version
Article Search


Varghese, George 
POPL'16: "Scaling Network Verification ..."
Scaling Network Verification using Symmetry and Surgery
Gordon D. Plotkin, Nikolaj Bjørner, Nuno P. Lopes, Andrey Rybalchenko, and George Varghese (University of Edinburgh, UK; Microsoft Research, USA; Microsoft Research, UK)
On the surface, large data centers with about 100,000 stations and nearly a million routing rules are complex and hard to verify. However, these networks are highly regular by design; for example they employ fat tree topologies with backup routers interconnected by redundant patterns. To exploit these regularities, we introduce network transformations: given a reachability formula and a network, we transform the network into a simpler to verify network and a corresponding transformed formula, such that the original formula is valid in the network if and only if the transformed formula is valid in the transformed network. Our network transformations exploit network surgery (in which irrelevant or redundant sets of nodes, headers, ports, or rules are ``sliced'' away) and network symmetry (say between backup routers). The validity of these transformations is established using a formal theory of networks. In particular, using Van BenthemHennessyMilner style bisimulation, we show that one can generally associate bisimulations to transformations connecting networks and formulas with their transforms. Our work is a development in an area of current wide interest: applying programming language techniques (in our case bisimulation and modal logic) to problems in switching networks. We provide experimental evidence that our network transformations can speed up by 65x the task of verifying the communication between all pairs of Virtual Machines in a large datacenter network with about 100,000 VMs. An allpair reachability calculation, which formerly took 5.5 days, can be done in 2 hours, and can be easily parallelized to complete in
@InProceedings{POPL16p69,
author = {Gordon D. Plotkin and Nikolaj Bjørner and Nuno P. Lopes and Andrey Rybalchenko and George Varghese},
title = {Scaling Network Verification using Symmetry and Surgery},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {6983},
doi = {10.1145/2837614.2837657},
year = {2016},
}
Publisher's Version
Article Search


Vechev, Martin 
POPL'16: "Learning Programs from Noisy ..."
Learning Programs from Noisy Data
Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause (ETH Zurich, Switzerland)
We present a new approach for learning programs from noisy datasets. Our approach is based on two new concepts: a regularized program generator which produces a candidate program based on a small sample of the entire dataset while avoiding overfitting, and a dataset sampler which carefully samples the dataset by leveraging the candidate program's score on that dataset. The two components are connected in a continuous feedbackdirected loop. We show how to apply this approach to two settings: one where the dataset has a bound on the noise, and another without a noise bound. The second setting leads to a new way of performing approximate empirical risk minimization on hypotheses classes formed by a discrete search space. We then present two new kinds of program synthesizers which target the two noise settings. First, we introduce a novel regularized bitstream synthesizer that successfully generates programs even in the presence of incorrect examples. We show that the synthesizer can detect errors in the examples while combating overfitting  a major problem in existing synthesis techniques. We also show how the approach can be used in a setting where the dataset grows dynamically via new examples (e.g., provided by a human). Second, we present a novel technique for constructing statistical code completion systems. These are systems trained on massive datasets of open source programs, also known as ``Big Code''. The key idea is to introduce a domain specific language (DSL) over trees and to learn functions in that DSL directly from the dataset. These learned functions then condition the predictions made by the system. This is a flexible and powerful technique which generalizes several existing works as we no longer need to decide a priori on what the prediction should be conditioned (another benefit is that the learned functions are a natural mechanism for explaining the prediction). As a result, our code completion system surpasses the prediction capabilities of existing, hardwired systems.
@InProceedings{POPL16p761,
author = {Veselin Raychev and Pavol Bielik and Martin Vechev and Andreas Krause},
title = {Learning Programs from Noisy Data},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {761774},
doi = {10.1145/2837614.2837671},
year = {2016},
}
Publisher's Version
Article Search


Vignudelli, Valeria 
POPL'16: "Environmental Bisimulations ..."
Environmental Bisimulations for Probabilistic HigherOrder Languages
Davide Sangiorgi and Valeria Vignudelli (University of Bologna, Italy; Inria, France)
Environmental bisimulations for probabilistic higherorder languages are studied. In contrast with applicative bisimulations, environmental bisimulations are known to be more robust and do not require sophisticated techniques such as Howe’s in the proofs of congruence. As representative calculi, callbyname and callbyvalue λ calculus, and a (callbyvalue) λcalculus extended with references (i.e., a store) are considered. In each case full abstraction results are derived for probabilistic environmental similarity and bisimilarity with respect to contextual preorder and contextual equivalence, respectively. Some possible enhancements of the (bi)simulations, as ‘upto techniques’, are also presented. Probabilities force a number of modifications to the definition of environmental bisimulations in nonprobabilistic languages. Some of these modifications are specific to probabilities, others may be seen as general refinements of environmental bisimulations, applicable also to nonprobabilistic languages. Several examples are presented, to illustrate the modifications and the differences.
@InProceedings{POPL16p595,
author = {Davide Sangiorgi and Valeria Vignudelli},
title = {Environmental Bisimulations for Probabilistic HigherOrder Languages},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {595607},
doi = {10.1145/2837614.2837651},
year = {2016},
}
Publisher's Version
Article Search


Vitek, Jan 
POPL'16: "Is Sound Gradual Typing Dead? ..."
Is Sound Gradual Typing Dead?
Asumu Takikawa, Daniel Feltey, Ben Greenman, Max S. New, Jan Vitek, and Matthias Felleisen (Northeastern University, USA)
Programmers have come to embrace dynamicallytyped languages for prototyping and delivering large and complex systems. When it comes to maintaining and evolving these systems, the lack of explicit static typing becomes a bottleneck. In response, researchers have explored the idea of graduallytyped programming languages which allow the incremental addition of type annotations to software written in one of these untyped languages. Some of these new, hybrid languages insert runtime checks at the boundary between typed and untyped code to establish type soundness for the overall system. With sound gradual typing, programmers can rely on the language implementation to provide meaningful error messages when type invariants are violated. While most research on sound gradual typing remains theoretical, the few emerging implementations suffer from performance overheads due to these checks. None of the publications on this topic comes with a comprehensive performance evaluation. Worse, a few report disastrous numbers. In response, this paper proposes a method for evaluating the performance of graduallytyped programming languages. The method hinges on exploring the space of partial conversions from untyped to typed. For each benchmark, the performance of the different versions is reported in a synthetic metric that associates runtime overhead to conversion effort. The paper reports on the results of applying the method to Typed Racket, a mature implementation of sound gradual typing, using a suite of realworld programs of various sizes and complexities. Based on these results the paper concludes that, given the current state of implementation technologies, sound gradual typing faces significant challenges. Conversely, it raises the question of how implementations could reduce the overheads associated with soundness and how tools could be used to steer programmers clear from pathological cases.
@InProceedings{POPL16p456,
author = {Asumu Takikawa and Daniel Feltey and Ben Greenman and Max S. New and Jan Vitek and Matthias Felleisen},
title = {Is Sound Gradual Typing Dead?},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {456468},
doi = {10.1145/2837614.2837630},
year = {2016},
}
Publisher's Version
Article Search
Artifacts Available


Walker, David 
POPL'16KEY: "Confluences in Programming ..."
Confluences in Programming Languages Research (Keynote)
David Walker (Princeton University, USA)
A confluence occurs when two rivers flow together; downstream the combined forces gather strength and propel their waters forward with increased vigor. In academic research, according to Varghese, a confluence occurs after some trigger, perhaps a discovery or a change in technology, and brings two previously separate branches of research together. In this talk, I will discuss confluences in programming languages research. Here, confluences often occur when basic research finds application in some important new domain. Two prime examples from my own career involve the confluence of research in type theory and systems security, triggered by new theoretical tools for reasoning about programming language safety, and the confluence of formal methods and networking, triggered by the rise of data centers. These experiences may shed light on what to teach our students and what is next for programming languages research.
@InProceedings{POPL16p4,
author = {David Walker},
title = {Confluences in Programming Languages Research (Keynote)},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {44},
doi = {10.1145/2837614.2843896},
year = {2016},
}
Publisher's Version
Article Search
POPL'16: "ExampleDirected Synthesis: ..."
ExampleDirected Synthesis: A TypeTheoretic Interpretation
Jonathan Frankle, PeterMichael Osera, David Walker, and Steve Zdancewic (Princeton University, USA; Grinnell College, USA; University of Pennsylvania, USA)
Inputoutput examples have emerged as a practical and userfriendly
specification mechanism for program synthesis in many environments.
While exampledriven tools have demonstrated tangible impact that has
inspired adoption in industry, their underlying semantics are less wellunderstood:
what are "examples" and how do they
relate to other kinds of specifications? This paper
demonstrates that examples can, in general, be interpreted
as refinement types. Seen in this light, program
synthesis is the task of finding an inhabitant of
such a type. This insight provides an immediate
semantic interpretation for examples. Moreover,
it enables us to exploit decades of research in type theory as
well as its correspondence with intuitionistic logic rather
than designing ad hoc theoretical frameworks for synthesis from scratch.
We put this observation into practice by formalizing synthesis
as proof search in a sequent calculus with
intersection and union refinements that we prove
to be sound with respect to a conventional type system.
In addition, we show how to handle negative examples,
which arise from user feedback or counterexampleguided loops.
This theory serves as the basis for a prototype
implementation that extends our core language to
support MLstyle algebraic data types and structurally
inductive functions. Users can also specify
synthesis goals using polymorphic refinements and
import monomorphic libraries.
The prototype serves as a vehicle
for empirically evaluating a number of different
strategies for resolving the nondeterminism of the sequent
calculusbottomup theoremproving,
term enumeration with refinement type checking, and
combinations of boththe results of which classify, explain, and
validate the design choices of existing synthesis systems.
It also provides a platform for measuring the practical
value of a specification language that combines
"examples" with the more general expressiveness of refinements.
@InProceedings{POPL16p802,
author = {Jonathan Frankle and PeterMichael Osera and David Walker and Steve Zdancewic},
title = {ExampleDirected Synthesis: A TypeTheoretic Interpretation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {802815},
doi = {10.1145/2837614.2837629},
year = {2016},
}
Publisher's Version
Article Search


Wickerson, John 
POPL'16: "Overhauling SC Atomics in ..."
Overhauling SC Atomics in C11 and OpenCL
Mark Batty, Alastair F. Donaldson, and John Wickerson (University of Kent, UK; Imperial College London, UK)
Despite the conceptual simplicity of sequential consistency (SC), the semantics of SC atomic operations and fences in the C11 and OpenCL memory models is subtle, leading to convoluted prose descriptions that translate to complex axiomatic formalisations. We conduct an overhaul of SC atomics in C11, reducing the associated axioms in both number and complexity. A consequence of our simplification is that the SC operations in an execution no longer need to be totally ordered. This relaxation enables, for the first time, efficient and exhaustive simulation of litmus tests that use SC atomics. We extend our improved C11 model to obtain the first rigorous memory model formalisation for OpenCL (which extends C11 with support for heterogeneous manycore programming). In the OpenCL setting, we refine the SC axioms still further to give a sensible semantics to SC operations that employ a ‘memory scope’ to restrict their visibility to specific threads. Our overhaul requires slight strengthenings of both the C11 and the OpenCL memory models, causing some behaviours to become disallowed. We argue that these strengthenings are natural, and that all of the formalised C11 and OpenCL compilation schemes of which we are aware (Power and x86 CPUs for C11, AMD GPUs for OpenCL) remain valid in our revised models. Using the HERD memory model simulator, we show that our overhaul leads to an exponential improvement in simulation time for C11 litmus tests compared with the original model, making *exhaustive* simulation competitive, timewise, with the *nonexhaustive* CDSChecker tool.
@InProceedings{POPL16p634,
author = {Mark Batty and Alastair F. Donaldson and John Wickerson},
title = {Overhauling SC Atomics in C11 and OpenCL},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {634648},
doi = {10.1145/2837614.2837637},
year = {2016},
}
Publisher's Version
Article Search


Wu, Rongxin 
POPL'16: "Casper: An Efficient Approach ..."
Casper: An Efficient Approach to Call Trace Collection
Rongxin Wu, Xiao Xiao, ShingChi Cheung, Hongyu Zhang, and Charles Zhang (Hong Kong University of Science and Technology, China; Microsoft Research, China)
Call traces, i.e., sequences of function calls and returns, are fundamental to a wide range of program analyses such as bug reproduction, fault diagnosis, performance analysis, and many others. The conventional approach to collect call traces that instruments each function call and return site incurs large space and time overhead. Our approach aims at reducing the recording overheads by instrumenting only a small amount of call sites while keeping the capability of recovering the full trace. We propose a call trace model and a logged call trace model based on an LL(1) grammar, which enables us to define the criteria of a feasible solution to call trace collection. Based on the two models, we prove that to collect call traces with minimal instrumentation is an NPhard problem. We then propose an efficient approach to obtaining a suboptimal solution. We implemented our approach as a tool Casper and evaluated it using the DaCapo benchmark suite. The experiment results show that our approach causes significantly lower runtime (and space) overhead than two stateofthearts approaches.
@InProceedings{POPL16p678,
author = {Rongxin Wu and Xiao Xiao and ShingChi Cheung and Hongyu Zhang and Charles Zhang},
title = {Casper: An Efficient Approach to Call Trace Collection},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {678690},
doi = {10.1145/2837614.2837619},
year = {2016},
}
Publisher's Version
Article Search


Xiao, Xiao 
POPL'16: "Casper: An Efficient Approach ..."
Casper: An Efficient Approach to Call Trace Collection
Rongxin Wu, Xiao Xiao, ShingChi Cheung, Hongyu Zhang, and Charles Zhang (Hong Kong University of Science and Technology, China; Microsoft Research, China)
Call traces, i.e., sequences of function calls and returns, are fundamental to a wide range of program analyses such as bug reproduction, fault diagnosis, performance analysis, and many others. The conventional approach to collect call traces that instruments each function call and return site incurs large space and time overhead. Our approach aims at reducing the recording overheads by instrumenting only a small amount of call sites while keeping the capability of recovering the full trace. We propose a call trace model and a logged call trace model based on an LL(1) grammar, which enables us to define the criteria of a feasible solution to call trace collection. Based on the two models, we prove that to collect call traces with minimal instrumentation is an NPhard problem. We then propose an efficient approach to obtaining a suboptimal solution. We implemented our approach as a tool Casper and evaluated it using the DaCapo benchmark suite. The experiment results show that our approach causes significantly lower runtime (and space) overhead than two stateofthearts approaches.
@InProceedings{POPL16p678,
author = {Rongxin Wu and Xiao Xiao and ShingChi Cheung and Hongyu Zhang and Charles Zhang},
title = {Casper: An Efficient Approach to Call Trace Collection},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {678690},
doi = {10.1145/2837614.2837619},
year = {2016},
}
Publisher's Version
Article Search


Yahav, Eran 
POPL'16: "Estimating Types in Binaries ..."
Estimating Types in Binaries using Predictive Modeling
Omer Katz, Ran ElYaniv, and Eran Yahav (Technion, Israel)
Reverse engineering is an important tool in mitigating vulnerabilities in binaries. As a lot of software is developed in objectoriented languages, reverse engineering of objectoriented code is of critical importance. One of the major hurdles in reverse engineering binaries compiled from objectoriented code is the use of dynamic dispatch. In the absence of debug information, any dynamic dispatch may seem to jump to many possible targets, posing a significant challenge to a reverse engineer trying to track the program flow. We present a novel technique that allows us to statically determine the likely targets of virtual function calls. Our technique uses object tracelets – statically constructed sequences of operations performed on an object – to capture potential runtime behaviors of the object. Our analysis automatically prelabels some of the object tracelets by relying on instances where the type of an object is known. The resulting typelabeled tracelets are then used to train a statistical language model (SLM) for each type.We then use the resulting ensemble of SLMs over unlabeled tracelets to generate a ranking of their most likely types, from which we deduce the likely targets of dynamic dispatches.We have implemented our technique and evaluated it over realworld C++ binaries. Our evaluation shows that when there are multiple alternative targets, our approach can drastically reduce the number of targets that have to be considered by a reverse engineer.
@InProceedings{POPL16p313,
author = {Omer Katz and Ran ElYaniv and Eran Yahav},
title = {Estimating Types in Binaries using Predictive Modeling},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {313326},
doi = {10.1145/2837614.2837674},
year = {2016},
}
Publisher's Version
Article Search


Yang, Hongseok 
POPL'16: "'Cause I'm Strong ..."
'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems
Alexey Gotsman, Hongseok Yang, Carla Ferreira, Mahsa Najafzadeh, and Marc Shapiro (IMDEA Software Institute, Spain; University of Oxford, UK; Universidade Nova Lisboa, Potugal; Sorbonne, France; Inria, France; UPMC, France)
Largescale distributed systems often rely on replicated databases that allow a
programmer to request different data consistency guarantees for different
operations, and thereby control their performance. Using such databases is far
from trivial: requesting stronger consistency in too many places may hurt
performance, and requesting it in too few places may violate correctness. To
help programmers in this task, we propose the first proof rule for establishing
that a particular choice of consistency guarantees for various operations on a
replicated database is enough to ensure the preservation of a given data
integrity invariant. Our rule is modular: it allows reasoning about the
behaviour of every operation separately under some assumption on the behaviour
of other operations. This leads to simple reasoning, which we have automated in
an SMTbased tool. We present a nontrivial proof of soundness of our rule and
illustrate its use on several examples.
@InProceedings{POPL16p371,
author = {Alexey Gotsman and Hongseok Yang and Carla Ferreira and Mahsa Najafzadeh and Marc Shapiro},
title = {'Cause I'm Strong Enough: Reasoning about Consistency Choices in Distributed Systems},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {371384},
doi = {10.1145/2837614.2837625},
year = {2016},
}
Publisher's Version
Article Search
POPL'16: "Abstraction Refinement Guided ..."
Abstraction Refinement Guided by a Learnt Probabilistic Model
Radu Grigore and Hongseok Yang (University of Oxford, UK)
The core challenge in designing an effective static program analysis is to find a good program abstraction  one that retains only details relevant to a given query. In this paper, we present a new approach for automatically finding such an abstraction. Our approach uses a pessimistic strategy, which can optionally use guidance from a probabilistic model. Our approach applies to parametric static analyses implemented in Datalog, and is based on counterexampleguided abstraction refinement. For each untried abstraction, our probabilistic model provides a probability of success, while the size of the abstraction provides an estimate of its cost in terms of analysis time. Combining these two metrics, probability and cost, our refinement algorithm picks an optimal abstraction. Our probabilistic model is a variant of the ErdosRenyi random graph model, and it is tunable by what we call hyperparameters. We present a method to learn good values for these hyperparameters, by observing past runs of the analysis on an existing codebase. We evaluate our approach on an object sensitive pointer analysis for Java programs, with two client analyses (PolySite and Downcast).
@InProceedings{POPL16p485,
author = {Radu Grigore and Hongseok Yang},
title = {Abstraction Refinement Guided by a Learnt Probabilistic Model},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {485498},
doi = {10.1145/2837614.2837663},
year = {2016},
}
Publisher's Version
Article Search
Info


Yang, Junfeng 
POPL'16: "Reducing Crash Recoverability ..."
Reducing Crash Recoverability to Reachability
Eric Koskinen and Junfeng Yang (Yale University, USA; Columbia University, USA)
Software applications run on a variety of platforms (filesystems, virtual slices, mobile hardware, etc.) that do not provide 100% uptime. As such, these applications may crash at any unfortunate moment losing volatile data and, when relaunched, they must be able to correctly recover from potentially inconsistent states left on persistent storage. From a verification perspective, crash recovery bugs can be particularly frustrating because, even when it has been formally proved for a program that it satisfies a property, the proof is foiled by these external events that crash and restart the program. In this paper we first provide a hierarchical formal model of what it means for a program to be crash recoverable. Our model captures the recoverability of many real world programs, including those in our evaluation which use sophisticated recovery algorithms such as shadow paging and writeahead logging. Next, we introduce a novel technique capable of automatically proving that a program correctly recovers from a crash via a reduction to reachability. Our technique takes an input controlflow automaton and transforms it into an encoding that blends the capture of snapshots of precrash states into a symbolic search for a proof that recovery terminates and every recovered execution simulates some crashfree execution. Our encoding is designed to enable one to apply existing abstraction techniques in order to do the work that is necessary to prove recoverability. We have implemented our technique in a tool called Eleven82, capable of analyzing C programs to detect recoverability bugs or prove their absence. We have applied our tool to benchmark examples drawn from industrial file systems and databases, including GDBM, LevelDB, LMDB, PostgreSQL, SQLite, VMware and ZooKeeper. Within minutes, our tool is able to discover bugs or prove that these fragments are crash recoverable.
@InProceedings{POPL16p97,
author = {Eric Koskinen and Junfeng Yang},
title = {Reducing Crash Recoverability to Reachability},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {97108},
doi = {10.1145/2837614.2837648},
year = {2016},
}
Publisher's Version
Article Search


Yoshida, Nobuko 
POPL'16: "Effects as Sessions, Sessions ..."
Effects as Sessions, Sessions as Effects
Dominic Orchard and Nobuko Yoshida (Imperial College London, UK)
Effect and session type systems are two expressive behavioural type systems. The former is usually developed in the context of the lambdacalculus and its variants, the latter for the picalculus. In this paper we explore their relative expressive power. Firstly, we give an embedding from PCF, augmented with a parameterised effect system, into a sessiontyped picalculus (session calculus), showing that session types are powerful enough to express effects. Secondly, we give a reverse embedding, from the session calculus back into PCF, by instantiating PCF with concurrency primitives and its effect system with a sessionlike effect algebra; effect systems are powerful enough to express sessions. The embedding of session types into an effect system is leveraged to give a new implementation of session types in Haskell, via an effect system encoding. The correctness of this implementation follows from the second embedding result. We also discuss various extensions to our embeddings.
@InProceedings{POPL16p568,
author = {Dominic Orchard and Nobuko Yoshida},
title = {Effects as Sessions, Sessions as Effects},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {568581},
doi = {10.1145/2837614.2837634},
year = {2016},
}
Publisher's Version
Article Search
Info


ZanellaBéguelin, Santiago 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Zdancewic, Steve 
POPL'16: "ExampleDirected Synthesis: ..."
ExampleDirected Synthesis: A TypeTheoretic Interpretation
Jonathan Frankle, PeterMichael Osera, David Walker, and Steve Zdancewic (Princeton University, USA; Grinnell College, USA; University of Pennsylvania, USA)
Inputoutput examples have emerged as a practical and userfriendly
specification mechanism for program synthesis in many environments.
While exampledriven tools have demonstrated tangible impact that has
inspired adoption in industry, their underlying semantics are less wellunderstood:
what are "examples" and how do they
relate to other kinds of specifications? This paper
demonstrates that examples can, in general, be interpreted
as refinement types. Seen in this light, program
synthesis is the task of finding an inhabitant of
such a type. This insight provides an immediate
semantic interpretation for examples. Moreover,
it enables us to exploit decades of research in type theory as
well as its correspondence with intuitionistic logic rather
than designing ad hoc theoretical frameworks for synthesis from scratch.
We put this observation into practice by formalizing synthesis
as proof search in a sequent calculus with
intersection and union refinements that we prove
to be sound with respect to a conventional type system.
In addition, we show how to handle negative examples,
which arise from user feedback or counterexampleguided loops.
This theory serves as the basis for a prototype
implementation that extends our core language to
support MLstyle algebraic data types and structurally
inductive functions. Users can also specify
synthesis goals using polymorphic refinements and
import monomorphic libraries.
The prototype serves as a vehicle
for empirically evaluating a number of different
strategies for resolving the nondeterminism of the sequent
calculusbottomup theoremproving,
term enumeration with refinement type checking, and
combinations of boththe results of which classify, explain, and
validate the design choices of existing synthesis systems.
It also provides a platform for measuring the practical
value of a specification language that combines
"examples" with the more general expressiveness of refinements.
@InProceedings{POPL16p802,
author = {Jonathan Frankle and PeterMichael Osera and David Walker and Steve Zdancewic},
title = {ExampleDirected Synthesis: A TypeTheoretic Interpretation},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {802815},
doi = {10.1145/2837614.2837629},
year = {2016},
}
Publisher's Version
Article Search


Zhang, Charles 
POPL'16: "Casper: An Efficient Approach ..."
Casper: An Efficient Approach to Call Trace Collection
Rongxin Wu, Xiao Xiao, ShingChi Cheung, Hongyu Zhang, and Charles Zhang (Hong Kong University of Science and Technology, China; Microsoft Research, China)
Call traces, i.e., sequences of function calls and returns, are fundamental to a wide range of program analyses such as bug reproduction, fault diagnosis, performance analysis, and many others. The conventional approach to collect call traces that instruments each function call and return site incurs large space and time overhead. Our approach aims at reducing the recording overheads by instrumenting only a small amount of call sites while keeping the capability of recovering the full trace. We propose a call trace model and a logged call trace model based on an LL(1) grammar, which enables us to define the criteria of a feasible solution to call trace collection. Based on the two models, we prove that to collect call traces with minimal instrumentation is an NPhard problem. We then propose an efficient approach to obtaining a suboptimal solution. We implemented our approach as a tool Casper and evaluated it using the DaCapo benchmark suite. The experiment results show that our approach causes significantly lower runtime (and space) overhead than two stateofthearts approaches.
@InProceedings{POPL16p678,
author = {Rongxin Wu and Xiao Xiao and ShingChi Cheung and Hongyu Zhang and Charles Zhang},
title = {Casper: An Efficient Approach to Call Trace Collection},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {678690},
doi = {10.1145/2837614.2837619},
year = {2016},
}
Publisher's Version
Article Search


Zhang, Hongyu 
POPL'16: "Casper: An Efficient Approach ..."
Casper: An Efficient Approach to Call Trace Collection
Rongxin Wu, Xiao Xiao, ShingChi Cheung, Hongyu Zhang, and Charles Zhang (Hong Kong University of Science and Technology, China; Microsoft Research, China)
Call traces, i.e., sequences of function calls and returns, are fundamental to a wide range of program analyses such as bug reproduction, fault diagnosis, performance analysis, and many others. The conventional approach to collect call traces that instruments each function call and return site incurs large space and time overhead. Our approach aims at reducing the recording overheads by instrumenting only a small amount of call sites while keeping the capability of recovering the full trace. We propose a call trace model and a logged call trace model based on an LL(1) grammar, which enables us to define the criteria of a feasible solution to call trace collection. Based on the two models, we prove that to collect call traces with minimal instrumentation is an NPhard problem. We then propose an efficient approach to obtaining a suboptimal solution. We implemented our approach as a tool Casper and evaluated it using the DaCapo benchmark suite. The experiment results show that our approach causes significantly lower runtime (and space) overhead than two stateofthearts approaches.
@InProceedings{POPL16p678,
author = {Rongxin Wu and Xiao Xiao and ShingChi Cheung and Hongyu Zhang and Charles Zhang},
title = {Casper: An Efficient Approach to Call Trace Collection},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {678690},
doi = {10.1145/2837614.2837619},
year = {2016},
}
Publisher's Version
Article Search


Zhang, Xin 
POPL'16: "QueryGuided Maximum Satisfiability ..."
QueryGuided Maximum Satisfiability
Xin Zhang, Ravi Mangal, Aditya V. Nori, and Mayur Naik (Georgia Institute of Technology, USA; Microsoft Research, UK)
We propose a new optimization problem "QMaxSAT", an extension of the wellknown Maximum Satisfiability or MaxSAT problem. In contrast to MaxSAT, which aims to find an assignment to all variables in the formula, QMaxSAT computes an assignment to a desired subset of variables (or queries) in the formula. Indeed, many problems in diverse domains such as program reasoning, information retrieval, and mathematical optimization can be naturally encoded as QMaxSAT instances. We describe an iterative algorithm for solving QMaxSAT. In each iteration, the algorithm solves a subproblem that is relevant to the queries, and applies a novel technique to check whether the partial assignment found is a solution to the QMaxSAT problem. If the check fails, the algorithm grows the subproblem with a new set of clauses identified as relevant to the queries. Our empirical evaluation shows that our QMaxSAT solver Pilot achieves significant improvements in runtime and memory consumption over conventional MaxSAT solvers on several QMaxSAT instances generated from realworld problems in program analysis and information retrieval.
@InProceedings{POPL16p109,
author = {Xin Zhang and Ravi Mangal and Aditya V. Nori and Mayur Naik},
title = {QueryGuided Maximum Satisfiability},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {109122},
doi = {10.1145/2837614.2837658},
year = {2016},
}
Publisher's Version
Article Search


Zinzindohoue, JeanKarim 
POPL'16: "Dependent Types and Multimonadic ..."
Dependent Types and Multimonadic Effects in F*
Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine DelignatLavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, PierreYves Strub, Markulf Kohlweiss, JeanKarim Zinzindohoue, and Santiago ZanellaBéguelin (Microsoft Research, USA; Inria, France; University of Maryland, USA; ENS, France; IMDEA Software Institute, Spain; Microsoft Research, UK)
We present a new, completely redesigned, version of F*, a language that works both as a proof assistant as well as a generalpurpose, verificationoriented, effectful programming language. In support of these complementary roles, F* is a dependently typed, higherorder, callbyvalue language with _primitive_ effects including state, exceptions, divergence and IO. Although primitive, programmers choose the granularity at which to specify effects by equipping each effect with a monadic, predicate transformer semantics. F* uses this to efficiently compute weakest preconditions and discharges the resulting proof obligations using a combination of SMT solving and manual proofs. Isolated from the effects, the core of F* is a language of pure functions used to write specifications and proof termsits consistency is maintained by a semantic termination check based on a wellfounded order. We evaluate our design on more than 55,000 lines of F* we have authored in the last year, focusing on three main case studies. Showcasing its use as a generalpurpose programming language, F* is programmed (but not verified) in F*, and bootstraps in both OCaml and F#. Our experience confirms F*'s payasyougo cost model: writing idiomatic MLlike code with no finer specifications imposes no user burden. As a verificationoriented language, our most significant evaluation of F* is in verifying several key modules in an implementation of the TLS1.2 protocol standard. For the modules we considered, we are able to prove more properties, with fewer annotations using F* than in a prior verified implementation of TLS1.2. Finally, as a proof assistant, we discuss our use of F* in mechanizing the metatheory of a range of lambda calculi, starting from the simply typed lambda calculus to System Fomega and even microF*, a sizeable fragment of F* itselfthese proofs make essential use of F*'s flexible combination of SMT automation and constructive proofs, enabling a tacticfree style of programming and proving at a relatively large scale.
@InProceedings{POPL16p256,
author = {Nikhil Swamy and Cătălin Hriţcu and Chantal Keller and Aseem Rastogi and Antoine DelignatLavaud and Simon Forest and Karthikeyan Bhargavan and Cédric Fournet and PierreYves Strub and Markulf Kohlweiss and JeanKarim Zinzindohoue and Santiago ZanellaBéguelin},
title = {Dependent Types and Multimonadic Effects in F*},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {256270},
doi = {10.1145/2837614.2837655},
year = {2016},
}
Publisher's Version
Article Search
Info


Zufferey, Damien 
POPL'16: "PSync: A Partially Synchronous ..."
PSync: A Partially Synchronous Language for FaultTolerant Distributed Algorithms
Cezara Drăgoi, Thomas A. Henzinger, and Damien Zufferey (Inria, France; ENS, France; CNRS, France; IST Austria, Austria; Massachusetts Institute of Technology, USA)
Faulttolerant distributed algorithms play an important role in many critical/highavailability applications. These algorithms are notoriously difficult to implement correctly, due to asynchronous communication and the occurrence of faults, such as the network dropping messages or computers crashing. We introduce PSync, a domain specific language based on the HeardOf model, which views asynchronous faulty systems as synchronous ones with an adversarial environment that simulates asynchrony and faults by dropping messages. We define a runtime system for PSync that efficiently executes on asynchronous networks. We formalise the relation between the runtime system and PSync in terms of observational refinement. The highlevel lockstep abstraction introduced by PSync simplifies the design and implementation of faulttolerant distributed algorithms and enables automated formal verification. We have implemented an embedding of PSync in the Scala programming language with a runtime system for partially synchronous networks. We show the applicability of PSync by implementing several important faulttolerant distributed algorithms and we compare the implementation of consensus algorithms in PSync against implementations in other languages in terms of code size, runtime efficiency, and verification.
@InProceedings{POPL16p400,
author = {Cezara Drăgoi and Thomas A. Henzinger and Damien Zufferey},
title = {PSync: A Partially Synchronous Language for FaultTolerant Distributed Algorithms},
booktitle = {Proc.\ POPL},
publisher = {ACM},
pages = {400415},
doi = {10.1145/2837614.2837650},
year = {2016},
}
Publisher's Version
Article Search

206 authors
proc time: 0.26