Powered by
Conference Publishing Consulting
Proceedings of the ACM on Programming Languages, Volume 2, Number POPL,
January 7–13, 2018,
Los Angeles, CA, USA
Frontmatter
Strings
Synthesizing Bijective Lenses
Anders Miltner, Kathleen Fisher, Benjamin C. Pierce, David Walker, and Steve Zdancewic
(Princeton University, USA; Tufts University, USA; University of Pennsylvania, USA)
Bidirectional transformations between different data representations
occur frequently in modern software systems. They appear as serializers and
deserializers, as parsers and pretty printers, as database views and view
updaters, and as a multitude of different kinds of ad hoc data converters.
Manually building bidirectional transformations---by writing two separate
functions that are intended to be inverses---is tedious and error prone.
A better approach is to use a domain-specific language in
which both directions can be written as a single expression. However,
these domain-specific languages can be difficult to program in, requiring
programmers to manage fiddly details while working in a complex type system.
We present an alternative approach.
Instead of coding transformations manually, we synthesize them from
declarative format descriptions and examples.
Specifically,
we present Optician, a tool for type-directed synthesis of bijective
string transformers. The inputs to Optician are a pair of ordinary
regular expressions representing two data formats
and a few concrete examples for disambiguation. The output is a well-typed
program in Boomerang (a bidirectional language based on the
theory of lenses).
The main technical challenge involves navigating the vast program search
space efficiently. In particular, and unlike most prior work on type-directed
synthesis, our system operates in the context of a language with a rich equivalence
relation on types (the theory of regular expressions). Consequently, program
synthesis requires search in two dimensions: First, our synthesis algorithm must
find a pair of "syntactically compatible types," and second, using the structure
of those types, it must find a type- and example-compliant term. Our key insight is
that it is possible to
reduce the size of this search space without losing any computational power
by defining a new language of lenses designed
specifically for synthesis. The new language is free from arbitrary function
composition and operates only over types and terms in a new disjunctive normal form.
We prove (1) our new language
is just as powerful as a more natural, compositional, and declarative
language and (2) our synthesis algorithm is sound and complete with respect to the
new language. We also demonstrate empirically
that our new language changes the synthesis problem from
one that admits intractable solutions to one that admits
highly efficient solutions, able to synthesize intricate lenses
between complex file formats in seconds.
We evaluate Optician on a benchmark suite of 39 examples
that includes both microbenchmarks and realistic examples derived from
other data management systems including Flash Fill, a tool
for synthesizing string transformations in spreadsheets, and Augeas, a tool
for bidirectional processing of Linux system configuration files.
Preprint
Archive submitted (24 MB)
Artifacts Available
Artifacts Functional
WebRelate: Integrating Web Data with Spreadsheets using Examples
Jeevana Priya Inala and Rishabh Singh
(Massachusetts Institute of Technology, USA; Microsoft Research, USA)
Data integration between web sources and relational data is a key challenge faced by data scientists and spreadsheet users. There are two main challenges in programmatically joining web data with relational data. First, most websites do not expose a direct interface to obtain tabular data, so the user needs to formulate a logic to get to different webpages for each input row in the relational table. Second, after reaching the desired webpage, the user needs to write complex scripts to extract the relevant data, which is often conditioned on the input data. Since many data scientists and end-users come from diverse backgrounds, writing such complex regular-expression based logical scripts to perform data integration tasks is unfortunately often beyond their programming expertise.
We present WebRelate, a system that allows users to join semi-structured web data with relational data in spreadsheets using input-output examples. WebRelate decomposes the web data integration task into two sub-tasks of i) URL learning and ii) input-dependent web extraction. We introduce a novel synthesis paradigm called "Output-constrained Programming By Examples", which allows us to use the finite set of possible outputs for the new inputs to efficiently constrain the search in the synthesis algorithm. We instantiate this paradigm for the two sub-tasks in WebRelate. The first sub-task generates the URLs for the webpages containing the desired data for all rows in the relational table. WebRelate achieves this by learning a string transformation program using a few example URLs. The second sub-task uses examples of desired data to be extracted from the corresponding webpages and learns a program to extract the data for the other rows. We design expressive domain-specific languages for URL generation and web data extraction, and present efficient synthesis algorithms for learning programs in these DSLs from few input-output examples. We evaluate WebRelate on 88 real-world web data integration tasks taken from online help forums and Excel product team, and show that WebRelate can learn the desired programs within few seconds using only 1 example for the majority of the tasks.
Article Search
Artifacts Functional
What Is Decidable about String Constraints with the ReplaceAll Function
Taolue Chen, Yan Chen, Matthew Hague, Anthony W. Lin, and Zhilin Wu
(University of London, UK; Institute of Software at Chinese Academy of Sciences, China; University at Chinese Academy of Sciences, China; Royal Holloway University of London, UK; University of Oxford, UK)
The theory of strings with concatenation has been widely argued as the basis of
constraint solving for verifying string-manipulating programs. However, this
theory is far from adequate for expressing many string constraints that are
also needed in practice; for example, the use of regular constraints (pattern
matching against a regular expression), and the string-replace function
(replacing either the first occurrence or all occurrences of a ``pattern''
string constant/variable/regular expression by a ``replacement'' string
constant/variable), among many others. Both regular constraints and the
string-replace function are crucial for such applications as analysis of
JavaScript (or more generally HTML5 applications) against cross-site scripting
(XSS) vulnerabilities, which motivates us to consider a richer class of string
constraints. The importance of the string-replace function (especially the
replace-all facility) is increasingly recognised, which can be witnessed by the
incorporation of the function in the input languages of several string
constraint solvers.
Recently, it was shown that any theory of strings containing the string-replace
function (even the most restricted version where pattern/replacement strings
are both constant strings) becomes undecidable if we do not impose some kind of
straight-line (aka acyclicity) restriction on the formulas. Despite this, the
straight-line restriction is still practically sensible since this condition is
typically met by string constraints that are generated by symbolic execution.
In this paper, we provide the first systematic study of straight-line string
constraints with the string-replace function and the regular constraints as the
basic operations. We show that a large class of such constraints (i.e. when
only a constant string or a regular expression is permitted in the pattern) is
decidable. We note that the string-replace function, even under this
restriction, is sufficiently powerful for expressing the concatenation operator
and much more (e.g. extensions of regular expressions with string variables).
This gives us the most expressive decidable logic containing concatenation,
replace, and regular constraints under the same umbrella. Our decision
procedure for the straight-line fragment follows an automata-theoretic
approach, and is modular in the sense that the string-replace terms are removed
one by one to generate more and more regular constraints, which can then be
discharged by the state-of-the-art string constraint solvers. We also show
that this fragment is, in a way, a maximal decidable subclass of the
straight-line fragment with string-replace and regular constraints. To this
end, we show undecidability results for the following two extensions:
(1) variables are permitted in the pattern parameter of the replace function,
(2) length constraints are permitted.
Article Search
String Constraints with Concatenation and Transducers Solved Efficiently
Lukáš Holík, Petr Janků, Anthony W. Lin,
Philipp Rümmer, and
Tomáš Vojnar
(Brno University of Technology, Czechia; University of Oxford, UK; Uppsala University, Sweden)
String analysis is the problem of reasoning about how strings are manipulated by a program. It has numerous applications including automatic detection of cross-site scripting, and automatic test-case generation. A popular string analysis technique includes symbolic executions, which at their core use constraint solvers over the string domain, a.k.a. string solvers. Such solvers typically reason about constraints expressed in theories over strings with the concatenation operator as an atomic constraint. In recent years, researchers started to recognise the importance of incorporating the replace-all operator (i.e. replace all occurrences of a string by another string) and, more generally, finite-state transductions in the theories of strings with concatenation. Such string operations are typically crucial for reasoning about XSS vulnerabilities in web applications, especially for modelling sanitisation functions and implicit browser transductions (e.g. innerHTML). Although this results in an undecidable theory in general, it was recently shown that the straight-line fragment of the theory is decidable, and is sufficiently expressive in practice. In this paper, we provide the first string solver that can reason about constraints involving both concatenation and finite-state transductions. Moreover, it has a completeness and termination guarantee for several important fragments (e.g. straight-line fragment). The main challenge addressed in the paper is the prohibitive worst-case complexity of the theory (double-exponential time), which is exponentially harder than the case without finite-state transductions. To this end, we propose a method that exploits succinct alternating finite-state automata as concise symbolic representations of string constraints. In contrast to previous approaches using nondeterministic automata, alternation offers not only exponential savings in space when representing Boolean combinations of transducers, but also a possibility of succinct representation of otherwise costly combinations of transducers and concatenation. Reasoning about the emptiness of the AFA language requires a state-space exploration in an exponential-sized graph, for which we use model checking algorithms (e.g. IC3). We have implemented our algorithm and demonstrated its efficacy on benchmarks that are derived from cross-site scripting analysis and other examples in the literature.
Preprint
Info
Artifacts Functional
Types and Effects 1
Linear Haskell: Practical Linearity in a Higher-Order Polymorphic Language
Jean-Philippe Bernardy, Mathieu Boespflug, Ryan R. Newton, Simon Peyton Jones, and Arnaud Spiwack
(University of Gothenburg, Sweden; Tweag I/O, France; Indiana University, USA; Microsoft Research, UK)
Linear type systems have a long and storied history, but not a clear path forward to integrate with existing languages such as OCaml or Haskell. In this paper, we study a linear type system designed with two crucial properties in mind: backwards-compatibility and code reuse across linear and non-linear users of a library. Only then can the benefits of linear types permeate conventional functional programming. Rather than bifurcate types into linear and non-linear counterparts, we instead attach linearity to function arrows. Linear functions can receive inputs from linearly-bound values, but can also operate over unrestricted, regular values.
To demonstrate the efficacy of our linear type system — both how easy it can be integrated in an existing language implementation and how streamlined it makes it to write programs with linear types — we implemented our type system in ghc, the leading Haskell compiler, and demonstrate two kinds of applications of linear types: mutable data with pure interfaces; and enforcing protocols in I/O-performing functions.
Article Search
Artifacts Functional
Polyadic Approximations, Fibrations and Intersection Types
Damiano Mazza, Luc Pellissier, and Pierre Vial
(CNRS, France; University of Paris 13, France; University of Paris Diderot, France)
Starting from an exact correspondence between linear approximations and non-idempotent intersection types, we develop a general framework for building systems of intersection types characterizing normalization properties. We show how this construction, which uses in a fundamental way Melliès and Zeilberger's ``type systems as functors'' viewpoint, allows us to recover equivalent versions of every well known intersection type system (including Coppo and Dezani's original system, as well as its non-idempotent variants independently introduced by Gardner and de Carvalho). We also show how new systems of intersection types may be built almost automatically in this way.
Article Search
Handling Fibred Algebraic Effects
Danel Ahman
(Inria, France)
We study algebraic computational effects and their handlers in the dependently typed setting. We describe computational effects using a generalisation of Plotkin and Pretnar's effect theories, whose dependently typed operations allow us to capture precise notions of computation, e.g., state with location-dependent store types and dependently typed update monads. Our treatment of handlers is based on an observation that their conventional term-level definition leads to unsound program equivalences being derivable in languages that include a notion of homomorphism. We solve this problem by giving handlers a novel type-based treatment via a new computation type, the user-defined algebra type, which pairs a value type (the carrier) with a set of value terms (the operations), capturing Plotkin and Pretnar's insight that effect handlers denote algebras. We then show that the conventional presentation of handlers can be routinely derived, and demonstrate that this type-based treatment of handlers provides a useful mechanism for reasoning about effectful computations. We also equip the resulting language with a sound denotational semantics based on families fibrations.
Article Search
Handle with Care: Relational Interpretation of Algebraic Effects and Handlers
Dariusz Biernacki, Maciej Piróg, Piotr Polesiuk, and Filip Sieczkowski
(University of Wrocław, Poland)
Algebraic effects and handlers have received a lot of attention recently, both from the
theoretical point of view and in practical language design. This stems from the fact
that algebraic effects give the programmer unprecedented freedom to define, combine, and
interpret computational effects. This plenty-of-rope, however, demands not only a deep
understanding of the underlying semantics, but also access to practical means of
reasoning about effectful code, including correctness and program equivalence. In this
paper we tackle this problem by constructing a step-indexed relational interpretation of
a call-by-value calculus with algebraic effect handlers and a row-based polymorphic
type-and-effect system. Our calculus, while striving for simplicity, enjoys desirable
theoretical properties, and is close to the cores of programming languages with
algebraic effects used in the wild, while the logical relation we build for it can be
used to reason about non-trivial properties, such as contextual equivalence and
contextual approximation of programs. Our development has been fully formalised in the
Coq proof assistant.
Article Search
Archive submitted (0 MB)
Info
Artifacts Available
Artifacts Functional
Verification
Automated Lemma Synthesis in Symbolic-Heap Separation Logic
Quang-Trung Ta, Ton Chanh Le, Siau-Cheng Khoo, and Wei-Ngan Chin
(National University of Singapore, Singapore)
The symbolic-heap fragment of separation logic has been actively developed and advocated for verifying the memory-safety property of computer programs. At present, one of its biggest challenges is to effectively prove entailments containing inductive heap predicates. These entailments are usually proof obligations generated when verifying programs that manipulate complex data structures like linked lists, trees, or graphs.
To assist in proving such entailments, this paper introduces a lemma synthesis framework, which automatically discovers lemmas to serve as eureka steps in the proofs. Mathematical induction and template-based constraint solving are two pillars of our framework. To derive the supporting lemmas for a given entailment, the framework firstly identifies possible lemma templates from the entailment's heap structure. It then sets up unknown relations among each template's variables and conducts structural induction proof to generate constraints about these relations. Finally, it solves the constraints to find out actual definitions of the unknown relations, thus discovers the lemmas. We have integrated this framework into a prototype prover and have experimented it on various entailment benchmarks. The experimental results show that our lemma-synthesis-assisted prover can prove many entailments that could not be handled by existing techniques. This new proposal opens up more opportunities to automatically reason with complex inductive heap predicates.
Preprint
Archive submitted (9 MB)
Info
Artifacts Available
Artifacts Functional
Higher-Order Constrained Horn Clauses for Verification
Toby Cathcart Burn, C.-H. Luke Ong, and Steven J. Ramsay
(University of Oxford, UK; University of Bristol, UK)
Motivated by applications in automated verification of higher-order functional programs, we develop a notion of constrained Horn clauses in higher-order logic and a decision problem concerning their satisfiability. We show that, although satisfiable systems of higher-order clauses do not generally have least models, there is a notion of canonical model obtained through a reduction to a problem concerning a kind of monotone logic program. Following work in higher-order program verification, we develop a refinement type system in order to reason about and automate the search for models. This provides a sound but incomplete method for solving the decision problem. Finally, we show that there is a sense in which we can use refinement types to express properties of terms whilst staying within the higher-order constrained Horn clause framework.
Article Search
Artifacts Functional
Relatively Complete Refinement Type System for Verification of Higher-Order Non-deterministic Programs
Hiroshi Unno, Yuki Satake, and Tachio Terauchi
(University of Tsukuba, Japan; Waseda University, Japan)
This paper considers verification of non-deterministic higher-order functional programs. Our contribution is a novel type system in which the types are used to express and verify (conditional) safety, termination, non-safety, and non-termination properties in the presence of ∀-∃ branching behavior due to non-determinism. For instance, the judgement ⊢ e:{u: int | φ(u) }^{∀∀} says that every evaluation of e either diverges or reduces to some integer u satisfying φ(u), whereas ⊢ e:{u: int | ψ(u) }^{∃∀} says that there exists an evaluation of e that either diverges or reduces to some integer u satisfying ψ(u). Note that the former is a safety property whereas the latter is a counterexample to a (conditional) termination property. Following the recent work on type-based verification methods for deterministic higher-order functional programs, we formalize the idea on the foundation of dependent refinement types, thereby allowing the type system to express and verify rich properties involving program values, branching behaviors, and the combination thereof.
Our type system is able to seamlessly combine deductions of both universal and existential facts within a unified framework, paving the way for an exciting opportunity for new type-based verification methods that combine both universal and existential reasoning. For example, our system can prove the existence of a path violating some safety property from a proof of termination that uses a well-foundedness termination argument. We prove that our type system is sound and relatively complete, and further, thanks to having both modes of non-determinism, we show that our types are closed under complement.
Article Search
Interpretation and Evaluation
Unifying Analytic and Statically-Typed Quasiquotes
Lionel Parreaux, Antoine Voizard, Amir Shaikhha, and Christoph E. Koch
(EPFL, Switzerland; University of Pennsylvania, USA)
Metaprograms are programs that manipulate (generate, analyze and evaluate) other programs. These tasks are greatly facilitated by quasiquotation, a technique to construct and deconstruct program fragments using quoted code templates expressed in the syntax of the manipulated language. We argue that two main flavors of quasiquotes have existed so far: Lisp-style quasiquotes, which can both construct and deconstruct programs but may produce code that contains type mismatches and unbound variables; and MetaML-style quasiquotes, which rely on static typing to prevent these errors, but can only construct programs. In this paper, we show how to combine the advantages of both flavors into a unified framework: we allow the construction, deconstruction and evaluation of program fragments while ensuring that generated programs are well-typed and well-scoped, a combination unseen in previous work. We formalize our approach as λ^{{}}, a multi-stage calculus with code pattern matching and rewriting, and prove its type safety. We also present its realization in Squid, a metaprogramming framework for Scala, leveraging Scala’s expressive type system. To demonstrate the usefulness of our approach, we introduce speculative rewrite rules, a novel code transformation technique that makes decisive use of these capabilities, and we outline how it simplifies the design of some crucial query compiler optimizations.
Article Search
Artifacts Functional
Jones-Optimal Partial Evaluation by Specialization-Safe Normalization
Matt Brown and Jens Palsberg
(University of California at Los Angeles, USA)
We present partial evaluation by specialization-safe normalization, a novel partial evaluation technique that is Jones-optimal, that can be self-applied to achieve the Futamura projections and that can be type-checked to ensure it always generates code with the correct type. Jones-optimality is the gold-standard for nontrivial partial evaluation and guarantees that a specializer can remove an entire layer of interpretation. We achieve Jones-optimality by using a novel affine-variable static analysis that directs specialization-safe normalization to always decrease a program’s runtime.
We demonstrate the robustness of our approach by showing Jones-optimality in a variety of settings. We have formally proved that our partial evaluator is Jones-optimal for call-by-value reduction, and we have experimentally shown that it is Jones-optimal for call-by-value, normal-order, and memoized normal-order. Each of our experiments tests Jones-optimality with three different self-interpreters.
We implemented our partial evaluator in F_{ω}^{µ i}, a recent language for typed self-applicable meta-programming. It is the first Jones-optimal and self-applicable partial evaluator whose type guarantees that it always generates type-correct code.
Article Search
Info
Artifacts Functional
Migrating Gradual Types
John Peter Campora, Sheng Chen, Martin Erwig, and Eric Walkingshaw
(University of Louisiana at Lafayette, USA; Oregon State University, USA)
Gradual typing allows programs to enjoy the benefits of both static typing and dynamic typing. While it is often desirable to migrate a program from more dynamically-typed to more statically-typed or vice versa, gradual typing itself does not provide a way to facilitate this migration. This places the burden on programmers who have to manually add or remove type annotations. Besides the general challenge of adding type annotations to dynamically typed code, there are subtle interactions between these annotations in gradually typed code that exacerbate the situation. For example, to migrate a program to be as static as possible, in general, all possible combinations of adding or removing type annotations from parameters must be tried out and compared.
In this paper, we address this problem by developing migrational typing, which efficiently types all possible ways of adding or removing type annotations from a gradually typed program. The typing result supports automatically migrating a program to be as static as possible, or introducing the least number of dynamic types necessary to remove a type error. The approach can be extended to support user-defined criteria about which annotations to modify. We have implemented migrational typing and evaluated it on large programs. The results show that migrational typing scales linearly with the size of the program and takes only 2–4 times longer than plain gradual typing.
Article Search
Archive submitted (0 MB)
Artifacts Available
Artifacts Functional
Intrinsically-Typed Definitional Interpreters for Imperative Languages
Casper Bach Poulsen, Arjen Rouvoet, Andrew Tolmach, Robbert Krebbers, and Eelco Visser
(Delft University of Technology, Netherlands; Portland State University, USA)
A definitional interpreter defines the semantics of an object language in terms of the (well-known) semantics of a host language, enabling understanding and validation of the semantics through execution. Combining a definitional interpreter with a separate type system requires a separate type safety proof. An alternative approach, at least for pure object languages, is to use a dependently-typed language to encode the object language type system in the definition of the abstract syntax. Using such intrinsically-typed abstract syntax definitions allows the host language type checker to verify automatically that the interpreter satisfies type safety. Does this approach scale to larger and more realistic object languages, and in particular to languages with mutable state and objects?
In this paper, we describe and demonstrate techniques and libraries in Agda that successfully scale up intrinsically-typed definitional interpreters to handle rich object languages with non-trivial binding structures and mutable state. While the resulting interpreters are certainly more complex than the simply-typed λ-calculus interpreter we start with, we claim that they still meet the goals of being concise, comprehensible, and executable, while guaranteeing type safety for more elaborate object languages. We make the following contributions: (1) A dependent-passing style technique for hiding the weakening of indexed values as they propagate through monadic code. (2) An Agda library for programming with scope graphs and frames, which provides a uniform approach to dealing with name binding in intrinsically-typed interpreters. (3) Case studies of intrinsically-typed definitional interpreters for the simply-typed λ-calculus with references (STLC+Ref) and for a large subset of Middleweight Java (MJ).
Article Search
Archive submitted (1 MB)
Artifacts Available
Artifacts Functional
Memory and Concurrency
Effective Stateless Model Checking for C/C++ Concurrency
Michalis Kokologiannakis, Ori Lahav, Konstantinos Sagonas, and
Viktor Vafeiadis
(National Technical University of Athens, Greece; Tel Aviv University, Israel; Uppsala University, Sweden; MPI-SWS, Germany)
We present a stateless model checking algorithm for verifying concurrent programs running under
RC11, a repaired version of the C/C++11 memory model without dependency cycles.
Unlike most previous approaches, which enumerate thread interleavings up to some partial order reduction improvements,
our approach works directly on execution graphs and (in the absence of RMW instructions and SC atomics) avoids redundant exploration by construction.
We have implemented a model checker, called RCMC,
based on this approach and applied it to a number of challenging concurrent programs.
Our experiments confirm that RCMC is significantly faster, scales better than other model checking tools, and is also more resilient to small changes in the benchmarks.
Preprint
Archive submitted (2232 MB)
Info
Artifacts Available
Artifacts Functional
Transactions in Relaxed Memory Architectures
Brijesh Dongol, Radha Jagadeesan, and James Riely
(Brunel University London, UK; DePaul University, USA)
The integration of transactions into hardware relaxed memory
architectures is a topic of current research both in industry and
academia. In this paper, we provide a general architectural framework
for the introduction of transactions into models of relaxed memory in
hardware, including the SC, TSO, ARMv8 and PPC models.
Our framework incorporates flexible and expressive forms of
transaction aborts and execution that have hitherto been in the realm
of software transactional memory. In contrast to software
transactional memory, we account for the characteristics of relaxed
memory as a restricted form of distributed system, without a notion of
global time. We prove abstraction theorems to demonstrate that the
programmer API matches the intuitions and expectations about
transactions.
Preprint
Simplifying ARM Concurrency: Multicopy-Atomic Axiomatic and Operational Models for ARMv8
Christopher Pulte, Shaked Flur, Will Deacon, Jon French, Susmit Sarkar, and Peter Sewell
(University of Cambridge, UK; ARM, UK; University of St. Andrews, UK)
ARM has a relaxed memory model, previously specified in informal prose for ARMv7 and ARMv8. Over time, and partly due to work building formal semantics for ARM concurrency, it has become clear that some of the complexity of the model is not justified by the potential benefits. In particular, the model was originally non-multicopy-atomic: writes could become visible to some other threads before becoming visible to all — but this has not been exploited in production implementations, the corresponding potential hardware optimisations are thought to have insufficient benefits in the ARM context, and it gives rise to subtle complications when combined with other ARMv8 features. The ARMv8 architecture has therefore been revised: it now has a multicopy-atomic model. It has also been simplified in other respects, including more straightforward notions of dependency, and the architecture now includes a formal concurrency model.
In this paper we detail these changes and discuss their motivation. We define two formal concurrency models: an operational one, simplifying the Flowing model of Flur et al., and the axiomatic model of the revised ARMv8 specification. The models were developed by an academic group and by ARM staff, respectively, and this extended collaboration partly motivated the above changes. We prove the equivalence of the two models. The operational model is integrated into an executable exploration tool with new web interface, demonstrated by exhaustively checking the possible behaviours of a loop-unrolled version of a Linux kernel lock implementation, a previously known bug due to unprevented speculation, and a fixed version.
Article Search
Artifacts Functional
Progress of Concurrent Objects with Partial Methods
Hongjin Liang and Xinyu Feng
(Nanjing University, China; University of Science and Technology of China, China)
Various progress properties have been proposed for concurrent objects, such as wait-freedom, lock-freedom, starvation-freedom and deadlock-freedom. However, none of them applies to concurrent objects with partial methods, i.e., methods that are supposed not to return under certain circumstances. A typical example is the lock_acquire method, which must not return when the lock has already been acquired.
In this paper we propose two new progress properties, partial starvation-freedom (PSF) and partial deadlock-freedom (PDF), for concurrent objects with partial methods. We also design four patterns to write abstract specifications for PSF or PDF objects under strongly or weakly fair scheduling, so that these objects contextually refine the abstract specifications. Our Abstraction Theorem shows the equivalence between PSF (or PDF) and the progress-aware contextual refinement. Finally, we generalize the program logic LiLi to have a new logic to verify the PSF (or PDF) property and linearizability of concurrent objects.
Article Search
Types
A Principled Approach to Ornamentation in ML
Thomas Williams and Didier Rémy
(Inria, France)
Ornaments are a way to describe changes in datatype definitions
reorganizing, adding, or dropping some
pieces of data so that functions operating on the bare definition can be
partially and sometimes totally lifted into functions operating on the
ornamented structure.
We propose an extension of ML with higher-order ornaments, demonstrate its
expressiveness with a few typical examples,
including code refactoring, study the metatheoretical
properties of ornaments, and describe their elaboration process.
We formalize ornamentation via an a posteriori abstraction of the bare code,
returning a generic term,
which lives in a meta-language above ML. The lifted code is obtained by
application of the generic term to well-chosen arguments, followed by
staged reduction, and some remaining simplifications.
We use logical relations to closely relate the lifted code to the bare
code.
Article Search
Info
Artifacts Functional
Type-Preserving CPS Translation of Σ and Π Types Is Not Not Possible
William J. Bowman, Youyou Cong, Nick Rioux, and Amal Ahmed
(Northeastern University, USA; Ochanomizu University, Japan)
Dependently typed languages such as Coq are used to specify and prove functional correctness of source programs, but what we ultimately need are guarantees about correctness of compiled code. By preserving dependent types through each compiler pass, we could preserve source-level specifications and correctness proofs into the generated target-language programs. Unfortunately, type-preserving compilation of dependent types is hard. In 2002, Barthe and Uustalu showed that type-preserving CPS is not possible for languages such as Coq. Specifically, they showed that for strong dependent pairs (Σ types), the standard typed call-by-name CPS is not type preserving. They further proved that for dependent case analysis on sums, a class of typed CPS translations—including the standard translation—is not possible. In 2016, Morrisett noticed a similar problem with the standard call-by-value CPS translation for dependent functions (Π types). In essence, the problem is that the standard typed CPS translation by double-negation, in which computations are assigned types of the form (A → ⊥) → ⊥, disrupts the term/type equivalence that is used during type checking in a dependently typed language.
In this paper, we prove that type-preserving CPS translation for dependently typed languages is not not possible. We develop both call-by-name and call-by-value CPS translations from the Calculus of Constructions with both Π and Σ types (CC) to a dependently typed target language, and prove type preservation and compiler correctness of each translation. Our target language is CC extended with an additional equivalence rule and an additional typing rule, which we prove consistent by giving a model in the extensional Calculus of Constructions. Our key observation is that we can use a CPS translation that employs answer-type polymorphism, where CPS-translated computations have type ∀ α. (A → α) → α. This type justifies, by a free theorem, the new equality rule in our target language and allows us to recover the term/type equivalences that CPS translation disrupts. Finally, we conjecture that our translation extends to dependent case analysis on sums, despite the impossibility result, and provide a proof sketch.
Preprint
Archive submitted (1 MB)
Info
Decidability of Conversion for Type Theory in Type Theory
Andreas Abel, Joakim Öhman, and Andrea Vezzosi
(University of Gothenburg, Sweden; IMDEA Software Institute, Spain; Chalmers University of Technology, Sweden)
Type theory should be able to handle its own meta-theory, both to justify its foundational claims and to obtain a verified implementation. At the core of a type checker for intensional type theory lies an algorithm to check equality of types, or in other words, to check whether two types are convertible. We have formalized in Agda a practical conversion checking algorithm for a dependent type theory with one universe à la Russell, natural numbers, and η-equality for Π types. We prove the algorithm correct via a Kripke logical relation parameterized by a suitable notion of equivalence of terms. We then instantiate the parameterized fundamental lemma twice: once to obtain canonicity and injectivity of type formers, and once again to prove the completeness of the algorithm. Our proof relies on inductive-recursive definitions, but not on the uniqueness of identity proofs. Thus, it is valid in variants of intensional Martin-Löf Type Theory as long as they support induction-recursion, for instance, Extensional, Observational, or Homotopy Type Theory.
Article Search
Artifacts Functional
Safety and Conservativity of Definitions in HOL and Isabelle/HOL
Ondřej Kunčar and
Andrei Popescu
(TU Munich, Germany; Middlesex University, UK; Institute of Mathematics at Romanian Academy, Romania)
Definitions are traditionally considered to be a safe mechanism for
introducing concepts on top of a logic known to be consistent. In contrast to
arbitrary axioms, definitions should
in principle be treatable as a form of abbreviation, and thus compiled away from
the theory without losing provability. In particular, definitions should form
a conservative extension of the pure logic.
These properties are crucial for modern interactive theorem provers, since they
ensure the consistency of the logic, as well as a valid environment for
total/certified functional programming.
We prove these properties,
namely, safety and conservativity, for Higher-Order Logic (HOL),
a logic implemented in several mainstream
theorem provers and relied upon by thousands of users.
Some unique features of HOL, such as the requirement to
give non-emptiness proofs when defining new types and the impossibility
to
unfold type definitions, make the proof of these properties, and
also the very formulation of safety, nontrivial.
Our study also factors in the essential variation
of HOL definitions featured by Isabelle/HOL, a popular member
of the HOL-based provers family. The current work improves on
recent
results which showed a weaker property, consistency of Isabelle/HOL's definitions.
Preprint
Info
Consistency
Sound, Complete, and Tractable Linearizability Monitoring for Concurrent Collections
Michael Emmi and Constantin Enea
(SRI International, USA; University of Paris Diderot, France; CNRS, France)
While many program properties like the validity of assertions and in-bounds array accesses admit nearly-trivial monitoring algorithms, the standard correctness criterion for concurrent data structures does not. Given an implementation of an arbitrary abstract data type, checking whether the operations invoked in one single concurrent execution are linearizable, i.e., indistinguishable from an execution where the same operations are invoked atomically, requires exponential time in the number of operations.
In this work we identify a class of collection abstract data types which admit polynomial-time linearizability monitors. Collections capture the majority of concurrent data structures available in practice, including stacks, queues, sets, and maps. Although monitoring executions of arbitrary abstract data types requires enumerating exponentially-many possible linearizations, collections enjoy combinatorial properties which avoid the enumeration. We leverage these properties to reduce linearizability to Horn satisfiability. As far as we know, ours is the first sound, complete, and tractable algorithm for monitoring linearizability for types beyond single-value registers.
Article Search
Reducing Liveness to Safety in First-Order Logic
Oded Padon, Jochen Hoenicke, Giuliano Losa, Andreas Podelski, Mooly Sagiv, and Sharon Shoham
(Tel Aviv University, Israel; University of Freiburg, Germany; University of California at Los Angeles, USA)
We develop a new technique for verifying temporal properties of
infinite-state (distributed) systems. The main idea is to reduce the
temporal verification problem to the problem of verifying the safety
of infinite-state systems expressed in first-order logic. This allows
to leverage existing techniques for safety verification to verify
temporal properties of interesting distributed protocols, including
some that have not been mechanically verified before. We model
infinite-state systems using first-order logic, and use first-order
temporal logic (FO-LTL) to specify temporal properties. This general
formalism allows to naturally model distributed systems, while
supporting both unbounded-parallelism (where the system is allowed to
dynamically create processes), and infinite-state per process.
The traditional approach for verifying temporal properties of
infinite-state systems employs well-founded relations (e.g. using
linear arithmetic ranking functions). In contrast, our approach is
based the idea of fair cycle detection. In finite-state systems,
temporal verification can always be reduced to fair cycle detection (a
system contains a fair cycle if it revisits a state after satisfying
all fairness constraints). However, with both infinitely many states
and infinitely many fairness constraints, a straightforward reduction
to fair cycle detection is unsound. To regain soundness, we augment
the infinite-state transition system by a dynamically computed finite
set, that exploits the locality of transitions. This set lets us
define a form of fair cycle detection that is sound in the presence of
both infinitely many states, and infinitely many fairness constraints.
Our approach allows a new style of temporal verification that does not
explicitly involve ranking functions. This fits well with pure
first-order verification which does not explicitly reason about
numerical values. In particular, it can be used with effectively
propositional first-order logic (EPR), in which case checking
verification conditions is decidable. We applied our technique to
verify temporal properties of several interesting protocols. To the
best of our knowledge, we have obtained the first mechanized liveness
proof for both TLB Shootdown, and Stoppable Paxos.
Article Search
Archive submitted (3466 MB)
Info
Artifacts Available
Artifacts Functional
Alone Together: Compositional Reasoning and Inference for Weak Isolation
Gowtham Kaki, Kartik Nagar, Mahsa Najafzadeh, and Suresh Jagannathan
(Purdue University, USA)
Serializability is a well-understood correctness criterion that simplifies reasoning about the behavior of concurrent transactions by ensuring they are isolated from each other while they execute. However, enforcing serializable isolation comes at a steep cost in performance because it necessarily restricts opportunities to exploit concurrency even when such opportunities would not violate application-specific invariants. As a result, database systems in practice support, and often encourage, developers to implement transactions using weaker alternatives. These alternatives break the strong isolation guarantees offered by serializable transactions to permit greater concurrency. Unfortunately, the semantics of weak isolation is poorly understood, and usually explained only informally in terms of low-level implementation artifacts. Consequently, verifying high-level correctness properties in such environments remains a challenging problem. To address this issue, we present a novel program logic that enables compositional reasoning about the behavior of concurrently executing weakly-isolated transactions. Recognizing that the proof burden necessary to use this logic may dissuade application developers, we also describe an inference procedure based on this foundation that ascertains the weakest isolation level that still guarantees the safety of high-level consistency assertions associated with such transactions. The key to effective inference is the observation that weakly-isolated transactions can be viewed as functional (monadic) computations over an abstract database state, allowing us to treat their operations as state transformers over the database. This interpretation enables automated verification using off-the-shelf SMT solvers. Our development is parametric over a transaction’s specific isolation semantics, allowing it to be applicable over a range of concurrency control mechanisms. Case studies and experiments on real-world applications (written in an embedded DSL in OCaml) demonstrate the utility of our approach, and provide strong evidence that automated verification of weakly-isolated transactions can be placed on the same formal footing as their strongly-isolated serializable counterparts.
Article Search
Programming and Proving with Distributed Protocols
Ilya Sergey, James R. Wilcox, and Zachary Tatlock
(University College London, UK; University of Washington, USA)
Distributed systems play a crucial role in modern infrastructure, but are notoriously difficult to implement correctly. This difficulty arises from two main challenges: (a) correctly implementing core system components (e.g., two-phase commit), so all their internal invariants hold, and (b) correctly composing standalone system components into functioning trustworthy applications (e.g., persistent storage built on top of a two-phase commit instance). Recent work has developed several approaches for addressing (a) by means of mechanically verifying implementations of core distributed components, but no methodology exists to address (b) by composing such verified components into larger verified applications. As a result, expensive verification efforts for key system components are not easily reusable, which hinders further verification efforts.
In this paper, we present Disel, the first framework for implementation and compositional verification of distributed systems and their clients, all within the mechanized, foundational context of the Coq proof assistant. In Disel, users implement distributed systems using a domain specific language shallowly embedded in Coq and providing both high-level programming constructs as well as low-level communication primitives. Components of composite systems are specified in Disel as protocols, which capture system-specific logic and disentangle system definitions from implementation details. By virtue of Disel's dependent type system, well-typed implementations always satisfy their protocols' invariants and never go wrong, allowing users to verify system implementations interactively using Disel's Hoare-style program logic, which extends state-of-the-art techniques for concurrency verification to the distributed setting. By virtue of the substitution principle and frame rule provided by Disel's logic, system components can be composed leading to modular, reusable verified distributed systems.
We describe Disel, illustrate its use with a series of examples, outline its logic and metatheory, and report on our experience using it as a framework for implementing, specifying, and verifying distributed systems.
Preprint
Archive submitted (0 MB)
Info
Artifacts Available
Artifacts Functional
Types and Effects 2
Inference of Static Semantics for Incomplete C Programs
Leandro T. C. Melo, Rodrigo G. Ribeiro, Marcus R. de Araújo, and Fernando Magno Quintão Pereira
(Federal University of Minas Gerais, Brazil; Federal University of Ouro Preto, Brazil)
Incomplete source code naturally emerges in software development: during the design phase, while evolving, testing and analyzing programs. Therefore, the ability to understand partial programs is a valuable asset. However, this problem is still unsolved in the C programming language. Difficulties stem from the fact that parsing C requires, not only syntax, but also semantic information. Furthermore, inferring types so that they respect C's type system is a challenging task. In this paper we present a technique that lets us solve these problems. We provide a unification-based type inference capable of dealing with C intricacies. The ideas we present let us reconstruct partial C programs into complete well-typed ones. Such program reconstruction has several applications: enabling static analysis tools in scenarios where software components may be absent; improving static analysis tools that do not rely on build-specifications; allowing stub-generation and testing tools to work on snippets; and assisting programmers on the extraction of reusable data-structures out of the program parts that use them. Our evaluation is performed on source code from a variety of C libraries such as GNU's Coreutils, GNULib, GNOME's GLib, and GDSL; on implementations from Sedgewick's books; and on snippets from popular open-source projects like CPython, FreeBSD, and Git.
Preprint
Archive submitted (9 MB)
Info
Artifacts Available
Artifacts Functional
Optimal Dyck Reachability for Data-Dependence and Alias Analysis
Krishnendu Chatterjee, Bhavya Choudhary, and Andreas Pavlogiannis
(IST Austria, Austria; IIT Bombay, India)
A fundamental algorithmic problem at the heart of static analysis is Dyck reachability. The input is a graph where the edges are labeled with different types of opening and closing parentheses, and the reachability information is computed via paths whose parentheses are properly matched. We present new results for Dyck reachability problems with applications to alias analysis and data-dependence analysis. Our main contributions, that include improved upper bounds as well as lower bounds that establish optimality guarantees, are as follows:
First, we consider Dyck reachability on bidirected graphs, which is the standard way of performing field-sensitive points-to analysis. Given a bidirected graph with n nodes and m edges, we present: (i) an algorithm with worst-case running time O(m + n · α(n)), where α(n) is the inverse Ackermann function, improving the previously known O(n^{2}) time bound; (ii) a matching lower bound that shows that our algorithm is optimal wrt to worst-case complexity; and (iii) an optimal average-case upper bound of O(m) time, improving the previously known O(m · logn) bound.
Second, we consider the problem of context-sensitive data-dependence analysis, where the task is to obtain analysis summaries of library code in the presence of callbacks. Our algorithm preprocesses libraries in almost linear time, after which the contribution of the library in the complexity of the client analysis is only linear, and only wrt the number of call sites.
Third, we prove that combinatorial algorithms for Dyck reachability on general graphs with truly sub-cubic bounds cannot be obtained without obtaining sub-cubic combinatorial algorithms for Boolean Matrix Multiplication, which is a long-standing open problem. Thus we establish that the existing combinatorial algorithms for Dyck reachability are (conditionally) optimal for general graphs. We also show that the same hardness holds for graphs of constant treewidth.
Finally, we provide a prototype implementation of our algorithms for both alias analysis and data-dependence analysis. Our experimental evaluation demonstrates that the new algorithms significantly outperform all existing methods on the two problems, over real-world benchmarks.
Article Search
Artifacts Functional
Data-Centric Dynamic Partial Order Reduction
Marek Chalupa, Krishnendu Chatterjee, Andreas Pavlogiannis, Nishant Sinha, and Kapil Vaidya
(Masaryk University, Czechia; IST Austria, Austria; Kena Labs, India; IIT Bombay, India)
We present a new dynamic partial-order reduction method for stateless model checking of concurrent programs. A common approach for exploring program behaviors relies on enumerating the traces of the program, without storing the visited states (aka stateless exploration). As the number of distinct traces grows exponentially, dynamic partial-order reduction (DPOR) techniques have been successfully used to partition the space of traces into equivalence classes (Mazurkiewicz partitioning), with the goal of exploring only few representative traces from each class.
We introduce a new equivalence on traces under sequential consistency semantics, which we call the observation equivalence. Two traces are observationally equivalent if every read event observes the same write event in both traces. While the traditional Mazurkiewicz equivalence is control-centric, our new definition is data-centric. We show that our observation equivalence is coarser than the Mazurkiewicz equivalence, and in many cases even exponentially coarser. We devise a DPOR exploration of the trace space, called data-centric DPOR, based on the observation equivalence.
- For acyclic architectures, our algorithm is guaranteed to explore exactly one representative trace from each observation class, while spending polynomial time per class. Hence, our algorithm is optimal wrt the observation equivalence, and in several cases explores exponentially fewer traces than any enumerative method based on the Mazurkiewicz equivalence.
- For cyclic architectures, we consider an equivalence between traces which is finer than the observation equivalence; but coarser than the Mazurkiewicz equivalence, and in some cases is exponentially coarser. Our data-centric DPOR algorithm remains optimal under this trace equivalence.
Finally, we perform a basic experimental comparison between the existing Mazurkiewicz-based DPOR and our data-centric DPOR on a set of academic benchmarks. Our results show a significant reduction in both running time and the number of explored equivalence classes.
Article Search
Analytical Modeling of Cache Behavior for Affine Programs
Wenlei Bao, Sriram Krishnamoorthy, Louis-Noel Pouchet, and P. Sadayappan
(Ohio State University, USA; Pacific Northwest National Laboratory, USA; Colorado State University, USA)
Optimizing compilers implement program transformation strategies aimed at reducing data movement to or from main memory by exploiting the data-cache hierarchy. However, instead of attempting to minimize the number of cache misses, very approximate cost models are used, due to the lack of precise compile-time models for misses for hierarchical caches. The current state of practice for cache miss analysis is based on accurate simulation. However, simulation requires time proportional to the dataset/problem size, as well as the number of distinct cache configurations of interest to be evaluated.
This paper takes a fundamentally different approach, by focusing on polyhedral programs with static control flow. Instead of relying on costly simulation, a closed-form solution for modeling of misses in a set associative cache hierarchy is developed. This solution can enable program transformation choice at compile time to optimize cache misses. A tool implementing the approach has been developed and used for validation of the framework.
Article Search
Termination
A New Proof Rule for Almost-Sure Termination
Annabelle McIver, Carroll Morgan, Benjamin Lucien Kaminski, and Joost-Pieter Katoen
(Macquarie University, Australia; UNSW, Australia; Data61 at CSIRO, Australia; RWTH Aachen University, Germany; University College London, UK; IST Austria, Austria)
We present a new proof rule for proving almost-sure termination of probabilistic programs, including those that contain demonic non-determinism.
An important question for a probabilistic program is whether the probability mass of all its diverging runs is zero, that is that it terminates "almost surely". Proving that can be hard, and this paper presents a new method for doing so. It applies directly to the program's source code, even if the program contains demonic choice.
Like others, we use variant functions (a.k.a. "super-martingales") that are real-valued and decrease randomly on each loop iteration; but our key innovation is that the amount as well as the probability of the decrease are parametric. We prove the soundness of the new rule, indicate where its applicability goes beyond existing rules, and explain its connection to classical results on denumerable (non-demonic) Markov chains.
Article Search
Lexicographic Ranking Supermartingales: An Efficient Approach to Termination of Probabilistic Programs
Sheshansh Agrawal, Krishnendu Chatterjee, and Petr Novotný
(IIT Bombay, India; IST Austria, Austria)
Probabilistic programs extend classical imperative programs with
real-valued random variables and random branching.
The most basic liveness property for such programs is the termination
property.
The qualitative (aka almost-sure) termination problem asks whether a given
program program terminates with probability 1.
While ranking functions provide a sound and complete method for
non-probabilistic
programs, the extension of them to probabilistic programs is achieved
via ranking supermartingales (RSMs).
Although deep theoretical results have been established about RSMs,
their application to probabilistic programs with nondeterminism has been limited
only to programs of restricted control-flow structure.
For non-probabilistic programs, lexicographic ranking functions provide a
compositional
and practical approach for termination analysis of real-world programs.
In this work we introduce lexicographic RSMs and show that they present a sound
method for almost-sure termination of probabilistic programs with nondeterminism.
We show that lexicographic RSMs provide a tool for compositional reasoning
about almost-sure termination,
and for probabilistic programs with linear arithmetic they can be synthesized
efficiently (in polynomial time).
We also show that with additional restrictions even asymptotic bounds on expected
termination time can be obtained through lexicographic RSMs.
Finally, we present experimental results on benchmarks adapted from previous work
to demonstrate the effectiveness of our approach.
Article Search
Artifacts Functional
Algorithmic Analysis of Termination Problems for Quantum Programs
Yangjia Li and Mingsheng Ying
(Institute of Software at Chinese Academy of Sciences, China; University of Technology Sydney, Australia; Tsinghua University, China)
We introduce the notion of linear ranking super-martingale (LRSM) for quantum programs (with nondeterministic choices, namely angelic and demonic choices). Several termination theorems are established showing that the existence of the LRSMs of a quantum program implies its termination. Thus, the termination problems of quantum programs is reduced to realisability and synthesis of LRSMs. We further show that the realisability and synthesis problem of LRSMs for quantum programs can be reduced to an SDP (Semi-Definite Programming) problem, which can be settled with the existing SDP solvers. The techniques developed in this paper are used to analyse the termination of several example quantum programs, including quantum random walks and quantum Bernoulli factory for random number generation. This work is essentially a generalisation of constraint-based approach to the corresponding problems for probabilistic programs developed in the recent literature by adding two novel ideas: (1) employing the fundamental Gleason's theorem in quantum mechanics to guide the choices of templates; and (2) a generalised Farkas' lemma in terms of observables (Hermitian operators) in quantum physics.
Article Search
Monadic Refinements for Relational Cost Analysis
Ivan Radiček, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Florian Zuleger
(Vienna University of Technology, Austria; IMDEA Software Institute, Spain; SUNY Buffalo, USA; MPI-SWS, Germany)
Formal frameworks for cost analysis of programs have been widely studied in the unary setting and, to a limited extent, in the relational setting. However, many of these frameworks focus only on the cost aspect, largely side-lining functional properties that are often a prerequisite for cost analysis, thus leaving many interesting programs out of their purview. In this paper, we show that elegant, simple, expressive proof systems combining cost analysis and functional properties can be built by combining already known ingredients: higher-order refinements and cost monads. Specifically, we derive two syntax-directed proof systems, U^{C} and R^{C}, for unary and relational cost analysis, by adding a cost monad to a (syntax-directed) logic of higher-order programs. We study the metatheory of the systems, show that several nontrivial examples can be verified in them, and prove that existing frameworks for cost analysis (RelCost and RAML) can be embedded in them.
Article Search
Outside the Box
Go with the Flow: Compositional Abstractions for Concurrent Data Structures
Siddharth Krishna, Dennis Shasha, and Thomas Wies
(New York University, USA)
Concurrent separation logics have helped to significantly simplify correctness proofs for concurrent data structures. However, a recurring problem in such proofs is that data structure abstractions that work well in the sequential setting are much harder to reason about in a concurrent setting due to complex sharing and overlays. To solve this problem, we propose a novel approach to abstracting regions in the heap by encoding the data structure invariant into a local condition on each individual node. This condition may depend on a quantity associated with the node that is computed as a fixpoint over the entire heap graph. We refer to this quantity as a flow. Flows can encode both structural properties of the heap (e.g. the reachable nodes from the root form a tree) as well as data invariants (e.g. sortedness). We then introduce the notion of a flow interface, which expresses the relies and guarantees that a heap region imposes on its context to maintain the local flow invariant with respect to the global heap. Our main technical result is that this notion leads to a new semantic model of separation logic. In this model, flow interfaces provide a general abstraction mechanism for describing complex data structures. This abstraction mechanism admits proof rules that generalize over a wide variety of data structures. To demonstrate the versatility of our approach, we show how to extend the logic RGSep with flow interfaces. We have used this new logic to prove linearizability and memory safety of nontrivial concurrent data structures. In particular, we obtain parametric linearizability proofs for concurrent dictionary algorithms that abstract from the details of the underlying data structure representation. These proofs cannot be easily expressed using the abstraction mechanisms provided by existing separation logics.
Article Search
Parametricity versus the Universal Type
Dominique Devriese, Marco Patrignani, and
Frank Piessens
(KU Leuven, Belgium; MPI-SWS, Germany; CISPA, Germany)
There has long been speculation in the scientific literature on how to dynamically enforce parametricity such as that yielded by System F.
Almost 20 years ago, Sumii and Pierce proposed a formal compiler from System F into the cryptographic lambda calculus: an untyped lambda calculus extended with an idealised model of encryption.
They conjectured that this compiler was fully abstract, i.e. that compiled terms are contextually equivalent if and only if the original terms were, a property that can be seen as a form of secure compilation.
The conjecture has received attention in several other publications since then, but remains open to this day.
More recently, several researchers have been looking at gradually-typed languages that extend System F.
In this setting it is natural to wonder whether embedding System F into these gradually-typed languages preserves contextual equivalence and thus parametricity.
In this paper, we answer both questions negatively.
We provide a concrete counterexample: two System F terms whose contextual equivalence is not preserved by the Sumii-Pierce compiler, nor the embedding into the polymorphic blame calculus.
This counterexample relies on the absence in System F of what we call a universal type, i.e., a type that all other types can be injected into and extracted from.
As the languages in which System F is compiled have a universal type, the compilation cannot be fully abstract; this paper explains why.
We believe this paper thus sheds light on recent results in the field of gradually typed languages and it provides a perspective for further research into secure compilation of polymorphic languages.
Article Search
Archive submitted (0 MB)
Linearity in Higher-Order Recursion Schemes
Pierre Clairambault, Charles Grellois, and Andrzej S. Murawski
(University of Lyon, France; CNRS, France; ENS Lyon, France; Claude Bernard University Lyon 1, France; LIP, France; Inria, France; Aix-Marseille University, France; ENSAM, France; University of Toulon, France; University of Oxford, UK)
Higher-order recursion schemes (HORS) have recently emerged as a promising foundation for higher-order program verification. We examine the impact of enriching HORS with linear types. To that end, we introduce two frameworks that blend non-linear and linear types: a variant of the λY -calculus and an extension of HORS, called linear HORS (LHORS).
First we prove that the two formalisms are equivalent and there exist polynomial-time translations between them. Then, in order to support model-checking of (trees generated by) LHORS, we propose a refined version of alternating parity tree automata, called LNAPTA, whose behaviour depends on information about linearity. We show that the complexity of LNAPTA model-checking for LHORS depends on two type-theoretic parameters: linear order and linear depth. The former is in general smaller than the standard notion of order and ignores linear function spaces. In contrast, the latter measures the depth of linear clusters inside a type. Our main result states that LNAPTA model-checking of LHORS of linear order n is n-EXPTIME-complete, when linear depth is fixed. This generalizes and improves upon the classic result of Ong, which relies on the standard notion of order.
To illustrate the significance of the result, we consider two applications: the MSO model-checking problem on variants of HORS with case distinction (RSFD and HORSC) on a finite domain and a call-by-value resource verification problem. In both cases, decidability can be established by translation into HORS, but the implied complexity bounds will be suboptimal due to increases in type order. In contrast, we show that the complexity bounds derived by translations into LHORS and appealing to our result are optimal in that they match the respective hardness results.
Article Search
Symbolic Types for Lenient Symbolic Execution
Stephen Chang, Alex Knauth, and Emina Torlak
(Northeastern University, USA; University of Washington, USA)
We present lambda_sym, a typed λ-calculus for lenient symbolic execution, where some language constructs do not recognize symbolic values. Its type system, however, ensures safe behavior of all symbolic values in a program. Our calculus extends a base occurrence typing system with symbolic types and mutable state, making it a suitable model for both functional and imperative symbolically executed languages. Naively allowing mutation in this mixed setting introduces soundness issues, however, so we further add concreteness polymorphism, which restores soundness without rejecting too many valid programs. To show that our calculus is a useful model for a real language, we implemented Typed Rosette, a typed extension of the solver-aided Rosette language. We evaluate Typed Rosette by porting a large code base, demonstrating that our type system accommodates a wide variety of symbolically executed programs.
Article Search
Artifacts Functional
Language Design
Simplicitly: Foundations and Applications of Implicit Function Types
Martin Odersky, Olivier Blanvillain, Fengyun Liu, Aggelos Biboudis, Heather Miller, and Sandro Stucki
(EPFL, Switzerland; Northeastern University, USA)
Understanding a program entails understanding its context; dependencies, configurations and even implementations are all forms of contexts. Modern programming languages and theorem provers offer an array of constructs to define contexts, implicitly. Scala offers implicit parameters which are used pervasively, but which cannot be abstracted over.
This paper describes a generalization of implicit parameters to implicit function types, a powerful way to abstract over the context in which some piece of code is run. We provide a formalization based on bidirectional type-checking that closely follows the semantics implemented by the Scala compiler.
To demonstrate their range of abstraction capabilities, we present several applications that make use of implicit function types. We show how to encode the builder pattern, tagless interpreters, reader and free monads and we assess the performance of the monadic structures presented.
Article Search
Archive submitted (0 MB)
Artifacts Available
Artifacts Functional
Dependent Types
Univalent Higher Categories via Complete Semi-Segal Types
Paolo Capriotti and
Nicolai Kraus
(University of Nottingham, UK)
Category theory in homotopy type theory is intricate as categorical laws can only be stated "up to homotopy", and thus require coherences. The established notion of a univalent category (as introduced by Ahrens et al.) solves this by considering only truncated types, roughly corresponding to an ordinary category. This fails to capture many naturally occurring structures, stemming from the fact that the naturally occurring structures in homotopy type theory are not ordinary, but rather higher categories.
Out of the large variety of approaches to higher category theory that mathematicians have proposed, we believe that, for type theory, the simplicial strategy is best suited. Work by Lurie and Harpaz motivates the following definition. Given the first (n+3) levels of a semisimplicial type S, we can equip S with three properties: first, contractibility of the types of certain horn fillers; second, a completeness property; and third, a truncation condition. We call this a complete semi-Segal n-type. This is very similar to an earlier suggestion by Schreiber.
The definition of a univalent (1-) category by Ahrens et al. can easily be extended or restricted to the definition of a univalent n-category (more precisely, (n,1)-category) for n ∈ {0,1,2}, and we show that the type of complete semi-Segal n-types is equivalent to the type of univalent n-categories in these cases. Thus, we believe that the notion of a complete semi-Segal n-type can be taken as the definition of a univalent n-category.
We provide a formalisation in the proof assistant Agda using a completely explicit representation of semi-simplicial types for levels up to 4.
Preprint
Info
Artifacts Functional
Testing and Verification
Generating Good Generators for Inductive Relations
Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C. Pierce
(University of Pennsylvania, USA; Princeton University, USA)
Property-based random testing (PBRT) is widely used in the functional programming and verification communities. For testing simple properties, PBRT tools such as QuickCheck can automatically generate random inputs of a given type. But for more complex properties, effective testing often demands generators for random inputs that belong to a given type and satisfy some logical condition. QuickCheck provides a library of combinators for building such generators by hand, but this can be tedious for simple conditions and error prone for more complex ones. Fortunately, the process can often be automated. The most prominent method, narrowing, works by traversing the structure of the condition, lazily instantiating parts of the data structure as constraints involving them are met.
We show how to use ideas from narrowing to compile a large subclass of Coq's inductive relations into efficient generators, avoiding the interpretive overhead of previous implementations. More importantly, the same compilation technique allows us to produce proof terms certifying that each derived generator is good---i.e., sound and complete with respect to the inductive relation it was derived from. We implement our algorithm as an extension of QuickChick, an existing tool for property-based testing in Coq. We evaluate our method by automatically deriving good generators for the majority of the specifications in Software Foundations, a formalized textbook on programming language foundations.
Article Search
Info
Artifacts Functional
Why Is Random Testing Effective for Partition Tolerance Bugs?
Rupak Majumdar and Filip Niksic
(MPI-SWS, Germany)
Random testing has proven to be an effective way to catch bugs in distributed systems in the presence of network partition faults. This is surprising, as the space of potentially faulty executions is enormous, and the bugs depend on a subtle interplay between sequences of operations and faults.
We provide a theoretical justification of the effectiveness of random testing in this context. First, we show a general construction, using the probabilistic method from combinatorics, that shows that whenever a random test covers a fixed coverage goal with sufficiently high probability, a small randomly-chosen set of tests achieves full coverage with high probability. In particular, we show that our construction can give test sets exponentially smaller than systematic enumeration. Second, based on an empirical study of many bugs found by random testing in production distributed systems, we introduce notions of test coverage relating to network partition faults which are effective in finding bugs. Finally, we show using combinatorial arguments that for these notions of test coverage we introduce, we can find a lower bound on the probability that a random test covers a given goal. Our general construction then explains why random testing tools achieve good coverage---and hence, find bugs---quickly.
While we formulate our results in terms of network partition faults, our construction provides a step towards rigorous analysis of random testing algorithms, and can be applicable in other scenarios.
Article Search
On Automatically Proving the Correctness of math.h Implementations
Wonyeol Lee, Rahul Sharma, and Alex Aiken
(Stanford University, USA; Microsoft Research, India)
Industry standard implementations of math.h claim (often without formal proof) tight bounds on floating-point errors. We demonstrate a novel static analysis that proves these bounds and verifies the correctness of these implementations. Our key insight is a reduction of this verification task to a set of mathematical optimization problems that can be solved by off-the-shelf computer algebra systems. We use this analysis to prove the correctness of implementations in Intel's math library automatically. Prior to this work, these implementations could only be verified with significant manual effort.
Article Search
Online Detection of Effectively Callback Free Objects with Applications to Smart Contracts
Shelly Grossman, Ittai Abraham, Guy Golan-Gueta, Yan Michalevsky, Noam Rinetzky, Mooly Sagiv, and Yoni Zohar
(Tel Aviv University, Israel; VMware, USA; Stanford University, USA)
Callbacks are essential in many programming environments, but drastically complicate program understanding and reasoning because they allow to mutate object's local states by external objects in unexpected fashions, thus breaking modularity.
The famous DAO bug in the cryptocurrency framework Ethereum, employed callbacks to steal $150M.
We define the notion of Effectively Callback Free (ECF) objects in order to allow callbacks without preventing modular reasoning.
An object is ECF in a given execution trace if there exists an equivalent execution trace without callbacks to this object.
An object is ECF if it is ECF in every possible execution trace.
We study the decidability of dynamically checking ECF in a given execution trace and statically checking if an object is ECF.
We also show that dynamically checking ECF in Ethereum is feasible and can be done online.
By running the history of all execution traces in Ethereum, we were able to verify that virtually all existing contract executions, excluding these of the DAO or of contracts with similar known vulnerabilities, are ECF.
Finally, we show that ECF, whether it is verified dynamically or statically, enables modular reasoning about objects with encapsulated state.
Article Search
Artifacts Functional
Dynamic Languages
Correctness of Speculative Optimizations with Dynamic Deoptimization
Olivier Flückiger, Gabriel Scherer, Ming-Ho Yee, Aviral Goel, Amal Ahmed, and Jan Vitek
(Northeastern University, USA; Inria, France; Czech Technical University, Czechia)
High-performance dynamic language implementations make heavy use of
speculative optimizations to achieve speeds close to statically compiled
languages. These optimizations are typically performed by a just-in-time
compiler that generates code under a set of assumptions about the state of
the program and its environment. In certain cases, a program may execute
code compiled under assumptions that are no longer valid. The implementation
must then deoptimize the program on-the-fly; this entails finding
semantically equivalent code that does not rely on invalid
assumptions, translating program state to that expected by the target code,
and transferring control. This paper looks at the interaction between
optimization and deoptimization, and shows that reasoning about speculation
is surprisingly easy when assumptions are made explicit in the program
representation. This insight is demonstrated on a compiler intermediate
representation, named sourir, modeled after the high-level representation
for a dynamic language. Traditional compiler optimizations such as constant
folding, unreachable code elimination, and function inlining are shown to be correct
in the presence of assumptions. Furthermore, the paper establishes the
correctness of compiler transformations specific to deoptimization: namely
unrestricted deoptimization, predicate hoisting, and assume composition.
Preprint
JaVerT: JavaScript Verification Toolchain
José Fragoso Santos, Petar Maksimović, Daiva Naudžiūnienė, Thomas Wood, and Philippa Gardner
(Imperial College London, UK; Mathematical Institute SASA, Serbia)
The dynamic nature of JavaScript and its complex semantics make it a difficult target for logic-based verification. We introduce JaVerT, a semi-automatic JavaScript Verification Toolchain, based on separation logic and aimed at the specialist developer wanting rich, mechanically verified specifications of critical JavaScript code. To specify JavaScript programs, we design abstractions that capture its key heap structures (for example, prototype chains and function closures), allowing the developer to write clear and succinct specifications with minimal knowledge of the JavaScript internals. To verify JavaScript programs, we develop JaVerT, a verification pipeline consisting of: JS-2-JSIL, a well-tested compiler from JavaScript to JSIL, an intermediate goto language capturing the fundamental dynamic features of JavaScript; JSIL Verify, a semi-automatic verification tool based on a sound JSIL separation logic; and verified axiomatic specifications of the JavaScript internal functions. Using JaVerT, we verify functional correctness properties of: data-structure libraries (key-value map, priority queue) written in an object-oriented style; operations on data structures such as binary search trees (BSTs) and lists; examples illustrating function closures; and test cases from the official ECMAScript test suite. The verification times suggest that reasoning about larger, more complex code using JaVerT is feasible.
Article Search
Archive submitted (2813 MB)
Artifacts Available
Artifacts Functional
Soft Contract Verification for Higher-Order Stateful Programs
Phúc C. Nguyễn, Thomas Gilray, Sam Tobin-Hochstadt, and David Van Horn
(University of Maryland, USA; Indiana University, USA)
Software contracts allow programmers to state rich program properties
using the full expressive power of an object language. However, since
they are enforced at runtime, monitoring contracts imposes significant
overhead and delays error discovery. So contract veri cation aims to
guarantee all or most of these properties ahead of time, enabling
valuable optimizations and yielding a more general assurance of
correctness. Existing methods for static contract verification satisfy
the needs of more restricted target languages, but fail to address the
challenges unique to those conjoining untyped, dynamic programming,
higher-order functions, modularity, and statefulness. Our approach
tackles all these features at once, in the context of the full Racket
system—a mature environment for stateful, higher-order, multi-paradigm
programming with or with- out types. Evaluating our method using a set
of both pure and stateful benchmarks, we are able to verify 99.94% of
checks statically (all but 28 of 49, 861).
Stateful, higher-order functions pose significant challenges for
static contract verification in particular. In the presence of these
features, a modular analysis must permit code from the current module
to escape permanently to an opaque context (unspecified code from
outside the current module) that may be stateful and therefore store a
reference to the escaped closure. Also, contracts themselves, being
predicates wri en in unrestricted Racket, may exhibit stateful
behavior; a sound approach must be robust to contracts which are
arbitrarily expressive and interwoven with the code they monitor. In
this paper, we present and evaluate our solution based on higher-order
symbolic execution, explain the techniques we used to address such
thorny issues, formalize a notion of behavioral approximation, and use
it to provide a mechanized proof of soundness.
Preprint
Archive submitted (7 MB)
Info
Artifacts Available
Artifacts Functional
Collapsing Towers of Interpreters
Nada Amin and Tiark Rompf
(University of Cambridge, UK; Purdue University, USA)
Given a tower of interpreters, i.e., a sequence of multiple interpreters interpreting one another as input programs, we aim to collapse this tower into a compiler that removes all interpretive overhead and runs in a single pass. In the real world, a use case might be Python code executed by an x86 runtime, on a CPU emulated in a JavaScript VM, running on an ARM CPU. Collapsing such a tower can not only exponentially improve runtime performance, but also enable the use of base-language tools for interpreted programs, e.g., for analysis and verification. In this paper, we lay the foundations in an idealized but realistic setting.
We present a multi-level lambda calculus that features staging constructs and stage polymorphism: based on runtime parameters, an evaluator either executes source code (thereby acting as an interpreter) or generates code (thereby acting as a compiler). We identify stage polymorphism, a programming model from the domain of high-performance program generators, as the key mechanism to make such interpreters compose in a collapsible way.
We present Pink, a meta-circular Lisp-like evaluator on top of this calculus, and demonstrate that we can collapse arbitrarily many levels of self-interpretation, including levels with semantic modifications. We discuss several examples: compiling regular expressions through an interpreter to base code, building program transformers from modi ed interpreters, and others. We develop these ideas further to include reflection and reification, culminating in Purple, a reflective language inspired by Brown, Blond, and Black, which realizes a conceptually infinite tower, where every aspect of the semantics can change dynamically. Addressing an open challenge, we show how user programs can be compiled and recompiled under user-modified semantics.
Article Search
Info
Artifacts Functional
Program Analysis
Refinement Reflection: Complete Verification with SMT
Niki Vazou, Anish Tondwalkar, Vikraman Choudhury, Ryan G. Scott, Ryan R. Newton, Philip Wadler, and Ranjit Jhala
(University of Maryland, USA; University of California at San Diego, USA; Indiana University, USA; University of Edinburgh, UK; Input Output HK, UK)
We introduce Refinement Reflection, a new framework for building SMT-based deductive verifiers. The key idea is to reflect the code implementing a user-defined function into the function’s (output) refinement type. As a consequence, at uses of the function, the function definition is instantiated in the SMT logic in a precise fashion that permits decidable verification. Reflection allows the user to write equational proofs of programs just by writing other programs using pattern-matching and recursion to perform case-splitting and induction. Thus, via the propositions-as-types principle, we show that reflection permits the specification of arbitrary functional correctness properties. Finally, we introduce a proof-search algorithm called Proof by Logical Evaluation that uses techniques from model checking and abstract interpretation, to completely automate equational reasoning. We have implemented reflection in Liquid Haskell and used it to verify that the widely used instances of the Monoid, Applicative, Functor, and Monad typeclasses actually satisfy key algebraic laws required to make the clients safe, and have used reflection to build the first library that actually verifies assumptions about associativity and ordering that are crucial for safe deterministic parallelism.
Article Search
Archive submitted (54 MB)
Artifacts Available
Artifacts Functional
Non-linear Reasoning for Invariant Synthesis
Zachary Kincaid, John Cyphert, Jason Breck, and
Thomas Reps
(Princeton University, USA; University of Wisconsin-Madison, USA; GrammaTech, USA)
Automatic generation of non-linear loop invariants is a long-standing
challenge in program analysis, with many applications. For instance, reasoning
about exponentials provides a way to find invariants of digital-filter
programs, and reasoning about polynomials and/or logarithms is needed for
establishing invariants that describe the time or memory usage of many
well-known algorithms. An appealing approach to this challenge is to exploit
the powerful recurrence-solving techniques that have been developed in the
field of computer algebra, which can compute exact characterizations of
non-linear repetitive behavior. However, there is a gap between the
capabilities of recurrence solvers and the needs of program analysis: (1) loop
bodies are not merely systems of recurrence relations---they may contain
conditional branches, nested loops, non-deterministic assignments, etc., and
(2) a client program analyzer must be able to reason about the closed-form
solutions produced by a recurrence solver (e.g., to prove assertions).
This paper presents a method for generating non-linear invariants of general
loops based on analyzing recurrence relations. The key components are an
abstract domain for reasoning about non-linear arithmetic, a semantics-based
method for extracting recurrence relations from loop bodies, and a recurrence
solver that avoids closed forms that involve complex or irrational numbers.
Our technique has been implemented in a program analyzer that can analyze
general loops and mutually recursive procedures. Our experiments show that
our technique shows promise for non-linear assertion-checking and
resource-bound generation.
Article Search
Archive submitted (3187 MB)
Artifacts Available
Artifacts Functional
A Practical Construction for Decomposing Numerical Abstract Domains
Gagandeep Singh, Markus Püschel, and Martin Vechev
(ETH Zurich, Switzerland)
Numerical abstract domains such as Polyhedra, Octahedron, Octagon, Interval, and others are an essential component of static program analysis. The choice of domain offers a performance/precision tradeoff ranging from cheap and imprecise (Interval) to expensive and precise (Polyhedra). Recently, significant speedups were achieved for Octagon and Polyhedra by manually decomposing their transformers to work with the Cartesian product of projections associated with partitions of the variable set. While practically useful, this manual process is time consuming, error-prone, and has to be applied from scratch for every domain.
In this paper, we present a generic approach for decomposing the transformers of sub-polyhedra domains along with conditions for checking whether the decomposed transformers lose precision with respect to the original transformers. These conditions are satisfied by most practical transformers, thus our approach is suitable for increasing the performance of these transformers without compromising their precision. Furthermore, our approach is ``black box:'' it does not require changes to the internals of the original non-decomposed transformers or additional manual effort per domain.
We implemented our approach and applied it to the domains of Zones, Octagon, and Polyhedra. We then compared the performance of the decomposed transformers obtained with our generic method versus the state of the art: the (non-decomposed) PPL for Polyhedra and the much faster ELINA (which uses manual decomposition) for Polyhedra and Octagon. Against ELINA we demonstrate finer partitions and an associated speedup of about 2x on average. Our results indicate that the general construction presented in this work is a viable method for improving the performance of sub-polyhedra domains. It enables designers of abstract domains to benefit from decomposition without re-writing all of their transformers from scratch as required by prior work.
Article Search
Info
Artifacts Functional
Verifying Equivalence of Database-Driven Applications
Yuepeng Wang, Isil Dillig,
Shuvendu K. Lahiri, and William R. Cook
(University of Texas at Austin, USA; Microsoft Research, USA)
This paper addresses the problem of verifying equivalence between a pair of programs that operate over databases with different schemas. This problem is particularly important in the context of web applications, which typically undergo database refactoring either for performance or maintainability reasons. While web applications should have the same externally observable behavior before and after schema migration, there are no existing tools for proving equivalence of such programs. This paper takes a first step towards solving this problem by formalizing the equivalence and refinement checking problems for database-driven applications. We also propose a proof methodology based on the notion of bisimulation invariants over relational algebra with updates and describe a technique for synthesizing such bisimulation invariants. We have implemented the proposed technique in a tool called Mediator for verifying equivalence between database-driven applications written in our intermediate language and evaluate our tool on 21 benchmarks extracted from textbooks and real-world web applications. Our results show that the proposed methodology can successfully verify 20 of these benchmarks.
Article Search
Probability
Proving Expected Sensitivity of Probabilistic Programs
Gilles Barthe, Thomas Espitau, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub
(IMDEA Software Institute, Spain; UPMC, France; Inria, France; University College London, UK; École Polytechnique, France)
Program sensitivity, also known as Lipschitz continuity, describes how small changes in a program’s input lead to bounded changes in the output. We propose an average notion of program sensitivity for probabilistic programs—expected sensitivity—that averages a distance function over a probabilistic coupling of two output distributions from two similar inputs. By varying the distance, expected sensitivity recovers useful notions of probabilistic function sensitivity, including stability of machine learning algorithms and convergence of Markov chains.
Furthermore, expected sensitivity satisfies clean compositional properties and is amenable to formal verification. We develop a relational program logic called EpRHL for proving expected sensitivity properties. Our logic features two key ideas. First, relational pre-conditions and post-conditions are expressed using distances, a real-valued generalization of typical boolean-valued (relational) assertions. Second, judgments are interpreted in terms of expectation coupling, a novel, quantitative generalization of probabilistic couplings which supports compositional reasoning.
We demonstrate our logic on examples beyond the reach of prior relational logics. Our main example formalizes uniform stability of the stochastic gradient method. Furthermore, we prove rapid mixing for a probabilistic model of population dynamics. We also extend our logic with a transitivity principle for expectation couplings to capture the path coupling proof technique by Bubley and Dyer, and formalize rapid mixing of the Glauber dynamics from statistical physics.
Preprint
Artifacts Functional
Synthesizing Coupling Proofs of Differential Privacy
Aws Albarghouthi and Justin Hsu
(University of Wisconsin-Madison, USA; University College London, UK)
Differential privacy has emerged as a promising probabilistic formulation of privacy, generating intense interest within academia and industry. We present a push-button, automated technique for verifying ε-differential privacy of sophisticated randomized algorithms. We make several conceptual, algorithmic, and practical contributions: (i) Inspired by the recent advances on approximate couplings and randomness alignment, we present a new proof technique called coupling strategies, which casts differential privacy proofs as a winning strategy in a game where we have finite privacy resources to expend. (ii) To discover a winning strategy, we present a constraint-based formulation of the problem as a set of Horn modulo couplings (HMC) constraints, a novel combination of first-order Horn clauses and probabilistic constraints. (iii) We present a technique for solving HMC constraints by transforming probabilistic constraints into logical constraints with uninterpreted functions. (iv) Finally, we implement our technique in the FairSquare verifier and provide the first automated privacy proofs for a number of challenging algorithms from the differential privacy literature, including Report Noisy Max, the Exponential Mechanism, and the Sparse Vector Mechanism.
Preprint
Measurable Cones and Stable, Measurable Functions: A Model for Probabilistic Higher-Order Programming
Thomas Ehrhard, Michele Pagani, and Christine Tasson
(University of Paris Diderot, France; CNRS, France)
We define a notion of stable and measurable map between cones endowed with measurability tests and show that it forms a cpo-enriched cartesian closed category. This category gives a denotational model of an extension of PCF supporting the main primitives of probabilistic functional programming, like continuous and discrete probabilistic distributions, sampling, conditioning and full recursion. We prove the soundness and adequacy of this model with respect to a call-by-name operational semantics and give some examples of its denotations.
Article Search
Info
Denotational Validation of Higher-Order Bayesian Inference
Adam Ścibior, Ohad Kammar, Matthijs Vákár,
Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani
(University of Cambridge, UK; MPI Tübingen, Germany; University of Oxford, UK; KAIST, South Korea; University of Tübingen, Germany; University of Edinburgh, UK; Uber AI Labs, USA)
We present a modular semantic account of Bayesian inference algorithms for
probabilistic programming languages, as used in data
science and machine learning. Sophisticated inference algorithms are
often explained in terms of composition of smaller parts. However,
neither their theoretical justification nor their implementation
reflects this modularity. We show how to conceptualise and analyse
such inference algorithms as manipulating intermediate
representations of probabilistic programs using higher-order
functions and inductive types, and their denotational semantics.
Semantic accounts of continuous distributions use measurable
spaces. However, our use of higher-order functions presents a
substantial technical difficulty: it is impossible to define a
measurable space structure over the collection of measurable
functions between arbitrary measurable spaces that is compatible
with standard operations on those functions, such as function
application. We overcome this difficulty using quasi-Borel spaces,
a recently proposed mathematical structure that supports both
function spaces and continuous distributions.
We define a class of semantic structures for representing
probabilistic programs, and semantic validity criteria for
transformations of these representations in terms of distribution
preservation. We develop a collection of building blocks for
composing representations. We use these building blocks to validate
common inference algorithms such as Sequential Monte Carlo and
Markov Chain Monte Carlo. To emphasize the connection between the semantic manipulation and its traditional measure theoretic origins, we use Kock's
synthetic measure theory. We demonstrate its usefulness by proving a
quasi-Borel counterpart to the Metropolis-Hastings-Green
theorem.
Article Search
Info
Synthesis
Bonsai: Synthesis-Based Reasoning for Type Systems
Kartik Chandra and Rastislav Bodik
(Henry M. Gunn High School, USA; Stanford University, USA; University of Washington, USA)
When designing a type system, we may want to mechanically check the design to guide its further development. We describe algorithms that perform symbolic reasoning about executable models of type systems. The algorithms support three queries. First, they check type soundness and synthesize a counterexample program if such a soundness bug is found. Second, they compare two versions of a type system, synthesizing a program accepted by one but rejected by the other. Third, they minimize the size of synthesized counterexample programs.
These algorithms symbolically evaluate typecheckers and interpreters, producing formulas that characterize the set of programs that fail or succeed in the typechecker and the interpreter. However, symbolically evaluating interpreters poses efficiency challenges, which are caused by having to merge execution paths of the various possible input programs. Our main contribution is the bonsai tree, a novel symbolic representation of programs and program states that addresses these challenges. Bonsai trees encode complex syntactic information in terms of logical constraints, enabling more efficient merging.
We implement these algorithms in the Bonsai tool, an assistant for type system designers. We perform case studies on how Bonsai helps test and explore a variety of type systems. Bonsai efficiently synthesizes counterexamples for soundness bugs previously inaccessible to automatic tools and is the first automated tool to find a counterexample for the recently discovered Scala soundness bug SI-9633.
Article Search
Artifacts Functional
Program Synthesis using Abstraction Refinement
Xinyu Wang, Isil Dillig, and Rishabh Singh
(University of Texas at Austin, USA; Microsoft Research, USA)
We present a new approach to example-guided program synthesis based on counterexample-guided abstraction refinement. Our method uses the abstract semantics of the underlying DSL to find a program P whose abstract behavior satisfies the examples. However, since program P may be spurious with respect to the concrete semantics, our approach iteratively refines the abstraction until we either find a program that satisfies the examples or prove that no such DSL program exists. Because many programs have the same input-output behavior in terms of their abstract semantics, this synthesis methodology significantly reduces the search space compared to existing techniques that use purely concrete semantics.
While synthesis using abstraction refinement (SYNGAR) could be implemented in different settings, we propose a refinement-based synthesis algorithm that uses abstract finite tree automata (AFTA). Our technique uses a coarse initial program abstraction to construct an initial AFTA, which is iteratively refined by constructing a proof of incorrectness of any spurious program. In addition to ruling out the spurious program accepted by the previous AFTA, proofs of incorrectness are also useful for ruling out many other spurious programs.
We implement these ideas in a framework called Blaze, which can be instantiated in different domains by providing a suitable DSL and its corresponding concrete and abstract semantics. We have used the Blaze framework to build synthesizers for string and matrix transformations, and we compare Blaze with existing techniques. Our results for the string domain show that Blaze compares favorably with FlashFill, a domain-specific synthesizer that is now deployed in Microsoft PowerShell. In the context of matrix manipulations, we compare Blaze against Prose, a state-of-the-art general-purpose VSA-based synthesizer, and show that Blaze results in a 90x speed-up over Prose. In both application domains, Blaze also consistently improves upon the performance of two other existing techniques by at least an order of magnitude.
Article Search
Types for State
A Logical Relation for Monadic Encapsulation of State: Proving Contextual Equivalences in the Presence of runST
Amin Timany, Léo Stefanesco, Morten Krogh-Jespersen, and
Lars Birkedal
(KU Leuven, Belgium; IRIF, France; CNRS, France; University of Paris Diderot, France; Aarhus University, Denmark)
We present a logical relations model of a higher-order functional programming language with impredicative polymorphism, recursive types, and a Haskell-style ST monad type with runST. We use our logical relations model to show that runST provides proper encapsulation of state, by showing that effectful computations encapsulated by runST are heap independent. Furthermore, we show that contextual refinements and equivalences that are expected to hold for pure computations do indeed hold in the presence of runST. This is the first time such relational results have been proven for a language with monadic encapsulation of state. We have formalized all the technical development and results in Coq.
Article Search
Archive submitted (1 MB)
Artifacts Available
Artifacts Functional
Recalling a Witness: Foundations and Applications of Monotonic State
Danel Ahman, Cédric Fournet, Cătălin Hriţcu, Kenji Maillard, Aseem Rastogi, and Nikhil Swamy
(Inria, France; Microsoft Research, UK; ENS Paris, France; Microsoft Research, India; Microsoft Research, USA)
We provide a way to ease the verification of programs whose state evolves monotonically. The main idea is that a property witnessed in a prior state can be soundly recalled in the current state, provided (1) state evolves according to a given preorder, and (2) the property is preserved by this preorder. In many scenarios, such monotonic reasoning yields concise modular proofs, saving the need for explicit program invariants. We distill our approach into the monotonic-state monad, a general yet compact interface for Hoare-style reasoning about monotonic state in a dependently typed language. We prove the soundness of the monotonic-state monad and use it as a unified foundation for reasoning about monotonic state in the F^{⋆} verification system. Based on this foundation, we build libraries for various mutable data structures like monotonic references and apply these libraries at scale to the verification of several distributed applications.
Preprint
Archive submitted (0 MB)
Info
Artifacts Available
Artifacts Functional
RustBelt: Securing the Foundations of the Rust Programming Language
Ralf Jung,
Jacques-Henri Jourdan, Robbert Krebbers, and
Derek Dreyer
(MPI-SWS, Germany; Delft University of Technology, Netherlands)
Rust is a new systems programming language that promises to overcome
the seemingly fundamental tradeoff between high-level safety
guarantees and low-level control over resource management.
Unfortunately, none of Rust's safety claims have been formally proven,
and there is good reason to question whether they actually hold.
Specifically, Rust employs a strong, ownership-based type system, but
then extends the expressive power of this core type system through
libraries that internally use unsafe features. In this paper, we give
the first formal (and machine-checked) safety proof for a language
representing a realistic subset of Rust. Our proof is extensible in
the sense that, for each new Rust library that uses unsafe features,
we can say what verification condition it must satisfy in order for it
to be deemed a safe extension to the language. We have carried out
this verification for some of the most important libraries that are
used throughout the Rust ecosystem.
Preprint
Archive submitted (0 MB)
Info
Artifacts Available
Artifacts Functional
proc time: 5.38