OOPSLA2 2023
Proceedings of the ACM on Programming Languages, Volume 7, Number OOPSLA2
Powered by
Conference Publishing Consulting

Proceedings of the ACM on Programming Languages, Volume 7, Number OOPSLA2, October 22–27, 2023, Cascais, Portugal

OOPSLAB – Journal Issue

Contents - Abstracts - Authors

Frontmatter

Title Page


Editorial Message
The Proceedings of the ACM series presents the highest-quality research conducted in diverse areas of computer science, as represented by the ACM Special Interest Groups (SIGs).

Papers

Hardware-Aware Static Optimization of Hyperdimensional Computations
Pu (Luke) Yi and Sara Achour
(Stanford University, USA)
Binary spatter code (BSC)-based hyperdimensional computing (HDC) is a highly error-resilient approximate computational paradigm suited for error-prone, emerging hardware platforms. In BSC HDC, the basic datatype is a hypervector, a typically large binary vector, where the size of the hypervector has a significant impact on the fidelity and resource usage of the computation. Typically, the hypervector size is dynamically tuned to deliver the desired accuracy; this process is time-consuming and often produces hypervector sizes that lack accuracy guarantees and produce poor results when reused for very similar workloads. We present Heim, a hardware-aware static analysis and optimization framework for BSC HD computations. Heim analytically derives the minimum hypervector size that minimizes resource usage and meets the target accuracy requirement. Heim guarantees the optimized computation converges to the user-provided accuracy target on expectation, even in the presence of hardware error. Heim deploys a novel static analysis procedure that unifies theoretical results from the neuroscience community to systematically optimize HD computations.
We evaluate Heim against dynamic tuning-based optimization on 25 benchmark data structures. Given a 99% accuracy requirement, Heim-optimized computations achieve a 99.2%-100.0% median accuracy, up to 49.5% higher than dynamic tuning-based optimization, while achieving 1.15x-7.14x reductions in hypervector size compared to HD computations that achieve comparable query accuracy and finding parametrizations 30.0x-100167.4x faster than dynamic tuning-based approaches. We also use Heim to systematically evaluate the performance benefits of using analog CAMs and multiple-bit-per-cell ReRAM over conventional hardware, while maintaining iso-accuracy – for both emerging technologies, we find usages where the emerging hardware imparts significant benefits.

Publisher's Version Published Artifact Archive submitted (1.3 MB) Artifacts Available Artifacts Functional
Leaf: Modularity for Temporary Sharing in Separation Logic
Travis Hance, Jon Howell, Oded Padon, and Bryan Parno
(Carnegie Mellon University, USA; VMware Research, USA)
In concurrent verification, separation logic provides a strong story for handling both resources that are owned exclusively and resources that are shared persistently (i.e., forever). However, the situation is more complicated for temporarily shared state, where state might be shared and then later reclaimed as exclusive. We believe that a framework for temporarily-shared state should meet two key goals not adequately met by existing techniques. One, it should allow and encourage users to verify new sharing strategies. Two, it should provide an abstraction where users manipulate shared state in a way agnostic to the means with which it is shared.
We present Leaf, a library in the Iris separation logic which accomplishes both of these goals by introducing a novel operator, which we call guarding, that allows one proposition to represent a shared version of another. We demonstrate that Leaf meets these two goals through a modular case study: we verify a reader-writer lock that supports shared state, and a hash table built on top of it that uses shared state.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Formally Verifying Optimizations with Block Simulations
Léo Gourdin, Benjamin Bonneau, Sylvain Boulmé, David Monniaux, and Alexandre Bérard
(Université Grenoble Alpes - CNRS - Grenoble INP - Verimag, France)
CompCert (ACM Software System Award 2021) is the first industrial-strength compiler with a mechanically checked proof of correctness. Yet, CompCert remains a moderately optimizing C compiler. Indeed, some optimizations of “gcc -O1” such as Lazy Code Motion (LCM) or Strength Reduction (SR) were still missing: developing these efficient optimizations together with their formal proofs remained a challenge.
Cyril Six et al. have developed efficient formally verified translation validators for certifying the results of superblock schedulers and peephole optimizations. We revisit and generalize their approach into a framework (integrated into CompCert) able to validate many more optimizations: an enhanced superblock scheduler, but also Dead Code Elimination (DCE), Constant Propagation (CP), and more noticeably, LCM and SR. In contrast to other approaches to translation validation, we co-design our untrusted optimizations and their validators. Our optimizations provide hints, in the forms of invariants or CFG morphisms, that help keep the formally verified validators both simple and efficient. Such designs seem applicable beyond CompCert.

Publisher's Version Published Artifact Info Artifacts Available Artifacts Reusable
Synthesizing Efficient Memoization Algorithms
Yican Sun, Xuanyu Peng, and Yingfei Xiong
(Peking University, China)
In this paper, we propose an automated approach to finding correct and efficient memoization algorithms from a given declarative specification. This problem has two major challenges: (i) a memoization algorithm is too large to be handled by conventional program synthesizers; (ii) we need to guarantee the efficiency of the memoization algorithm. To address this challenge, we structure the synthesis of memoization algorithms by introducing the local objective function and the memoization partition function and reduce the synthesis task to two smaller independent program synthesis tasks. Moreover, the number of distinct outputs of the function synthesized in the second synthesis task also decides the efficiency of the synthesized memoization algorithm, and we only need to minimize the number of different output values of the synthesized function. However, the generated synthesis task is still too complex for existing synthesizers. Thus, we propose a novel synthesis algorithm that combines the deductive and inductive methods to solve these tasks. To evaluate our algorithm, we collect 42 real-world benchmarks from Leetcode, the National Olympiad in Informatics in Provinces-Junior (a national-wide algorithmic programming contest in China), and previous approaches. Our approach successfully synhesizes 39/42 problems in a reasonable time, outperforming the baselines.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
AtomiS: Data-Centric Synchronization Made Practical
Hervé Paulino, Ana Almeida Matos, Jan Cederquist, Marco Giunti, João Matos, and António Ravara
(Nova University of Lisbon, Portugal; University of Lisbon, Portugal)
Data-Centric Synchronization (DCS) shifts the reasoning about concurrency restrictions from control structures to data declaration. It is a high-level declarative approach that abstracts away from the actual concurrency control mechanism(s) in use. Despite its advantages, the practical use of DCS is hindered by the fact that it may require many annotations and/or multiple implementations of the same method to cope with differently qualified parameters.
To overcome these limitations, in this paper we present AtomiS, a new DCS approach that requires only qualifying types of parameters and return values in interface definitions, and of fields in class definitions. The latter may also be abstracted away in type parameters, rendering class implementations virtually annotation-free. From this high level specification, a static analysis infers the atomicity constraints that are local to each method, considering valid only the method variants that are consistent with the specification, and performs code generation for all valid variants of each method. The generated code is then the target for automatic injection of concurrency control primitives that are responsible for ensuring the absence of data-races, atomicity-violations and deadlocks.
We provide a Java implementation and showcase the applicability of AtomiS in real-life code. For the benchmarks analysed, AtomiS requires fewer annotations than the original number of regions requiring locks, as well as fewer annotations than Atomic Sets (a reference DCS proposal).

Publisher's Version
Secure RDTs: Enforcing Access Control Policies for Offline Available JSON Data
Thierry Renaux, Sam Van den Vonder, and Wolfgang De Meuter
(Vrije Universiteit Brussel, Belgium)
Replicated Data Types (RDTs) are a type of data structure that can be replicated over a network, where each replica can be kept (eventually) consistent with the other replicas. They are used in applications with intermittent network connectivity, since local (offline) edits can later be merged with the other replicas. Applications that want to use RDTs often have an inherent security component that restricts data access for certain clients. However, access control for RDTs is difficult to enforce for clients that are not running within a secure environment, e.g., web applications where the client-side software can be freely tampered with. In essence, an application cannot prevent a client from reading data which they are not supposed to read, and any malicious changes will also affect well-behaved clients.
This paper proposes Secure RDTs (SRDTs), a data type that specifies role-based access control for offline-available JSON data. In brief, a trusted application server specifies a security policy based on roles with read and write privileges for certain fields of an SRDT. The server enforces read privileges by projecting the data and security policy to omit any non-readable fields for the user's given role, and it acts as an intermediary to enforce write privileges. The approach is presented as an operational semantics engineered in PLT Redex, which is validated by formal proofs and randomised testing in Redex to ensure that the formal specification is secure.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Beacons: An End-to-End Compiler Framework for Predicting and Utilizing Dynamic Loop Characteristics
Girish Mururu, Sharjeel Khan, Bodhisatwa Chatterjee, Chao Chen, Chris Porter, Ada Gavrilovska, and Santosh Pande
(Georgia Institute of Technology, USA; IBM Research, USA)
Efficient management of shared resources is a critical problem in high-performance computing (HPC) environments. Existing workload management systems often promote non-sharing of resources among different co-executing applications to achieve performance isolation. Such schemes lead to poor resource utilization and suboptimal process throughput, adversely affecting user productivity. Tackling this problem in a scalable fashion is extremely challenging, since it requires the workload scheduler to possess an in-depth knowledge about various application resource requirements and runtime phases at fine granularities within individual applications.
In this work, we show that applications’ resource requirements and execution phase behaviour can be captured in a scalable and lightweight manner at runtime by estimating important program artifacts termed as “dynamic loop characteristics”. Specifically, we propose a solution to the problem of efficient workload scheduling by designing a compiler and runtime cooperative framework that leverages novel loop-based compiler analysis for resource allocation.
We present Beacons Framework, an end-to-end compiler and scheduling framework, that estimates dynamic loop characteristics, encapsulates them in compiler-instrumented beacons in an application, and broadcasts them during application runtime, for proactive workload scheduling. We focus on estimating four important loop characteristics: loop trip-count, loop timing, loop memory footprint, and loop data-reuse behaviour, through a combination of compiler analysis and machine learning.
The novelty of the Beacons Framework also lies in its ability to tackle irregular loops that exhibit complex control flow with indeterminate loop bounds involving structure fields, aliased variables and function calls, which are highly prevalent in modern workloads. At the backend, Beacons Framework entails a proactive workload scheduler that leverages the runtime information to orchestrate aggressive process co-locations, for maximizing resource concurrency, without causing cache thrashing. Our results show that Beacons Framework can predict different loop characteristics with an accuracy of 85% to 95% on average, and the proactive scheduler obtains an average throughput improvement of 1.9x (up to 3.2x) over the state-of-the-art schedulers on an Amazon Graviton2 machine on consolidated workloads involving 1000-10000 co-executing processes, across 51 benchmarks.

Publisher's Version Published Artifact Archive submitted (3.8 MB) Artifacts Available
Compiling Structured Tensor Algebra
Mahdi Ghorbani, Mathieu Huot, Shideh Hashemian, and Amir Shaikhha
(University of Edinburgh, UK; University of Oxford, UK)
Tensor algebra is essential for data-intensive workloads in various computational domains. Computational scientists face a trade-off between the specialization degree provided by dense tensor algebra and the algorithmic efficiency that leverages the structure provided by sparse tensors. This paper presents StructTensor, a framework that symbolically computes structure at compilation time. This is enabled by Structured Tensor Unified Representation (STUR), an intermediate language that can capture tensor computations as well as their sparsity and redundancy structures. Through a mathematical view of lossless tensor computations, we show that our symbolic structure computation and the related optimizations are sound. Finally, for different tensor computation workloads and structures, we experimentally show how capturing the symbolic structure can result in outperforming state-of-the-art frameworks for both dense and sparse tensor algebra.

Publisher's Version Artifacts Functional
The Essence of Verilog: A Tractable and Tested Operational Semantics for Verilog
Qinlin Chen, Nairen Zhang, Jinpeng Wang, Tian Tan, Chang Xu, Xiaoxing Ma, and Yue Li
(Nanjing University, China)
With the increasing need to apply modern software techniques to hardware design, Verilog, the most popular Hardware Description Language (HDL), plays an infrastructure role. However, Verilog has several semantic pitfalls that often confuse software and hardware developers. Although prior research on formal semantics for Verilog exists, it is not comprehensive and has not fully addressed these issues. In this work, we present a novel scheme inspired by previous work on defining core languages for software languages like JavaScript and Python. Specifically, we define the formal semantics of Verilog using a core language called λV, which captures the essence of Verilog using as few language structures as possible. λV not only covers the most complete set of language features to date, but also addresses the aforementioned pitfalls. We implemented λV with about 27,000 lines of Java code, and comprehensively tested its totality and conformance with Verilog. As a reliable reference semantics, λV can detect semantic bugs in real-world Verilog simulators and expose ambiguities in Verilog’s standard specification. Moreover, as a useful core language, λV has the potential to facilitate the development of tools such as a state-space explorer and a concolic execution tool for Verilog.

Publisher's Version Published Artifact Info Artifacts Available Artifacts Reusable
Run-Time Prevention of Software Integration Failures of Machine Learning APIs
Chengcheng Wan, Yuhan Liu, Kuntai Du, Henry Hoffmann, Junchen Jiang, Michael Maire, and Shan Lu
(East China Normal University, China; University of Chicago, USA; Microsoft, Redmon, USA)
Due to the under-specified interfaces, developers face challenges in correctly integrating machine learning (ML) APIs in software. Even when the ML API and the software are well designed on their own, the resulting application misbehaves when the API output is incompatible with the software. It is desirable to have an adapter that converts ML API output at runtime to better fit the software need and prevent integration failures. In this paper, we conduct an empirical study to understand ML API integration problems in real-world applications. Guided by this study, we present SmartGear, a tool that automatically detects and converts mismatching or incorrect ML API output at run time, serving as a middle layer between ML API and software. Our evaluation on a variety of open-source applications shows that SmartGear detects 70% incompatible API outputs and prevents 67% potential integration failures, outperforming alternative solutions.

Publisher's Version
The Bounded Pathwidth of Control-Flow Graphs
Giovanna Kobus Conrado, Amir Kafshdar Goharshady, and Chun Kit Lam
(Hong Kong University of Science and Technology, Hong Kong)
Pathwidth and treewidth are standard and well-studied graph sparsity parameters which intuitively model the degree to which a given graph resembles a path or a tree, respectively. It is well-known that the control-flow graphs of structured goto-free programs have a tree-like shape and bounded treewidth. This fact has been exploited to design considerably more efficient algorithms for a wide variety of static analysis and compiler optimization problems, such as register allocation, µ-calculus model-checking and parity games, data-flow analysis, cache management, and liftetime-optimal redundancy elimination. However, there is no bound in the literature for the pathwidth of programs, except the general inequality that the pathwidth of a graph is at most O(lgn) times its treewidth, where n is the number of vertices of the graph.
In this work, we prove that control-flow graphs of structured programs have bounded pathwidth and provide a linear-time algorithm to obtain a path decomposition of small width. Specifically, we establish a bound of 2 · d on the pathwidth of programs with nesting depth d. Since real-world programs have small nesting depth, they also have bounded pathwidth. This is significant for a number of reasons: (i) pathwidth is a strictly stronger parameter than treewidth, i.e. any graph family with bounded pathwidth has bounded treewidth, but the converse does not hold; (ii) any algorithm that is designed with treewidth in mind can be applied to bounded-pathwidth graphs with no change; (iii) there are problems that are fixed-parameter tractable with respect to pathwidth but not treewidth; (iv) verification algorithms that are designed based on treewidth would become significantly faster when using pathwidth as the parameter; and (v) it is easier to design algorithms based on bounded pathwidth since one does not have to consider the often-challenging case of merge nodes in treewidth-based dynamic programming. Thus, we invite the static analysis and compiler optimization communities to adopt pathwidth as their parameter of choice instead of, or in addition to, treewidth. Intuitively, control-flow graphs are not only tree-like, but also path-like and one can obtain simpler and more scalable algorithms by relying on path-likeness instead of tree-likeness.
As a motivating example, we provide a simpler and more efficient algorithm for spill-free register allocation using bounded pathwidth instead of treewidth. Our algorithm reduces the runtime from O(n · r2 · tw · r + 2 · r) to O(n · pw · rpw· r + r + 1), where n is the number of lines of code, r is the number of registers, pw is the pathwidth of the control-flow graph and tw is its treewidth. We provide extensive experimental results showing that our approach is applicable to a wide variety of real-world embedded benchmarks from SDCC and obtains runtime improvements of 2-3 orders of magnitude. This is because the pathwidth is equal to the treewidth, or one more, in the overwhelming majority of real-world CFGs and thus our algorithm provides an exponential runtime improvement. As such, the benefits of using pathwidth are not limited to the theoretical side and simplicity in algorithm design, but are also apparent in practice.

Publisher's Version Published Artifact Artifacts Available
AST vs. Bytecode: Interpreters in the Age of Meta-Compilation
Octave Larose, Sophie Kaleba, Humphrey Burchell, and Stefan Marr
(University of Kent, UK)
Thanks to partial evaluation and meta-tracing, it became practical to build language implementations that reach state-of-the-art peak performance by implementing only an interpreter. Systems such as RPython and GraalVM provide components such as a garbage collector and just-in-time compiler in a language-agnostic manner, greatly reducing implementation effort.
However, meta-compilation-based language implementations still need to improve further to reach the low memory use and fast warmup behavior that custom-built systems provide. A key element in this endeavor is interpreter performance. Folklore tells us that bytecode interpreters are superior to abstract-syntax-tree (AST) interpreters both in terms of memory use and run-time performance.
This work assesses the trade-offs between AST and bytecode interpreters to verify common assumptions and whether they hold in the context of meta-compilation systems. We implemented four interpreters, each an AST and a bytecode one using RPython and GraalVM. We keep the difference between the interpreters as small as feasible to be able to evaluate interpreter performance, peak performance, warmup, memory use, and the impact of individual optimizations.
Our results show that both systems indeed reach performance close to Node.js/V8. Looking at interpreter-only performance, our AST interpreters are on par with, or even slightly faster than their bytecode counterparts. After just-in-time compilation, the results are roughly on par. This means bytecode interpreters do not have their widely assumed performance advantage. However, we can confirm that bytecodes are more compact in memory than ASTs, which becomes relevant for larger applications. However, for smaller applications, we noticed that bytecode interpreters allocate more memory because boxing avoidance is not as applicable, and because the bytecode interpreter structure requires memory, e.g., for a reified stack.
Our results show AST interpreters to be competitive on top of meta-compilation systems. Together with possible engineering benefits, they should thus not be discounted so easily in favor of bytecode interpreters.

Publisher's Version Published Artifact Archive submitted (360 kB) Artifacts Available Artifacts Reusable
Mutually Iso-Recursive Subtyping
Andreas Rossberg
(Independent Researcher, Munich, Germany)
Iso-recursive types are often taken as a type-theoretic model for type recursion as present in many programming languages, e.g., classes in object-oriented languages or algebraic datatypes in functional languages. Their main advantage over an equi-recursive semantics is that they are simpler and algorithmically less expensive, which is an important consideration when the cost of type checking matters, such as for intermediate or low-level code representations, virtual machines, or runtime casts. However, a closer look reveals that iso-recursion cannot, in its standard form, efficiently express essential type system features like mutual recursion or non-uniform recursion. While it has been folklore that mutual recursion and non-uniform type parameterisation can nicely be handled by generalising to higher kinds, this encoding breaks down when combined with subtyping: the classic “Amber” rule for subtyping iso-recursive types is too weak to express mutual recursion without falling back to encodings of quadratic size.
We present a foundational core calculus of iso-recursive types with declared subtyping that can express both inter- and intra-recursion subtyping without such blowup, including subtyping between constructors of higher or mixed kind. In a second step, we identify a syntactic fragment of this general calculus that allows for more efficient type checking without “deep” substitutions, by observing that higher-kinded iso-recursive types can be inserted to “guard” against unwanted β-reductions. This fragment closely resembles the structure of typical nominal subtype systems, but without requiring nominal semantics. It has been used as the basis for a proposed extension of WebAssembly with recursive types.

Publisher's Version
Mechanizing Session-Types using a Structural View: Enforcing Linearity without Linearity
Chuta Sano, Ryan Kavanagh, and Brigitte Pientka
(McGill University, Canada)
Session types employ a linear type system that ensures that communication channels cannot be implicitly copied or discarded. As a result, many mechanizations of these systems require modeling channel contexts and carefully ensuring that they treat channels linearly. We demonstrate a technique that localizes linearity conditions as additional predicates embedded within type judgments, which allows us to use structural typing contexts instead of linear ones. This technique is especially relevant when leveraging (weak) higher-order abstract syntax to handle channel mobility and the intricate binding structures that arise in session-typed systems. Following this approach, we mechanize a session-typed system based on classical linear logic and its type preservation proof in the proof assistant Beluga, which uses the logical framework LF as its encoding language. We also prove adequacy for our encoding. This shows the tractability and effectiveness of our approach in modelling substructural systems such as session-typed languages.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Graph IRs for Impure Higher-Order Languages: Making Aggressive Optimizations Affordable with Precise Effect Dependencies
Oliver Bračevac, Guannan Wei, Songlin Jia, Supun Abeysinghe, Yuxuan Jiang, Yuyan Bao, and Tiark Rompf
(Purdue University, USA; Galois, USA; Augusta University, USA)
Graph-based intermediate representations (IRs) are widely used for powerful compiler optimizations, either interprocedurally in pure functional languages, or intraprocedurally in imperative languages. Yet so far, no suitable graph IR exists for aggressive global optimizations in languages with both effects and higher-order functions: aliasing and indirect control transfers make it difficult to maintain sufficiently granular dependency information for optimizations to be effective. To close this long-standing gap, we propose a novel typed graph IR combining a notion of reachability types with an expressive effect system to compute precise and granular effect dependencies at an affordable cost while supporting local reasoning and separate compilation. Our high-level graph IR imposes lexical structure to represent structured control flow and nesting, enabling aggressive and yet inexpensive code motion and other optimizations for impure higher-order programs. We formalize the new graph IR based on a λ-calculus with a reachability type-and-effect system along with a specification of various optimizations. We present performance case studies for tensor loop fusion, CUDA kernel fusion, symbolic execution of LLVM IR, and SQL query compilation in the Scala LMS compiler framework using the new graph IR. We observe significant speedups of up to 21x.

Publisher's Version
Getting into the Flow: Towards Better Type Error Messages for Constraint-Based Type Inference
Ishan Bhanuka, Lionel Parreaux, David Binder, and Jonathan Immanuel Brachthäuser
(Hong Kong University of Science and Technology, China; University of Tübingen, Germany)
Creating good type error messages for constraint-based type inference systems is difficult. Typical type error messages reflect implementation details of the underlying constraint-solving algorithms rather than the specific factors leading to type mismatches. We propose using subtyping constraints that capture data flow to classify and explain type errors. Our algorithm explains type errors as faulty data flows, which programmers are already used to reasoning about, and illustrates these data flows as sequences of relevant program locations. We show that our ideas and algorithm are not limited to languages with subtyping, as they can be readily integrated with Hindley-Milner type inference. In addition to these core contributions, we present the results of a user study to evaluate the quality of our messages compared to other implementations. While the quantitative evaluation does not show that flow-based messages improve the localization or understanding of the causes of type errors, the qualitative evaluation suggests a real need and demand for flow-based messages.

Publisher's Version Published Artifact Info Artifacts Available Artifacts Reusable
Continuing WebAssembly with Effect Handlers
Luna Phipps-Costin, Andreas Rossberg, Arjun Guha, Daan Leijen, Daniel Hillerström, KC Sivaramakrishnan, Matija Pretnar, and Sam Lindley
(Northeastern University, USA; Independent, Germany; Roblox, USA; Microsoft Research, USA; Huawei Zurich Research Center, Switzerland; Tarides, India; IIT Madras, India; University of Ljubljana, Slovenia; Institute of Mathematics, Physics, and Mechanics, Ljubljana, Slovenia; University of Edinburgh, UK)
WebAssembly (Wasm) is a low-level portable code format offering near native performance. It is intended as a compilation target for a wide variety of source languages. However, Wasm provides no direct support for non-local control flow features such as async/await, generators/iterators, lightweight threads, first-class continuations, etc. This means that compilers for source languages with such features must ceremoniously transform whole source programs in order to target Wasm.
We present WasmFX an extension to Wasm which provides a universal target for non-local control features via effect handlers, enabling compilers to translate such features directly into Wasm. Our extension is minimal and only adds three main instructions for creating, suspending, and resuming continuations. Moreover, our primitive instructions are type-safe providing typed continuations which are well-aligned with the design principles of Wasm whose stacks are typed. We present a formal specification of WasmFX and show that the extension is sound. We have implemented WasmFX as an extension to the Wasm reference interpreter and also built a prototype WasmFX extension for Wasmtime, a production-grade Wasm engine, piggybacking on Wasmtime's existing fibers API. The preliminary performance results for our prototype are encouraging, and we outline future plans to realise a native implementation.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Two Birds with One Stone: Boosting Code Generation and Code Search via a Generative Adversarial Network
Shangwen Wang, Bo Lin, Zhensu Sun, Ming Wen, Yepang Liu, Yan Lei, and Xiaoguang Mao
(National University of Defense Technology, China; Singapore Management University, Singapore; Huazhong University of Science and Technology, China; Southern University of Science and Technology, China; Chongqing University, China)
Automatically transforming developers' natural language descriptions into source code has been a longstanding goal in software engineering research. Two types of approaches have been proposed in the literature to achieve this: code generation, which involves generating a new code snippet, and code search, which involves reusing existing code. However, despite existing efforts, the effectiveness of the state-of-the-art techniques remains limited. To seek for further advancement, our insight is that code generation and code search can help overcome the limitation of each other: the code generator can benefit from feedback on the quality of its generated code, which can be provided by the code searcher, while the code searcher can benefit from the additional training data augmented by the code generator to better understand code semantics. Drawing on this insight, we propose a novel approach that combines code generation and code search techniques using a generative adversarial network (GAN), enabling mutual improvement through the adversarial training. Specifically, we treat code generation and code search as the generator and discriminator in the GAN framework, respectively, and incorporate several customized designs for our tasks. We evaluate our approach in eight different settings, and consistently observe significant performance improvements for both code generation and code search. For instance, when using NatGen, a state-of-the-art code generator, as the generator and GraphCodeBERT, a state-of-the-art code searcher, as the discriminator, we achieve a 32% increase in CodeBLEU score for code generation, and a 12% increase in mean reciprocal rank for code search on a large-scale Python dataset, compared to their original performances.

Publisher's Version Published Artifact Artifacts Available
Fast and Efficient Boolean Unification for Hindley-Milner-Style Type and Effect Systems
Magnus Madsen, Jaco van de Pol, and Troels Henriksen
(Aarhus University, Denmark; University of Copenhagen, Denmark)
As type and effect systems become more expressive there is an increasing need for efficient type inference. We consider a polymorphic effect system based on Boolean formulas where inference requires Boolean unification. Since Boolean unification involves semantic equivalence, conventional syntax-driven unification is insufficient. At the same time, existing Boolean unification techniques are ill-suited for type inference. We propose a hybrid algorithm for solving Boolean unification queries based on Boole’s Successive Variable Elimination (SVE) algorithm. The proposed approach builds on several key observations regarding the Boolean unification queries encountered in practice, including: (i) most queries are simple, (ii) most queries involve a few flexible variables, (iii) queries are likely to repeat due similar programming patterns, and (iv) there is a long tail of complex queries. We exploit these observations to implement several strategies for formula minimization, including ones based on tabling and binary decision diagrams. We implement the new hybrid approach in the Flix programming language. Experimental results show that by reducing the overhead of Boolean unification, the compilation throughput increases from 8,580 lines/sec to 15,917 lines/sec corresponding to a 1.8x speed-up. Further, the overhead on type and effect inference time is only 16% which corresponds to an overhead of less than 7% on total compilation time. We study the hybrid approach and demonstrate that each design choice improves performance.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
How Profilers Can Help Navigate Type Migration
Ben Greenman, Matthias Felleisen, and Christos Dimoulas
(University of Utah, USA; Northeastern University, USA; Northwestern University, USA)
Sound migratory typing envisions a safe and smooth refactoring of untyped code bases to typed ones. However, the cost of enforcing safety with run-time checks is often prohibitively high, thus performance regressions are a likely occurrence. Additional types can often recover performance, but choosing the right components to type is difficult because of the exponential size of the migratory typing lattice. In principal though, migration could be guided by off-the-shelf profiling tools. To examine this hypothesis, this paper follows the rational programmer method and reports on the results of an experiment on tens of thousands of performance-debugging scenarios via seventeen strategies for turning profiler output into an actionable next step. The most effective strategy is the use of deep types to eliminate the most costly boundaries between typed and untyped components; this strategy succeeds in more than 50% of scenarios if two performance degradations are tolerable along the way.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Rhombus: A New Spin on Macros without All the Parentheses
Matthew Flatt, Taylor Allred, Nia Angle, Stephen De Gabrielle, Robert Bruce Findler, Jack Firth, Kiran Gopinathan, Ben Greenman, Siddhartha Kasivajhula, Alex Knauth, Jay McCarthy, Sam Phillips, Sorawee Porncharoenwase, Jens Axel Søgaard, and Sam Tobin-Hochstadt
(University of Utah, USA; independent, USA; independent, UK; Northwestern University, USA; National University of Singapore, Singapore; Brown University, USA; Reach, USA; University of Washington, USA; independent, Denmark; Indiana University, USA)
Rhombus is a new language that is built on Racket. It offers the same kind of language extensibility as Racket itself, but using traditional (infix) notation. Although Rhombus is far from the first language to support Lisp-style macros without Lisp-style parentheses, Rhombus offers a novel synthesis of macro technology that is practical and expressive. A key element is the use of multiple binding spaces for context-specific sublanguages. For example, expressions and pattern-matching forms can use the same operators with different meanings and without creating conflicts. Context-sensitive bindings, in turn, facilitate a language design that reduces the notational distance between the core language and macro facilities. For example, repetitions can be defined and used in binding and expression contexts generally, which enables a smoother transition from programming to metaprogramming. Finally, since handling static information (such as types) is also a necessary part of growing macros beyond Lisp, Rhombus includes support in its expansion protocol for communicating static information among bindings and expressions. The Rhombus implementation demonstrates that all of these pieces can work together in a coherent and user-friendly language.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Towards Better Semantics Exploration for Browser Fuzzing
Chijin Zhou, Quan Zhang, Lihua Guo, Mingzhe Wang, Yu Jiang, Qing Liao, Zhiyong Wu, Shanshan Li, and Bin Gu
(Tsinghua University, China; Harbin Institute of Technology, China; National University of Defense Technology, China; Beijing Institute of Control Engineering, China)
Web browsers exhibit rich semantics that enable a plethora of web-based functionalities. However, these intricate semantics present significant challenges for the implementation and testing of browsers. For example, fuzzing, a widely adopted testing technique, typically relies on handwritten context-free grammars (CFGs) for automatically generating inputs. However, these CFGs fall short in adequately modeling the complex semantics of browsers, resulting in generated inputs that cover only a portion of the semantics and are prone to semantic errors. In this paper, we present SaGe, an automated method that enhances browser fuzzing through the use of production-context sensitive grammars (PCSGs) incorporating semantic information. Our approach begins by extracting a rudimentary CFG from W3C standards and iteratively enhancing it to create a PCSG. The resulting PCSG enables our fuzzer to generate inputs that explore a broader range of browser semantics with a higher proportion of semantically-correct inputs. To evaluate the efficacy of SaGe, we conducted 24-hour fuzzing campaigns on mainstream browsers, including Chrome, Safari, and Firefox. Our approach demonstrated better performance compared to existing browser fuzzers, with a 6.03%-277.80% improvement in edge coverage, a 3.56%-161.71% boost in semantic correctness rate, twice the number of bugs discovered. Moreover, we identified 62 bugs across the three browsers, with 40 confirmed and 10 assigned CVEs.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Spirea: A Mechanized Concurrent Separation Logic for Weak Persistent Memory
Simon Friis Vindum and Lars Birkedal
(Aarhus University, Denmark)
Weak persistent memory (a.k.a. non-volatile memory) is an emerging technology that offers fast byte-addressable durable main memory. A wealth of algorithms and libraries has been developed to explore this exciting technology. As noted by others, this has led to a significant verification gap. Towards closing this gap, we present Spirea, the first concurrent separation logic for verification of programs under a weak persistent memory model. Spirea is based on the Iris and Perennial verification frameworks, and by combining features from these logics with novel techniques it supports high-level modular reasoning about crash-safe and thread-safe programs and libraries. Spirea is fully mechanized in the Coq proof assistant and allows for interactive development of proofs with the Iris Proof Mode. We use Spirea to verify several challenging examples with modular specifications. We show how our logic can verify thread-safety and crash-safety of non-blocking durable data structures with null-recovery, in particular the Treiber stack and the Michael-Scott queue adapted to persistent memory. This is the first time durable data structures have been verified with a program logic.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Adventure of a Lifetime: Extract Method Refactoring for Rust
Sewen Thy, Andreea Costea, Kiran Gopinathan, and Ilya Sergey
(Yale-NUS College, Singapore; National University of Singapore, Singapore)
We present a design and implementation of the automated "Extract Method" refactoring for Rust programs. Even though Extract Method is one of the most well-studied and widely used in practice automated refactorings, featured in all major IDEs for all popular programming languages, implementing it soundly for Rust is surprisingly non-trivial due to the restrictions of the Rust's ownership and lifetime-based type system.
In this work, we provide a systematic decomposition of the Extract Method refactoring for Rust programs into a series of program transformations, each concerned with satisfying a particular aspect of Rust type safety, eventually producing a well-typed Rust program. Our key discovery is the formulation of Extract Method as a composition of naive function hoisting and a series of automated program repair procedures that progressively make the resulting program "more well-typed" by relying on the corresponding repair oracles. Those oracles include a novel static intra-procedural ownership analysis that infers correct sharing annotations for the extracted function's parameters, and the lifetime checker of rustc, Rust's reference compiler.
We implemented our approach in a tool called REM---an automated Extract Method refactoring built on top of IntelliJ IDEA plugin for Rust. Our extensive evaluation on a corpus of changes in five popular Rust projects shows that REM (a) can extract a larger class of feature-rich code fragments into semantically correct functions than other existing refactoring tools, (b) can reproduce method extractions performed manually by human developers in the past, and (c) is efficient enough to be used in interactive development.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Mat2Stencil: A Modular Matrix-Based DSL for Explicit and Implicit Matrix-Free PDE Solvers on Structured Grid
Huanqi Cao, Shizhi Tang, Qianchao Zhu, Bowen Yu, and Wenguang Chen
(Tsinghua University, China; Peking University, China; Pengcheng Laboratory, China)
Partial differential equation (PDE) solvers are extensively utilized across numerous scientific and engineering fields. However, achieving high performance and scalability often necessitates intricate and low-level programming, particularly when leveraging deterministic sparsity patterns in structured grids.
In this paper, we propose an innovative domain-specific language (DSL), Mat2Stencil, with its compiler, for PDE solvers on structured grids. Mat2Stencil introduces a structured sparse matrix abstraction, facilitating modular, flexible, and easy-to-use expression of solvers across a broad spectrum, encompassing components such as Jacobi or Gauss-Seidel preconditioners, incomplete LU or Cholesky decompositions, and multigrid methods built upon them. Our DSL compiler subsequently generates matrix-free code consisting of generalized stencils through multi-stage programming. The code allows spatial loop-carried dependence in the form of quasi-affine loops, in addition to the Jacobi-style stencil’s embarrassingly parallel on spatial dimensions. We further propose a novel automatic parallelization technique for the spatially dependent loops, which offers a compile-time deterministic task partitioning for threading, calculates necessary inter-thread synchronization automatically, and generates an efficient multi-threaded implementation with fine-grained synchronization.
Implementing 4 benchmarking programs, 3 of them being the pseudo-applications in NAS Parallel Benchmarks with 6.3% lines of code and 1 being matrix-free High Performance Conjugate Gradients with 16.4% lines of code, we achieve up to 1.67× and on average 1.03× performance compared to manual implementations.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Melocoton: A Program Logic for Verified Interoperability Between OCaml and C
Armaël Guéneau, Johannes Hostert, Simon Spies, Michael Sammler, Lars Birkedal, and Derek Dreyer
(Université Paris-Saclay - CNRS - ENS Paris-Saclay - Inria, France; Saarland University, Germany; MPI-SWS, Germany; Aarhus University, Denmark)
In recent years, there has been tremendous progress on developing program logics for verifying the correctness of programs in a rich and diverse array of languages. Thus far, however, such logics have assumed that programs are written entirely in a single programming language. In practice, this assumption rarely holds since programs are often composed of components written in different programming languages, which interact with one another via some kind of foreign function interface (FFI). In this paper, we take the first steps towards the goal of developing program logics for multi-language verification. Specifically, we present Melocoton, a multi-language program verification system for reasoning about OCaml, C, and their interactions through the OCaml FFI. Melocoton consists of the first formal semantics of (a large subset of) the OCaml FFI—previously only described in prose in the OCaml manual—as well as the first program logic to reason about the interactions of program components written in OCaml and C. Melocoton is fully mechanized in Coq on top of the Iris separation logic framework.

Publisher's Version Published Artifact Info Artifacts Available Artifacts Reusable
Interactive Debugging of Datalog Programs
André Pacak and Sebastian Erdweg
(JGU Mainz, Germany)
Datalog is used for complex programming tasks nowadays, consisting of numerous inter-dependent predicates. But Datalog lacks interactive debugging techniques that support the stepwise execution and inspection of the execution state. In this paper, we propose interactive debugging of Datalog programs following a top-down evaluation strategy called recursive query/subquery. While the recursive query/subquery approach is well-known in the literature, we are the first to provide a complete programming-language semantics based on it. Specifically, we develop the first small-step operational semantics for top-down Datalog, where subqueries occur as nested intermediate terms. The small-step semantics forms the basis of step-into interactions in the debugger. Moreover, we show how step-over interactions can be realized efficiently based on a hybrid Datalog semantics that adds a bottom-up database to our top-down operational semantics. We implemented a debugger for core Datalog following these semantics and explain how to adopt it for debugging the frontend languages of Soufflé and IncA. Our evaluation shows that our hybrid Datalog semantics can be used to debug real-world Datalog programs with realistic workloads.

Publisher's Version
Concrete Type Inference for Code Optimization using Machine Learning with SMT Solving
Fangke Ye, Jisheng Zhao, Jun Shirako, and Vivek Sarkar
(Georgia Institute of Technology, USA)
Despite the widespread popularity of dynamically typed languages such as Python, it is well known that they pose significant challenges to code optimization due to the lack of concrete type information. To overcome this limitation, many ahead-of-time optimizing compiler approaches for Python rely on programmers to provide optional type information as a prerequisite for extensive code optimization. Since few programmers provide this information, a large majority of Python applications are executed without the benefit of code optimization, thereby contributing collectively to a significant worldwide wastage of compute and energy resources.
In this paper, we introduce a new approach to concrete type inference that is shown to be effective in enabling code optimization for dynamically typed languages, without requiring the programmer to provide any type information. We explore three kinds of type inference algorithms in our approach based on: 1) machine learning models including GPT-4, 2) constraint-based inference based on SMT solving, and 3) a combination of 1) and 2). Our approach then uses the output from type inference to generate multi-version code for a bounded number of concrete type options, while also including a catch-all untyped version for the case when no match is found. The typed versions are then amenable to code optimization. Experimental results show that the combined algorithm in 3) delivers far superior precision and performance than the separate algorithms for 1) and 2). The performance improvement due to type inference, in terms of geometric mean speedup across all benchmarks compared to standard Python, when using 3) is 26.4× with Numba as an AOT optimizing back-end and 62.2× with the Intrepydd optimizing compiler as a back-end. These vast performance improvements can have a significant impact on programmers’ productivity, while also reducing their applications’ use of compute and energy resources.

Publisher's Version Published Artifact Artifacts Available
An Explanation Method for Models of Code
Yu Wang, Ke Wang, and Linzhang Wang
(Nanjing University, China; Visa Research, USA)
This paper introduces a novel method, called WheaCha, for explaining the predictions of code models. Similar to attribution methods, WheaCha seeks to identify input features that are responsible for a particular prediction that models make. On the other hand, it differs from attribution methods in crucial ways. Specifically, WheaCha separates an input program into "wheat" (i.e., defining features that are the reason for which models predict the label that they predict) and the rest "chaff" for any given prediction. We realize WheaCha in a tool, HuoYan, and use it to explain four prominent code models: code2vec, seq-GNN, GGNN, and CodeBERT. Results show that (1) HuoYan is efficient — taking on average under twenty seconds to compute wheat for an input program in an end-to-end fashion (i.e., including model prediction time); (2) the wheat that all models use to make predictions is predominantly comprised of simple syntactic or even lexical properties (i.e., identifier names); (3) neither the latest explainability methods for code models (i.e., SIVAND and CounterFactual Explanations) nor the most noteworthy attribution methods (i.e., Integrated Gradients and SHAP) can precisely capture wheat. Finally, we set out to demonstrate the usefulness of WheaCha, in particular, we assess if WheaCha’s explanations can help end users to identify defective code models (e.g., trained on mislabeled data or learned spurious correlations from biased data). We find that, with WheaCha, users achieve far higher accuracy in identifying faulty models than SIVAND, CounterFactual Explanations, Integrated Gradients and SHAP.

Publisher's Version Archive submitted (570 kB)
Modular Verification of Safe Memory Reclamation in Concurrent Separation Logic
Jaehwang Jung, Janggun Lee, Jaemin Choi, Jaewoo Kim, Sunho Park, and Jeehoon Kang
(KAIST, Republic of Korea)
Formal verification is an effective method to address the challenge of designing correct and efficient concurrent data structures. But verification efforts often ignore memory reclamation, which involves nontrivial synchronization between concurrent accesses and reclamation. When incorrectly implemented, it may lead to critical safety errors such as use-after-free and the ABA problem. Semi-automatic safe memory reclamation schemes such as hazard pointers and RCU encapsulate the complexity of manual memory management in modular interfaces. However, this modularity has not been carried over to formal verification.
We propose modular specifications of hazard pointers and RCU, and formally verify realistic implementations of them in concurrent separation logic. Specifically, we design abstract predicates for hazard pointers that capture the meaning of validating the protection of nodes, and those for RCU that support optimistic traversal to possibly retired nodes. We demonstrate that the specifications indeed facilitate modular verification in three criteria: compositional verification, general applicability, and easy integration. In doing so, we present the first formal verification of Harris’s list, the Harris-Michael list, the Chase-Lev deque, and RDCSS with reclamation. We report the Coq mechanization of all our results in the Iris separation logic framework.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Simple Reference Immutability for System F<:
Edward Lee and Ondřej Lhoták
(University of Waterloo, Canada)
Reference immutability is a type based technique for taming mutation that has long been studied in the context of object-oriented languages, like Java. Recently, though, languages like Scala have blurred the lines between functional programming languages and object oriented programming languages. We explore how reference immutability interacts with features commonly found in these hybrid languages, in particular with higher-order functions – polymorphism – and subtyping. We construct a calculus System F<:M which encodes a reference immutability system as a simple extension of System F<: and prove that it satisfies the standard soundness and immutability safety properties.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Asparagus: Automated Synthesis of Parametric Gas Upper-Bounds for Smart Contracts
Zhuo Cai, Soroush Farokhnia, Amir Kafshdar Goharshady, and S. Hitarth
(Hong Kong University of Science and Technology, Hong Kong)
Modern programmable blockchains have built-in support for smart contracts, i.e. programs that are stored on the blockchain and whose state is subject to consensus. After a smart contract is deployed on the blockchain, anyone on the network can interact with it and call its functions by creating transactions. The blockchain protocol is then used to reach a consensus about the order of the transactions and, as a direct corollary, the state of every smart contract. Reaching such consensus necessarily requires every node on the network to execute all function calls. Thus, an attacker can perform DoS by creating expensive transactions and function calls that use considerable or even possibly infinite time and space. To avoid this, following Ethereum, virtually all programmable blockchains have introduced the concept of “gas”. A fixed hard-coded gas cost is assigned to every atomic operation and the user who calls a function has to pay for its total gas usage. This technique ensures that the protocol is not vulnerable to DoS attacks, but it has also had significant unintended consequences. Out-of-gas errors, i.e. when a user misunderestimates the gas usage of their function call and does not allocate enough gas, are a major source of security vulnerabilities in Ethereum.
We focus on the well-studied problem of automatically finding upper-bounds on the gas usage of a smart contract. This is a classical problem in the blockchain community and has also been extensively studied by researchers in programming languages and verification. In this work, we provide a novel approach using theorems from polyhedral geometry and real algebraic geometry, namely Farkas’ Lemma, Handelman’s Theorem, and Putinar’s Positivstellensatz, to automatically synthesize linear and polynomial parametric bounds for the gas usage of smart contracts. Our approach is the first to provide completeness guarantees for the synthesis of such parametric upper-bounds. Moreover, our theoretical results are independent of the underlying consensus protocol and can be applied to smart contracts written in any language and run on any blockchain.
As a proof of concept, we also provide a tool, called “Asparagus” that implements our algorithms for Ethereum contracts written in Solidity. Finally, we provide extensive experimental results over 24,188 real-world smart contracts that are currently deployed on the Ethereum blockchain. We compare Asparagus against GASTAP, which is the only previous tool that could provide parametric bounds, and show that our method significantly outperforms it, both in terms of applicability and the tightness of the resulting bounds. More specifically, our approach can handle 80.56% of the functions (126,269 out of 156,735) in comparison with GASTAP’s 58.62%. Additionally, even on the benchmarks where both approaches successfully synthesize a bound, our bound is tighter in 97.85% of the cases.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Inductive Program Synthesis Guided by Observational Program Similarity
Jack Feser, Işıl Dillig, and Armando Solar-Lezama
(Massachusetts Institute of Technology, USA; University of Texas at Austin, USA)
We present a new general-purpose synthesis technique for generating programs from input-output examples. Our method, called metric program synthesis, relaxes the observational equivalence idea (used widely in bottom-up enumerative synthesis) into a weaker notion of observational similarity, with the goal of reducing the search space that the synthesizer needs to explore. Our method clusters programs into equivalence classes based on an expert-provided distance metric and constructs a version space that compactly represents “approximately correct” programs. Then, given a “close enough” program sampled from this version space, our approach uses a distance-guided repair algorithm to find a program that exactly matches the given input-output examples. We have implemented our proposed metric program synthesis technique in a tool called SyMetric and evaluate it in three different domains considered in prior work. Our evaluation shows that SyMetric outperforms other domain-agnostic synthesizers that use observational equivalence and that it achieves results competitive with domain-specific synthesizers that are either designed for or trained on those domains.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
From Capabilities to Regions: Enabling Efficient Compilation of Lexical Effect Handlers
Marius Müller, Philipp Schuster, Jonathan Lindegaard Starup, Klaus Ostermann, and Jonathan Immanuel Brachthäuser
(University of Tübingen, Germany; Aarhus University, Denmark)
Effect handlers are a high-level abstraction that enables programmers to use effects in a structured way. They have gained a lot of popularity within academia and subsequently also in industry. However, the abstraction often comes with a significant runtime cost and there has been intensive research recently on how to reduce this price.
A promising approach in this regard is to implement effect handlers using a CPS translation and to provide sufficient information about the nesting of handlers. With this information the CPS translation can decide how effects have to be lifted through handlers, i.e., which handlers need to be skipped, in order to handle the effect at the correct place. A structured way to make this information available is to use a calculus with a region system and explicit subregion evidence. Such calculi, however, are quite verbose, which makes them impractical to use as a source-level language.
We present a method to infer the lifting information for a calculus underlying a source-level language. This calculus uses second-class capabilities for the safe use of effects. To do so, we define a typed translation to a calculus with regions and evidence and we show that this lift-inference translation is typability- and semantics-preserving. On the one hand, this exposes the precise relation between the second-class property and the structure given by regions. On the other hand, it closes a gap in a compiler pipeline enabling efficient compilation of the source-level language. We have implemented lift inference in this compiler pipeline and conducted benchmarks which indicate that the approach is indeed working.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
A Container-Usage-Pattern-Based Context Debloating Approach for Object-Sensitive Pointer Analysis
Dongjie He, Yujiang Gui, Wei Li, Yonggang Tao, Changwei Zou, Yulei Sui, and Jingling Xue
(UNSW, Australia)
In this paper, we introduce DebloaterX, a new approach for automatically identifying context-independent objects to debloat contexts in object-sensitive pointer analysis (kobj). Object sensitivity achieves high precision, but its context construction mechanism combines objects with their contexts indiscriminately. This leads to a combinatorial explosion of contexts in large programs, resulting in inefficiency. Previous research has proposed a context-debloating approach that inhibits a pre-selected set of context-independent objects from forming new contexts, improving the efficiency of kobj. However, this earlier context-debloating approach under-approximates the set of context-independent objects identified, limiting performance speedups.
We introduce a novel context-debloating pre-analysis approach that identifies objects as context-dependent only when they are potentially precision-critical to kobj based on three general container-usage patterns. Our research finds that objects containing no fields of ”abstract” (i.e., open) types can be analyzed context-insensitively with negligible precision loss in real-world applications. We provide clear rules and efficient algorithms to recognize these patterns, selecting more context-independent objects for better debloating. We have implemented DebloaterX in the Qilin framework and will release it as an open-source tool. Our experimental results on 12 standard Java benchmarks and real-world programs show that DebloaterX selects 92.4% of objects to be context-independent on average, enabling kobj to run significantly faster (an average of 19.3x when k = 2 and 150.2x when k = 3) and scale up to 8 more programs when k = 3, with only a negligible loss of precision (less than 0.2%). Compared to state-of-the-art alternative pre-analyses in accelerating kobj, DebloaterX outperforms Zipper significantly in both precision and efficiency and outperforms Conch (the earlier context-debloating approach) in efficiency substantially while achieving nearly the same precision.

Publisher's Version
A Cocktail Approach to Practical Call Graph Construction
Yuandao Cai and Charles Zhang
(Hong Kong University of Science and Technology, China)
After decades of research, constructing call graphs for modern C-based software remains either imprecise or inefficient when scaling up to the ever-growing complexity. The main culprit is the difficulty of resolving function pointers, as precise pointer analyses are cubic in nature and become exponential when considering calling contexts. This paper takes a practical stance by first conducting a comprehensive empirical study of function pointer manipulations in the wild. By investigating 5355 indirect calls in five popular open-source systems, we conclude that, instead of the past uniform treatments for function pointers, a cocktail approach can be more effective in “squeezing” the number of difficult pointers to a minimum using a potpourri of cheap methods. In particular, we decompose the costs of constructing highly precise call graphs of big code by tailoring several increasingly precise algorithms and synergizing them into a concerted workflow. As a result, many indirect calls can be precisely resolved in an efficient and principled fashion, thereby reducing the final, expensive refinements. This is, in spirit, similar to the well-known cocktail medical therapy.
The results are encouraging — our implemented prototype called Coral can achieve similar precision versus the previous field-, flow-, and context-sensitive Andersen-style call graph construction, yet scale up to millions of lines of code for the first time, to the best of our knowledge. Moreover, by evaluating the produced call graphs through the lens of downstream clients (i.e., use-after-free detection, thin slicing, and directed grey-box fuzzing), the results show that Coral can dramatically improve their effectiveness for better vulnerability hunting, understanding, and reproduction. More excitingly, we found twelve confirmed bugs (six impacted by indirect calls) in popular systems (e.g., MariaDB), spreading across multiple historical versions.

Publisher's Version
Equality Saturation Theory Exploration à la Carte
Anjali Pal, Brett Saiki, Ryan Tjoa, Cynthia Richey, Amy Zhu, Oliver Flatt, Max Willsey, Zachary Tatlock, and Chandrakana Nandi
(University of Washington, USA; Certora, USA)
Rewrite rules are critical in equality saturation, an increasingly popular technique in optimizing compilers, synthesizers, and verifiers. Unfortunately, developing high-quality rulesets is difficult and error-prone. Recent work on automatically inferring rewrite rules does not scale to large terms or grammars, and existing rule inference tools are monolithic and opaque. Equality saturation users therefore struggle to guide inference and incrementally construct rulesets. As a result, most users still manually develop and maintain rulesets.
This paper proposes Enumo, a new domain-specific language for programmable theory exploration. Enumo provides a small set of core operators that enable users to strategically guide rule inference and incrementally build rulesets. Short Enumo programs easily replicate results from state-of-the-art tools, but Enumo programs can also scale to infer deeper rules from larger grammars than prior approaches. Its composable operators even facilitate developing new strategies for ruleset inference. We introduce a new fast-forwarding strategy that does not require evaluating terms in the target language, and can thus support domains that were out of scope for prior work.
We evaluate Enumo and fast-forwarding across a variety of domains. Compared to state-of-the-art techniques, enumo can synthesize better rulesets over a diverse set of domains, in some cases matching the effects of manually-developed rulesets in systems driven by equality saturation.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Complete First-Order Reasoning for Properties of Functional Programs
Adithya Murali, Lucas Peña, Ranjit Jhala, and P. Madhusudan
(University of Illinois at Urbana-Champaign, USA; University of California at San Diego, USA)
Several practical tools for automatically verifying functional programs (e.g., Liquid Haskell and Leon for Scala programs) rely on a heuristic based on unrolling recursive function definitions followed by quantifier-free reasoning using SMT solvers. We uncover foundational theoretical properties of this heuristic, revealing that it can be generalized and formalized as a technique that is in fact complete for reasoning with combined First-Order theories of algebraic datatypes and background theories, where background theories support decidable quantifier-free reasoning. The theory developed in this paper explains the efficacy of these heuristics when they succeed, explain why they fail when they fail, and the precise role that user help plays in making proofs succeed.

Publisher's Version
Structural Subtyping as Parametric Polymorphism
Wenhao Tang, Daniel Hillerström, James McKinna, Michel Steuwer, Ornela Dardha, Rongxiao Fu, and Sam Lindley
(University of Edinburgh, UK; Huawei Zurich Research Center, Switzerland; Heriot-Watt University, UK; TU Berlin, Germany; University of Glasgow, UK)
Structural subtyping and parametric polymorphism provide similar flexibility and reusability to programmers. For example, both features enable the programmer to provide a wider record as an argument to a function that expects a narrower one. However, the means by which they do so differs substantially, and the precise details of the relationship between them exists, at best, as folklore in literature.
In this paper, we systematically study the relative expressive power of structural subtyping and parametric polymorphism. We focus our investigation on establishing the extent to which parametric polymorphism, in the form of row and presence polymorphism, can encode structural subtyping for variant and record types. We base our study on various Church-style λ-calculi extended with records and variants, different forms of structural subtyping, and row and presence polymorphism.
We characterise expressiveness by exhibiting compositional translations between calculi. For each translation we prove a type preservation and operational correspondence result. We also prove a number of non-existence results. By imposing restrictions on both source and target types, we reveal further subtleties in the expressiveness landscape, the restrictions enabling otherwise impossible translations to be defined. More specifically, we prove that full subtyping cannot be encoded via polymorphism, but we show that several restricted forms of subtyping can be encoded via particular forms of polymorphism.

Publisher's Version Archive submitted (1.1 MB)
A Pretty Expressive Printer
Sorawee Porncharoenwase, Justin Pombrio, and Emina Torlak
(University of Washington, USA; Unaffiliated, USA)
Pretty printers make trade-offs between the expressiveness of their pretty printing language, the optimality objective that they minimize when choosing between different ways to lay out a document, and the performance of their algorithm. This paper presents a new pretty printer, Πe, that is strictly more expressive than all pretty printers in the literature and provably minimizes an optimality objective. Furthermore, the time complexity of Πe is better than many existing pretty printers. When choosing among different ways to lay out a document, Πe consults a user-supplied cost factory, which determines the optimality objective, giving Πe a unique degree of flexibility. We use the Lean theorem prover to verify the correctness (validity and optimality) of Πe, and implement Πe concretely as a pretty printer that we call PrettyExpressive. To evaluate our pretty printer against others, we develop a formal framework for reasoning about the expressiveness of pretty printing languages, and survey pretty printers in the literature, comparing their expressiveness, optimality, worst-case time complexity, and practical running time. Our evaluation shows that PrettyExpressive is efficient and effective at producing optimal layouts. PrettyExpressive has also seen real-world adoption: it serves as a foundation of a code formatter for Racket.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Automated Ambiguity Detection in Layout-Sensitive Grammars
Jiangyi Liu, Fengmin Zhu, and Fei He
(Tsinghua University, China; CISPA Helmholtz Center for Information Security, Germany)
Layout-sensitive grammars have been adopted in many modern programming languages. In a serious language design phase, the specified syntax—typically a grammar—must be unambiguous. Although checking ambiguity is undecidable for context-free grammars and (trivially also) layout-sensitive grammars, ambiguity detection, on the other hand, is possible and can benefit language designers from exposing potential design flaws. In this paper, we tackle the ambiguity detection problem in layout-sensitive grammars. Inspired by a previous work on checking the bounded ambiguity of context-free grammars via SAT solving, we intensively extend their approach to support layout-sensitive grammars but via SMT solving to express the ordering and quantitative relations over line/column numbers. Our key novelty lies in a reachability condition, which takes the impact of layout constraints on ambiguity into careful account. With this condition in hand, we propose an equivalent ambiguity notion called local ambiguity for the convenience of SMT encoding. We translate local ambiguity into an SMT formula and developed a bounded ambiguity checker that automatically finds a shortest nonempty ambiguous sentence (if exists) for a user-input grammar. The soundness and completeness of our SMT encoding are mechanized in the Coq proof assistant. We conducted an evaluation on both grammar fragments and full grammars extracted from the language manuals of domain-specific languages like YAML as well as general-purpose languages like Python, which reveals the effectiveness of our approach.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Reusing Just-in-Time Compiled Code
Meetesh Kalpesh Mehta, Sebastián Krynski, Hugo Musso Gualandi, Manas Thakur, and Jan Vitek
(IIT Mandi, India; Czech Technical University in Prague, Czechia; IIT Bombay, India; Northeastern University, USA)
Most code is executed more than once. If not entire programs then libraries remain unchanged from one run to the next. Just-in-time compilers expend considerable effort gathering insights about code they compiled many times, and often end up generating the same binary over and over again. We explore how to reuse compiled code across runs of different programs to reduce warm-up costs of dynamic languages. We propose to use speculative contextual dispatch to select versions of functions from an off-line curated code repository. That repository is a persistent database of previously compiled functions indexed by the context under which they were compiled. The repository is curated to remove redundant code and to optimize dispatch. We assess practicality by extending Ř, a compiler for the R language, and evaluating its performance. Our results suggest that the approach improves warmup times while preserving peak performance.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Bring Your Own Data Structures to Datalog
Arash Sahebolamri, Langston Barrett, Scott Moore, and Kristopher Micinski
(Syracuse University, USA; Galois, USA)
The restricted logic programming language Datalog has become a popular implementation target for deductive-analytic workloads including social-media analytics and program analysis. Modern Datalog engines compile Datalog rules to joins over explicit representations of relations—often B-trees or hash maps. While these modern engines have enabled high scalability in many application domains, they have a crucial weakness: achieving the desired algorithmic complexity may be impossible due to representation-imposed overhead of the engine’s data structures. In this paper, we present the "Bring Your Own Data Structures" (Byods) approach, in the form of a DSL embedded in Rust. Using Byods, an engineer writes logical rules which are implicitly parametric on the concrete data structure representation; our implementation provides an interface to enable "bringing their own" data structures to represent relations, which harmoniously interact with code generated by our compiler (implemented as Rust procedural macros). We formalize the semantics of Byods as an extension of Datalog’s; our formalization captures the key properties demanded of data structures compatible with Byods, including properties required for incrementalized (semi-naïve) evaluation. We detail many applications of the Byods approach, implementing analyses requiring specialized data structures for transitive and equivalence relations to scale, including an optimized version of the Rust borrow checker Polonius; highly-parallel PageRank made possible by lattices; and a large-scale analysis of LLVM utilizing index-sharing to scale. Our results show that Byods offers both improved algorithmic scalability (reduced time and/or space complexity) and runtimes competitive with state-of-the-art parallelizing Datalog solvers.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
A Grounded Conceptual Model for Ownership Types in Rust
Will Crichton, Gavin Gray, and Shriram Krishnamurthi
(Brown University, USA; ETH Zurich, Switzerland)
Programmers learning Rust struggle to understand ownership types, Rust’s core mechanism for ensuring memory safety without garbage collection. This paper describes our attempt to systematically design a pedagogy for ownership types. First, we studied Rust developers’ misconceptions of ownership to create the Ownership Inventory, a new instrument for measuring a person’s knowledge of ownership. We found that Rust learners could not connect Rust’s static and dynamic semantics, such as determining why an ill-typed program would (or would not) exhibit undefined behavior. Second, we created a conceptual model of Rust’s semantics that explains borrow checking in terms of flow-sensitive permissions on paths into memory. Third, we implemented a Rust compiler plugin that visualizes programs under the model. Fourth, we integrated the permissions model and visualizations into a broader pedagogy of ownership by writing a new ownership chapter for The Rust Programming Language, a popular Rust textbook. Fifth, we evaluated an initial deployment of our pedagogy against the original version, using reader responses to the Ownership Inventory as a point of comparison. Thus far, the new pedagogy has improved learner scores on the Ownership Inventory by an average of 9

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Building Dynamic System Call Sandbox with Partial Order Analysis
Quan Zhang, Chijin Zhou, Yiwen Xu, Zijing Yin, Mingzhe Wang, Zhuo Su, Chengnian Sun, Yu Jiang, and Jiaguang Sun
(Tsinghua University, China; University of Waterloo, Canada)
Attack surface reduction is a security technique that secures the operating system by removing the unnecessary code or features of a program. By restricting the system calls that programs can use, the system call sandbox is able to reduce the exposed attack surface of the operating system and prevent attackers from damaging it through vulnerable programs. Ideally, programs should only retain access to system calls they require for normal execution. Many researchers focus on adopting static analysis to automatically restrict the system calls for each program. However, these methods do not adjust the restriction policy along with program execution. Thus, they need to permit all system calls required for program functionalities.
We observe that some system calls, especially security-sensitive ones, are used a few times in certain stages of a program’s execution and then never used again. This motivates us to minimize the set of required system calls dynamically. In this paper, we propose , which gradually disables access to unnecessary system calls throughout the program’s execution. To accomplish this, we utilize partial order analysis to transform the program into a partially ordered graph, which enables efficient identification of the necessary system calls at any given point during program execution. Once a system call is no longer required by the program, can restrict it immediately. To evaluate , we applied it to seven widely-used programs with an average of 615 KLOC, including web servers and databases. With partial order analysis, restricts an average of 23.50, 16.86, and 15.89 more system calls than the state-of-the-art Chestnut, Temporal Specialization, and the configuration-aware sandbox, C2C, respectively. For mitigating malicious exploitations, on average, defeats 83.42% of 1726 exploitation payloads with only a 5.07% overhead.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Resource-Aware Soundness for Big-Step Semantics
Riccardo Bianchini, Francesco Dagnino, Paola Giannini, and Elena Zucca
(University of Genoa, Italy; University of Eastern Piedmont, Italy)
We extend the semantics and type system of a lambda calculus equipped with common constructs to be resource-aware. That is, reduction is instrumented to keep track of the usage of resources, and the type system guarantees, besides standard soundness, that for well-typed programs there is a computation where no needed resource gets exhausted. The resource-aware extension is parametric on an arbitrary grade algebra, and does not require ad-hoc changes to the underlying language. To this end, the semantics needs to be formalized in big-step style; as a consequence, expressing and proving (resource-aware) soundness is challenging, and is achieved by applying recent techniques based on coinductive reasoning.

Publisher's Version
Initializing Global Objects: Time and Order
Fengyun Liu, Ondřej Lhoták, David Hua, and Enze Xing
(Oracle Labs, Switzerland; University of Waterloo, Canada)
Object-oriented programming has been bothered by an awkward feature for a long time: static members. Static members not only compromise the conceptual integrity of object-oriented programming, but also give rise to subtle initialization errors, such as reading non-initialized fields and deadlocks.
The Scala programming language eliminated static members from the language, replacing them with global objects that present a unified object-oriented programming model. However, the problem of global object initialization remains open, and programmers still suffer from initialization errors.
We propose partial ordering and initialization-time irrelevance as two fundamental principles for initializing global objects. Based on these principles, we put forward an effective static analysis to ensure safe initialization of global objects, which eliminates initialization errors at compile time. The analysis also enables static scheduling of global object initialization to avoid runtime overhead. The analysis is modular at the granularity of objects and it avoids whole-program analysis. To make the analysis explainable and tunable, we introduce the concept of regions to make context-sensitivity understandable and customizable by programmers.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Formal Abstractions for Packet Scheduling
Anshuman Mohan, Yunhe Liu, Nate Foster, Tobias Kappé, and Dexter Kozen
(Cornell University, USA; Open University of the Netherlands, Netherlands; University of Amsterdam, Netherlands)
Early programming models for software-defined networking (SDN) focused on basic features for controlling network-wide forwarding paths, but more recent work has considered richer features, such as packet scheduling and queueing, that affect performance. In particular, PIFO trees, proposed by Sivaraman et al., offer a flexible and efficient primitive for programmable packet scheduling. Prior work has shown that PIFO trees can express a wide range of practical algorithms including strict priority, weighted fair queueing, and hierarchical schemes. However, the semantic properties of PIFO trees are not well understood.
This paper studies PIFO trees from a programming language perspective. We formalize the syntax and semantics of PIFO trees in an operational model that decouples the scheduling policy running on a tree from the topology of the tree. Building on this formalization, we develop compilation algorithms that allow the behavior of a PIFO tree written against one topology to be realized using a tree with a different topology. Such a compiler could be used to optimize an implementation of PIFO trees, or realize a logical PIFO tree on a target with a fixed topology baked into the hardware. To support experimentation, we develop a software simulator for PIFO trees, and we present case studies illustrating its behavior on standard and custom algorithms.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Reference Capabilities for Flexible Memory Management
Ellen Arvidsson, Elias Castegren, Sylvan Clebsch, Sophia Drossopoulou, James Noble, Matthew J. Parkinson, and Tobias Wrigstad
(Uppsala University, Sweden; Microsoft Azure Research, USA; Imperial College London, UK; Research & Programming, New Zealand; Microsoft Azure Research, UK)
Verona is a concurrent object-oriented programming language that organises all the objects in a program into a forest of isolated regions. Memory is managed locally for each region, so programmers can control a program's memory use by adjusting objects' partition into regions, and by setting each region's memory management strategy. A thread can only mutate (allocate, deallocate) objects within one active region---its "window of mutability". Memory management costs are localised to the active region, ensuring overheads can be predicted and controlled. Moving the mutability window between regions is explicit, so code can be executed wherever it is required, yet programs remain in control of memory use. An ownership type system based on reference capabilities enforces region isolation, controlling aliasing within and between regions, yet supporting objects moving between regions and threads. Data accesses never need expensive atomic operations, and are always thread-safe.

Publisher's Version
Mobius: Synthesizing Relational Queries with Recursive and Invented Predicates
Aalok Thakkar, Nathaniel Sands, George Petrou, Rajeev Alur, Mayur Naik, and Mukund Raghothaman
(University of Pennsylvania, USA; University of Southern California, USA)
Synthesizing relational queries from data is challenging in the presence of recursion and invented predicates. We propose a fully automated approach to synthesize such queries. Our approach comprises of two steps: it first synthesizes a non-recursive query consistent with the given data, and then identifies recursion schemes in it and thereby generalizes to arbitrary data. This generalization is achieved by an iterative predicate unification procedure which exploits the notion of data provenance to accelerate convergence. In each iteration of the procedure, a constraint solver proposes a candidate query, and a query evaluator checks if the proposed program is consistent with the given data. The data provenance for a failed query allows us to construct additional constraints for the constraint solver and refine the search. We have implemented our approach in a tool named Mobius. On a suite of 21 challenging recursive query synthesis tasks, Mobius outperforms three state-of-the-art baselines Gensynth, ILASP, and Popper, both in terms of runtime and accuracy. We also demonstrate that the synthesized queries generalize well to unseen data.

Publisher's Version
MemPerf: Profiling Allocator-Induced Performance Slowdowns
Jin Zhou, Sam Silvestro, Steven (Jiaxun) Tang, Hanmei Yang, Hongyu Liu, Guangming Zeng, Bo Wu, Cong Liu, and Tongping Liu
(University of Massachusetts at Amherst, USA; University of Texas at San Antonio, USA; Synopsys, USA; Colorado School of Mines, USA; University of Texas at Dallas, USA)
The memory allocator plays a key role in the performance of applications, but none of the existing profilers can pinpoint performance slowdowns caused by a memory allocator. Consequently, programmers may spend time improving application code incorrectly or unnecessarily, achieving low or no performance improvement. This paper designs the first profiler—MemPerf—to identify allocator-induced performance slowdowns without comparing against another allocator. Based on the key observation that an allocator may impact the whole life-cycle of heap objects, including the accesses (or uses) of these objects, MemPerf proposes a life-cycle based detection to identify slowdowns caused by slow memory management operations and slow accesses separately. For the prior one, MemPerf proposes a thread-aware and type-aware performance modeling to identify slow management operations. For slow memory accesses, MemPerf utilizes a top-down approach to identify all possible reasons for slow memory accesses introduced by the allocator, mainly due to cache and TLB misses, and further proposes a unified method to identify them correctly and efficiently. Based on our extensive evaluation, MemPerf reports 98% medium and large allocator-reduced slowdowns (larger than 5%) correctly without reporting any false positives. MemPerf also pinpoints multiple known and unknown design issues in widely-used allocators.

Publisher's Version
Verifying Indistinguishability of Privacy-Preserving Protocols
Kirby Linvill, Gowtham Kaki, and Eric Wustrow
(University of Colorado Boulder, USA)
Internet users rely on the protocols they use to protect their private information including their identity and the websites they visit. Formal verification of these protocols can detect subtle bugs that compromise these protections at design time, but is a challenging task as it involves probabilistic reasoning about random sampling, cryptographic primitives, and concurrent execution. Existing approaches either reason about symbolic models of the protocols that sacrifice precision for automation, or reason about more precise computational models that are harder to automate and require cryptographic expertise. In this paper we propose a novel approach to verifying privacy-preserving protocols that is more precise than symbolic models yet more accessible than computational models. Our approach permits direct-style proofs of privacy, as opposed to indirect game-based proofs in computational models, by formalizing privacy as indistinguishability of possible network traces induced by a protocol. We ease automation by leveraging insights from the distributed systems verification community to create sound synchronous models of concurrent protocols. Our verification framework is implemented in F* as a library we call Waldo. We describe two large case studies of using Waldo to verify indistinguishability; one on the Encrypted Client Hello (ECH) extension of the TLS protocol and another on a Private Information Retrieval (PIR) protocol. We uncover subtle flaws in the TLS ECH specification that were missed by other models.

Publisher's Version
Quantifying and Mitigating Cache Side Channel Leakage with Differential Set
Cong Ma, Dinghao Wu, Gang Tan, Mahmut Taylan Kandemir, and Danfeng Zhang
(University of Waterloo, Canada; Pennsylvania State University, USA; Duke University, USA)
Cache side-channel attacks leverage secret-dependent footprints in CPU cache to steal confidential information, such as encryption keys. Due to the lack of a proper abstraction for reasoning about cache side channels, existing static program analysis tools that can quantify or mitigate cache side channels are built on very different kinds of abstractions. As a consequence, it is hard to bridge advances in quantification and mitigation research. Moreover, existing abstractions lead to imprecise results. In this paper, we present a novel abstraction, called differential set, for analyzing cache side channels at compile time. A distinguishing feature of differential sets is that it allows compositional and precise reasoning about cache side channels. Moreover, it is the first abstraction that carries sufficient information for both side channel quantification and mitigation. Based on this new abstraction, we develop a static analysis tool DSA that automatically quantifies and mitigates cache side channel leakage at the same time. Experimental evaluation on a set of commonly used benchmarks shows that DSA can produce more precise leakage bound as well as mitigated code with fewer memory footprints, when compared with state-of-the-art tools that only quantify or mitigate cache side channel leakage.

Publisher's Version Published Artifact Artifacts Available
How Domain Experts Use an Embedded DSL
Lisa Rennels and Sarah E. Chasins
(University of California at Berkeley, USA)
Programming tools are increasingly integral to research and analysis in myriad domains, including specialized areas with no formal relation to computer science. Embedded domain-specific languages (eDSLs) have the potential to serve these programmers while placing relatively light implementation burdens on language designers. However, barriers to eDSL use reduce their practical value and adoption. In this paper, we aim to deepen our understanding of how programmers use eDSLs and identify user needs to inform future eDSL designs. We performed a contextual inquiry (9 participants) with domain experts using Mimi, an eDSL for climate change economics modeling. A thematic analysis identified five key themes, including: the interaction between the eDSL and the host language has significant and sometimes unexpected impacts on eDSL user experience, and users preferentially engage with domain-specific communities and code templates rather than host language resources. The needs uncovered in our study offer design considerations for future eDSLs and suggest directions for future DSL usability research.

Publisher's Version
When Concurrency Matters: Behaviour-Oriented Concurrency
Luke Cheeseman, Matthew J. Parkinson, Sylvan Clebsch, Marios Kogias, Sophia Drossopoulou, David Chisnall, Tobias Wrigstad, and Paul Liétar
(Imperial College London, UK; Microsoft Azure Research, UK; Microsoft Research, UK; Microsoft, UK; Uppsala University, Sweden)
Expressing parallelism and coordination is central for modern concurrent programming. Many mechanisms exist for expressing both parallelism and coordination. However, the design decisions for these two mechanisms are tightly intertwined. We believe that the interdependence of these two mechanisms should be recognised and achieved through a single, powerful primitive. We are not the first to realise this: the prime example is actor model programming, where parallelism arises through fine-grained decomposition of a program’s state into actors that are able to execute independently in parallel. However, actor model programming has a serious pain point: updating multiple actors as a single atomic operation is a challenging task. We address this pain point by introducing a new concurrency paradigm: Behaviour-Oriented Concurrency (BoC). In BoC, we are revisiting the fundamental concept of a behaviour to provide a more transactional concurrency model. BoC enables asynchronously creating atomic and ordered units of work with exclusive access to a collection of independent resources. In this paper, we describe BoC informally in terms of examples, which demonstrate the advantages of exclusive access to several independent resources, as well as the need for ordering. We define it through a formal model. We demonstrate its practicality by implementing a C++ runtime. We argue its applicability through the Savina benchmark suite: benchmarks in this suite can be more compactly represented using BoC in place of Actors, and we observe comparable, if not better, performance.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
TASTyTruffle: Just-in-Time Specialization of Parametric Polymorphism
Matt D'Souza, James You, Ondřej Lhoták, and Aleksandar Prokopec
(University of Waterloo, Canada; Oracle Labs, Switzerland)
Parametric polymorphism enables programmers to express algorithms independently of the types of values that they operate on. The approach used to implement parametric polymorphism can have important performance implications. One popular approach, erasure, uses a uniform representation for generic data, which entails primitive boxing and other indirections that harm performance. Erasure destroys type information that could be used by language implementations to optimize generic code.
We present TASTyTruffle, an implementation for a subset of the Scala programming language. Instead of JVM bytecode, TASTyTruffle interprets Scala's TASTy intermediate representation, a typed representation wherein generic types are not erased. TASTy's precise type information empowers TASTyTruffle to implement generic code more effectively. In particular, it allows TASTyTruffle to reify types as run-time objects that can be passed around. Using reified types, TASTyTruffle supports heterogeneous box-free representations for generic values. TASTyTruffle also uses reified types to specialize generic code, producing monomorphic copies of generic code that can be easily and reliably optimized by its just-in-time (JIT) compiler.
Empirically, TASTyTruffle is competitive with standard JVM implementations on a small set of benchmark programs; when generic code is used with multiple types, TASTyTruffle consistently outperforms the JVM. The precise type information in TASTy enables TASTyTruffle to find additional optimization opportunities that could not be uncovered with erased JVM bytecode.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Validating IoT Devices with Rate-Based Session Types
Grant Iraci, Cheng-En Chuang, Raymond Hu, and Lukasz Ziarek
(University at Buffalo, USA; Queen Mary University of London, UK)
We develop a session types based framework for implementing and validating rate-based message passing systems in Internet of Things (IoT) domains. To model the indefinite repetition present in many embedded and IoT systems, we introduce a timed process calculus with a periodic recursion primitive. This allows us to model rate-based computations and communications inherent to these application domains. We introduce a definition of rate based session types in a binary session types setting and a new compatibility relationship, which we call rate compatibility. Programs which type check enjoy the standard session types guarantees as well as rate error freedom --- meaning processes which exchanges messages do so at the same rate. Rate compatibility is defined through a new notion of type expansion, a relation that allows communication between processes of differing periods by synthesizing and checking a common superperiod type. We prove type preservation and rate error freedom for our system, and show a decidable method for type checking based on computing superperiods for a collection of processes. We implement a prototype of our type system including rate compatibility via an embedding into the native type system of Rust. We apply this framework to a range of examples from our target domain such as Android software sensors, wearable devices, and sound processing.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Static Analysis of Memory Models for SMT Encodings
Thomas Haas, René Maseli, Roland Meyer, and Hernán Ponce de León
(TU Braunschweig, Germany; Huawei, Germany)
The goal of this work is to improve the efficiency of bounded model checkers that are modular in the memory model. Our first contribution is a static analysis for the given memory model that is performed as a preprocessing step and helps us significantly reduce the encoding size. Memory model make use of relations to judge whether an execution is consistent. The analysis computes bounds on these relations: which pairs of events may or must be related. What is new is that the bounds are relativized to the execution of events. This makes it possible to derive, for the first time, not only upper but also meaningful lower bounds. Another important feature is that the analysis can import information about the verification instance from external sources to improve its precision. Our second contribution are new optimizations for the SMT encoding. Notably, the lower bounds allow us to simplify the encoding of acyclicity constraints. We implemented our analysis and optimizations within a bounded model checker and evaluated it on challenging benchmarks. The evaluation shows up-to 40% reduction in verification time (including the analysis) over previous encodings. Our optimizations allow us to efficiently check safety, liveness, and data race freedom in Linux kernel code.

Publisher's Version Published Artifact Archive submitted (530 kB) Artifacts Available Artifacts Functional
Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programs
Alex Renda, Yi Ding, and Michael Carbin
(Massachusetts Institute of Technology, USA; Purdue University, USA)
Programmers and researchers are increasingly developing surrogates of programs, models of a subset of the observable behavior of a given program, to solve a variety of software development challenges. Programmers train surrogates from measurements of the behavior of a program on a dataset of input examples. A key challenge of surrogate construction is determining what training data to use to train a surrogate of a given program.
We present a methodology for sampling datasets to train neural-network-based surrogates of programs. We first characterize the proportion of data to sample from each region of a program's input space (corresponding to different execution paths of the program) based on the complexity of learning a surrogate of the corresponding execution path. We next provide a program analysis to determine the complexity of different paths in a program. We evaluate these results on a range of real-world programs, demonstrating that complexity-guided sampling results in empirical improvements in accuracy.

Publisher's Version Published Artifact Archive submitted (4.1 MB) Artifacts Available
Stuttering for Free
Minki Cho, Youngju Song, Dongjae Lee, Lennard Gäher, and Derek Dreyer
(Seoul National University, Republic of Korea; MPI-SWS, Germany)
One of the most common tools for proving behavioral refinements between transition systems is the method of simulation proofs, which has been explored extensively over the past several decades. Stuttering simulations are an extension of traditional simulations—used, for example, in CompCert—in which either the source or target of the simulation is permitted to “stutter” (stay in place) while the other side steps forward. In the interest of ensuring soundness, however, existing stuttering simulations restrict proofs to only perform a finite number of stuttering steps before making synchronous progress—a step of reasoning in which both sides of the simulation progress forward together. This restriction guarantees that a terminating program cannot be proven to simulate a non-terminating one.
In this paper, we observe that the requirement to eventually achieve synchronous progress is burdensome and, what’s more, unnecessary: it is possible to ensure soundness of stuttering simulations while only requiring asynchronous progress (progress on both sides of the simulation that may be achieved with only stuttering steps). Building on this observation, we develop a new simulation technique we call FreeSim (short for “freely-stuttering simulations”), mechanized in Coq, and we demonstrate its effectiveness on a range of interesting case studies. These include a simplification of the meta-theory of CompCert, as well as the DTrees library, which enriches the ITrees (Interaction Trees) library with dual non-determinism.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Inference of Resource Management Specifications
Narges Shadab, Pritam Gharat, Shrey Tiwari, Michael D. Ernst, Martin Kellogg, Shuvendu K. Lahiri, Akash Lal, and Manu Sridharan
(University of California at Riverside, USA; Microsoft Research, India; University of Washington, USA; New Jersey Institute of Technology, USA; Microsoft Research, USA)
A resource leak occurs when a program fails to free some finite resource after it is no longer needed. Such leaks are a significant cause of real-world crashes and performance problems. Recent work proposed an approach to prevent resource leaks based on checking resource management specifications. A resource management specification expresses how the program allocates resources, passes them around, and releases them; it also tracks the ownership relationship between objects and resources, and aliasing relationships between objects. While this specify-and-verify approach has several advantages compared to prior techniques, the need to manually write annotations presents a significant barrier to its practical adoption.
This paper presents a novel technique to automatically infer a resource management specification for a program, broadening the applicability of specify-and-check verification for resource leaks. Inference in this domain is challenging because resource management specifications differ significantly in nature from the types that most inference techniques target. Further, for practical effectiveness, we desire a technique that can infer the resource management specification intended by the developer, even in cases when the code does not fully adhere to that specification. We address these challenges through a set of inference rules carefully designed to capture real-world coding patterns, yielding an effective fixed-point-based inference algorithm.
We have implemented our inference algorithm in two different systems, targeting programs written in Java and C#. In an experimental evaluation, our technique inferred 85.5% of the annotations that programmers had written manually for the benchmarks. Further, the verifier issued nearly the same rate of false alarms with the manually-written and automatically-inferred annotations.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Rapid: Region-Based Pointer Disambiguation
Khushboo Chitre, Piyus Kedia, and Rahul Purandare
(IIIT Delhi, India; University of Nebraska-Lincoln, USA)
Interprocedural alias analyses often sacrifice precision for scalability. Thus, modern compilers such as GCC and LLVM implement more scalable but less precise intraprocedural alias analyses. This compromise makes the compilers miss out on potential optimization opportunities, affecting the performance of the application. Modern compilers implement loop-versioning with dynamic checks for pointer disambiguation to enable the missed optimizations. Polyhedral access range analysis and symbolic range analysis enable 𝑂 (1) range checks for non-overlapping of memory accesses inside loops. However, these approaches work only for the loops in which the loop bounds are loop invariants. To address this limitation, researchers proposed a technique that requires 𝑂 (𝑙𝑜𝑔 𝑛) memory accesses for pointer disambiguation. Others improved the performance of dynamic checks to single memory access by constraining the object size and alignment. However, the former approach incurs noticeable overhead due to its dynamic checks, whereas the latter has a noticeable allocator overhead. Thus, scalability remains a challenge.
In this work, we present a tool, Rapid, that further reduces the overheads of the allocator and dynamic checks proposed in the existing approaches. The key idea is to identify objects that need disambiguation checks using a profiler and allocate them in different regions, which are disjoint memory areas. The disambiguation checks simply compare the regions corresponding to the objects. The regions are aligned such that the top 32 bits in the addresses of any two objects allocated in different regions are always different. As a consequence, the dynamic checks do not require any memory access to ensure that the objects belong to different regions, making them efficient.
Rapid achieved a maximum performance benefit of around 52.94% for Polybench and 1.88% for CPU SPEC 2017 benchmarks. The maximum CPU overhead of our allocator is 0.57% with a geometric mean of -0.2% for CPU SPEC 2017 benchmarks. Due to the low overhead of the allocator and dynamic checks, Rapid could improve the performance of 12 out of 16 CPU SPEC 2017 benchmarks. In contrast, a state-of-the-art approach used in the comparison could improve only five CPU SPEC 2017 benchmarks.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Gradual Typing for Effect Handlers
Max S. New, Eric Giovannini, and Daniel R. Licata
(University of Michigan, USA; Wesleyan University, USA)
We present a gradually typed language, GrEff, with effects and handlers that supports migration from unchecked to checked effect typing. This serves as a simple model of the integration of an effect typing discipline with an existing effectful typed language that does not track fine-grained effect information. Our language supports a simple module system to model the programming model of gradual migration from unchecked to checked effect typing in the style of Typed Racket.
The surface language GrEff is given semantics by elaboration to a core language Core GrEff. We equip Core GrEff with an inequational theory for reasoning about the semantic error ordering and desired program equivalences for programming with effects and handlers. We derive an operational semantics for the language from the equations provable in the theory. We then show that the theory is sound by constructing an operational logical relations model to prove the graduality theorem. This extends prior work on embedding-projection pair models of gradual typing to handle effect typing and subtyping.

Publisher's Version
Synthesizing Specifications
Kanghee Park, Loris D'Antoni, and Thomas Reps
(University of Wisconsin-Madison, USA)
Every program should be accompanied by a specification that describes important aspects of the code's behavior, but writing good specifications is often harder than writing the code itself. This paper addresses the problem of synthesizing specifications automatically, guided by user-supplied inputs of two kinds: i) a query posed about a set of function definitions, and ii) a domain-specific language L in which the extracted property is to be expressed (we call properties in the language L-properties). Each of the property is a best L-property for the query: there is no other L-property that is strictly more precise. Furthermore, the set of synthesized L-properties is exhaustive: no more L-properties can be added to it to make the conjunction more precise. We implemented our method in a tool, Spyro. The ability to modify both the query and L provides a Spyro user with ways to customize the kind of specification to be synthesized. We use this ability to show that Spyro can be used in a variety of applications, such as mining program specifications, performing abstract-domain operations, and synthesizing algebraic properties of program modules.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Compositional Verification of Efficient Masking Countermeasures against Side-Channel Attacks
Pengfei Gao, Yedi Zhang, Fu Song, Taolue Chen, and Francois-Xavier Standaert
(ShanghaiTech University, China; Institute of Software at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China; University of London, UK; Université Catholique de Louvain, Belgium)
Masking is one of the most effective countermeasures for securely implementing cryptographic algorithms against power side-channel attacks, the design of which however turns out to be intricate and error-prone. While techniques have been proposed to rigorously verify implementations of cryptographic algorithms, currently they are limited in scalability. To address this issue, compositional approaches have been investigated, but insofar they fail to prove the security of recent efficient implementations. To fill this gap, we propose a novel compositional verification approach. In particular, we introduce two new language-level security notions based on which we propose composition strategies and verification algorithms. Our approach is able to prove efficient implementations, which cannot be done by prior compositional approaches. We implement our approach as a tool CONVINCE and conduct extensive experiments to confirm its efficacy. We also use CONVINCE to further explore the design space of the AES Sbox with least refreshing by replacing its implementation for finite-field multiplication with more efficient counterparts. We automatically prove leakage-freeness of these new versions. As a result, we can effectively reduce 1,600 randomness and 3,200 XOR-operations of the state-of-the-art AES implementation.

Publisher's Version Published Artifact Info Artifacts Available Artifacts Functional
Data Extraction via Semantic Regular Expression Synthesis
Qiaochu Chen, Arko Banerjee, Çağatay Demiralp, Greg Durrett, and Işıl Dillig
(University of Texas at Austin, USA; Massachusetts Institute of Technology, USA)
Many data extraction tasks of practical relevance require not only syntactic pattern matching but also semantic reasoning about the content of the underlying text. While regular expressions are very well suited for tasks that require only syntactic pattern matching, they fall short for data extraction tasks that involve both a syntactic and semantic component. To address this issue, we introduce semantic regexes, a generalization of regular expressions that facilitates combined syntactic and semantic reasoning about textual data. We also propose a novel learning algorithm that can synthesize semantic regexes from a small number of positive and negative examples. Our proposed learning algorithm uses a combination of neural sketch generation and compositional type-directed synthesis for fast and effective generalization from a small number of examples. We have implemented these ideas in a new tool called Smore and evaluated it on representative data extraction tasks involving several textual datasets. Our evaluation shows that semantic regexes can better support complex data extraction tasks than standard regular expressions and that our learning algorithm significantly outperforms existing tools, including state-of-the-art neural networks and program synthesis tools.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Counterexample Driven Quantifier Instantiations with Applications to Distributed Protocols
Orr Tamir, Marcelo Taube, Kenneth L. McMillan, Sharon Shoham, Jon Howell, Guy Gueta, and Mooly Sagiv
(Tel Aviv University, Israel; University of Texas at Austin, USA; VMware Research, USA; VMware Research, Israel)
Formally verifying infinite-state systems can be a daunting task, especially when it comes to reasoning about quantifiers. In particular, quantifier alternations in conjunction with function symbols can create function cycles that result in infinitely many ground terms, making it difficult for solvers to instantiate quantifiers and causing them to diverge. This can leave users with no useful information on how to proceed. To address this issue, we propose an interactive verification methodology that uses a relational abstraction technique to mitigate solver divergence in the presence of quantifiers. This technique abstracts functions in the verification conditions (VCs) as one-to-one relations, which avoids the creation of function cycles and the resulting proliferation of ground terms. Relational abstraction is sound and guarantees correctness if the solver cannot find counter-models. However, it may also lead to false counterexamples, which can be addressed by refining the abstraction and requiring the existence of corresponding elements. In the domain of distributed protocols, we can refine the abstraction by diagnosing counterexamples and manually instantiating elements in the range of the original function. If the verification conditions are correct, there always exist finitely many refinement steps that eliminate all spurious counter-models, making the approach complete. We applied this approach in Ivy to verify the safety properties of consensus protocols and found that: (1) most verification goals can be automatically verified using relational abstraction, while SMT solvers often diverge when given the original VC, (2) only a few manual instantiations were needed, and the counterexamples provided valuable guidance for the user compared to timeouts produced by the traditional approach, and (3) the technique can be used to derive efficient low-level implementations of tricky algorithms.

Publisher's Version
Historia: Refuting Callback Reachability with Message-History Logics
Shawn Meier, Sergio Mover, Gowtham Kaki, and Bor-Yuh Evan Chang
(University of Colorado Boulder, USA; École Polytechnique - CNRS - Institut Polytechnique de Paris, France; Amazon, USA)
This paper considers the callback reachability problem --- determining if a callback can be called by an event-driven framework in an unexpected state. Event-driven programming frameworks are pervasive for creating user-interactive applications (apps) on just about every modern platform. Control flow between callbacks is determined by the framework and largely opaque to the programmer. This opacity of the callback control flow not only causes difficulty for the programmer but is also difficult for those developing static analysis. Previous static analysis techniques address this opacity either by assuming an arbitrary framework implementation or attempting to eagerly specify all possible callback control flow, but this is either too coarse to prove properties requiring callback-ordering constraints or too burdensome and tricky to get right. Instead, we present a middle way where the callback control flow can be gradually refined in a targeted manner to prove assertions of interest. The key insight to get this middle way is by reasoning about the history of method invocations at the boundary between app and framework code --- enabling a decoupling of the specification of callback control flow from the analysis of app code. We call the sequence of such boundary-method invocations message histories and develop message-history logics to do this reasoning. In particular, we define the notion of an application-only transition system with boundary transitions, a message-history program logic for programs with such transitions, and a temporal specification logic for capturing callback control flow in a targeted and compositional manner. Then to utilize the logics in a goal-directed verifier, we define a way to combine after-the-fact an assertion about message histories with a specification of callback control flow. We implemented a prototype message history-based verifier called Historia and provide evidence that our approach is uniquely capable of distinguishing between buggy and fixed versions on challenging examples drawn from real-world issues and that our targeted specification approach enables proving the absence of multi-callback bug patterns in real-world open-source Android apps.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
P4R-Type: A Verified API for P4 Control Plane Programs
Jens Kanstrup Larsen, Roberto Guanciale, Philipp Haller, and Alceste Scalas
(DTU, Denmark; KTH Royal Institute of Technology, Sweden)
Software-Defined Networking (SDN) significantly simplifies programming, reconfiguring, and optimizing network devices, such as switches and routers. The de facto standard for programming SDN devices is the P4 language. However, the flexibility and power of P4, and SDN more generally, gives rise to important risks. As a number of incidents at major cloud providers have shown, errors in SDN programs can compromise the availability of networks, leaving them in a non-functional state. The focus of this paper are errors in control-plane programs that interact with P4-enabled network devices via the standardized P4Runtime API. For clients of the P4Runtime API it is easy to make mistakes that may lead to catastrophic failures, despite the use of Google’s Protocol Buffers as an interface definition language.
This paper proposes P4R-Type, a novel verified P4Runtime API for Scala that performs static checks for P4 control plane operations, ruling out mismatches between P4 tables, allowed actions, and action parameters. As a formal foundation of P4R-Type, we present the FP4R calculus and its typing system, which ensure that well-typed programs never get stuck by issuing invalid P4Runtime operations. We evaluate the safety and flexibility of P4R-Type with 3 case studies. To the best of our knowledge, this is the first work that formalises P4Runtime control plane applications, and a typing discipline ensuring the correctness of P4Runtime operations.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Synthesizing Precise Static Analyzers for Automatic Differentiation
Jacob Laurel, Siyuan Brant Qian, Gagandeep Singh, and Sasa Misailovic
(University of Illinois at Urbana-Champaign, USA; Zhejiang University, China; VMware Research, USA)
We present Pasado, a technique for synthesizing precise static analyzers for Automatic Differentiation. Our technique allows one to automatically construct a static analyzer specialized for the Chain Rule, Product Rule, and Quotient Rule computations for Automatic Differentiation in a way that abstracts all of the nonlinear operations of each respective rule simultaneously. By directly synthesizing an abstract transformer for the composite expressions of these 3 most common rules of AD, we are able to obtain significant precision improvement compared to prior works which compose standard abstract transformers together suboptimally. We prove our synthesized static analyzers sound and additionally demonstrate the generality of our approach by instantiating these AD static analyzers with different nonlinear functions, different abstract domains (both intervals and zonotopes) and both forward-mode and reverse-mode AD.
We evaluate Pasado on multiple case studies, namely soundly computing bounds on a neural network’s local Lipschitz constant, soundly bounding the sensitivities of financial models, certifying monotonicity, and lastly, bounding sensitivities of the solutions of differential equations from climate science and chemistry for verified ranges of initial conditions and parameters. The local Lipschitz constants computed by Pasado on our largest CNN are up to 2750× more precise compared to the existing state-of-the-art zonotope analysis. The bounds obtained on the sensitivities of the climate, chemical, and financial differential equation solutions are between 1.31 − 2.81× more precise (on average) compared to a state-of-the-art zonotope analysis.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Exploiting the Sparseness of Control-Flow and Call Graphs for Efficient and On-Demand Algebraic Program Analysis
Giovanna Kobus Conrado, Amir Kafshdar Goharshady, Kerim Kochekov, Yun Chen Tsai, and Ahmed Khaled Zaher
(Hong Kong University of Science and Technology, Hong Kong)
Algebraic Program Analysis (APA) is a ubiquitous framework that has been employed as a unifying model for various problems in data-flow analysis, termination analysis, invariant generation, predicate abstraction and a wide variety of other standard static analysis tasks. APA models program summaries as elements of a regular algebra . Suppose that a summary in A is assigned to every transition of the program and that we aim to compute the effect of running the program starting at line s and ending at line t. APA first computes a regular expression capturing all program paths of interest. In case of intraprocedural analysis, models all paths from s to t, whereas in the interprocedural case it models all interprocedurally-valid paths, i.e. paths that go back to the right caller function when a callee returns. This regular expression is then interpreted over the algebra to obtain the desired result. Suppose the program has n lines of code and each evaluation of an operation in the regular algebra takes O(k) time. It is well-known that a single APA query, or a set of queries with the same starting point s, can be answered in O(n · α(n) · k), where α is the inverse Ackermann function. In this work, we consider an on-demand setting for APA: the program is given in the input and can be preprocessed. The analysis has to then answer a large number of on-line queries, each providing a pair (s, t) of program lines which are the start and end point of the query, respectively. The goal is to avoid the significant cost of running a fresh APA instance for each query. Our main contribution is a series of algorithms that, after a lightweight preprocessing of O(n · lgn · k), answer each query in O(k) time. In other words, our preprocessing has almost the same asymptotic complexity as a single APA query, except for a sub-logarithmic factor, and then every future query is answered instantly, i.e. by a constant number of operations in the algebra. We achieve this remarkable speedup by relying on certain structural sparsity properties of control-flow and call graphs (CFGs and CGs). Specifically, we exploit the fact that control-flow graphs of real-world programs have a tree-like structure and bounded treewidth and nesting depth and that their call graphs have small treedepth in comparison to the size of the program. Finally, we provide experimental results demonstrating the effectiveness and efficiency of our approach and showing that it beats the runtime of classical APA by several orders of magnitude.

Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Saggitarius: A DSL for Specifying Grammatical Domains
Anders Miltner, Devon Loehr, Arnold Mong, Kathleen Fisher, and David Walker
(Simon Fraser University, Canada; Princeton University, USA; Tufts University, USA)
Common data types like dates, addresses, phone numbers and tables can have multiple textual representations, and many heavily-used languages, such as SQL, come in several dialects. These variations can cause data to be misinterpreted, leading to silent data corruption, failure of data processing systems, or even security vulnerabilities. Saggitarius is a new language and system designed to help programmers reason about the format of data, by describing grammatical domains---that is, sets of context-free grammars that describe the many possible representations of a datatype. We describe the design of Saggitarius via example and provide a relational semantics. We show how Saggitarius may be used to analyze a data set: given example data, it uses an algorithm based on semi-ring parsing and MaxSAT to infer which grammar in a given domain best matches that data. We evaluate the effectiveness of the algorithm on a benchmark suite of 110 example problems, and we demonstrate that our system typically returns a satisfying grammar within a few seconds with only a small number of examples. We also delve deeper into a more extensive case study on using Saggitarius for CSV dialect detection. Despite being general-purpose, we find that Saggitarius offers comparable results to hand-tuned, specialized tools; in the case of CSV, it infers grammars for 84% of benchmarks within 60 seconds, and has comparable accuracy to custom-built dialect detection tools.

Publisher's Version
A Deductive Verification Infrastructure for Probabilistic Programs
Philipp Schröer, Kevin Batz, Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Christoph Matheja
(RWTH Aachen University, Germany; Saarland University, Germany; University College London, UK; DTU, Denmark)
This paper presents a quantitative program verification infrastructure for discrete probabilistic programs. Our infrastructure can be viewed as the probabilistic analogue of Boogie: its central components are an intermediate verification language (IVL) together with a real-valued logic. Our IVL provides a programming-language-style for expressing verification conditions whose validity implies the correctness of a program under investigation. As our focus is on verifying quantitative properties such as bounds on expected outcomes, expected run-times, or termination probabilities, off-the-shelf IVLs based on Boolean first-order logic do not suffice. Instead, a paradigm shift from the standard Boolean to a real-valued domain is required.
Our IVL features quantitative generalizations of standard verification constructs such as assume- and assert-statements. Verification conditions are generated by a weakest-precondition-style semantics, based on our real-valued logic. We show that our verification infrastructure supports natural encodings of numerous verification techniques from the literature. With our SMT-based implementation, we automatically verify a variety of benchmarks. To the best of our knowledge, this establishes the first deductive verification infrastructure for expectation-based reasoning about probabilistic programs.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Greedy Implicit Bounded Quantification
Chen Cui, Shengyi Jiang, and Bruno C. d. S. Oliveira
(University of Hong Kong, China)
Mainstream object-oriented programming languages such as Java, Scala, C#, or TypeScript have polymorphic type systems with subtyping and bounded quantification. Bounded quantification, despite being a pervasive and widely used feature, has attracted little research work on type-inference algorithms to support it. A notable exception is local type inference, which is the basis of most current implementations of type inference for mainstream languages. However, support for bounded quantification in local type inference has important restrictions, and its non-algorithmic specification is complex.
In this paper, we present a variant of kernel F, which is the canonical calculus with bounded quantification, with implicit polymorphism. Our variant, called Fb, comes with a declarative and an algorithmic formulation of the type system. The declarative type system is based on previous work on bidirectional typing for predicative higher-rank polymorphism and a greedy approach to implicit instantiation. This allows for a clear declarative specification where programs require few type annotations and enables implicit polymorphism where applications omit type parameters. Just as local type inference, explicit type applications are also available in Fb if desired. This is useful to deal with impredicative instantiations, which would not be allowed otherwise in Fb. Due to the support for impredicative instantiations, we can obtain a completeness result with respect to kernel F, showing that all the well-typed kernel F programs can type-check in Fb. The corresponding algorithmic version of the type system is shown to be sound, complete, and decidable. All the results have been mechanically formalized in the Abella theorem prover.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Solving String Constraints with Lengths by Stabilization
Yu-Fang Chen, David Chocholatý, Vojtěch Havlena, Lukáš Holík, Ondřej Lengál, and Juraj Síč
(Academia Sinica, Taiwan; Brno University of Technology, Czechia)
We present a new algorithm for solving string constraints. The algorithm builds upon a recent method for solving word equations and regular constraints that interprets string variables as languages rather than strings and, consequently, mitigates the combinatorial explosion that plagues other approaches. We extend the approach to handle linear integer arithmetic length constraints by combination with a known principle of equation alignment and splitting, and by extension to other common types of string constraints, yielding a fully-fledged string solver. The ability of the framework to handle unrestricted disequalities even extends one of the largest decidable classes of string constraints, the chain-free fragment. We integrate our algorithm into a DPLL-based SMT solver. The performance of our implementation is competitive and even significantly better than state-of-the-art string solvers on several established benchmarks obtained from applications in verification of string programs.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Type-Safe Dynamic Placement with First-Class Placed Values
George Zakhour, Pascal Weisenburger, and Guido Salvaneschi
(University of St. Gallen, Switzerland)
Several distributed programming language solutions have been proposed to reason about the placement of data, computations, and peers interaction. Such solutions include, among the others, multitier programming, choreographic programming and various approaches based on behavioral types. These methods statically ensure safety properties thanks to a complete knowledge about placement of data and computation at compile time. In distributed systems, however, dynamic placement of computation and data is crucial to enable performance optimizations, e.g., driven by data locality or in presence of a number of other constraints such as security and compliance regarding data storage location. Unfortunately, in existing programming languages, dynamic placement conflicts with static reasoning about distributed programs: the flexibility required by dynamic placement hinders statically tracking the location of data and computation.
In this paper we present Dyno, a programming language that enables static reasoning about dynamic placement. Dyno features a type system where values are explicitly placed, but in contrast to existing approaches, placed values are also first class, ensuring that they can be passed around and referred to from other locations. Building on top of this mechanism, we provide a novel interpretation of dynamic placement as unions of placement types. We formalize type soundness, placement correctness (as part of type soundness) and architecture conformance. In case studies and benchmarks, our evaluation shows that Dyno enables static reasoning about programs even in presence of dynamic placement, ensuring type safety and placement correctness of programs at negligible performance cost. We reimplement an Android app with ∼ 7 K LOC in Dyno, find a bug in the existing implementation, and show that the app's approach is representative of a common way to implement dynamic placement found in over 100 apps in a large open-source app store.

Publisher's Version Published Artifact Artifacts Available Artifacts Reusable
Explainable Program Synthesis by Localizing Specifications
Amirmohammad Nazari, Yifei Huang, Roopsha Samanta, Arjun Radhakrishna, and Mukund Raghothaman
(University of Southern California, USA; Purdue University, USA; Microsoft, USA)
The traditional formulation of the program synthesis problem is to find a program that meets a logical correctness specification. When synthesis is successful, there is a guarantee that the implementation satisfies the specification. Unfortunately, synthesis engines are typically monolithic algorithms, and obscure the correspondence between the specification, implementation and user intent. In contrast, humans often include comments in their code to guide future developers towards the purpose and design of different parts of the codebase. In this paper, we introduce subspecifications as a mechanism to augment the synthesized implementation with explanatory notes of this form. In this model, the user may ask for explanations of different parts of the implementation; the subspecification generated in response is a logical formula that describes the constraints induced on that subexpression by the global specification and surrounding implementation. We develop algorithms to construct and verify subspecifications and investigate their theoretical properties. We perform an experimental evaluation of the subspecification generation procedure, and measure its effectiveness and running time. Finally, we conduct a user study to determine whether subspecifications are useful: we find that subspecifications greatly aid in understanding the global specification, in identifying alternative implementations, and in debugging faulty implementations.

Publisher's Version Published Artifact Archive submitted (430 kB) Artifacts Available Artifacts Reusable
Perception Contracts for Safety of ML-Enabled Systems
Angello Astorga, Chiao Hsieh, P. Madhusudan, and Sayan Mitra
(University of Illinois at Urbana-Champaign, USA)
We introduce a novel notion of perception contracts to reason about the safety of controllers that interact with an environment using neural perception. Perception contracts capture errors in ground-truth estimations that preserve invariants when systems act upon them. We develop a theory of perception contracts and design symbolic learning algorithms for synthesizing them from a finite set of images. We implement our algorithms and evaluate synthesized perception contracts for two realistic vision-based control systems, a lane tracking system for an electric vehicle and an agricultural robot that follows crop rows. Our evaluation shows that our approach is effective in synthesizing perception contracts and generalizes well when evaluated over test images obtained during runtime monitoring of the systems.

Publisher's Version
Message Chains for Distributed System Verification
Federico Mora, Ankush Desai, Elizabeth Polgreen, and Sanjit A. Seshia
(University of California at Berkeley, USA; Amazon Web Services, USA; University of Edinburgh, UK)
Verification of asynchronous distributed programs is challenging due to the need to reason about numerous control paths resulting from the myriad interleaving of messages and failures. In this paper, we propose an automated bookkeeping method based on message chains. Message chains reveal structure in asynchronous distributed system executions and can help programmers verify their systems at the message passing level of abstraction. To evaluate our contributions empirically we build a verification prototype for the P programming language that integrates message chains. We use it to verify 16 benchmarks from related work, one new benchmark that exemplifies the kinds of systems our method focuses on, and two industrial benchmarks. We find that message chains are able to simplify existing proofs and our prototype performs comparably to existing work in terms of runtime. We extend our work with support for specification mining and find that message chains provide enough structure to allow existing learning and program synthesis tools to automatically infer meaningful specifications using only execution examples.

Publisher's Version

proc time: 13.16