Powered by
40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2019),
June 22–26, 2019,
Phoenix, AZ, USA
Frontmatter
Message from the Chairs
Welcome to PLDI 2019, the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation! This year the conference takes place in Phoenix, Arizona, USA from June 22 to 26 as part of ACM's Federated Computing Research Conference (FCRC).
As the call for research papers indicated, PLDI is a premier forum in the field of programming languages and programming systems research, covering the areas of design, implementation, theory, applications, and performance, broadly construed. The conference annually publishes outstanding research that advances the state of the art and has the potential for long-lasting impact.
It is inclusive in scope: innovative system designs, thorough empirical work, well-motivated theoretical results, and new application areas are all welcome emphases at PLDI.
Concurrency I
Promising-ARM/RISC-V: A Simpler and Faster Operational Concurrency Model
Christopher Pulte, Jean Pichon-Pharabod,
Jeehoon Kang,
Sung-Hwan Lee, and
Chung-Kil Hur
(Cambridge University, UK; KAIST, South Korea; Seoul National University, South Korea)
For ARMv8 and RISC-V, there are concurrency models in two styles,
extensionally equivalent: axiomatic models, expressing the concurrency
semantics in terms of global properties of complete executions; and
operational models, that compute incrementally. The latter are in an
abstract microarchitectural style: they execute each instruction in
multiple steps, out-of-order and with explicit branch speculation.
This similarity to hardware implementations has been important in
developing the models and in establishing confidence, but involves
complexity that, for programming and model-checking, one would prefer
to avoid.
We present new more abstract operational models for ARMv8 and RISC-V,
and an exploration tool based on them. The models compute the allowed
concurrency behaviours incrementally based on thread-local conditions
and are significantly simpler than the existing operational models:
executing instructions in a single step and (with the exception of
early writes) in program order, and without branch speculation. We
prove the models equivalent to the existing ARMv8 and RISC-V axiomatic
models in Coq. The exploration tool is the first such tool for ARMv8
and RISC-V fast enough for exhaustively checking the concurrency
behaviour of a number of interesting examples. We demonstrate using
the tool for checking several standard concurrent datastructure and
lock implementations, and for interactively stepping through
model-allowed executions for debugging.
@InProceedings{PLDI19p1,
author = {Christopher Pulte and Jean Pichon-Pharabod and Jeehoon Kang and Sung-Hwan Lee and Chung-Kil Hur},
title = {Promising-ARM/RISC-V: A Simpler and Faster Operational Concurrency Model},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1--15},
doi = {10.1145/3314221.3314624},
year = {2019},
}
Publisher's Version
Artifacts Functional
Accelerating Sequential Consistency for Java with Speculative Compilation
Lun Liu,
Todd Millstein, and
Madanlal Musuvathi
(University of California at Los Angeles, USA; Microsoft Research, USA)
A memory consistency model (or simply a memory model) specifies the granularity and the order in which memory accesses by one thread become visible to other threads in the program. We previously proposed the volatile-by-default (VBD) memory model as a natural form of sequential consistency (SC) for Java. VBD is significantly stronger than the Java memory model (JMM) and incurs relatively modest overheads in a modified HotSpot JVM running on Intel x86 hardware. However, the x86 memory model is already quite close to SC. It is expected that the cost of VBD will be much higher on the other widely used hardware platform today, namely ARM, whose memory model is very weak.
In this paper, we quantify this expectation by building and evaluating a baseline volatile-by-default JVM for ARM called VBDA-HotSpot, using the same technique previously used for x86. Through this baseline we report, to the best of our knowledge, the first comprehensive study of the cost of providing language-level SC for a production compiler on ARM. VBDA-HotSpot indeed incurs a considerable performance penalty on ARM, with average overheads on the DaCapo benchmarks on two ARM servers of 57% and 73% respectively.
Motivated by these experimental results, we then present a novel speculative technique to optimize language-level SC. While several prior works have shown how to optimize SC in the context of an offline, whole-program compiler, to our knowledge this is the first optimization approach that is compatible with modern implementation technology, including dynamic class loading and just-in-time (JIT) compilation. The basic idea is to modify the JIT compiler to treat each object as thread-local initially, so accesses to its fields can be compiled without fences. If an object is ever accessed by a second thread, any speculatively compiled code for the object is removed, and future JITed code for the object will include the necessary fences in order to ensure SC. We demonstrate that this technique is effective, reducing the overhead of enforcing VBD by one-third on average, and additional experiments validate the thread-locality hypothesis that underlies the approach.
@InProceedings{PLDI19p16,
author = {Lun Liu and Todd Millstein and Madanlal Musuvathi},
title = {Accelerating Sequential Consistency for Java with Speculative Compilation},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {16--30},
doi = {10.1145/3314221.3314611},
year = {2019},
}
Publisher's Version
Info
Artifacts Functional
Renaissance: Benchmarking Suite for Parallel Applications on the JVM
Aleksandar Prokopec,
Andrea Rosà, David Leopoldseder, Gilles Duboscq,
Petr Tůma, Martin Studener, Lubomír Bulej, Yudi Zheng, Alex Villazón, Doug Simon,
Thomas Würthinger, and
Walter Binder
(Oracle Labs, Switzerland; USI Lugano, Switzerland; JKU Linz, Austria; Charles University in Prague, Czechia; Universidad Privada Boliviana, Bolivia)
Established benchmark suites for the Java Virtual Machine (JVM),
such as DaCapo, ScalaBench, and SPECjvm2008,
lack workloads that take advantage of the parallel programming abstractions
and concurrency primitives offered by the JVM and the Java Class Library.
However, such workloads are fundamental for understanding the way in which
modern applications and data-processing frameworks use the JVM's concurrency features,
and for validating new just-in-time (JIT) compiler optimizations
that enable more efficient execution of such workloads.
We present Renaissance, a new benchmark suite composed of
modern, real-world, concurrent, and object-oriented workloads
that exercise various concurrency primitives of the JVM.
We show that the use of concurrency primitives in these workloads
reveals optimization opportunities that were not visible with the existing workloads.
We use Renaissance to compare performance of two state-of-the-art,
production-quality JIT compilers (HotSpot C2 and Graal),
and show that the performance differences are more significant
than on existing suites such as DaCapo and SPECjvm2008.
We also use Renaissance to expose four new compiler optimizations,
and we analyze the behavior of several existing ones.
We use Renaissance to compare performance of two state-of-the-art,
production-quality JIT compilers (HotSpot C2 and Graal),
and show that the performance differences are more significant
than on existing suites such as DaCapo and SPECjvm2008.
We also use Renaissance to expose four new compiler optimizations,
and we analyze the behavior of several existing ones.
@InProceedings{PLDI19p31,
author = {Aleksandar Prokopec and Andrea Rosà and David Leopoldseder and Gilles Duboscq and Petr Tůma and Martin Studener and Lubomír Bulej and Yudi Zheng and Alex Villazón and Doug Simon and Thomas Würthinger and Walter Binder},
title = {Renaissance: Benchmarking Suite for Parallel Applications on the JVM},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {31--47},
doi = {10.1145/3314221.3314637},
year = {2019},
}
Publisher's Version
Artifacts Functional
Language Design I
LoCal: A Language for Programs Operating on Serialized Data
Michael Vollmer,
Chaitanya Koparkar,
Mike Rainey, Laith Sakka,
Milind Kulkarni, and Ryan R. Newton
(Indiana University, USA; Purdue University, USA)
In a typical data-processing program, the representation of data in memory is distinct from its representation in a serialized form on disk. The former has pointers and arbitrary, sparse layout, facilitating easy manipulation by a program, while the latter is packed contiguously, facilitating easy I/O. We propose a language, LoCal, to unify in-memory and serialized formats. LoCal extends a region calculus into a location calculus, employing a type system that tracks the byte-addressed layout of all heap values. We formalize LoCal and prove type safety, and show how LoCal programs can be inferred from unannotated source terms.
We transform the existing Gibbon compiler to use LoCal as an intermediate language, with the goal of achieving a balance between code speed and data compactness by introducing just enough indirection into heap layouts, preserving the asymptotic complexity of traditional representations, but working with mostly or completely serialized data. We show that our approach yields significant performance improvement over prior approaches to operating on packed data, without abandoning idiomatic programming with recursive functions.
@InProceedings{PLDI19p48,
author = {Michael Vollmer and Chaitanya Koparkar and Mike Rainey and Laith Sakka and Milind Kulkarni and Ryan R. Newton},
title = {LoCal: A Language for Programs Operating on Serialized Data},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {48--62},
doi = {10.1145/3314221.3314631},
year = {2019},
}
Publisher's Version
Artifacts Functional
Scenic: A Language for Scenario Specification and Scene Generation
Daniel J. Fremont, Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Alberto L. Sangiovanni-Vincentelli, and
Sanjit A. Seshia
(University of California at Berkeley, USA)
We propose a new probabilistic programming language for the design and analysis of perception systems, especially those based on machine learning. Specifically, we consider the problems of training a perception system to handle rare events, testing its performance under different conditions, and debugging failures. We show how a probabilistic programming language can help address these problems by specifying distributions encoding interesting types of inputs and sampling these to generate specialized training and test sets. More generally, such languages can be used for cyber-physical systems and robotics to write environment models, an essential prerequisite to any formal analysis. In this paper, we focus on systems like autonomous cars and robots, whose environment is a scene, a configuration of physical objects and agents. We design a domain-specific language, Scenic, for describing scenarios that are distributions over scenes. As a probabilistic programming language, Scenic allows assigning distributions to features of the scene, as well as declaratively imposing hard and soft constraints over the scene. We develop specialized techniques for sampling from the resulting distribution, taking advantage of the structure provided by Scenic's domain-specific syntax. Finally, we apply Scenic in a case study on a convolutional neural network designed to detect cars in road images, improving its performance beyond that achieved by state-of-the-art synthetic data generation methods.
@InProceedings{PLDI19p63,
author = {Daniel J. Fremont and Tommaso Dreossi and Shromona Ghosh and Xiangyu Yue and Alberto L. Sangiovanni-Vincentelli and Sanjit A. Seshia},
title = {Scenic: A Language for Scenario Specification and Scene Generation},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {63--78},
doi = {10.1145/3314221.3314633},
year = {2019},
}
Publisher's Version
Info
Compiling KB-Sized Machine Learning Models to Tiny IoT Devices
Sridhar Gopinath, Nikhil Ghanathe,
Vivek Seshadri, and
Rahul Sharma
(Microsoft Research, India)
Recent advances in machine learning (ML) have produced KiloByte-size models that can directly run on constrained IoT devices. This approach avoids expensive communication between IoT devices and the cloud, thereby enabling energy-efficient real-time analytics. However, ML models are expressed typically in floating-point, and IoT hardware typically does not support floating-point. Therefore, running these models on IoT devices requires simulating IEEE-754 floating-point using software, which is very inefficient.
We present SeeDot, a domain-specific language to express ML inference algorithms and a compiler that compiles SeeDot programs to fixed-point code that can efficiently run on constrained IoT devices. We propose 1) a novel compilation strategy that reduces the search space for some key parameters used in the fixed-point code, and 2) new efficient implementations of expensive operations. SeeDot compiles state-of-the-art KB-sized models to various microcontrollers and low-end FPGAs. We show that SeeDot outperforms 1) software emulation of floating-point (Arduino), 2) high-bitwidth fixed-point (MATLAB), 3) post-training quantization (TensorFlow-Lite), and 4) floating- and fixed-point FPGA implementations generated using high-level synthesis tools.
@InProceedings{PLDI19p79,
author = {Sridhar Gopinath and Nikhil Ghanathe and Vivek Seshadri and Rahul Sharma},
title = {Compiling KB-Sized Machine Learning Models to Tiny IoT Devices},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {79--95},
doi = {10.1145/3314221.3314597},
year = {2019},
}
Publisher's Version
Concurrency II
Model Checking for Weakly Consistent Libraries
Michalis Kokologiannakis, Azalea Raad, and
Viktor Vafeiadis
(MPI-SWS, Germany)
We present GenMC, a model checking algorithm for concurrent programs
that is parametric in the choice of memory model
and can be used for verifying clients of concurrent libraries.
Subject to a few basic conditions about the memory model,
our algorithm is sound, complete and optimal,
in that it explores each consistent execution of the program according to the model exactly once,
and does not explore inconsistent executions or embark on futile exploration paths.
We implement GenMC as a tool for verifying C programs.
Despite the generality of the algorithm,
its performance is comparable to the state-of-art
specialized model checkers for specific memory models,
and in certain cases exponentially faster thanks to its coarse partitioning of executions.
@InProceedings{PLDI19p96,
author = {Michalis Kokologiannakis and Azalea Raad and Viktor Vafeiadis},
title = {Model Checking for Weakly Consistent Libraries},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {96--110},
doi = {10.1145/3314221.3314609},
year = {2019},
}
Publisher's Version
Artifacts Functional
Towards Certified Separate Compilation for Concurrent Programs
Hanru Jiang, Hongjin Liang, Siyang Xiao, Junpeng Zha, and Xinyu Feng
(University of Science and Technology of China, China; Nanjing University, China)
Certified separate compilation is important for establishing end-to-end guarantees for certified systems consisting of multiple program modules. There has been much work building certified compilers for sequential programs. In this paper, we propose a language-independent framework consisting of the key semantics components and lemmas that bridge the verification gap between the compilers for sequential programs and those for (race-free) concurrent programs, so that the existing verification work for the former can be reused. One of the key contributions of the framework is a novel footprint-preserving compositional simulation as the compilation correctness criterion. The framework also provides a new mechanism to support confined benign races which are usually found in efficient implementations of synchronization primitives.
With our framework, we develop CASCompCert, which extends CompCert for certified separate compilation of race-free concurrent Clight programs. It also allows linking of concurrent Clight modules with x86-TSO implementations of synchronization primitives containing benign races. All our work has been implemented in the Coq proof assistant.
@InProceedings{PLDI19p111,
author = {Hanru Jiang and Hongjin Liang and Siyang Xiao and Junpeng Zha and Xinyu Feng},
title = {Towards Certified Separate Compilation for Concurrent Programs},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {111--125},
doi = {10.1145/3314221.3314595},
year = {2019},
}
Publisher's Version
Artifacts Functional
Robustness against Release/Acquire Semantics
Ori Lahav and
Roy Margalit
(Tel Aviv University, Israel)
We present an algorithm for automatically checking robustness of concurrent programs against C/C++11 release/acquire semantics, namely verifying that all program behaviors under release/acquire are allowed by sequential consistency. Our approach reduces robustness verification to a reachability problem under (instrumented) sequential consistency. We have implemented our algorithm in a prototype tool called Rocker and applied it to several challenging concurrent algorithms. To the best of our knowledge, this is the first precise method for verifying robustness against a high-level programming language weak memory semantics.
@InProceedings{PLDI19p126,
author = {Ori Lahav and Roy Margalit},
title = {Robustness against Release/Acquire Semantics},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {126--141},
doi = {10.1145/3314221.3314604},
year = {2019},
}
Publisher's Version
Artifacts Functional
Language Design II
CHET: An Optimizing Compiler for Fully-Homomorphic Neural-Network Inferencing
Roshan Dathathri,
Olli Saarikivi, Hao Chen, Kim Laine, Kristin Lauter,
Saeed Maleki,
Madanlal Musuvathi, and Todd Mytkowicz
(University of Texas at Austin, USA; Microsoft Research, USA)
Fully Homomorphic Encryption (FHE) refers to a set of encryption schemes
that allow computations on encrypted data without
requiring a secret key. Recent cryptographic advances have pushed FHE
into the realm of practical applications. However, programming these
applications remains a huge challenge, as it requires
cryptographic domain expertise to ensure correctness, security, and
performance.
CHET is a domain-specific optimizing compiler designed to make the task of
programming FHE applications easier. Motivated by the need to perform
neural network inference on encrypted medical and financial data, CHET
supports a domain-specific language for specifying tensor circuits. It automates many of
the laborious and error prone tasks of encoding such circuits
homomorphically, including encryption parameter selection to guarantee
security and accuracy of the computation, determining efficient tensor
layouts, and performing scheme-specific optimizations.
Our evaluation on a collection of popular neural networks shows that
CHET generates homomorphic circuits that outperform expert-tuned
circuits and makes it easy to switch across different encryption
schemes. We demonstrate its scalability by evaluating it on a version of
SqueezeNet, which to the best of our knowledge, is the deepest neural
network to be evaluated homomorphically.
@InProceedings{PLDI19p142,
author = {Roshan Dathathri and Olli Saarikivi and Hao Chen and Kim Laine and Kristin Lauter and Saeed Maleki and Madanlal Musuvathi and Todd Mytkowicz},
title = {CHET: An Optimizing Compiler for Fully-Homomorphic Neural-Network Inferencing},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {142--156},
doi = {10.1145/3314221.3314628},
year = {2019},
}
Publisher's Version
Usuba: High-Throughput and Constant-Time Ciphers, by Construction
Darius Mercadier and Pierre-Évariste Dagand
(Sorbonne University, France; CNRS, France; Inria, France; LIP6, France)
Cryptographic primitives are subject to diverging imperatives. Functional correctness and auditability pushes for the use of a high-level programming language. Performance and the threat of timing attacks push for using no more abstract than an assembler to exploit (or avoid!) the micro-architectural features of a given machine. We believe that a suitable programming language can reconcile both views and actually improve on the state of the art of both. Usuba is an opinionated dataflow programming language in which block ciphers become so simple as to be “obviously correct” and whose types document and enforce valid parallelization strategies at the granularity of individual bits. Its optimizing compiler, Usubac, produces high-throughput, constant-time implementations performing on par with hand-tuned reference implementations. The cornerstone of our approach is a systematization and generalization of bitslicing, an implementation trick frequently used by cryptographers.
@InProceedings{PLDI19p157,
author = {Darius Mercadier and Pierre-Évariste Dagand},
title = {Usuba: High-Throughput and Constant-Time Ciphers, by Construction},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {157--173},
doi = {10.1145/3314221.3314636},
year = {2019},
}
Publisher's Version
Artifacts Functional
FaCT: A DSL for Timing-Sensitive Computation
Sunjay Cauligi, Gary Soeller, Brian Johannesmeyer, Fraser Brown, Riad S. Wahby, John Renner,
Benjamin Grégoire, Gilles Barthe,
Ranjit Jhala, and
Deian Stefan
(University of California at San Diego, USA; Stanford University, USA; Inria, France; MPI for Security and Privacy, Germany; IMDEA Software Institute, Spain)
Real-world cryptographic code is often written in a subset of C intended to execute in constant-time, thereby avoiding timing side channel vulnerabilities. This C subset eschews structured programming as we know it: if-statements, looping constructs, and procedural abstractions can leak timing information when handling sensitive data. The resulting obfuscation has led to subtle bugs, even in widely-used high-profile libraries like OpenSSL.
To address the challenge of writing constant-time cryptographic code, we present FaCT, a crypto DSL that provides high-level but safe language constructs. The FaCT compiler uses a secrecy type system to automatically transform potentially timing-sensitive high-level code into low-level, constant-time LLVM bitcode. We develop the language and type system, formalize the constant-time transformation, and present an empirical evaluation that uses FaCT to implement core crypto routines from several open-source projects including OpenSSL, libsodium, and curve25519-donna. Our evaluation shows that FaCT’s design makes it possible to write readable, high-level cryptographic code, with efficient, constant-time behavior.
@InProceedings{PLDI19p174,
author = {Sunjay Cauligi and Gary Soeller and Brian Johannesmeyer and Fraser Brown and Riad S. Wahby and John Renner and Benjamin Grégoire and Gilles Barthe and Ranjit Jhala and Deian Stefan},
title = {FaCT: A DSL for Timing-Sensitive Computation},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {174--189},
doi = {10.1145/3314221.3314605},
year = {2019},
}
Publisher's Version
Info
Artifacts Functional
Probabilistic Programming
Scalable Verification of Probabilistic Networks
Steffen Smolka, Praveen Kumar, David M. Kahn,
Nate Foster,
Justin Hsu,
Dexter Kozen, and
Alexandra Silva
(Cornell University, USA; Carnegie Mellon University, USA; University of Wisconsin-Madison, USA; University College London, UK)
This paper presents McNetKAT, a scalable tool for verifying
probabilistic network programs. McNetKAT is based on a new semantics
for the guarded and history-free fragment of Probabilistic NetKAT in
terms of finite-state, absorbing Markov chains. This view allows the
semantics of all programs to be computed exactly, enabling
construction of an automatic verification tool. Domain-specific
optimizations and a parallelizing backend enable
McNetKAT to analyze networks with thousands of nodes,
automatically reasoning about general properties such as probabilistic
program equivalence and refinement, as well as networking properties
such as resilience to failures. We evaluate McNetKAT's scalability
using real-world topologies, compare its performance against
state-of-the-art tools, and develop an extended case study on a
recently proposed data center network design.
@InProceedings{PLDI19p190,
author = {Steffen Smolka and Praveen Kumar and David M. Kahn and Nate Foster and Justin Hsu and Dexter Kozen and Alexandra Silva},
title = {Scalable Verification of Probabilistic Networks},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {190--203},
doi = {10.1145/3314221.3314639},
year = {2019},
}
Publisher's Version
Artifacts Functional
Cost Analysis of Nondeterministic Probabilistic Programs
Peixin Wang, Hongfei Fu,
Amir Kafshdar Goharshady,
Krishnendu Chatterjee, Xudong Qin, and Wenjun Shi
(Shanghai Jiao Tong University, China; East China Normal University, China; IST Austria, Austria)
We consider the problem of expected cost analysis over nondeterministic probabilistic programs,
which aims at automated methods for analyzing the resource-usage of such programs.
Previous approaches for this problem could only handle nonnegative bounded costs.
However, in many scenarios, such as queuing networks or analysis of cryptocurrency protocols,
both positive and negative costs are necessary and the costs are unbounded as well.
In this work, we present a sound and efficient approach to obtain polynomial bounds on the
expected accumulated cost of nondeterministic probabilistic programs.
Our approach can handle (a) general positive and negative costs with bounded updates in
variables; and (b) nonnegative costs with general updates to variables.
We show that several natural examples which could not be
handled by previous approaches are captured in our framework.
Moreover, our approach leads to an efficient polynomial-time algorithm, while no
previous approach for cost analysis of probabilistic programs could guarantee polynomial runtime.
Finally, we show the effectiveness of our approach using experimental results on a variety of programs for which we efficiently synthesize tight resource-usage bounds.
@InProceedings{PLDI19p204,
author = {Peixin Wang and Hongfei Fu and Amir Kafshdar Goharshady and Krishnendu Chatterjee and Xudong Qin and Wenjun Shi},
title = {Cost Analysis of Nondeterministic Probabilistic Programs},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {204--220},
doi = {10.1145/3314221.3314581},
year = {2019},
}
Publisher's Version
Info
Artifacts Functional
Gen: A General-Purpose Probabilistic Programming System with Programmable Inference
Marco F. Cusumano-Towner, Feras A. Saad,
Alexander K. Lew, and
Vikash K. Mansinghka
(Massachusetts Institute of Technology, USA)
Although probabilistic programming is widely used for some restricted classes of statistical models, existing systems lack the flexibility and efficiency needed for practical use with more challenging models arising in fields like computer vision and robotics. This paper introduces Gen, a general-purpose probabilistic programming system that achieves modeling flexibility and inference efficiency via several novel language constructs: (i) the generative function interface for encapsulating probabilistic models; (ii) interoperable modeling languages that strike different flexibility/efficiency trade-offs; (iii) combinators that exploit common patterns of conditional independence; and (iv) an inference library that empowers users to implement efficient inference algorithms at a high level of abstraction. We show that Gen outperforms state-of-the-art probabilistic programming systems, sometimes by multiple orders of magnitude, on diverse problems including object tracking, estimating 3D body pose from a depth image, and inferring the structure of a time series.
@InProceedings{PLDI19p221,
author = {Marco F. Cusumano-Towner and Feras A. Saad and Alexander K. Lew and Vikash K. Mansinghka},
title = {Gen: A General-Purpose Probabilistic Programming System with Programmable Inference},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {221--236},
doi = {10.1145/3314221.3314642},
year = {2019},
}
Publisher's Version
Incremental Precision-Preserving Symbolic Inference for Probabilistic Programs
Jieyuan Zhang and
Jingling Xue
(UNSW, Australia)
We present ISymb an incremental symbolic inference framework for probabilistic programs in situations when some loop-manipulated array data, upon which their probabilistic models are conditioned, undergoes small changes. To tackle the path explosion challenge, ISymb is intra-procedurally path-sensitive except that it conducts a “meet-over-all-paths” analysis within an iteration of a loop (conditioned on some observed array data). By recomputing only the probability distributions for the paths affected, ISymb avoids expensive symbolic inference from scratch while also being precision-preserving. Our evaluation with a set of existing benchmarks shows that ISymb can lead to orders of magnitude performance improvements compared to its non-incremental counterpart (under small changes in observed array data).
@InProceedings{PLDI19p237,
author = {Jieyuan Zhang and Jingling Xue},
title = {Incremental Precision-Preserving Symbolic Inference for Probabilistic Programs},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {237--252},
doi = {10.1145/3314221.3314623},
year = {2019},
}
Publisher's Version
Artifacts Functional
Synthesis
Resource-Guided Program Synthesis
Tristan Knoth, Di Wang,
Nadia Polikarpova, and
Jan Hoffmann
(University of California at San Diego, USA; Carnegie Mellon University, USA)
This article presents resource-guided synthesis, a technique
for synthesizing recursive programs that satisfy both a functional specification and a symbolic resource bound. The technique is type-directed and rests upon a novel type system that
combines polymorphic refinement types with potential annotations of automatic amortized resource analysis. The type
system enables efficient constraint-based type checking and
can express precise refinement-based resource bounds. The
proof of type soundness shows that synthesized programs
are correct by construction. By tightly integrating program
exploration and type checking, the synthesizer can leverage
the user-provided resource bound to guide the search, eagerly rejecting incomplete programs that consume too many
resources. An implementation in the resource-guided synthesizer ReSyn is used to evaluate the technique on a range of recursive data structure manipulations. The experiments show
that ReSyn synthesizes programs that are asymptotically
more efficient than those generated by a resource-agnostic
synthesizer. Moreover, synthesis with ReSyn is faster than a
naive combination of synthesis and resource analysis. ReSyn
is also able to generate implementations that have a constant
resource consumption for fixed input sizes, which can be used
to mitigate side-channel attacks.
@InProceedings{PLDI19p253,
author = {Tristan Knoth and Di Wang and Nadia Polikarpova and Jan Hoffmann},
title = {Resource-Guided Program Synthesis},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {253--268},
doi = {10.1145/3314221.3314602},
year = {2019},
}
Publisher's Version
Artifacts Functional
Using Active Learning to Synthesize Models of Applications That Access Databases
Jiasi Shen and
Martin C. Rinard
(Massachusetts Institute of Technology, USA)
We present Konure, a new system that uses active learning to infer models of applications that access relational databases. Konure comprises a domain-specific language (each model is a program in this language) and associated inference algorithm that infers models of applications whose behavior can be expressed in this language. The inference algorithm generates inputs and database configurations, runs the application, then observes the resulting database traffic and outputs to progressively refine its current model hypothesis. Because the technique works with only externally observable inputs, outputs, and database configurations, it can infer the behavior of applications written in arbitrary languages using arbitrary coding styles (as long as the behavior of the application is expressible in the domain-specific language). Konure also implements a regenerator that produces a translated Python implementation of the application that systematically includes relevant security and error checks.
@InProceedings{PLDI19p269,
author = {Jiasi Shen and Martin C. Rinard},
title = {Using Active Learning to Synthesize Models of Applications That Access Databases},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {269--285},
doi = {10.1145/3314221.3314591},
year = {2019},
}
Publisher's Version
Synthesizing Database Programs for Schema Refactoring
Yuepeng Wang, James Dong, Rushi Shah, and
Isil Dillig
(University of Texas at Austin, USA)
Many programs that interact with a database need to undergo schema refactoring several times during their life cycle. Since this process typically requires making significant changes to the program's implementation, schema refactoring is often non-trivial and error-prone. Motivated by this problem, we propose a new technique for automatically synthesizing a new version of a database program given its original version and the source and target schemas. Our method does not require manual user guidance and ensures that the synthesized program is equivalent to the original one. Furthermore, our method is quite efficient and can synthesize new versions of database programs (containing up to 263 functions) that are extracted from real-world web applications with an average synthesis time of 69.4 seconds.
@InProceedings{PLDI19p286,
author = {Yuepeng Wang and James Dong and Rushi Shah and Isil Dillig},
title = {Synthesizing Database Programs for Schema Refactoring},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {286--300},
doi = {10.1145/3314221.3314588},
year = {2019},
}
Publisher's Version
Synthesis and Machine Learning for Heterogeneous Extraction
Arun Iyer, Manohar Jonnalagedda,
Suresh Parthasarathy,
Arjun Radhakrishna, and
Sriram K. Rajamani
(Microsoft Research, India; Inpher, Switzerland; Microsoft, USA)
We present a way to combine techniques from the program synthesis and machine learning communities to extract structured information from heterogeneous data. Such problems arise in several situations such as extracting attributes from web pages, machine-generated emails, or from data obtained from multiple sources. Our goal is to extract a set of structured attributes from such data.
We use machine learning models ("ML models") such as conditional random fields to get an initial labeling of potential attribute values. However, such models are typically not interpretable, and the noise produced by such models is hard to manage or debug. We use (noisy) labels produced by such ML models as inputs to program synthesis, and generate interpretable programs that cover the input space. We also employ type specifications (called "field constraints") to certify well-formedness of extracted values. Using synthesized programs and field constraints, we re-train the ML models with improved confidence on the labels. We then use these improved labels to re-synthesize a better set of programs. We iterate the process of re-synthesizing the programs and re-training the ML models, and find that such an iterative process improves the quality of the extraction process. This iterative approach, called HDEF, is novel, not only the in way it combines the ML models with program synthesis, but also in the way it adapts program synthesis to deal with noise and heterogeneity.
More broadly, our approach points to ways by which machine learning and programming language techniques can be combined to get the best of both worlds --- handling noise, transferring signals from one context to another using ML, producing interpretable programs using PL, and minimizing user intervention.
@InProceedings{PLDI19p301,
author = {Arun Iyer and Manohar Jonnalagedda and Suresh Parthasarathy and Arjun Radhakrishna and Sriram K. Rajamani},
title = {Synthesis and Machine Learning for Heterogeneous Extraction},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {301--315},
doi = {10.1145/3314221.3322485},
year = {2019},
}
Publisher's Version
Memory Management
AutoPersist: An Easy-To-Use Java NVM Framework Based on Reachability
Thomas Shull,
Jian Huang, and
Josep Torrellas
(University of Illinois at Urbana-Champaign, USA)
Byte-addressable, non-volatile memory (NVM) is emerging as a revolutionary memory technology that provides persistency, near-DRAM performance, and scalable capacity. To facilitate its use, many NVM programming models have been proposed. However, most models require programmers to explicitly specify the data structures or objects that should reside in NVM. Such requirement increases the burden on programmers, complicates software development, and introduces opportunities for correctness and performance bugs.
We believe that requiring programmers to identify the data structures that should reside in NVM is untenable. Instead, programmers should only be required to identify durable roots - the entry points to the persistent data structures at recovery time. The NVM programming framework should then automatically ensure that all the data structures reachable from these roots are in NVM, and stores to these data structures are persistently completed in an intuitive order.
To this end, we present a new NVM programming framework, named AutoPersist, that only requires programmers to identify durable roots. AutoPersist then persists all the data structures that can be reached from the durable roots in an automated and transparent manner. We implement AutoPersist as a thread-safe extension to the Java language and perform experiments with a variety of applications running on Intel Optane DC persistent memory. We demonstrate that AutoPersist requires minimal code modifications, and significantly outperforms expert-marked Java NVM applications.
@InProceedings{PLDI19p316,
author = {Thomas Shull and Jian Huang and Josep Torrellas},
title = {AutoPersist: An Easy-To-Use Java NVM Framework Based on Reachability},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {316--332},
doi = {10.1145/3314221.3314608},
year = {2019},
}
Publisher's Version
Mesh: Compacting Memory Management for C/C++ Applications
Bobby Powers, David Tench,
Emery D. Berger, and Andrew McGregor
(University of Massachusetts at Amherst, USA)
Programs written in C/C++ can suffer from serious memory fragmentation, leading to low utilization of memory, degraded performance, and application failure due to memory exhaustion. This paper introduces Mesh, a plug-in replacement for malloc that, for the first time, eliminates fragmentation in unmodified C/C++ applications. Mesh combines novel randomized algorithms with widely-supported virtual memory operations to provably reduce fragmentation, breaking the classical Robson bounds with high probability. Mesh generally matches the runtime performance of state-of-the-art memory allocators while reducing memory consumption; in particular, it reduces the memory of consumption of Firefox by 16% and Redis by 39%.
@InProceedings{PLDI19p333,
author = {Bobby Powers and David Tench and Emery D. Berger and Andrew McGregor},
title = {Mesh: Compacting Memory Management for C/C++ Applications},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {333--346},
doi = {10.1145/3314221.3314582},
year = {2019},
}
Publisher's Version
Artifacts Functional
Panthera: Holistic Memory Management for Big Data Processing over Hybrid Memories
Chenxi Wang,
Huimin Cui, Ting Cao, John Zigman, Haris Volos, Onur Mutlu, Fang Lv,
Xiaobing Feng, and
Guoqing Harry Xu
(University of California at Los Angeles, USA; Institute of Computing Technology at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China; Microsoft Research, USA; University of Sydney, Australia; Google, USA; ETH Zurich, Switzerland)
Modern data-parallel systems such as Spark rely increasingly on in-memory computing that can significantly improve the efficiency of iterative algorithms. To process real-world datasets, modern data-parallel systems often require extremely large amounts of memory, which are both costly and energy-inefficient. Emerging non-volatile memory (NVM) technologies offers high capacity compared to DRAM and low energy compared to SSDs. Hence, NVMs have the potential to fundamentally change the dichotomy between DRAM and durable storage in Big Data processing. However, most Big Data applications are written in managed languages (e.g., Scala and Java) and executed on top of a managed runtime (e.g., the Java Virtual Machine) that already performs various dimensions of memory management. Supporting hybrid physical memories adds in a new dimension, creating unique challenges in data replacement and migration.
This paper proposes Panthera, a semantics-aware, fully automated memory management technique for Big Data processing over hybrid memories. Panthera analyzes user programs on a Big Data system to infer their coarse-grained access patterns, which are then passed down to the Panthera runtime for efficient data placement and migration. For Big Data applications, the coarse-grained data division is accurate enough to guide GC for data layout, which hardly incurs data monitoring and moving overhead. We have implemented Panthera in OpenJDK and Apache Spark. An extensive evaluation with various datasets and applications demonstrates that Panthera reduces energy by 32 – 52% at only a 1 – 9% execution time overhead.
@InProceedings{PLDI19p347,
author = {Chenxi Wang and Huimin Cui and Ting Cao and John Zigman and Haris Volos and Onur Mutlu and Fang Lv and Xiaobing Feng and Guoqing Harry Xu},
title = {Panthera: Holistic Memory Management for Big Data Processing over Hybrid Memories},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {347--362},
doi = {10.1145/3314221.3314650},
year = {2019},
}
Publisher's Version
Parsing
Lightweight Multi-Language Syntax Transformation with Parser Parser Combinators
Rijnard van Tonder and
Claire Le Goues
(Carnegie Mellon University, USA)
Automatically transforming programs is hard, yet critical for automated program refactoring, rewriting, and repair. Multi-language syntax transformation is especially hard due to heterogeneous representations in syntax, parse trees, and abstract syntax trees (ASTs). Our insight is that the problem can be decomposed such that (1) a common grammar expresses the central context-free language (CFL) properties shared by many contemporary languages and (2) open extension points in the grammar allow customizing syntax (e.g., for balanced delimiters) and hooks in smaller parsers to handle language-specific syntax (e.g., for comments). Our key contribution operationalizes this decomposition using a Parser Parser combinator (PPC), a mechanism that generates parsers for matching syntactic fragments in source code by parsing declarative user-supplied templates. This allows our approach to detach from translating input programs to any particular abstract syntax tree representation, and lifts syntax rewriting to a modularly-defined parsing problem. A notable effect is that we skirt the complexity and burden of defining additional translation layers between concrete user input templates and an underlying abstract syntax representation. We demonstrate that these ideas admit efficient and declarative rewrite templates across 12 languages, and validate effectiveness of our approach by producing correct and desirable lightweight transformations on popular real-world projects (over 50 syntactic changes produced by our approach have been merged into 40+). Our declarative rewrite patterns require an order of magnitude less code compared to analog implementations in existing, language-specific tools.
@InProceedings{PLDI19p363,
author = {Rijnard van Tonder and Claire Le Goues},
title = {Lightweight Multi-Language Syntax Transformation with Parser Parser Combinators},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {363--378},
doi = {10.1145/3314221.3314589},
year = {2019},
}
Publisher's Version
Artifacts Functional
A Typed, Algebraic Approach to Parsing
Neelakantan R. Krishnaswami and
Jeremy Yallop
(University of Cambridge, UK)
In this paper, we recall the definition of the context-free expressions (or µ-regular expressions), an algebraic presentation of the context-free languages. Then, we define a core type system for the context-free expressions which gives a compositional criterion for identifying those context-free expressions which can be parsed unambiguously by predictive algorithms in the style of recursive descent or LL(1).
Next, we show how these typed grammar expressions can be used to derive a parser combinator library which both guarantees linear-time parsing with no backtracking and single-token lookahead, and which respects the natural denotational semantics of context-free expressions. Finally, we show how to exploit the type information to write a staged version of this library, which produces dramatic increases in performance, even outperforming code generated by the standard parser generator tool ocamlyacc.
@InProceedings{PLDI19p379,
author = {Neelakantan R. Krishnaswami and Jeremy Yallop},
title = {A Typed, Algebraic Approach to Parsing},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {379--393},
doi = {10.1145/3314221.3314625},
year = {2019},
}
Publisher's Version
Artifacts Functional
Genie: A Generator of Natural Language Semantic Parsers for Virtual Assistant Commands
Giovanni Campagna, Silei Xu, Mehrad Moradshahi, Richard Socher, and Monica S. Lam
(Stanford University, USA; Salesforce, USA)
To understand diverse natural language commands, virtual assistants today are trained with numerous labor-intensive, manually annotated sentences. This paper presents a methodology and the Genie toolkit that can handle new compound commands with significantly less manual effort.
We advocate formalizing the capability of virtual assistants with a Virtual Assistant Programming Language (VAPL) and using a neural semantic parser to translate natural language into VAPL code. Genie needs only a small realistic set of input sentences for validating the neural model. Developers write templates to synthesize data; Genie uses crowdsourced paraphrases and data augmentation, along with the synthesized data, to train a semantic parser.
We also propose design principles that make VAPL languages amenable to natural language translation. We apply these principles to revise ThingTalk, the language used by the Almond virtual assistant. We use Genie to build the first semantic parser that can support compound virtual assistants commands with unquoted free-form parameters. Genie achieves a 62% accuracy on realistic user inputs. We
demonstrate Genie’s generality by showing a 19% and 31% improvement over the previous state of the art on a music skill, aggregate functions, and access control.
@InProceedings{PLDI19p394,
author = {Giovanni Campagna and Silei Xu and Mehrad Moradshahi and Richard Socher and Monica S. Lam},
title = {Genie: A Generator of Natural Language Semantic Parsers for Virtual Assistant Commands},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {394--410},
doi = {10.1145/3314221.3314594},
year = {2019},
}
Publisher's Version
Artifacts Functional
Bug Finding and Testing I
Lazy Counterfactual Symbolic Execution
William T. Hallahan, Anton Xue, Maxwell Troy Bland,
Ranjit Jhala, and
Ruzica Piskac
(Yale University, USA; University of California at San Diego, USA)
We present counterfactual symbolic execution, a new approach that produces counterexamples that localize the causes of failure of static verification. First, we develop a notion of symbolic weak head normal form and use it to define lazy symbolic execution reduction rules for non-strict languages like Haskell. Second, we introduce counterfactual branching, a new method to identify places where verification fails due to imprecise specifications (as opposed to incorrect code). Third, we show how to use counterfactual symbolic execution to localize refinement type errors, by translating refinement types into assertions. We implement our approach in a new Haskell symbolic execution engine, G2, and evaluate it on a corpus of 7550 errors gathered from users of the LiquidHaskell refinement type system. We show that for 97.7% of these errors, G2 is able to quickly find counterexamples that show how the code or specifications must be fixed to enable verification.
@InProceedings{PLDI19p411,
author = {William T. Hallahan and Anton Xue and Maxwell Troy Bland and Ranjit Jhala and Ruzica Piskac},
title = {Lazy Counterfactual Symbolic Execution},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {411--424},
doi = {10.1145/3314221.3314618},
year = {2019},
}
Publisher's Version
Artifacts Functional
Sound Regular Expression Semantics for Dynamic Symbolic Execution of JavaScript
Blake Loring, Duncan Mitchell, and Johannes Kinder
(Royal Holloway University of London, UK; Bundeswehr University Munich, Germany)
Support for regular expressions in symbolic execution-based tools for test generation and bug finding is insufficient. Common aspects of mainstream regular expression engines, such as backreferences or greedy matching, are ignored or imprecisely approximated, leading to poor test coverage or missed bugs. In this paper, we present a model for the complete regular expression language of ECMAScript 2015 (ES6), which is sound for dynamic symbolic execution of the test and exec functions. We model regular expression operations using string constraints and classical regular expressions and use a refinement scheme to address the problem of matching precedence and greediness. We implemented our model in ExpoSE, a dynamic symbolic execution engine for JavaScript, and evaluated it on over 1,000 Node.js packages containing regular expressions, demonstrating that the strategy is effective and can significantly increase the number of successful regular expression queries and therefore boost coverage.
@InProceedings{PLDI19p425,
author = {Blake Loring and Duncan Mitchell and Johannes Kinder},
title = {Sound Regular Expression Semantics for Dynamic Symbolic Execution of JavaScript},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {425--438},
doi = {10.1145/3314221.3314645},
year = {2019},
}
Publisher's Version
Artifacts Functional
Effective Floating-Point Analysis via Weak-Distance Minimization
Zhoulai Fu and
Zhendong Su
(IT University of Copenhagen, Denmark; ETH Zurich, Switzerland)
This work studies the connection between the problem of analyzing floating-point code and that of function minimization. It formalizes this connection as a reduction theory, where the semantics of a floating-point program is measured as a generalized metric, called weak distance, which faithfully captures any given analysis objective. It is theoretically guaranteed that minimizing the weak distance (e.g., via mathematical optimization) solves the underlying problem. This reduction theory provides a general framework for analyzing numerical code. Two important separate analyses from the literature, branch-coverage-based testing and quantifier-free floating-point satisfiability, are its instances.
To further demonstrate our reduction theory’s generality and power, we develop three additional applications, including boundary value analysis, path reachability and overflow detection. Critically, these analyses do not rely on the modeling or abstraction of floating-point semantics; rather, they explore a program’s input space guided by runtime computation and minimization of the weak distance. This design, combined with the aforementioned theoretical guarantee, enables the application of the reduction theory to real-world floating-point code. In our experiments, our boundary value analysis is able to find all reachable boundary conditions of the GNU sin function, which is complex with several hundred lines of code, and our floating-point overflow detection detects a range of overflows and inconsistencies in the widely-used numerical library GSL, including two latent bugs that developers have already confirmed.
@InProceedings{PLDI19p439,
author = {Zhoulai Fu and Zhendong Su},
title = {Effective Floating-Point Analysis via Weak-Distance Minimization},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {439--452},
doi = {10.1145/3314221.3314632},
year = {2019},
}
Publisher's Version
Parallelism and Super Computing I
Huron: Hybrid False Sharing Detection and Repair
Tanvir Ahmed Khan, Yifan Zhao, Gilles Pokam, Barzan Mozafari, and
Baris Kasikci
(University of Michigan, USA; Intel, USA)
Writing efficient multithreaded code that can leverage the full parallelism of underlying hardware is difficult. A key impediment is insidious cache contention issues, such as false sharing. False sharing occurs when multiple threads from different cores access disjoint portions of the same cache line, causing it to go back and forth between the caches of different cores and leading to substantial slowdown.
Alas, existing techniques for detecting and repairing false sharing have limitations. On the one hand, in-house (i.e., offline) techniques are limited to situations where falsely-shared data can be determined statically, and are otherwise inaccurate. On the other hand, in-production (i.e., run-time) techniques incur considerable overhead, as they constantly monitor a program to detect false sharing. In-production repair techniques are also limited by the types of modifications they can perform on the fly, and are therefore less effective.
We present Huron, a hybrid in-house/in-production false sharing detection and repair system. Huron detects and repairs as much false sharing as it can in-house, and relies on its lightweight in-production mechanism for remaining cases. The key idea behind Huron's in-house false sharing repair is to group together data that is accessed by the same set of threads, to shift falsely-shared data to different cache lines. Huron's in-house repair technique can generalize to previously-unobserved inputs. Our evaluation shows that Huron can detect more false sharing bugs than all state-of-the-art techniques, and with a lower overhead. Huron improves runtime performance by 3.82× on average (up to 11×), which is 2.11-2.27× better than the state of the art.
@InProceedings{PLDI19p453,
author = {Tanvir Ahmed Khan and Yifan Zhao and Gilles Pokam and Barzan Mozafari and Baris Kasikci},
title = {Huron: Hybrid False Sharing Detection and Repair},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {453--468},
doi = {10.1145/3314221.3314644},
year = {2019},
}
Publisher's Version
Artifacts Functional
Model-Driven Transformations for Multi- and Many-Core CPUs
Martin Kong and Louis-Noël Pouchet
(Brookhaven National Laboratory, USA; Colorado State University, USA)
Modern polyhedral compilers excel at aggressively optimizing codes with static
control parts, but the state-of-practice to find high-performance polyhedral
transformations especially for different hardware targets still largely
involves auto-tuning. In this work we propose a novel customizable polyhedral
scheduling technique, with the aim of delivering high performance for several
hardware targets. We design constraints and objectives that model several
crucial aspects of performance such as stride optimization or the trade-off
between parallelism and reuse, while considering important architectural
features of the target machine. We evaluate our work using the PolyBench/C
benchmark suite and experimentally validate it against large optimization
spaces generated with the Pluto compiler on 3 representative architectures: an
IBM Power9, an Intel Xeon Phi and an Intel Core-i9. Our results show we can
achieve comparable or superior performance to Pluto on the majority of
benchmarks, without implementing tiling in the source code nor using
experimental autotuning.
@InProceedings{PLDI19p469,
author = {Martin Kong and Louis-Noël Pouchet},
title = {Model-Driven Transformations for Multi- and Many-Core CPUs},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {469--484},
doi = {10.1145/3314221.3314653},
year = {2019},
}
Publisher's Version
Parallelism-Centric What-If and Differential Analyses
Adarsh Yoga and
Santosh Nagarakatte
(Rutgers University, USA)
This paper proposes TaskProf2, a parallelism profiler and an adviser for task parallel programs. As a parallelism profiler, TaskProf2 pinpoints regions with serialization bottlenecks, scheduling overheads, and secondary effects of execution. As an adviser, TaskProf2 identifies regions that matter in improving parallelism. To accomplish these objectives, it uses a performance model that captures series-parallel relationships between various dynamic execution fragments of tasks and includes fine-grained measurement of computation in those fragments. Using this performance model, TaskProf2’s what-if analyses identify regions that improve the parallelism of the program while considering tasking overheads. Its differential analyses perform fine-grained differencing of an oracle and the observed performance model to identify static regions experiencing secondary effects. We have used TaskProf2 to identify regions with serialization bottlenecks and secondary effects in many applications.
@InProceedings{PLDI19p485,
author = {Adarsh Yoga and Santosh Nagarakatte},
title = {Parallelism-Centric What-If and Differential Analyses},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {485--501},
doi = {10.1145/3314221.3314621},
year = {2019},
}
Publisher's Version
Artifacts Functional
Type Systems I
Verifying Message-Passing Programs with Dependent Behavioural Types
Alceste Scalas,
Nobuko Yoshida, and Elias Benussi
(Imperial College London, UK; Aston University, UK; Faculty Science, UK)
Concurrent and distributed programming is notoriously hard. Modern languages and toolkits ease this difficulty by offering message-passing abstractions, such as actors (e.g., Erlang, Akka, Orleans) or processes (e.g., Go): they allow for simpler reasoning w.r.t. shared-memory concurrency, but do not ensure that a program implements a given specification.
To address this challenge, it would be desirable to specify and verify the intended behaviour of message-passing applications using types, and ensure that, if a program type-checks and compiles, then it will run and communicate as desired.
We develop this idea in theory and practice. We formalise a concurrent functional language λ≤π, with a new blend of behavioural types (from π-calculus theory), and dependent function types (from the Dotty programming language, a.k.a. the future Scala 3). Our theory yields four main payoffs: (1) it verifies safety and liveness properties of programs via type-level model checking; (2) unlike previous work, it accurately verifies channel-passing (covering a typical pattern of actor programs) and higher-order interaction (i.e., sending/receiving mobile code); (3) it is directly embedded in Dotty, as a toolkit called Effpi, offering a simplified actor-based API; (4) it enables an efficient runtime system for Effpi, for highly concurrent programs with millions of processes/actors.
@InProceedings{PLDI19p502,
author = {Alceste Scalas and Nobuko Yoshida and Elias Benussi},
title = {Verifying Message-Passing Programs with Dependent Behavioural Types},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {502--516},
doi = {10.1145/3314221.3322484},
year = {2019},
}
Publisher's Version
Artifacts Functional
Toward Efficient Gradual Typing for Structural Types via Coercions
Andre Kuhlenschmidt, Deyaaeldeen Almahallawi, and
Jeremy G. Siek
(Indiana University, USA)
Gradual typing combines static and dynamic typing in the same program. Siek et al. (2015) describe five criteria for gradually typed languages, including type soundness and the gradual guarantee. A significant number of languages have been developed in academia and industry that support some of these criteria (TypeScript, Typed Racket, Safe TypeScript, Transient Reticulated Python, Thorn, etc.) but relatively few support all the criteria (Nom, Gradualtalk, Guarded Reticulated Python). Of those that do, only Nom does so efficiently. The Nom experiment shows that one can achieve efficient gradual typing in languages with only nominal types, but many languages have structural types: function types, tuples, record and object types, generics, etc.
In this paper we present a compiler, named Grift, that addresses the difficult challenge of efficient gradual typing for structural types. The input language includes a selection of difficult features: first-class functions, mutable arrays, and recursive types. We show that a close-to-the-metal implementation of run-time casts inspired by Henglein's coercions eliminates all of the catastrophic slowdowns without introducing significant average-case overhead. As a result, Grift exhibits lower overheads than those of Typed Racket.
@InProceedings{PLDI19p517,
author = {Andre Kuhlenschmidt and Deyaaeldeen Almahallawi and Jeremy G. Siek},
title = {Toward Efficient Gradual Typing for Structural Types via Coercions},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {517--532},
doi = {10.1145/3314221.3314627},
year = {2019},
}
Publisher's Version
Artifacts Functional
Bidirectional Type Checking for Relational Properties
Ezgi Çiçek, Weihao Qu, Gilles Barthe, Marco Gaboardi, and
Deepak Garg
(Facebook, USA; SUNY Buffalo, USA; MPI for Security and Privacy, Germany; IMDEA Software Institute, Spain; MPI-SWS, Germany)
Relational type systems have been designed for several applications
including information flow, differential privacy, and cost analysis.
In order to achieve the best results, these systems often use
relational refinements and relational effects to
maximally exploit the similarity in the structure of the two programs
being compared. Relational type systems are appealing for relational
properties because they deliver simpler and more precise verification
than what could be derived from typing the two programs
separately. However, relational type systems do not yet achieve the
practical appeal of their non-relational counterpart, in part because
of the lack of a general foundation for implementing them.
In this paper, we take a step in this direction by developing
bidirectional relational type checking for systems with
relational refinements and effects. Our approach achieves the benefits
of bidirectional type checking, in a relational setting. In
particular, it significantly reduces the need for typing annotations
through the combination of type checking and type inference. In order
to highlight the foundational nature of our approach, we develop
bidirectional versions of several relational type systems which
incrementally combine many different components needed for expressive
relational analysis.
@InProceedings{PLDI19p533,
author = {Ezgi Çiçek and Weihao Qu and Gilles Barthe and Marco Gaboardi and Deepak Garg},
title = {Bidirectional Type Checking for Relational Properties},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {533--547},
doi = {10.1145/3314221.3314603},
year = {2019},
}
Publisher's Version
Bug Finding and Testing II
Parser-Directed Fuzzing
Björn Mathis, Rahul Gopinath, Michaël Mera, Alexander Kampmann, Matthias Höschele, and Andreas Zeller
(CISPA, Germany)
To be effective, software test generation needs to well cover the space of possible inputs. Traditional fuzzing generates large numbers of random inputs, which however are unlikely to contain keywords and other specific inputs of non-trivial input languages. Constraint-based test generation solves conditions of paths leading to uncovered code, but fails on programs with complex input conditions because of path explosion. In this paper, we present a test generation technique specifically directed at input parsers. We systematically produce inputs for the parser and track comparisons made; after every rejection, we satisfy the comparisons leading to rejection. This approach effectively covers the input space: Evaluated on five subjects, from CSV files to JavaScript, our pFuzzer prototype covers more tokens than both random-based and constraint-based approaches, while requiring no symbolic analysis and far fewer tests than random fuzzers.
@InProceedings{PLDI19p548,
author = {Björn Mathis and Rahul Gopinath and Michaël Mera and Alexander Kampmann and Matthias Höschele and Andreas Zeller},
title = {Parser-Directed Fuzzing},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {548--560},
doi = {10.1145/3314221.3314651},
year = {2019},
}
Publisher's Version
Continuously Reasoning about Programs using Differential Bayesian Inference
Kihong Heo, Mukund Raghothaman, Xujie Si, and
Mayur Naik
(University of Pennsylvania, USA)
Programs often evolve by continuously integrating changes from multiple programmers. The effective adoption of program analysis tools in this continuous integration setting is hindered by the need to only report alarms relevant to a particular program change. We present a probabilistic framework, Drake, to apply program analyses to continuously evolving programs. Drake is applicable to a broad range of analyses that are based on deductive reasoning. The key insight underlying Drake is to compute a graph that concisely and precisely captures differences between the derivations of alarms produced by the given analysis on the program before and after the change. Performing Bayesian inference on the graph thereby enables to rank alarms by likelihood of relevance to the change. We evaluate Drake using Sparrow—a static analyzer that targets buffer-overrun, format-string, and integer-overflow errors—on a suite of ten widely-used C programs each comprising 13k–112k lines of code. Drake enables to discover all true bugs by inspecting only 30 alarms per benchmark on average, compared to 85 (3× more) alarms by the same ranking approach in batch mode, and 118 (4× more) alarms by a differential approach based on syntactic masking of alarms which also misses 4 of the 26 bugs overall.
@InProceedings{PLDI19p561,
author = {Kihong Heo and Mukund Raghothaman and Xujie Si and Mayur Naik},
title = {Continuously Reasoning about Programs using Differential Bayesian Inference},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {561--575},
doi = {10.1145/3314221.3314616},
year = {2019},
}
Publisher's Version
Artifacts Functional
Sparse Record and Replay with Controlled Scheduling
Christopher Lidbury and
Alastair F. Donaldson
(Imperial College London, UK)
Modern applications include many sources of nondeterminism, e.g. due to concurrency, signals, and system calls that interact with the external environment. Finding and reproducing bugs in the presence of this nondeterminism has been the subject of much prior work in three main areas: (1) controlled concurrency-testing, where a custom scheduler replaces the OS scheduler to find subtle bugs; (2) record and replay, where sources of nondeterminism are captured and logged so that a failing execution can be replayed for debugging purposes; and (3) dynamic analysis for the detection of data races. We present a dynamic analysis tool for C++ applications, tsan11rec, which brings these strands of work together by integrating controlled concurrency testing and record and replay into the tsan11 framework for C++11 data race detection. Our novel twist on record and replay is a sparse approach, where the sources of nondeterminism to record can be configured per application. We show that our approach is effective at finding subtle concurrency bugs in small applications; is competitive in terms of performance with the state-of-the-art record and replay tool rr on larger applications; succeeds (due to our sparse approach) in replaying the I/O-intensive Zandronum and QuakeSpasm video games, which are out of scope for rr; but (due to limitations of our sparse approach) cannot faithfully replay applications where memory layout nondeterminism significantly affects application behaviour.
@InProceedings{PLDI19p576,
author = {Christopher Lidbury and Alastair F. Donaldson},
title = {Sparse Record and Replay with Controlled Scheduling},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {576--593},
doi = {10.1145/3314221.3314635},
year = {2019},
}
Publisher's Version
Parallelism and Super Computing II
Sparse Computation Data Dependence Simplification for Efficient Compiler-Generated Inspectors
Mahdi Soltan Mohammadi, Tomofumi Yuki, Kazem Cheshmi, Eddie C. Davis,
Mary Hall,
Maryam Mehri Dehnavi, Payal Nandy,
Catherine Olschanowsky,
Anand Venkat, and
Michelle Mills Strout
(University of Arizona, USA; Inria, France; University of Rennes, France; CNRS, France; IRISA, France; University of Toronto, Canada; Boise State University, USA; University of Utah, USA; Intel, USA)
This paper presents a combined compile-time and runtime loop-carried
dependence analysis of sparse matrix codes and evaluates its
performance in the context of wavefront parallellism.
Sparse computations incorporate indirect memory accesses such as x[col[j]]
whose memory locations cannot be determined until runtime.
The key contributions of this paper are two compile-time techniques for
significantly reducing the overhead of runtime dependence testing:
(1) identifying new equality constraints that result in more efficient runtime inspectors, and
(2) identifying subset relations between dependence constraints such
that one dependence test subsumes another one that is therefore eliminated.
New equality constraints discovery is
enabled by taking advantage of domain-specific knowledge about index arrays, such as col[j].
These simplifications lead to automatically-generated
inspectors that make it practical to parallelize such computations.
We analyze our simplification methods for a collection of seven sparse computations.
The evaluation shows our methods reduce the complexity of the runtime
inspectors significantly.
Experimental results for a collection of five large matrices
show parallel speedups ranging from 2x to more than 8x running on a 8-core CPU.
@InProceedings{PLDI19p594,
author = {Mahdi Soltan Mohammadi and Tomofumi Yuki and Kazem Cheshmi and Eddie C. Davis and Mary Hall and Maryam Mehri Dehnavi and Payal Nandy and Catherine Olschanowsky and Anand Venkat and Michelle Mills Strout},
title = {Sparse Computation Data Dependence Simplification for Efficient Compiler-Generated Inspectors},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {594--609},
doi = {10.1145/3314221.3314646},
year = {2019},
}
Publisher's Version
Artifacts Functional
Modular Divide-and-Conquer Parallelization of Nested Loops
Azadeh Farzan and
Victor Nicolet
(University of Toronto, Canada)
We propose a methodology for automatic generation of divide-and-conquer parallel implementations of sequential nested loops. We focus on a class of loops that traverse read-only multidimensional collections (lists or arrays) and compute a function over these collections. Our approach is modular, in that, the inner loop nest is abstracted away to produce a simpler loop nest for parallelization. The summarized version of the loop nest is then parallelized. The main challenge addressed by this paper is that to perform the code transformations necessary in each step, the loop nest may have to be augmented (automatically) with extra computation to make possible the abstraction and/or the parallelization tasks. We present theoretical results to justify the correctness of our modular approach, and algorithmic solutions for automation. Experimental results demonstrate that our approach can parallelize highly non-trivial loop nests efficiently.
@InProceedings{PLDI19p610,
author = {Azadeh Farzan and Victor Nicolet},
title = {Modular Divide-and-Conquer Parallelization of Nested Loops},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {610--624},
doi = {10.1145/3314221.3314612},
year = {2019},
}
Publisher's Version
Generating Piecewise-Regular Code from Irregular Structures
Travis Augustine, Janarthanan Sarma, Louis-Noël Pouchet, and
Gabriel Rodríguez
(Colorado State University, USA; Universidade da Coruña, Spain)
Irregular data structures, as exemplified with sparse matrices, have proved to be essential in modern computing. Numerous sparse formats have been investigated to improve the overall performance of Sparse Matrix-Vector multiply (SpMV). But in this work we propose instead to take a fundamentally different approach: to automatically build sets of regular sub-computations by mining for regular sub-regions in the irregular data structure.
Our approach leads to code that is specialized to the sparsity structure of the input matrix, but which does not need anymore any indirection array, thereby improving SIMD vectorizability. We particularly focus on small sparse structures (below 10M nonzeros), and demonstrate substantial performance improvements and compaction capabilities compared to a classical CSR implementation and Intel MKL IE's SpMV implementation, evaluating on 200+ different matrices from the SuiteSparse repository.
@InProceedings{PLDI19p625,
author = {Travis Augustine and Janarthanan Sarma and Louis-Noël Pouchet and Gabriel Rodríguez},
title = {Generating Piecewise-Regular Code from Irregular Structures},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {625--639},
doi = {10.1145/3314221.3314615},
year = {2019},
}
Publisher's Version
Artifacts Functional
Type Systems II
ILC: A Calculus for Composable, Computational Cryptography
Kevin Liao, Matthew A. Hammer, and Andrew Miller
(University of Illinois at Urbana-Champaign, USA; DFINITY, USA)
The universal composability (UC) framework is the established standard for analyzing cryptographic protocols in a modular way, such that security is preserved under concurrent composition with arbitrary other protocols. However, although UC is widely used for on-paper proofs, prior attempts at systemizing it have fallen short, either by using a symbolic model (thereby ruling out computational reduction proofs), or by limiting its expressiveness.
In this paper, we lay the groundwork for building a concrete, executable implementation of the UC framework. Our main contribution is a process calculus, dubbed the Interactive Lambda Calculus (ILC). ILC faithfully captures the computational model underlying UC—interactive Turing machines (ITMs)—by adapting ITMs to a subset of the π-calculus through an affine typing discipline. In other words, well-typed ILC programs are expressible as ITMs. In turn, ILC’s strong confluence property enables reasoning about cryptographic security reductions. We use ILC to develop a simplified implementation of UC called SaUCy.
@InProceedings{PLDI19p640,
author = {Kevin Liao and Matthew A. Hammer and Andrew Miller},
title = {ILC: A Calculus for Composable, Computational Cryptography},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {640--654},
doi = {10.1145/3314221.3314607},
year = {2019},
}
Publisher's Version
Proving Differential Privacy with Shadow Execution
Yuxin Wang, Zeyu Ding, Guanhong Wang, Daniel Kifer, and Danfeng Zhang
(Pennsylvania State University, USA)
Recent work on formal verification of differential privacy shows a trend toward usability and expressiveness -- generating a correctness proof of sophisticated algorithm while minimizing the annotation burden on programmers. Sometimes, combining those two requires substantial changes to program logics: one recent paper is able to verify Report Noisy Max automatically, but it involves a complex verification system using customized program logics and verifiers.
In this paper, we propose a new proof technique, called shadow execution, and embed it into a language called ShadowDP. ShadowDP uses shadow execution to generate proofs of differential privacy with very few programmer annotations and without relying on customized logics and verifiers. In addition to verifying Report Noisy Max, we show that it can verify a new variant of Sparse Vector that reports the gap between some noisy query answers and the noisy threshold. Moreover, ShadowDP reduces the complexity of verification: for all of the algorithms we have evaluated, type checking and verification in total takes at most 3 seconds, while prior work takes minutes on the same algorithms.
@InProceedings{PLDI19p655,
author = {Yuxin Wang and Zeyu Ding and Guanhong Wang and Daniel Kifer and Danfeng Zhang},
title = {Proving Differential Privacy with Shadow Execution},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {655--669},
doi = {10.1145/3314221.3314619},
year = {2019},
}
Publisher's Version
Data-Trace Types for Distributed Stream Processing Systems
Konstantinos Mamouras,
Caleb Stanford,
Rajeev Alur, Zachary G. Ives, and Val Tannen
(Rice University, USA; University of Pennsylvania, USA)
Distributed architectures for efficient processing of streaming data are increasingly critical to modern information processing systems. The goal of this paper is to develop type-based programming abstractions that facilitate correct and efficient deployment of a logical specification of the desired computation on such architectures. In the proposed model, each communication link has an associated type specifying tagged data items along with a dependency relation over tags that captures the logical partial ordering constraints over data items. The semantics of a (distributed) stream processing system is then a function from input data traces to output data traces, where a data trace is an equivalence class of sequences of data items induced by the dependency relation. This data-trace transduction model generalizes both acyclic synchronous data-flow and relational query processors, and can specify computations over data streams with a rich variety of partial ordering and synchronization characteristics. We then describe a set of programming templates for data-trace transductions: abstractions corresponding to common stream processing tasks. Our system automatically maps these high-level programs to a given topology on the distributed implementation platform Apache Storm while preserving the semantics. Our experimental evaluation shows that (1) while automatic parallelization deployed by existing systems may not preserve semantics, particularly when the computation is sensitive to the ordering of data items, our programming abstractions allow a natural specification of the query that contains a mix of ordering constraints while guaranteeing correct deployment, and (2) the throughput of the automatically compiled distributed code is comparable to that of hand-crafted distributed implementations.
@InProceedings{PLDI19p670,
author = {Konstantinos Mamouras and Caleb Stanford and Rajeev Alur and Zachary G. Ives and Val Tannen},
title = {Data-Trace Types for Distributed Stream Processing Systems},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {670--685},
doi = {10.1145/3314221.3314580},
year = {2019},
}
Publisher's Version
ML
An Inductive Synthesis Framework for Verifiable Reinforcement Learning
He Zhu, Zikang Xiong, Stephen Magill, and
Suresh Jagannathan
(Galois, USA; Purdue University, USA)
Despite the tremendous advances that have been made in the last decade on developing useful machine-learning applications, their wider adoption has been hindered by the lack of strong assurance guarantees that can be made about their behavior. In this paper, we consider how formal verification techniques developed for traditional software systems can be repurposed for verification of reinforcement learning-enabled ones, a particularly important class of machine learning systems. Rather than enforcing safety by examining and altering the structure of a complex neural network implementation, our technique uses blackbox methods to synthesizes deterministic programs, simpler, more interpretable, approximations of the network that can nonetheless guarantee desired safety properties are preserved, even when the network is deployed in unanticipated or previously unobserved environments. Our methodology frames the problem of neural network verification in terms of a counterexample and syntax-guided inductive synthesis procedure over these programs. The synthesis procedure searches for both a deterministic program and an inductive invariant over an infinite state transition system that represents a specification of an application's control logic. Additional specifications defining environment-based constraints can also be provided to further refine the search space. Synthesized programs deployed in conjunction with a neural network implementation dynamically enforce safety conditions by monitoring and preventing potentially unsafe actions proposed by neural policies. Experimental results over a wide range of cyber-physical applications demonstrate that software-inspired formal verification techniques can be used to realize trustworthy reinforcement learning systems with low overhead.
@InProceedings{PLDI19p686,
author = {He Zhu and Zikang Xiong and Stephen Magill and Suresh Jagannathan},
title = {An Inductive Synthesis Framework for Verifiable Reinforcement Learning},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {686--701},
doi = {10.1145/3314221.3314638},
year = {2019},
}
Publisher's Version
Artifacts Functional
Programming Support for Autonomizing Software
Wen-Chuan Lee, Peng Liu, Yingqi Liu, Shiqing Ma, and
Xiangyu Zhang
(Purdue University, USA; IBM, USA)
Most traditional software systems are not built with the artificial intelligence support (AI) in mind. Among them, some may require human interventions to operate, e.g., the manual specification of the parameters in the data processing programs, or otherwise, would behave poorly. We propose a novel framework called Autonomizer to autonomize these systems by installing the AI into the traditional programs. Autonomizeris general so it can be applied to many real-world applications. We provide the primitives and the run-time support, where the primitives abstract common tasks of autonomization and the runtime support realizes them transparently. With the support of Autonomizer, the users can gain the AI support with little engineering efforts. Like many other AI applications, the challenge lies in the feature selection, which we address by proposing multiple automated strategies based on the program analysis. Our experiment results on nine real-world applications show that the autonomization only requires adding a few lines to the source code.Besides, for the data-processing programs, Autonomizer improves the output quality by 161% on average over the default settings. For the interactive programs such as game/driving,Autonomizer achieves higher success rate with lower training time than existing autonomized programs.
@InProceedings{PLDI19p702,
author = {Wen-Chuan Lee and Peng Liu and Yingqi Liu and Shiqing Ma and Xiangyu Zhang},
title = {Programming Support for Autonomizing Software},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {702--716},
doi = {10.1145/3314221.3314593},
year = {2019},
}
Publisher's Version
Wootz: A Compiler-Based Framework for Fast CNN Pruning via Composability
Hui Guan,
Xipeng Shen, and Seung-Hwan Lim
(North Carolina State University, USA; Oak Ridge National Laboratory, USA)
Convolutional Neural Networks (CNN) are widely used for Deep Learning tasks. CNN pruning is an important method to adapt a large CNN model trained on general datasets to fit a more specialized task or a smaller device. The key challenge is on deciding which filters to remove in order to maximize the quality of the pruned networks while satisfying the constraints. It is time-consuming due to the enormous configuration space and the slowness of CNN training.
The problem has drawn many efforts from the machine learning field, which try to reduce the set of network configurations to explore. This work tackles the problem distinctively from a programming systems perspective, trying to speed up the evaluations of the remaining configurations through computation reuse via a compiler-based framework. We empirically uncover the existence of composability in the training of a collection of pruned CNN models, and point out the opportunities for computation reuse. We then propose composability-based CNN pruning, and design a compression-based algorithm to efficiently identify the set of CNN layers to pre-train for maximizing their reuse benefits in CNN pruning. We further develop a compiler-based framework named Wootz, which, for an arbitrary CNN, automatically generates code that builds a Teacher-Student scheme to materialize composability-based pruning. Experiments show that network pruning enabled by Wootz shortens the state-of-art pruning process by up to 186X while producing significantly better pruning results.
@InProceedings{PLDI19p717,
author = {Hui Guan and Xipeng Shen and Seung-Hwan Lim},
title = {Wootz: A Compiler-Based Framework for Fast CNN Pruning via Composability},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {717--730},
doi = {10.1145/3314221.3314652},
year = {2019},
}
Publisher's Version
Optimization and Abstraction: A Synergistic Approach for Analyzing Neural Network Robustness
Greg Anderson,
Shankara Pailoor,
Isil Dillig, and Swarat Chaudhuri
(University of Texas at Austin, USA; Rice University, USA)
In recent years, the notion of local robustness (or robustness for short) has emerged as a desirable property of deep neural networks. Intuitively, robustness means that small perturbations to an input do not cause the network to perform misclassifications. In this paper, we present a novel algorithm for verifying robustness properties of neural networks. Our method synergistically combines gradient-based optimization methods for counterexample search with abstraction-based proof search to obtain a sound and (δ -)complete decision procedure. Our method also employs a data-driven approach to learn a verification policy that guides abstract interpretation during proof search. We have implemented the proposed approach in a tool called Charon and experimentally evaluated it on hundreds of benchmarks. Our experiments show that the proposed approach significantly outperforms three state-of-the-art tools, namely AI^2, Reluplex, and Reluval.
@InProceedings{PLDI19p731,
author = {Greg Anderson and Shankara Pailoor and Isil Dillig and Swarat Chaudhuri},
title = {Optimization and Abstraction: A Synergistic Approach for Analyzing Neural Network Robustness},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {731--744},
doi = {10.1145/3314221.3314614},
year = {2019},
}
Publisher's Version
Specification
Unsupervised Learning of API Aliasing Specifications
Jan Eberhardt, Samuel Steffen, Veselin Raychev, and
Martin Vechev
(DeepCode, Switzerland; ETH Zurich, Switzerland)
Real world applications make heavy use of powerful libraries and frameworks, posing a significant challenge for static analysis as the library implementation may be very complex or unavailable. Thus, obtaining specifications that summarize the behaviors of the library is important as it enables static analyzers to precisely track the effects of APIs on the client program, without requiring the actual API implementation.
In this work, we propose a novel method for discovering aliasing specifications of APIs by learning from a large dataset of programs. Unlike prior work, our method does not require manual annotation, access to the library's source code or ability to run its APIs. Instead, it learns specifications in a fully unsupervised manner, by statically observing usages of APIs in the dataset. The core idea is to learn a probabilistic model of interactions between API methods and aliasing objects, enabling identification of additional likely aliasing relations, and to then infer aliasing specifications of APIs that explain these relations. The learned specifications are then used to augment an API-aware points-to analysis.
We implemented our approach in a tool called USpec and used it to automatically learn aliasing specifications from millions of source code files. USpec learned over 2000 specifications of various Java and Python APIs, in the process improving the results of the points-to analysis and its clients.
@InProceedings{PLDI19p745,
author = {Jan Eberhardt and Samuel Steffen and Veselin Raychev and Martin Vechev},
title = {Unsupervised Learning of API Aliasing Specifications},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {745--759},
doi = {10.1145/3314221.3314640},
year = {2019},
}
Publisher's Version
Scalable Taint Specification Inference with Big Code
Victor Chibotaru,
Benjamin Bichsel, Veselin Raychev, and
Martin Vechev
(DeepCode, Switzerland; ETH Zurich, Switzerland)
We present a new scalable, semi-supervised method for inferring taint analysis specifications by learning from a large dataset of programs. Taint specifications capture the role of library APIs (source, sink, sanitizer) and are a critical ingredient of any taint analyzer that aims to detect security violations based on information flow.
The core idea of our method is to formulate the taint specification learning problem as a linear optimization task over a large set of information flow constraints. The resulting constraint system can then be efficiently solved with state-of-the-art solvers. Thanks to its scalability, our method can infer many new and interesting taint specifications by simultaneously learning from a large dataset of programs (e.g., as found on GitHub), while requiring few manual annotations.
We implemented our method in an end-to-end system, called Seldon, targeting Python, a language where static specification inference is particularly hard due to lack of typing information. We show that Seldon is practically effective: it learned almost 7,000 API roles from over 210,000 candidate APIs with very little supervision (less than 300 annotations) and with high estimated precision (67%). Further, using the learned specifications, our taint analyzer flagged more than 20,000 violations in open source projects, 97% of which were undetectable without the inferred specifications.
@InProceedings{PLDI19p760,
author = {Victor Chibotaru and Benjamin Bichsel and Veselin Raychev and Martin Vechev},
title = {Scalable Taint Specification Inference with Big Code},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {760--774},
doi = {10.1145/3314221.3314648},
year = {2019},
}
Publisher's Version
Learning Stateful Preconditions Modulo a Test Generator
Angello Astorga,
P. Madhusudan, Shambwaditya Saha, Shiyu Wang, and Tao Xie
(University of Illinois at Urbana-Champaign, USA)
In this paper, we present a novel learning framework for inferring stateful preconditions (i.e., preconditions constraining not only primitive-type inputs but also non-primitive-type object states) modulo a test generator, where the quality of the preconditions is based on their safety and maximality with respect to the test generator. We instantiate the learning framework with a specific learner and test generator to realize a precondition synthesis tool for C#. We use an extensive evaluation to show that the tool is highly effective in synthesizing preconditions for avoiding exceptions as well as synthesizing conditions under which methods commute.
@InProceedings{PLDI19p775,
author = {Angello Astorga and P. Madhusudan and Shambwaditya Saha and Shiyu Wang and Tao Xie},
title = {Learning Stateful Preconditions Modulo a Test Generator},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {775--787},
doi = {10.1145/3314221.3314641},
year = {2019},
}
Publisher's Version
SLING: Using Dynamic Analysis to Infer Program Invariants in Separation Logic
Ton Chanh Le, Guolong Zheng, and
ThanhVu Nguyen
(Stevens Institute of Technology, USA; University of Nebraska-Lincoln, USA)
We introduce a new dynamic analysis technique to discover invariants in separation logic for heap-manipulating programs. First, we use a debugger to obtain rich program execution traces at locations of interest on sample inputs. These traces consist of heap and stack information of variables that point to dynamically allocated data structures. Next, we iteratively analyze separate memory regions related to each pointer variable and search for a formula over predefined heap predicates in separation logic to model these regions. Finally, we combine the computed formulae into an invariant that describes the shape of explored memory regions.
We present SLING, a tool that implements these ideas to automatically generate invariants in separation logic at arbitrary locations in C programs, e.g., program pre and postconditions and loop invariants.
Preliminary results on existing benchmarks show that SLING can efficiently generate correct and useful invariants for programs that manipulate a wide variety of complex data structures.
@InProceedings{PLDI19p788,
author = {Ton Chanh Le and Guolong Zheng and ThanhVu Nguyen},
title = {SLING: Using Dynamic Analysis to Infer Program Invariants in Separation Logic},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {788--801},
doi = {10.1145/3314221.3314634},
year = {2019},
}
Publisher's Version
Artifacts Functional
Static Analysis
Abstract Interpretation under Speculative Execution
Meng Wu and
Chao Wang
(Virginia Tech, USA; University of Southern California, USA)
Analyzing the behavior of a program running on a processor
that supports speculative execution is crucial for applications
such as execution time estimation and side channel detection.
Unfortunately, existing static analysis techniques based on
abstract interpretation do not model speculative execution
since they focus on functional properties of a program while
speculative execution does not change the functionality. To
fill the gap, we propose a method to make abstract interpretation
sound under speculative execution. There are two
contributions. First, we introduce the notion of virtual control
flow to augment instructions that may be speculatively
executed and thus affect subsequent instructions. Second,
to make the analysis efficient, we propose optimizations to
handle merges and loops and to safely bound the speculative
execution depth. We have implemented and evaluated
the proposed method in a static cache analysis for execution
time estimation and side channel detection. Our experiments
show that the new method, while guaranteed to be sound
under speculative execution, outperforms state-of-the-art
abstract interpretation techniques that may be unsound.
@InProceedings{PLDI19p802,
author = {Meng Wu and Chao Wang},
title = {Abstract Interpretation under Speculative Execution},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {802--815},
doi = {10.1145/3314221.3314647},
year = {2019},
}
Publisher's Version
A Fast Analytical Model of Fully Associative Caches
Tobias Gysi, Tobias Grosser, Laurin Brandner, and
Torsten Hoefler
(ETH Zurich, Switzerland)
While the cost of computation is an easy to understand local property, the cost of data movement on cached architectures depends on global state, does not compose, and is hard to predict. As a result, programmers often fail to consider the cost of data movement. Existing cache models and simulators provide the missing information but are computationally expensive. We present a lightweight cache model for fully associative caches with least recently used (LRU) replacement policy that gives fast and accurate results. We count the cache misses without explicit enumeration of all memory accesses by using symbolic counting techniques twice: 1) to derive the stack distance for each memory access and 2) to count the memory accesses with stack distance larger than the cache size. While this technique seems infeasible in theory, due to non-linearities after the first round of counting, we show that the counting problems are sufficiently linear in practice. Our cache model often computes the results within seconds and contrary to simulation the execution time is mostly problem size independent. Our evaluation measures modeling errors below 0.6% on real hardware. By providing accurate data placement information we enable memory hierarchy aware software development.
@InProceedings{PLDI19p816,
author = {Tobias Gysi and Tobias Grosser and Laurin Brandner and Torsten Hoefler},
title = {A Fast Analytical Model of Fully Associative Caches},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {816--829},
doi = {10.1145/3314221.3314606},
year = {2019},
}
Publisher's Version
Info
Artifacts Functional
Sound, Fine-Grained Traversal Fusion for Heterogeneous Trees
Laith Sakka,
Kirshanthan Sundararajah, Ryan R. Newton, and
Milind Kulkarni
(Purdue University, USA; Indiana University, USA)
Applications in many domains are based on a series of traversals of tree structures, and fusing these traversals together to reduce the total number of passes over the tree is a common, important optimization technique. In applications such as compilers and render trees, these trees are heterogeneous: different nodes of the tree have different types. Unfortunately, prior work for fusing traversals falls short in different ways: they do not handle heterogeneity; they require using domain-specific languages to express an application; they rely on the programmer to aver that fusing traversals is safe, without any soundness guarantee; or they can only perform coarse-grain fusion, leading to missed fusion opportunities. This paper addresses these shortcomings to build a framework for fusing traversals of heterogeneous trees that is automatic, sound, and fine-grained. We show across several case studies that our approach is able to allow programmers to write simple, intuitive traversals, and then automatically fuse them to substantially improve performance.
@InProceedings{PLDI19p830,
author = {Laith Sakka and Kirshanthan Sundararajah and Ryan R. Newton and Milind Kulkarni},
title = {Sound, Fine-Grained Traversal Fusion for Heterogeneous Trees},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {830--844},
doi = {10.1145/3314221.3314626},
year = {2019},
}
Publisher's Version
Artifacts Functional
Size-Change Termination as a Contract: Dynamically and Statically Enforcing Termination for Higher-Order Programs
Phúc C. Nguyễn, Thomas Gilray, Sam Tobin-Hochstadt, and
David Van Horn
(University of Maryland at College Park, USA; University of Alabama at Birmingham, USA; Indiana University, USA)
Termination is an important but undecidable program property, which has led to a large body of work on static methods for conservatively predicting or enforcing termination. One such method is the size-change termination approach of Lee, Jones, and Ben-Amram, which operates in two phases: (1) abstract programs into “size-change graphs,” and (2) check these graphs for the size-change property: the existence of paths that lead to infinite decreasing sequences.
We transpose these two phases with an operational semantics that accounts for the run-time enforcement of the size-change property, postponing (or entirely avoiding) program abstraction. This choice has two key consequences: (1) size-change termination can be checked at run-time and (2) termination can be rephrased as a safety property analyzed using existing methods for systematic abstraction.
We formulate run-time size-change checks as contracts in the style of Findler and Felleisen. The result compliments existing contracts that enforce partial correctness specifications to obtain contracts for total correctness. Our approach combines the robustness of the size-change principle for termination with the precise information available at run-time. It has tunable overhead and can check for nontermination without the conservativeness necessary in static checking. To obtain a sound and computable termination analysis, we apply existing abstract interpretation techniques directly to the operational semantics, avoiding the need for custom abstractions for termination. The resulting analyzer is competitive with with existing, purpose-built analyzers.
@InProceedings{PLDI19p845,
author = {Phúc C. Nguyễn and Thomas Gilray and Sam Tobin-Hochstadt and David Van Horn},
title = {Size-Change Termination as a Contract: Dynamically and Statically Enforcing Termination for Higher-Order Programs},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {845--859},
doi = {10.1145/3314221.3314643},
year = {2019},
}
Publisher's Version
Artifacts Functional
Dynamics: Analysis and Compilation
SemCluster: Clustering of Imperative Programming Assignments Based on Quantitative Semantic Features
David M. Perry, Dohyeong Kim,
Roopsha Samanta, and
Xiangyu Zhang
(Purdue University, USA)
A fundamental challenge in automated reasoning about programming assignments at scale is clustering student submissions based on their underlying algorithms. State-of-the-art clustering techniques are sensitive to control structure variations, cannot cluster buggy solutions with similar correct solutions, and either require expensive pair-wise program analyses or training efforts. We propose a novel technique that can cluster small imperative programs based on their algorithmic essence: (A) how the input space is partitioned into equivalence classes and (B) how the problem is uniquely addressed within individual equivalence classes. We capture these algorithmic aspects as two quantitative semantic program features that are merged into a program's vector representation. Programs are then clustered using their vector representations. The computation of our first semantic feature leverages model counting to identify the number of inputs belonging to an input equivalence class. The computation of our second semantic feature abstracts the program's data flow by tracking the number of occurrences of a unique pair of consecutive values of a variable during its lifetime. The comprehensive evaluation of our tool SemCluster on benchmarks drawn from solutions to small programming assignments shows that SemCluster (1) generates far fewer clusters than other clustering techniques, (2) precisely identifies distinct solution strategies, and (3) boosts the performance of clustering-based program repair, all within a reasonable amount of time.
@InProceedings{PLDI19p860,
author = {David M. Perry and Dohyeong Kim and Roopsha Samanta and Xiangyu Zhang},
title = {SemCluster: Clustering of Imperative Programming Assignments Based on Quantitative Semantic Features},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {860--873},
doi = {10.1145/3314221.3314629},
year = {2019},
}
Publisher's Version
Computing Summaries of String Loops in C for Better Testing and Refactoring
Timotej Kapus, Oren Ish-Shalom,
Shachar Itzhaky, Noam Rinetzky, and
Cristian Cadar
(Imperial College London, UK; Tel Aviv University, Israel; Technion, Israel)
Analysing and comprehending C programs that use strings
is hard: using standard library functions for manipulating
strings is not enforced and programs often use complex loops
for the same purpose. We introduce the notion of memoryless
loops that capture some of these string loops and present a
counterexample-guided synthesis approach to summarise
memoryless loops using C standard library functions, which
has applications to testing, optimisation and refactoring.
We prove our summarisation is correct for arbitrary input
strings and evaluate it on a database of loops we gathered
from thirteen open-source programs. Our approach can summarise
over two thirds of memoryless loops in less than
five minutes of computation time per loop. We then show
that these summaries can be used to (1) improve symbolic
execution (2) optimise native code, and (3) refactor code.
@InProceedings{PLDI19p874,
author = {Timotej Kapus and Oren Ish-Shalom and Shachar Itzhaky and Noam Rinetzky and Cristian Cadar},
title = {Computing Summaries of String Loops in C for Better Testing and Refactoring},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {874--888},
doi = {10.1145/3314221.3314610},
year = {2019},
}
Publisher's Version
Info
Artifacts Functional
Reusable Inline Caching for JavaScript Performance
Jiho Choi, Thomas Shull, and
Josep Torrellas
(University of Illinois at Urbana-Champaign, USA)
JavaScript performance is paramount to a user’s browsing experience. Browser vendors have gone to great lengths to improve JavaScript’s steady-state performance. This has led to sophisticated web applications. However, as users increasingly expect instantaneous page load times, another important goal for JavaScript engines is to attain minimal startup times.
In this paper, we reduce the startup time of JavaScript programs by enhancing the reuse of compilation and optimization information across different executions. Specifically, we propose a new scheme to increase the startup performance of Inline Caching (IC), a key optimization for dynamic type systems. The idea is to represent a substantial portion of the IC information in an execution in a context-independent way, and reuse it in subsequent executions. We call our enhanced IC design Reusable Inline Caching (RIC). We integrate RIC into the state-of-the-art Google V8 JavaScript engine and measure its impact on the initialization time of popular JavaScript libraries. By recycling IC information collected from a previous execution, RIC reduces the average initialization time per library by 17%.
@InProceedings{PLDI19p889,
author = {Jiho Choi and Thomas Shull and Josep Torrellas},
title = {Reusable Inline Caching for JavaScript Performance},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {889--901},
doi = {10.1145/3314221.3314587},
year = {2019},
}
Publisher's Version
Performance
Composable, Sound Transformations of Nested Recursion and Loops
Kirshanthan Sundararajah and
Milind Kulkarni
(Purdue University, USA)
Scheduling transformations reorder a program’s operations to improve locality and/or parallelism. The polyhedral model is a general framework for composing and applying instance-wise scheduling transformations for loop-based programs, but there is no analogous framework for recursive programs. This paper presents an approach for composing and applying scheduling transformations—like inlining, interchange, and code motion—to nested recursive programs. This paper describes the phases of the approach—representing dynamic instances, composing and applying transformations, reasoning about correctness—and shows that these techniques can verify the soundness of composed transformations.
@InProceedings{PLDI19p902,
author = {Kirshanthan Sundararajah and Milind Kulkarni},
title = {Composable, Sound Transformations of Nested Recursion and Loops},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {902--917},
doi = {10.1145/3314221.3314592},
year = {2019},
}
Publisher's Version
Low-Latency Graph Streaming using Compressed Purely-Functional Trees
Laxman Dhulipala,
Guy E. Blelloch, and Julian Shun
(Carnegie Mellon University, USA; Massachusetts Institute of Technology, USA)
There has been a growing interest in the graph-streaming setting where a continuous stream of graph updates is mixed with graph queries. In principle, purely-functional trees are an ideal fit for this setting as they enable safe parallelism, lightweight snapshots, and strict serializability for queries. However, directly using them for graph processing leads to significant space overhead and poor cache locality.
This paper presents C-trees, a compressed purely-functional search tree data structure that significantly improves on the space usage and locality of purely-functional trees. We design theoretically-efficient and practical algorithms for performing batch updates to C-trees, and also show that we can store massive dynamic real-world graphs using only a few bytes per edge, thereby achieving space usage close to that of the best static graph processing frameworks.
To study the efficiency and applicability of our data structure, we designed Aspen, a graph-streaming framework that extends the interface of Ligra with operations for updating graphs. We show that Aspen is faster than two state-of-the-art graph-streaming systems, Stinger and LLAMA, while requiring less memory, and is competitive in performance with the state-of-the-art static graph frameworks, Galois, GAP, and Ligra+. With Aspen, we are able to efficiently process the largest publicly-available graph with over two hundred billion edges in the graph-streaming setting using a single commodity multicore server with 1TB of memory.
@InProceedings{PLDI19p918,
author = {Laxman Dhulipala and Guy E. Blelloch and Julian Shun},
title = {Low-Latency Graph Streaming using Compressed Purely-Functional Trees},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {918--934},
doi = {10.1145/3314221.3314598},
year = {2019},
}
Publisher's Version
Artifacts Functional
Co-optimizing Memory-Level Parallelism and Cache-Level Parallelism
Xulong Tang, Mahmut Taylan Kandemir,
Mustafa Karakoy, and Meenakshi Arunachalam
(Pennsylvania State University, USA; TOBB University of Economics and Technology, Turkey; Intel, USA)
Minimizing cache misses has been the traditional goal in optimizing cache performance using compiler based techniques. However, continuously increasing dataset sizes combined with large numbers of cache banks and memory banks connected using on-chip networks in emerging manycores/accelerators makes cache hit–miss latency optimization as important as cache miss rate minimization. In this paper, we propose compiler support that optimizes both the latencies of last-level cache (LLC) hits and the latencies of LLC misses. Our approach tries to achieve this goal by improving the parallelism exhibited by LLC hits and LLC misses. More specifically, it tries to maximize both cache-level parallelism (CLP) and memory-level parallelism (MLP). This paper presents different incarnations of our approach, and evaluates them using a set of 12 multithreaded applications. Our results indicate that (i) optimizing MLP first and CLP later brings, on average, 11.31% performance improvement over an approach that already minimizes the number of LLC misses, and (ii) optimizing CLP first and MLP later brings 9.43% performance improvement. In comparison, balancing MLP and CLP brings 17.32% performance improvement on average.
@InProceedings{PLDI19p935,
author = {Xulong Tang and Mahmut Taylan Kandemir and Mustafa Karakoy and Meenakshi Arunachalam},
title = {Co-optimizing Memory-Level Parallelism and Cache-Level Parallelism},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {935--949},
doi = {10.1145/3314221.3314599},
year = {2019},
}
Publisher's Version
Type Systems III
Characterising Renaming within OCaml’s Module System: Theory and Implementation
Reuben N. S. Rowe, Hugo Férée,
Simon J. Thompson, and
Scott Owens
(University of Kent, UK)
We present an abstract, set-theoretic denotational semantics for a significant subset of OCaml and its module system, allowing to reason about the correctness of renaming value bindings. Our semantics captures information about the binding structure of programs, as well as about which declarations are related by the use of different language constructs (e.g. functors, module types and module constraints). Correct renamings are precisely those that preserve this structure. We show that our abstract semantics is sound with respect to a (domain-theoretic) denotational model of the operational behaviour of programs, and that it allows us to prove various high-level, intuitive properties of renamings. This formal framework has been implemented in a prototype refactoring tool for OCaml that performs renaming.
@InProceedings{PLDI19p950,
author = {Reuben N. S. Rowe and Hugo Férée and Simon J. Thompson and Scott Owens},
title = {Characterising Renaming within OCaml’s Module System: Theory and Implementation},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {950--965},
doi = {10.1145/3314221.3314600},
year = {2019},
}
Publisher's Version
Artifacts Functional
Type-Level Computations for Ruby Libraries
Milod Kazerounian,
Sankha Narayan Guria, Niki Vazou, Jeffrey S. Foster, and
David Van Horn
(University of Maryland at College Park, USA; IMDEA Software Institute, Spain; Tufts University, USA)
Many researchers have explored ways to bring static typing to dynamic languages. However, to date, such systems are not precise enough when types depend on values, which often arises when using certain Ruby libraries. For example, the type safety of a database query in Ruby on Rails depends on the table and column names used in the query. To address this issue, we introduce CompRDL, a type system for Ruby that allows library method type signatures to include type-level computations (or comp types for short). Combined with singleton types for table and column names, comp types let us give database query methods type signatures that compute a table’s schema to yield very precise type information. Comp types for hash, array, and string libraries can also increase precision and thereby reduce the need for type casts. We formalize CompRDL and prove its type system sound. Rather than type check the bodies of library methods with comp types—those methods may include native code or be complex—CompRDL inserts run-time checks to ensure library methods abide by their computed types. We evaluated CompRDL by writing annotations with type-level computations for several Ruby core libraries and database query APIs. We then used those annotations to type check two popular Ruby libraries and four Ruby on Rails web apps. We found the annotations were relatively compact and could successfully type check 132 methods across our subject programs. Moreover, the use of type-level computations allowed us to check more expressive properties, with fewer manually inserted casts, than was possible without type-level computations. In the process, we found two type errors and a documentation error that were confirmed by the developers. Thus, we believe CompRDL is an important step forward in bringing precise static type checking to dynamic languages.
@InProceedings{PLDI19p966,
author = {Milod Kazerounian and Sankha Narayan Guria and Niki Vazou and Jeffrey S. Foster and David Van Horn},
title = {Type-Level Computations for Ruby Libraries},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {966--979},
doi = {10.1145/3314221.3314630},
year = {2019},
}
Publisher's Version
Artifacts Functional
Systems I
Replication-Aware Linearizability
Chao Wang, Constantin Enea, Suha Orhun Mutluergil, and Gustavo Petri
(IRIF, France; University Paris Diderot, France; CNRS, France; ARM, UK)
Distributed systems often replicate data at multiple locations to achieve availability despite network partitions. These systems accept updates at any replica and propagate them asynchronously to every other replica. Conflict-Free Replicated Data Types (CRDTs) provide a principled approach to the problem of ensuring that replicas are eventually consistent despite the asynchronous delivery of updates.
We address the problem of specifying and verifying CRDTs, introducing a new correctness criterion called Replication-Aware Linearizability. This criterion is inspired by linearizability, the de-facto correctness criterion for (shared-memory) concurrent data structures. We argue that this criterion is both simple to understand, and it fits most known implementations of CRDTs. We provide a proof methodology to show that a CRDT satisfies replication-aware linearizability that we apply on a wide range of implementations. Finally, we show that our criterion can be leveraged to reason modularly about the composition of CRDTs.
@InProceedings{PLDI19p980,
author = {Chao Wang and Constantin Enea and Suha Orhun Mutluergil and Gustavo Petri},
title = {Replication-Aware Linearizability},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {980--993},
doi = {10.1145/3314221.3314617},
year = {2019},
}
Publisher's Version
DFix: Automatically Fixing Timing Bugs in Distributed Systems
Guangpu Li, Haopeng Liu, Xianglan Chen,
Haryadi S. Gunawi, and
Shan Lu
(University of Chicago, USA; University of Science and Technology of China, China)
Distributed systems nowadays are the backbone of computing society, and are expected to have high availability. Unfortunately, distributed timing bugs, a type of bugs triggered by non-deterministic timing of messages and node crashes, widely exist. They lead to many production-run failures, and are difficult to reason about and patch. Although recently proposed techniques can automatically detect these bugs, how to automatically and correctly fix them still remains as an open problem. This paper presents DFix, a tool that automatically processes distributed timing bug reports, statically analyzes the buggy system, and produces patches. Our evaluation shows that DFix is effective in fixing real-world distributed timing bugs.
@InProceedings{PLDI19p994,
author = {Guangpu Li and Haopeng Liu and Xianglan Chen and Haryadi S. Gunawi and Shan Lu},
title = {DFix: Automatically Fixing Timing Bugs in Distributed Systems},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {994--1009},
doi = {10.1145/3314221.3314620},
year = {2019},
}
Publisher's Version
Ignis: Scaling Distribution-Oblivious Systems with Light-Touch Distribution
Nikos Vasilakis, Ben Karel, Yash Palkhiwala, John Sonchack,
André DeHon, and Jonathan M. Smith
(University of Pennsylvania, USA)
Distributed systems offer notable benefits over their centralized counterparts. Reaping these benefits, however, requires burdensome developer effort to identify and rewrite bottlenecked components. Light-touch distribution is a new approach that converts a legacy system into a distributed one using automated transformations. Transformations operate at the boundaries of bottlenecked modules and are parametrizable by light distribution recipes that guide the intended semantics of the resulting distribution. Transformations and recipes operate at runtime, adapting to load by scaling out only saturated components. Our Ignis prototype shows substantial speedups, attractive elasticity characteristics, and memory gains over full replication, achieved by small and backward-compatible code changes.
@InProceedings{PLDI19p1010,
author = {Nikos Vasilakis and Ben Karel and Yash Palkhiwala and John Sonchack and André DeHon and Jonathan M. Smith},
title = {Ignis: Scaling Distribution-Oblivious Systems with Light-Touch Distribution},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1010--1026},
doi = {10.1145/3314221.3314586},
year = {2019},
}
Publisher's Version
Verification I
Semantic Program Alignment for Equivalence Checking
Berkeley Churchill, Oded Padon,
Rahul Sharma, and
Alex Aiken
(Stanford University, USA; Microsoft Research, India)
We introduce a robust semantics-driven technique for program
equivalence checking. Given two functions we find a trace
alignment over a set of concrete executions of both programs and
construct a product program particularly amenable to checking
equivalence.
We demonstrate that our algorithm is applicable to challenging
equivalence problems beyond the scope of existing techniques. For
example, we verify the correctness of the hand-optimized vector
implementation of strlen that ships as part of the GNU C
Library, as well as the correctness of vectorization optimizations
for 56 benchmarks derived from the Test Suite for
Vectorizing Compilers.
@InProceedings{PLDI19p1027,
author = {Berkeley Churchill and Oded Padon and Rahul Sharma and Alex Aiken},
title = {Semantic Program Alignment for Equivalence Checking},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1027--1040},
doi = {10.1145/3314221.3314596},
year = {2019},
}
Publisher's Version
Info
Artifacts Functional
Verified Compilation on a Verified Processor
Andreas Lööw, Ramana Kumar, Yong Kiam Tan,
Magnus O. Myreen,
Michael Norrish, Oskar Abrahamsson, and
Anthony Fox
(Chalmers University of Technology, Sweden; DeepMind, UK; Carnegie Mellon University, USA; Data61 at CSIRO, Australia; Australian National University, Australia; ARM, UK)
Developing technology for building verified stacks, i.e., computer systems with comprehensive proofs of correctness, is one way the science of programming languages furthers the computing discipline. While there have been successful projects verifying complex, realistic system components, including compilers (software) and processors (hardware), to date these verification efforts have not been compatible to the point of enabling a single end-to-end correctness theorem about running a verified compiler on a verified processor.
In this paper we show how to extend the trustworthy development methodology of the CakeML project, including its verified compiler, with a connection to verified hardware. Our hardware target is Silver, a verified proof-of-concept processor that we introduce here. The result is an approach to producing verified stacks that scales to proving correctness, at the hardware level, of the execution of realistic software including compilers and proof checkers. Alongside our hardware-level theorems, we demonstrate feasibility by hosting and running our verified artefacts on an FPGA board.
@InProceedings{PLDI19p1041,
author = {Andreas Lööw and Ramana Kumar and Yong Kiam Tan and Magnus O. Myreen and Michael Norrish and Oskar Abrahamsson and Anthony Fox},
title = {Verified Compilation on a Verified Processor},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1041--1053},
doi = {10.1145/3314221.3314622},
year = {2019},
}
Publisher's Version
Artifacts Functional
Argosy: Verifying Layered Storage Systems with Recovery Refinement
Tej Chajed, Joseph Tassarotti, M. Frans Kaashoek, and Nickolai Zeldovich
(Massachusetts Institute of Technology, USA)
Storage systems make persistence guarantees even if the system crashes at any time, which they achieve using recovery procedures that run after a crash. We present Argosy, a framework for machine-checked proofs of storage systems that supports layered recovery implementations with modular proofs. Reasoning about layered recovery procedures is especially challenging because the system can crash in the middle of a more abstract layer’s recovery procedure and must start over with the lowest-level recovery procedure.
This paper introduces recovery refinement, a set of conditions that ensure proper implementation of an interface with a recovery procedure. Argosy includes a proof that recovery refinements compose, using Kleene algebra for concise definitions and metatheory. We implemented Crash Hoare Logic, the program logic used by FSCQ, to prove recovery refinement, and demonstrated the whole system by verifying an example of layered recovery featuring a write-ahead log running on top of a disk replication system. The metatheory of the framework, the soundness of the program logic, and these examples are all verified in the Coq proof assistant.
@InProceedings{PLDI19p1054,
author = {Tej Chajed and Joseph Tassarotti and M. Frans Kaashoek and Nickolai Zeldovich},
title = {Argosy: Verifying Layered Storage Systems with Recovery Refinement},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1054--1068},
doi = {10.1145/3314221.3314585},
year = {2019},
}
Publisher's Version
Artifacts Functional
Systems II
Simple and Precise Static Analysis of Untrusted Linux Kernel Extensions
Elazar Gershuni,
Nadav Amit,
Arie Gurfinkel, Nina Narodytska, Jorge A. Navas,
Noam Rinetzky, Leonid Ryzhyk, and Mooly Sagiv
(VMware, USA; Tel Aviv University, Israel; University of Waterloo, Canada; SRI International, USA)
Extended Berkeley Packet Filter (eBPF) is a Linux subsystem that allows
safely executing untrusted user-defined extensions inside the kernel. It
relies on static analysis to protect the kernel against buggy and malicious
extensions. As the eBPF ecosystem evolves to support more complex and
diverse extensions, the limitations of its current verifier, including high
rate of false positives, poor scalability, and lack of support for loops,
have become a major barrier for developers.
We design a static analyzer for eBPF within the framework of abstract
interpretation. Our choice of abstraction is based on common patterns found
in many eBPF programs. We observed that eBPF programs manipulate memory in a
rather disciplined way which permits analyzing them successfully with a
scalable mixture of very-precise abstraction of certain bounded regions with
coarser abstractions of other parts of the memory.
We use the Zone domain, a simple domain that tracks differences between
pairs of registers and offsets, to achieve precise and scalable analysis. We
demonstrate that this abstraction is as precise in practice as more costly
abstract domains like Octagon and Polyhedra.
Furthermore, our evaluation, based on hundreds of real-world eBPF programs,
shows that the new tool generates no more false alarms than the existing
Linux verifier, while it supports a wider class of programs (including
programs with loops) and has better asymptotic complexity.
@InProceedings{PLDI19p1069,
author = {Elazar Gershuni and Nadav Amit and Arie Gurfinkel and Nina Narodytska and Jorge A. Navas and Noam Rinetzky and Leonid Ryzhyk and Mooly Sagiv},
title = {Simple and Precise Static Analysis of Untrusted Linux Kernel Extensions},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1069--1084},
doi = {10.1145/3314221.3314590},
year = {2019},
}
Publisher's Version
Artifacts Functional
Transactional Concurrency Control for Intermittent, Energy-Harvesting Computing Systems
Emily Ruppel and
Brandon Lucia
(Carnegie Mellon University, USA)
Batteryless energy-harvesting devices are computing platforms that operate in
environments where batteries are not viable for energy storage.
Energy-harvesting devices operate intermittently, only as energy is available.
Prior work developed software execution models robust to intermittent power
failures but no existing intermittent execution model allows interrupts to
update global persistent state without allowing incorrect behavior or requiring
complex programming.
We present Coati, a system that supports event-driven concurrency via interrupts
in an intermittent software execution model. Coati exposes a task-based
interface for synchronous computations and an event interface for asynchronous
interrupts. Coati supports synchronizing tasks and events using transactions,
which allow for multi-task atomic regions that extend across multiple power
failures. This work explores two different models for serializing events and
tasks that both safely provide intuitive semantics for event-driven intermittent
programs.
We implement a prototype of Coati as C language extensions and a runtime
library. Using energy-harvesting hardware, we evaluate Coati on benchmarks
adapted from prior work. We show that Coati prevents failures when interrupts
are introduced, while the
baseline fails in just seconds. Moreover, Coati operates with a reasonable run
time overhead that is often comparable to an idealized baseline.
@InProceedings{PLDI19p1085,
author = {Emily Ruppel and Brandon Lucia},
title = {Transactional Concurrency Control for Intermittent, Energy-Harvesting Computing Systems},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1085--1100},
doi = {10.1145/3314221.3314583},
year = {2019},
}
Publisher's Version
Supporting Peripherals in Intermittent Systems with Just-In-Time Checkpoints
Kiwan Maeng and
Brandon Lucia
(Carnegie Mellon University, USA)
Batteryless energy-harvesting devices have the potential to be the foundation of applications for which batteries are infeasible. Just-In-Time checkpointing supports intermittent execution on energy-harvesting devices by checkpointing processor state right before a power failure. While effective for software execution, Just-In-Time checkpointing remains vulnerable to unrecoverable failures involving peripherals(e.g., sensors and accelerators) because checkpointing during a peripheral operation may lead to inconsistency between peripheral and program state. Additionally, a peripheral operation that uses more energy than a device can buffer never completes, causing non-termination.
This paper presents Samoyed, a Just-In-Time checkpointing system that safely supports peripherals. Samoyed correctly runs user-annotated peripheral functions by selectively disabling checkpoints and undo-logging. Samoyed guarantees progress by energy profiling, dynamic peripheral workload scaling, and a user-provided software fallback routine. Our evaluation shows that Samoyed correctly executes peripheral operations that fail with existing systems, achieving up to 122.9x speedup by using accelerators. Samoyed preserves the performance benefit of Just-In-Time checkpointing, showing 4.11x mean speedup compared to a recent possible alternative. Moreover, Samoyed’s unique ability to profile energy and to dynamically scale large peripheral operations simplifies programming.
@InProceedings{PLDI19p1101,
author = {Kiwan Maeng and Brandon Lucia},
title = {Supporting Peripherals in Intermittent Systems with Just-In-Time Checkpoints},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1101--1116},
doi = {10.1145/3314221.3314613},
year = {2019},
}
Publisher's Version
Verification II
Verification of Programs under the Release-Acquire Semantics
Parosh Aziz Abdulla, Jatin Arora,
Mohamed Faouzi Atig, and
Shankaranarayanan Krishna
(Uppsala University, Sweden; IIT Bombay, India)
We address the verification of concurrent programs running under the release-acquire (RA) semantics.
We show that the reachability problem is undecidable even in the case where the input program is finite-state.
Given this undecidability, we follow the spirit of the work on context-bounded analysis for detecting bugs in
programs under the classical SC model, and propose an under-approximate reachability analysis for the case of RA. To this end, we propose a novel notion, called view-switching, and provide a code-to-code translation from an input program under RA to a program under SC. This leads to a reduction, in polynomial time, of the bounded view-switching reachability problem under RA to the bounded context-switching problem under SC. We have implemented a prototype tool VBMC and tested it on a set of benchmarks, demonstrating that many bugs in programs can be found using a small number of view switches.
@InProceedings{PLDI19p1117,
author = {Parosh Aziz Abdulla and Jatin Arora and Mohamed Faouzi Atig and Shankaranarayanan Krishna},
title = {Verification of Programs under the Release-Acquire Semantics},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1117--1132},
doi = {10.1145/3314221.3314649},
year = {2019},
}
Publisher's Version
Artifacts Functional
A Complete Formal Semantics of x86-64 User-Level Instruction Set Architecture
Sandeep Dasgupta, Daejun Park, Theodoros Kasampalis,
Vikram S. Adve, and
Grigore Roşu
(University of Illinois at Urbana-Champaign, USA; Runtime Verification, USA)
We present the most complete and thoroughly tested formal semantics of x86-64 to date. Our semantics faithfully formalizes all the non-deprecated, sequential user-level instructions of the x86-64 Haswell instruction set architecture. This totals 3155 instruction variants, corresponding to 774 mnemonics. The semantics is fully executable and has been tested against more than 7,000 instruction-level test cases and the GCC torture test suite. This extensive testing paid off, revealing bugs in both the x86-64 reference manual and other existing semantics. We also illustrate potential applications of our semantics in different formal analyses, and discuss how it can be useful for processor verification.
@InProceedings{PLDI19p1133,
author = {Sandeep Dasgupta and Daejun Park and Theodoros Kasampalis and Vikram S. Adve and Grigore Roşu},
title = {A Complete Formal Semantics of x86-64 User-Level Instruction Set Architecture},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1133--1148},
doi = {10.1145/3314221.3314601},
year = {2019},
}
Publisher's Version
Artifacts Functional
An Applied Quantum Hoare Logic
Li Zhou,
Nengkun Yu, and
Mingsheng Ying
(Tsinghua University, China; University of Technology Sydney, Australia; Institute of Software at Chinese Academy of Sciences, China)
We derive a variant of quantum Hoare logic (QHL), called applied quantum Hoare logic (aQHL for short), by: 1. restricting QHL to a special class of preconditions and postconditions, namely projections, which can significantly simplify verification of quantum programs and are much more convenient when used in debugging and testing; and 2. adding several rules for reasoning about robustness of quantum programs, i.e. error bounds of outputs. The effectiveness of aQHL is shown by its applications to verify two sophisticated quantum algorithms: HHL (Harrow-Hassidim-Lloyd) for solving systems of linear equations and qPCA (quantum Principal Component Analysis).
@InProceedings{PLDI19p1149,
author = {Li Zhou and Nengkun Yu and Mingsheng Ying},
title = {An Applied Quantum Hoare Logic},
booktitle = {Proc.\ PLDI},
publisher = {ACM},
pages = {1149--1162},
doi = {10.1145/3314221.3314584},
year = {2019},
}
Publisher's Version
proc time: 22.49