MAPL 2020 – Proceedings

Welcome from the Chairs
Welcome to the fourth ACM SIGPLAN workshop on Machine Learning and Programming Languages (MAPL), co-located with PLDI 2020. The focus of MAPL is to advance interdisciplinary research across the fields of machine learning (ML) and programming languages (PL). The workshop was born of the belief that although significant advances have been made in both ML and PL, there are many open research problems that are likely to be best solved using a combination of the two. These include the development of machine learning to support novel programming systems that are better able to capture programmer’s intentions, invent new algorithms, and potentially help with the autonomous generation and optimization of software programs. It also includes the use of programming systems technologies to improve machine learning infrastructure and to support learning with stronger guarantees.

Papers

Generating Correctness Proofs with Neural Networks
Alex Sanchez-Stern, Yousef Alhessi, Lawrence Saul, and Sorin Lerner

(University of California at San Diego, USA)
Foundational verification allows programmers to build software which has been empirically shown to have high levels of assurance in a variety of important domains. However, the cost of producing foundationally verified software remains prohibitively high for most projects, as it requires significant manual effort by highly trained experts. In this paper we present Proverbot9001, a proof search system using machine learning techniques to produce proofs of software correctness in interactive theorem provers. We demonstrate Proverbot9001 on the proof obligations from a large practical proof project, the CompCert verified C compiler, and show that it can effectively automate what were previously manual proofs, automatically producing proofs for 28% of theorem statements in our test dataset, when combined with solver-based tooling. Without any additional solvers, we exhibit a proof completion rate that is a 4X improvement over prior state-of-the-art machine learning models for generating proofs in Coq.

Publisher's Version

Semi-static Type, Shape, and Symbolic Shape Inference for Dynamic Computation Graphs
Momoko Hattori, Shimpei Sawada, Shinichiro Hamaji, Masahiro Sakai, and Shunsuke Shimizu
(University of Tokyo, Japan; Preferred Networks, Japan)
The growing interest in deep learning has created a demand to compile computation graphs to accelerate execution and to deploy applications on various devices. Modern deep learning frameworks construct computation graphs dynamically. This gives rise to the problem of inferring types of dynamic computation graphs. Two approaches are known. One of them is a dynamic approach that constructs the computation graph from the execution trace of an actual input. While this approach is straightforward, the results of the shape inference will consist of only concrete values and is often difficult for users to interpret. The other one performs static analysis over the source program. This method can produce symbolic shape inference results but suffers from the dynamic nature of the host programming language Python.
In this paper, we propose a novel approach for type, shape, and symbolic shape inference of dynamic computation graphs as a mixture of the above two methods. We present results of applying our prototype inference engine for networks written with PyTorch and demonstrate its effectiveness for nontrivial networks.

Publisher's Version

On the Challenges in Programming Mixed-Precision Deep Neural Networks
Ruizhe Zhao, Wayne Luk, Chao Xiong, Xinyu Niu, and Kuen Hung Tsoi
(Imperial College London, UK; Corerain Technologies, China)
Deep Neural Networks (DNNs) are resilient to reduced data precision, which motivates exploiting low-precision data formats for more efficient computation, especially on custom hardware accelerators. Multiple low-precision types can be mixed to fit the dynamic range of different DNN layers. However, these formats are not often supported on popular microprocessors and Deep Learning (DL) frameworks, hence we have to manually implement and optimize such novel data types and integrate them with multiple DL framework components, which is tedious and error-prone. This paper first reviews three major challenges in programming mixed-precision DNNs, including generating high-performance arithmetic and typecast functions, reducing the recompilation time and bloated binary size caused by excessive template specialization, and optimizing mixed-precision DNN computational graphs. We present our approach, Lowgen, a framework that addresses these challenges. For each challenge, we present our solution implemented and tested on our in-house, TensorFlow-like DL framework. Empirical evaluation shows that Lowgen can automatically generate efficient data type implementations that enable significant speed-up, which greatly lowers the development effort and enhances research productivity on mixed-precision DNN.

Publisher's Version

Learning Quantitative Representation Synthesis
Mayur Patil, Farzin Houshmand

, and Mohsen Lesani

(University of California at Riverside, USA)
Software systems often use specialized combinations of data structures to store and retrieve data. Designing and maintaining custom data structures particularly concurrent ones is time-consuming and error-prone.We let the user declare the required data as a high-level specification of a relation and method interface, and automatically synthesize correct and efficient concurrent data representations. We present provably sound syntactic derivations to synthesize structures that efficiently support the interface.We then synthesize synchronization to support concurrent execution on the structures. Multiple candidate representations may satisfy the same specification and we aim at quantitative selection of the most efficient candidate. Previous works have either used dynamic auto-tuners to execute and measure the performance of the candidates or used static cost functions to estimate their performance. However, repeating the execution for many candidates is time-consuming and a single performance model cannot be an effective predictor of all workloads across all platforms. We present a novel approach to quantitative synthesis that learns the performance model. We developed a synthesis tool called Leqsy that trains an artificial neural network to statically predict the performance of candidate representations. Experimental evaluations demonstrate that Leqsy can synthesize near-optimum representations.

Publisher's Version

Learned Garbage Collection
Lujing Cen, Ryan Marcus, Hongzi Mao, Justin Gottschlich

, Mohammad Alizadeh, and Tim Kraska
(Massachusetts Institute of Technology, USA; Intel Labs, USA)
Several programming languages use garbage collectors (GCs) to automatically manage memory for the programmer. Such collectors must decide when to look for unreachable objects to free, which can have a large performance impact on some applications. In this preliminary work, we propose a design for a learned garbage collector that autonomously learns over time when to perform collections. By using reinforcement learning, our design can incorporate user-defined reward functions, allowing an autonomous garbage collector to learn to optimize the exact metric the user desires (e.g., request latency or queries per second). We conduct an initial experimental study on a prototype, demonstrating that an approach based on tabular Q learning may be promising.

Publisher's Version

MAPL 2020 – Proceedings

4th ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (MAPL 2020)

Frontmatter

Papers