SPLASH Companion 2021 – Companion Proceedings

Message from the Chairs
Welcome to the 2021 edition of the ACM SIGPLAN conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2021). Last year, I wrote to you about the very first virtual SPLASH conference. 2021 edition is also another first! This is the first time that the SPLASH conference has been organized as a hybrid conference and we are excited to welcome some of you in person in Chicago and others virtually. In fact, SPLASH will be the first major hybrid conference in programming languages. As of this writing, over 200 of attendees are registered to be in Chicago and over 400 are signed up to attend virtually. This 2021 edition will begin at 7:40am Central Standard Time on Sunday, October 17 and will finish at 23:30 CST on October 22, 2021. SPLASH is a premier conference on the applications of programming language. SPLASH uniquely sits at the intersection of programming languages and software engineering research. SPLASH and its predecessor conference OOPSLA have served as an effective bridge between these two intellectually rich communities since 1986.

Keynotes

Integrated Scientific Modeling and Lab Automation (Keynote)
Luca Cardelli
(Microsoft Research, UK; University of Oxford, UK)
The cycle of observation, hypothesis formulation, experimentation, and falsification that has driven scientific and technical progress is lately becoming automated in all its separate components. However, integration between these automated components is lacking. Theories are not placed in the same formal context as the (coded) protocols that are supposed to test them: neither description knows about the other, although they both aim to describe the same process.
We develop integrated descriptions from which we can extract both the model of a phenomenon (for possibly automated mathematical analysis), and the steps carried out to test it (for automated execution by lab equipment). This is essential if we want to carry out automated model synthesis, falsification, and inference, by taking into account uncertainties in both the model structure and in the equipment tolerances that may jointly affect the results of experiments.

Publisher's Version

Designing Safe Programmed Molecular Systems (Keynote)
Robyn Lutz
(Iowa State University, USA)
Molecular programming uses the computational power of DNA and other biomolecules to create nanoscale systems. Many of these will be safety-critical, such as bio-compatible diagnostic sensors and targeted drug-delivery devices. Design of these programmed molecular systems needs to assure safe outcomes from very many, very small, fault-prone components operating simultaneously in a dynamic environment. Some of this can be achieved by adapting existing software engineering methods, but molecular programming also presents new challenges that will require new methods. This talk discusses an example of such a challenge, describes how we design safety-critical building blocks, and presents work in progress to ascertain how robust a molecular program is. Similar approaches also will help design safe non-molecular systems with highly distributed, autonomous, fault-prone components operating in dynamic environments.

Publisher's Version

Student Research Competition

Graduate Papers

Can Reactive Synthesis and Syntax-Guided Synthesis Be Friends?
Wonhyuk Choi

(Columbia University, USA)
While reactive synthesis and syntax-guided synthesis (SyGuS) have seen enormous progress in recent years, combining the two approaches has remained a challenge. In this work, we present the synthesis of reactive programs from Temporal Stream Logic modulo theories (TSL-MT), a framework that unites the two approaches to synthesize a single program. In our approach, reactive synthesis and SyGuS collaborate in the synthesis process, and generate executable code that implements both reactive and data-level properties. We also present a tool, temos, that combines state-of-the-art methods in reactive synthesis and SyGuS to synthesize programs from TSL-MT specifications. We demonstrate the applicability of our approach over a set of benchmarks, and present a deep case study on synthesizing a music keyboard synthesizer.

Publisher's Version

Run-Time Data Analysis in Dynamic Runtimes
Lukas Makor

(JKU Linz, Austria)
Databases are typically faster in processing huge amounts of data than applications with hand-coded data access. Even though modern dynamic runtimes optimize applications intensively, they cannot perform certain optimizations that are traditionally used by database systems as they lack the required information. Thus, we propose to extend the capabilities of dynamic runtimes to allow them to collect fine-grained information of the processed data at run time and use it to perform database-like optimizations. By doing so, we want to enable dynamic runtimes to significantly boost the performance of data-processing workloads. Ideally, applications should be as fast as databases in data-processing workloads. To show the feasibility of our approach, we are implementing it in a polyglot dynamic runtime.

Publisher's Version

Run-Time Data Analysis to Drive Compiler Optimizations
Sebastian Kloibhofer

(JKU Linz, Austria)
Dynamic compilers collect a variety of information to optimize programs and achieve peak performance. Nevertheless, particularly in data-heavy applications, analysis of the processed data - data structures, metrics, relations - could enable additional optimizations in terms of access patterns and data locality. Query planning in database systems is one source of inspiration, but due to the required overhead to collect such information, it is infeasible in dynamic compilers. With this project, we propose integrating data analysis into a dynamic runtime to speed up big data applications. The goal is to use the detailed run-time information for speculative compiler optimizations based on the shape and complexion of the data to improve performance.

Publisher's Version

A Study of Call Graph Effectiveness for Framework-Based Web Applications
Madhurima Chakraborty

(University of California at Riverside, USA)
Modern web applications are continuously evolving and becoming increasingly reliant on web frameworks to support their ever-changing needs. This necessitates the realization of efficient static analysis methodologies for the purpose of bug finding and security auditing of such applications. Moreover, the majority of these frameworks are written in JavaScript, which is difficult to analyze due to its extremely dynamic nature. The primary goal of this work is to study the effectiveness of the present state-of-the-art call graph approaches for JavaScript and propose techniques to enhance them such that they discover more of the crucial functions and call edges in modern, framework-based JavaScript applications. Ideally, these new techniques must enhance function and call edge discovery without much impact on precision and scalability.

Publisher's Version

Avoiding Monomorphization Bottlenecks with Phase-Based Splitting
Sophie Kaleba
(University of Kent, UK)
State-of-the-art managed runtimes apply aggressive optimizations speculating that programs have low variability. However, literature shows that program behavior may evolve at run time and experience different execution phases. This variable behavior may hide optimization opportunities. Taking phases into account may thus improve performance when applied to phase-sensitive optimizations such as lookup caches, which can be monomorphization bottlenecks when they contain phase-specific entries.
In this project, we introduce phase-based splitting, an optimization that utilizes phases to guide monomorphization based on splitting. Preliminary results on a first prototype and micro-benchmarks show promising speedups ranging from 10 to 20% on average, peaking at 47.6% per phase. In our next steps, we will evaluate our approach on a richer set of benchmarks and real-world applications, and define heuristics to better guide phase-based splitting.

Publisher's Version

Undergraduate Papers

Programming-by-Example by Programming-by-Example: Synthesis of Looping Programs
Shmuel Berman
(Columbia University, USA)
Program synthesis has seen many new applications in recent years, in large part thanks to the introduction of SyGuS. However, no existing SyGuS solvers have support for synthesizing recursive functions. We introduce an multi-phase algorithm for the synthesis of recursive "looplike" programs in SyGuS for programming-by-example. We solve constraints individually and treat them as "unrolled" examples of how a recursive program would behave, and solve for the generalized recursive solution. Our approach is modular and supports any SyGuS Solver.

Publisher's Version

Edgeworth: Authoring Diagrammatic Math Problems using Program Mutation
Hwei-Shin Harriman

(Olin College of Engineering, USA; Carnegie Mellon University, USA)
Building connections between mathematical expressions and their visual representations increases conceptual understanding and flexibility. However, students rarely practice visualizing abstract mathematical relationships because developing diagrammatic problems is challenging, especially at scale. To address this issue, we introduce Edgeworth, a system that automatically generates correct and incorrect diagrams for a given question prompt. It does so by mutating declarative mathematical statements with visual semantics. We evaluated the system by recreating diagrammatic problems in a widely used geometry textbook.

Publisher's Version

Towards Decidable and Expressive DOT
Sophia Roshal
(Cornell University, USA; Carnegie Mellon University, USA)
While decidability of type systems may be a reasonable sacrifice to make in exchange for greatly increased expressivity in some cases, having a type checker that is guaranteed to terminate while maintaining a high degree of expressivity can be beneficial. Over the last few years there have been several papers and theses written describing decidable variants of DOT, however, these variants all present similar challenges in limitations on expressivity. We aim to increase the expressivity of these variants without sacrificing decidability.

Publisher's Version

Implementation of an End-to-End Gradual Verification System
Hemant Gouni and Conrad Zimmerman
(University of Minnesota at Twin Cities, USA; Brown University, USA)
Static verification is used to ensure the correctness of programs. While useful in critical applications, the high overhead associated with writing specifications limits its general applicability. Similarly, the run-time costs introduced by dynamic verification limit its practicality. Gradual verification validates partially specified code statically where possible and dynamically where necessary. As a result, software developers gain granular control over the trade-offs between static and dynamic verification. This paper contains an end-to-end presentation of gradual verification in action, with a focus on applying it to C₀ (a safe subset of C) and implementing the required dynamic verification.

Publisher's Version

Source Code Authorship Attribution using File Embeddings
Alina Bogdanova
(Innopolis University, Russia)
The problem of source code authorship attribution is crucial for a few reasons. Security and legal issues are the most popular ones. However, this domain could also help to understand the nature of the personal code style. This type of information could be used, for instance, by IDEs to improve the developer's experience of writing the code. The goal of this study is to construct an interpretable model for source code embeddings generation. Such embeddings should represent the correspondence between the source code and its author.

Publisher's Version

SPLASH Companion 2021 – Companion Proceedings

Frontmatter

Keynotes

Student Research Competition

Graduate Papers

Undergraduate Papers