2023 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2023)
Powered by
Conference Publishing Consulting

2023 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2023), October 22–27, 2023, Cascais, Portugal

SPLASH 2023 – Preliminary Table of Contents

Contents - Abstracts - Authors


Title Page

Message from the Chairs

Doctoral Symposium

Remote Just-in-Time Compilation for Dynamic Languages
Andrej Pečimúth ORCID logo
(Oracle Labs, Czechia)
Cloud platforms allow applications to meet fluctuating levels of demand through automatic horizontal scaling. These deployment models are characterized by short-lived applications running in resource-constrained environments. This amplifies the overhead of dynamic languages with just-in-time (JIT) compilation. Dynamic-language runtimes suffer from a warmup phase and resource-usage peaks caused by JIT compilation. Offloading compilation jobs to a dedicated server is a possible mitigation for these problems. We propose leveraging remote JIT compilation as a means to enable coordination between the independent instances. By sharing compilation results, aggregating profiles, and adapting the compiler and compilation policy, we strive to improve the peak performance and further reduce the warmup time of these applications. Additionally, an implementation on top of the Truffle framework enables us to bring these benefits to many popular languages.

Article Search
Scaling up Program Synthesis to Efficient Algorithms
Ruyi Ji ORCID logo
(Peking University, China)
The automatic synthesis of algorithms can effectively reduce the difficulty of algorithm design. However, multiple challenges exist for synthesizing algorithms. Among them, scalability of the synthesizer is the most prominent one because of the significant complexity of efficient algorithms. To address this scalability challenge, we propose several approaches from two aspects, improving the efficiency of existing program synthesizers and reducing the difficulty of algorithm synthesis by properly using algorithmic knowledge, respectively.

Article Search
Large Language Models for Automated Program Repair
Francisco Ribeiro ORCID logo
(INESC Tec, Portugal; Universidade do Minho, Portugal)
This paper introduces two methods for automated program repair (APR) utilizing pre-trained language models. The first method demonstrates program repair as a code completion task and is validated on a dataset of Java programs. The second method, Mentat, leverages OCaml’s parser and type system as fault localization techniques to generate prompts for GPT-3, producing candidate patches. Evaluation results show promising repair rates, with 27% and 39.2% effectiveness, respectively. For OCaml, a comparative study employing an automated validation strategy is presented in which the technique outperforms other tools. Language models are effective at APR, enhancing bug fixing and freeing developers to focus on other critical aspects of software engineering.

Article Search
Transforming Ideas into Code: Visual Sketching for ML Development
Luís Gomes ORCID logo
(Carnegie Mellon University, USA; University of Porto, Portugal)
We propose a novel code assistant and generation paradigm aimed at closing the gap between visual sketching and code creation for Machine Learning (ML) development. This approach empowers developers and ML practitioners to translate hand-drawn sketches into functional code with enhanced accuracy and usability. Developers are recruited to assess the tool's performance. This research contributes to the future of low-code approaches, facilitating ML application development, and promoting an intuitive and accessible programming environment.

Article Search
Semantic Versioning for Python Programs
Luís Carvalho ORCID logo
(NOVA University Lisbon, Portugal; NOVA-LINCS, Portugal)
We propose a language-based approach to software versioning. Unlike the traditional approach of mainstream version control systems, where each evolution step is represented by a textual diff, we treat versions as programming elements. Each evolution step, merge operation, and version relationship, is represented explicitly in the program. This provides compile time guarantees for safety code reuse from previous versions, as well as forward and backwards compatibility between versions, allowing clients to use newly introduced code without needing to refactor their program. By lifting the versioning to the language level, we pave the way for tools that interact with software repositories to have more insight regarding the evolution of the software semantics.

Article Search
Reusing Single-Language Analyses for Static Analysis of Multi-language Programs
Tobias Roth ORCID logo
(TU Darmstadt, Germany)
State-of-the-art software is crafted in multiple programming languages. Such multi-language software challenges static analyses: Since many static analyses are focused on analyzing single-language programs, they are inherently unsound or imprecise in analyzing cross-language interaction. Existing approaches that perform multi-language analysis are not analysis- or language independent and thus lack in extensibility for new analyses and languages. We will develop an extensible, language-, framework-, and analysis-independent architecture to reuse existing static analyses for analyzing multi-language software. Our hypotheses are that, our architecture allows reusing existing single-language analyses and improves precision and soundness compared to the state of the art. We will evaluate our architecture with a points-to and data flow analysis for Java, JavaScript, and C/C++ code and compare it against the state of the art.

Article Search
A Pragmatic Approach to Syntax Repair
Breandan Considine
(McGill University, Canada)
Programming languages share a social and formal heritage. These families were historically divided, but share deep roots, and we argue their destined matrimony heralds important consequences for language design and generative language modeling. In our work, we develop a sociotechnical framework for understanding the dynamics of programming and argue it captures many of the social and formal properties of language acquisition and evolution.

Article Search


Partial Gradual Dependent Type Theory
Zhan Shi ORCID logo
(Kyoto University, Japan)
Gradual typing supports imprecise types in the type system, allowing incremental migration from untyped code to typed in the same language. Through the gradual typing approach, our ongoing work proposes a new theory based on the Martin-Löf type theory called Partial Gradual Dependent Type Theory. PGTT supports a gradual step from non-dependently typed code to dependently typed, enhancing the functionality of code reasoning while preserving the usability of widely used non-dependent type systems. PGTT restricts entirely unknown types and only permits dynamic terms on the type indices, making it naturally maintain better properties than existing gradual dependent type theories. It allows us to simplify runtime type checks into type parameter checks and elaborate the surface language into a static, dependently typed language, thereby reducing the performance overhead associated with gradual typing.

Article Search
Synthesizing Recursive Programs through Dataflow Constraints
Marta Davila Mateu ORCID logo
(University of Southern California, USA)
Despite great progress in algorithms for synthesizing recursive programs, state of the art approaches continue to have several limitations. Prinicipal among these is their inability to “invent” auxiliary functions. This makes them sensitive to the available set of primitive components. In this paper, we present an alternative approach to recover a recursive program from a nonrecursive implementation. We develop a system of constraints that characterizes patterns of data flow in the unrollings of a recursive program. Combined with a generator of seed nonrecursive circuits and an appropriate constraint solver, these constraints naturally form the basis of a general algorithm to synthesize recursive circuits.

Article Search
Modular Educational Languages
Jesse Hoobergs ORCID logo
(KU Leuven, Belgium)
Teaching novices to program is an unsolved problem. One part of the problem lies in the fact that industrial languages are used for teaching novices, while they were not made for this purpose. I am designing a Programming Education Runtime System to easily create modular languages for education. This system utilizes object algebras, trampolining and algebraic effects and handlers. It has been used to implement an interpreter for the Hedy programming language. This implementation has several important advantages over the existing one, such as better debugging support and better integration with the Hedy platform.

Article Search
Historiographer: Strongly-Consistent Distributed Reactive Programming with Minimal Locking
Julia Freeman ORCID logo and Timothy Zhou ORCID logo
(Francis Marion University, USA; University of Illinois at Urbana-Champaign, USA)
We propose a novel distributed reactive propagation algorithm that provides strong consistency guarantees with minimal locking. This is achieved by decoupling reactive propagation from transaction execution, utilizing reactive histories to avoid propagating in-progress or inconsistent results. We formally define soundness properties in terms of histories, and sketch how our algorithm upholds them. We implement Historiographer, a runtime incorporating our methods, and conduct a preliminary evaluation demonstrating performance improvements of up to 38

Article Search
Clearing the Trail: Motivations for Maintenance Work in Open Source
Katrina Wilson ORCID logo
(Bucknell University, USA)
Introducing new maintainers to established projects is critical to the long-term sustainability of open-source projects. While there is significant research about why people create new open-source projects, we have less of an understanding of the motivations behind joining and maintaining already established projects. Open-source grey literature contrasts the thrill of creating a new project with burnout from triaging reported issues. Previous research on volunteering motivations emphasizes that individuals are motivated by a unique set of factors to volunteer in a certain area. This leads us to suspect that the motivations behind open-source contributions also depend on the nature of the work. We aim to determine correlations between types of open-source contributions and their specific motivators through surveys of open-source contributors. This can improve the effectiveness of community and research efforts to attract, support, and sustain developer engagement in open source.

Article Search
Historiographer: Strongly-Consistent Distributed Reactive Programming with Minimal Locking
Julia Freeman ORCID logo and Timothy Zhou ORCID logo
(Francis Marion University, USA; University of Illinois at Urbana-Champaign, USA)
We propose a novel distributed reactive propagation algorithm that provides strong consistency guarantees with minimal locking. This is achieved by decoupling reactive propagation from transaction execution, utilizing reactive histories to avoid propagating in-progress or inconsistent results. We formally define soundness properties in terms of histories, and sketch how our algorithm upholds them. We implement Historiographer, a runtime incorporating our methods, and conduct a preliminary evaluation demonstrating performance improvements of up to 38

Article Search
Design and Implementation of Facets of Dynamic Policies
Antonio Zegarelli ORCID logo
(IMDEA Software Institute, Spain)
In Information Flow Control (IFC) expressiveness, i.e. being able to model multiple scenarios, is a crucial aspect, especially in the context of dynamically evolving security requirements. Those dynamic scenarios introduce complexities due to varying security interpretations. Broberg et al. identified "facets of dynamic policies", patterns of information flow that may be considered secure or insecure depending on the context. Typically, most existing definitions of security conditions do not delve into the analysis of these facets, making it difficult to model a wide range of scenarios. Therefore, our research aims to establish a robust framework that facilitates the design and implementation of the different interpretations of facets within a single, modular enforcement mechanism. We propose, in Haskell, an abstract definition of a monadic IFC mechanism, that, based on the instantiation, can account for the different interpretations of facets. Our ongoing work involves the formalization of the respective security conditions, the study of their combination and a LiquidHaskell proof mechanization. This with the aspiration to improve the reasoning about dynamic policies and their associated facets, especially in the case of future extensions.

Article Search
Towards the Formal Verification of Wigderson’s Algorithm
Siraphob PhipathananunthORCID logo
(Vanderbilt University, USA)
We present progress towards the formal verification of Wigderson’s graph coloring algorithm in Coq. We have created a library of formalized graph theory that aims to bridge the literature gap between introductory material on Coq and large-scale formal developments, while providing a motivating case study. Our library contains over 180 proven theorems. The development is available at https://github.com/siraben/coq-wigderson.

Article Search Info
An Optimal Structure-Aware Code Difference Framework with MaxSAT-Solver
Haolin Ye ORCID logo
(McGill University, Canada)
The Abstract Syntax Tree (AST) serves as a pivotal representation of program codes, offering a structured and hierarchical view of the program’s syntax. When developers modify code, the underlying AST also evolves to reflect these changes. Tree-diff algorithms, such as truediff and Gumtreediff, are developed to compare different versions of the AST and identify the modifications made between them. However, these heuristics are based on certain vertex matching methods that do not ensure optimality and preciseness. In this study, I propose a novel tree-diff approach that utilizes a MaxSAT (Maximum satisfiability) solver to address this issue. By encoding potential vertex matches and edges with associated costs as a tree-diff SAT problem, the MaxSAT solver effectively minimizes the edit distance and reveals the optimal vertex matching plan.

Article Search
Rose: Extensible Autodiff on the Web
Raven Rothkopf ORCID logo
(Barnard College, USA)
Automatic differentiation (AD) has become the backbone for a new wave of optimization-driven domains such as computer graphics and machine learning over the past decade. However, existing AD systems face limitations, either lacking support for in-browser development or failing to harness more recent, compiler-based approaches to achieve both expressiveness and size-preserving differentiation. This work introduces Rose, a portable, extensible AD library that runs on the web. Through Rose, we aim to increase accessibility to AD and empower end-user programming in optimization-driven domains. We plan to evaluate Rose by replacing the AD engines of real-world, client-side optimization systems and assess the improvements on the computation power and expressiveness of such systems.

Article Search


Involving Users in Design of a Widely Used Language: A Case of ECMAScript (JavaScript) Standardization
Mikhail BarashORCID logo, Yulia Startsev ORCID logo, and Rolf Martin Glomsrud ORCID logo
(University of Bergen, Norway; Ecma TC39, Geneva, Switzerland; Mozilla, Germany)
We present an overview of primary feedback mechanisms used by Ecma International Technical Committee 39 (TC39), the standardizing body of JavaScript programming language.

Article Search
Allocation-Driven Program Slicing
Octave Larose ORCID logo
(University of Kent, UK)
Software is composed of different parts with different goals, each with different needs. Security-wise, this means not all necessarily need the same permissions: it can be beneficial to isolate some code such that it has limited control over the process, following the principle of least privilege. Without any sort of compartmentalization, a vulnerability found in a sensitive part of an application means the entire process can get corrupted, as its entire memory is now open to be read and modified by the attacker.
One notable example of dangerous code is the use of external libraries: using a library implies that it is trusted, as it is run within the program's context and can therefore read and modify any program state. This means malicious or vulnerable code can be introduced to the system this way, which can compromise sensitive information that is present in the process' memory.
This begs the question: if parts of the software cannot necessarily be trusted, what can we do to limit their capabilities over the process?
There exist many in-process software isolation mechanisms. One such example is Google's Sandboxed API, which creates a sandbox for a C/C++ library. It establishes a process boundary between the main process and the untrusted library: through this, it relies on operating system primitives directly to safeguard the memory of the main process, preventing the library from accessing data belonging to a different process than its own. CodeJail also relies on processes, with a focus on isolating dynamically linked libraries. However, this is at the cost of much engineering effort: to isolate an untrusted library, a new, trusted library must be implemented based on it, with wrappers for each of the original library's exported functions. This is a non-trivial amount of effort, while ideally, code isolation would be somewhat automatic.
PtrSplit is less limited in this way, being designed to enable automatic partitioning of arbitrary C/C++ applications based on developer source code annotations. CALI (Compiler Assisted Library Isolation) also focuses on being fully automatic. At build-time, after compiling source code files to LLVM bitcode, static analysis is performed to identify what memory needs to be shared between the main program and the chosen library. The main program is then automatically instrumented based on this information. While valuable approaches, the systems mentioned thus far have the shortcoming of not being able to automatically handle dynamic loading: instead, libraries need to be handled during the linking phase. Programs which load additional code at run time cannot do so safely.
We propose a new approach to automatic in-process software isolation to properly handle dynamic loading. For this, we rely on liballocs, which provides us with run-time type information and a run-time memory model of all allocations in a given process. This allows us to track pointers and their usages, thus the ability to draw conclusions on what memory must be shared between the parent process and the sandboxed one. liballocs grants knowledge over the memory of the entire process: therefore, we can potentially isolate arbitrary code, and not necessarily just libraries. Instrumentation would be performed at run time: when dynamically loading a new library, we would compartmentalize it in another process. Calls to its functions within existing code would be adapted to instead send messages to the sandboxed process, and read/write data to a segment of shared memory; this is akin to several existing software isolation approaches, but being performed ad-hoc instead of during the compilation or linking phase.
Working at the run-time level would also mean we have more information about the context of the current program, which would allow us to be more fine-grained in what becomes compartmentalized. For example, instead of creating a compartment only for the C library, we can e.g. compartmentalize only calls to memcpy that the program invokes during its execution. Generally speaking, a library providing various utility functions would not need to be isolated as one single entity, and could instead have only parts of itself be sandboxed.
liballocs works at the level of Unix processes, which means we work at the binary level instead of the source level, and that we do not rely on e.g. LLVM. We expect the benefits of our approach to prove to be more flexible and portable than existing ones, but coming at the cost of some performance because of the instrumentation taking place during runtime.
To evaluate our approach, we will test it with existing programs which rely on dynamically loaded libraries, and ensure their behavior is not affected by our instrumentation. We will modify libraries to attempt to access private information in the program, to ensure the process boundary renders it impossible. Finally, we will test any finer-grained control granted by our system through isolating select calls to a given library function (e.g. memcpy) in a program and ensure that it relies only on proper compartments, with the rest of the library operating in the main process context.

Article Search
Sui Move: Modern Blockchain Programming with Objects
Adam Welc ORCID logo and Sam Blackshear ORCID logo
(Mysten Labs, USA)
This paper presents Sui Move, a new smart contract language for programming blockchains using objects as an abstraction.

Article Search
JaMaBuild: Mass Building of Java Projects
Matúš SulírORCID logo and Milan Nosáľ ORCID logo
(Technical University of Košice, Slovakia; ValeSoft, Slovakia)
Many large-scale Java empirical studies require not only source code but also resulting binaries such as JAR files. Pre-compiled datasets quickly become obsolete, and the creation of a custom corpus for every study is tedious. We present a prototype of JaMaBuild, a tool and a framework for mass building of Java projects from source. Given a list of projects and optional settings, it downloads the projects, filters them by user-definable criteria, builds them using Maven or Gradle, and collects outputs such as JAR files and build logs. Our tool can also be used for local build breakage studies.

Preprint Info
Extensible Testing for Infrastructure as Code
David SpielmannORCID logo, Daniel SokolowskiORCID logo, and Guido SalvaneschiORCID logo
(University of St. Gallen, Switzerland)
Developers automate deployments with Programming Languages Infrastructure as Code (PL-IaC) by implementing IaC programs in popular languages like TypeScript and Python. Yet, systematic testing---well established for high-velocity software development---is rarely applied to IaC programs because IaC testing techniques are either slow or require extensive development effort. To solve this dilemma, we develop ProTI, a novel IaC unit testing approach, and implement it for Pulumi TypeScript. Our preliminary experiments with simple type-based test case generators and oracles show that ProTI can find bugs reliably in a short time, often without writing any additional testing code. ProTI's extensible plugin architecture allows combining, adopting, and experimenting with new approaches, opening the discussion about novel generators and oracles for efficient IaC testing.

Article Search
A Functional Reactive Programming Language for Wirelessly Connected Shape-Changeable Chiplet-Based Computers
Hidetsugu Irie ORCID logo, Junichiro Kadomoto ORCID logo, Shuichi Sakai ORCID logo, and Yusuke Izawa ORCID logo
(University of Tokyo, Japan; Tokyo Institute of Technology, Japan)
This work presents the concept of MorphLang, a functional reactive programming language tailored for shape-changeable computers, which are built using wirelessly interconnected chiplets. MorphLang simplifies the programming process for these systems by concentrating on the basic behaviors of individual nodes and their asynchronous communication. The language allows for compilation into binary or Arduino formats, and programs can be transmitted to each node either wirelessly or through physical connections.

Article Search
ReactCOP: Modular and Scalable Web Development with Context-Oriented Programming
David H. LorenzORCID logo and Ofir Shmuel ORCID logo
(Open University of Israel, Israel; Technion, Israel)
We present a library named ReactCOP that extends React's capabilities with support for Context-Oriented Programming. The library lets developers manage behavioral variations in React applications through layers, and adapt the application's behavior dynamically based on different contexts.

Article Search Info
Safe Combination of Data-Centric and Operation-Centric Consistency
Mirko Köhler ORCID logo and Guido SalvaneschiORCID logo
(TU Darmstadt, Germany; University of St. Gallen, Switzerland)
Programming distributed systems requires maintaining consistency among data replicas. In recent years, various frameworks have proposed language-level abstractions for this, falling into two fundamental approaches: data-centric and operation-centric solutions. The former allow developers to explicitly assign consistency levels to data, the latter enable attaching consistency constraints to operations. In practice, developers may benefit from both in the same application: data-centric consistency harmonizes well with object-oriented programming, yet one may need the flexibility to access the same data with a different consistency level depending on the operation. Currently, there is no solution that integrates both: it is a conceptual challenge to unify these two models and design a type system capable of ensuring their correct interaction. We present ConOpY, a programming language that integrates both data-centric and operation-centric consistency into the same design. The ConOpY type system guarantees the proper usage of consistency levels, preventing consistency violations resulting from an improper mix of consistency models. ConOpY is implemented as a Java extension based on annotations.

Article Search
Towards Reusable GUI Structures
Knut Anders Stokke ORCID logo, Mikhail BarashORCID logo, and Jaakko JärviORCID logo
(University of Bergen, Norway; University of Turku, Finland)
Graphical user interfaces present data as structures (lists, trees, grids). Convenient features to manipulate these structures are tedious to implement. We are working towards a GUI programming approach, where concise specifications of structures give rise to full-fledged GUIs with a complete set of structure manipulation features.

Article Search
Generating Domain-Specific Programs for Diagram Authoring with Large Language Models
Rijul Jain ORCID logo, Wode Ni ORCID logo, and Joshua SunshineORCID logo
(Williams College, USA; Carnegie Mellon University, USA)
Large language models (LLMs) can generate programs in general-purpose languages from prose descriptions, but are not trained on many domain-specific languages (DSLs). Diagram authoring with Penrose, a diagramming system using three DSLs, exemplifies the utility of DSL program generation with LLMs, which enables diagram creation from prose. We provide methods to conceptualize and evaluate the structures of one-shot LLM prompts to generate error-free DSL programs and implement Penrose diagram creation from prose using LLMs. We will evaluate our LLM prompt structures by testing prompt variations across different diagramming domains and plan to run a user study to assess the ease of LLM-augmented Penrose diagramming over other tools.

Article Search

proc time: 3.91