PROGJA – Journal Issue

Hybrid Structured Editing: Structures for Tools, Text for Users
Tom Beckmann, Christoph Thiede, Jens Lincke, and Robert Hirschfeld
(Hasso Plattner Institute, Germany; University of Potsdam, Germany)
In programming, better tools often yield better results. For that, modern programming environments offer mechanisms to allow for their extensibility. The closer those tools are to the code, the easier it is for programmers to map the information provided by a tool to the code this information is about. However, existing extension mechanisms do not facilitate the close integration of tools with textual source code. Tools must be able to track program structures across edits to appear at the right positions but the parsing step of text complicates tracking structures. We propose hybrid structured editing, an approach that supports tool builders by providing structural guarantees while providing tool users with a familiar and consistent text editing interface. Hybrid structured editing allows tool builders to declare constraints on the structure that a program must conform to and ensures their observance. We present an implementation and several case studies of tools based on hybrid structured editing to demonstrate its effectiveness. Hybrid structured editing supports the safe extension of programming environments with tools that work on a structured representation of code and provide a consistent and reliable user experience.

Publisher's Version

Published Artifact Artifact Available v2.0 Artifact Supports Claims v2.0 Article: progja26main-p8-p doi:10.22152/programming-journal.org/2026/11/1

Pitfalls in VM Implementation on CHERI: Lessons from Porting CRuby
Hanhaotian Liu, Tetsuro Yamazaki, and Tomoharu Ugawa
(University of Tokyo, Japan)
CHERI (Capability Hardware Enhanced RISC Instructions) is a novel hardware designed to address memory safety issues. By replacing traditional pointers with hardware capabilities, it enhances security in modern software systems. A Virtual Machine (VM) is one such system that can benefit from CHERI's protection, as it may contain latent memory vulnerabilities.
However, developing and porting VMs to CHERI is a non-trivial task. There are many subtle pitfalls from the assumptions on the undefined behaviors of the C language made based on conventional architectures. Those assumptions conflict with CHERI's stricter memory safety model, causing unexpected failures.
Although several prior works have discussed the process of porting VMs, they focus on the overall porting process instead of the pitfalls for VM implementation on CHERI. The guide for programming in CHERI exists, but it is for general programming, not addressing VM-specific issues.
We have ported CRuby to CHERI as a case study and surveyed previous works on porting VMs to CHERI. We categorized and discussed the issues found based on their causes.
In this paper, we illustrate the VM-specific pitfalls for each category. Most of the pitfalls arise from the undefined behaviors in the C language; in particular, implementation techniques and idioms of VMs often assume behaviors of traditional architectures that are invalid on CHERI. We also discuss workarounds for them and the impacts of those workarounds.
We verified the validity of the workarounds by applying them to our CRuby port and by surveying the codebases of prior case studies.
This work contributes to the body of knowledge on developing and porting VMs to CHERI and will help guide efforts toward constructing safer VMs.

Publisher's Version

Published Artifact Artifact Available v2.0 Artifact Supports Claims v2.0 Article: progja26main-p11-p doi:10.22152/programming-journal.org/2026/11/2

Efficient Selection of Type Annotations for Performance Improvement in Gradual Typing
Senxi Li, Feng Dai, Tetsuro Yamazaki, and Shigeru Chiba
(University of Tokyo, Japan)
Gradual typing has gained popularity as a design choice for integrating static and dynamic typing within a single language. Several practical languages have adopted gradual typing to offer programmers the flexibility to annotate their programs as needed. Meanwhile there is a key challenge of unexpected performance degradation in partially typed programs. The execution speed may significantly decrease when simply adding more type annotations. Prior studies have investigated strategies of selectively adding type annotations for better performance. However, they are restricted in substantial compilation time, which impedes the practical usage.
This paper presents a new technique to select a subset of type annotations derived by type inference for improving the execution performance of gradually typed programs. The advantage of the proposal is shorter compilation time by employing a lightweight, amortized approach. It selects type annotations along the data flows, which is expected to avoid expensive runtime casts caused by a value repeatedly crossing the boundaries between untyped and typed code.
We demonstrate the applicability of our proposal, and conduct experiments to validate its effectiveness of improving the execution time on Reticulated Python. Our implementation supports a Python subset to select type annotations derived by an implemented, external type inference engine. Experiment results show that our proposal outperforms a naive strategy of using all type annotations derived by type inference among the benchmark programs. In comparison with an existing approach, the proposal achieves comparable execution speed and shows advantage of maintaining a more stable compilation time of deriving and selecting type annotations. Our results empirically indicate that the proposed technique is practical within Reticulated Python for mitigating the performance bottleneck of gradually typed programs.

Publisher's Version

Published Artifact Artifact Available v2.0 Artifact Supports Claims v2.0 Article: progja26main-p13-p doi:10.22152/programming-journal.org/2026/11/3

JoinActors: A Modular Library for Actors with Join Patterns
Ayman Hussein, Philipp Haller, Ioannis Karras, Hernán Melgratti, Alceste Scalas, and Emilio Tuosto
(Technical University of Denmark, Denmark; KTH Royal Institute of Technology, Sweden; University of Buenos Aires, Argentina; CONICET, Argentina; Gran Sasso Science Institute, Italy)
*Join patterns* are a high-level programming construct for message-passing applications. They offer an intuitive and declarative approach for specifying how concurrent and distributed components coordinate, possibly depending on complex conditions over combinations of messages. Join patterns have inspired many implementations — but most of them are not available as libraries: rather, they are domain-specific languages that can be hard to integrate into pre-existing ecosystems. Moreover, all implementations ship with a predefined matching algorithm, which may not be optimal depending on the application requirements. These limitations are addressed by `JoinActors`, a recently published library which integrates join patterns in the off-the-shelf Scala 3 programming language, and is designed to be modular w.r.t. the matching algorithm in use.
In this work we address the problem of designing, developing, and evaluating a modular join pattern matching toolkit that (1) can be used as a regular library with a developer-friendly syntax within a pre-existing programming language, and (2) has an extensible design that supports the use and comparison of different matching algorithms.
We analyse how `JoinActors` achieves goals (1) and (2) above. The paper that introduced `JoinActors` only briefly outlined its design and implementation (as its main goal was formalising its novel *fair matching semantics*). In this work we present and discuss in detail an improved version of `JoinActors`, focusing on its use of metaprogramming (which enables an intuitive API resembling standard pattern matching) and on its modular design. We show how this enables the integration of multiple matching algorithms with different optimisations and we evaluate their performance via benchmarks covering different workloads.
We illustrate a sophisticated use of Scala 3's metaprogramming for the integration of an advanced concurrent programming construct within a pre-existing language. In addition, we discuss the insights and "lessons learned" in optimising join pattern matching, and how they are facilitated by `JoinActors`'s modularity — which allows for the systematic comparison of multiple matching algorithm implementations.
We adopt the *fair join pattern matching* semantics and the benchmark suite from the paper that originally introduced `JoinActors`. Through extensive testing we ensure that our new optimised matching algorithms produce exactly the same matches as the original `JoinActors` library, while achieving significantly better performance. The improved version of `JoinActors` is the companion artifact of this paper.
This work showcases the expressiveness, effectiveness, and usability of join patterns for implementing complex coordination patterns in distributed message-passing systems, within a pre-existing language. It also demonstrates promising performance results, with significant improvements over previous work. Besides the practical promise, `JoinActors`'s modular design offers a research playground for exploring and comparing new join pattern matching algorithms, possibly based on entirely different semantics.

Publisher's Version

Published Artifact Artifact Available v2.0 Artifact Supports Claims v2.0 Article: progja26main-p16-p doi:10.22152/programming-journal.org/2026/11/4

Evaluating LLMs in the Context of a Functional Programming Course: A Comprehensive Study
Yihan Zhang, Brigitte Pientka, and Xujie Si
(McGill University, Canada; University of Toronto, USA)
Large-Language Models (LLMs) are changing the way learners acquire knowledge outside the classroom setting. Previous studies have shown that LLMs seem effective in generating to short and simple questions in introductory CS courses using high-resource programming languages such as Java or Python.
In this paper, we evaluate the effectiveness of LLMs in the context of a low-resource programming language — OCaml, in an educational setting. In particular, we built three benchmarks to comprehensively evaluate 9 state-of-the-art LLMs: 1) 1 (a benchmark containing natural-language homework programming problems); 2) 2 (a benchmark containing programs with syntax, type, and logical errors drawn from actual student submissions); 3)3 (a benchmark containing natural language questions regarding theoretical programming concepts). We grade each LLMs responses with respect to correctness using the OCaml compiler and an autograder. And our evaluation goes beyond common evaluation methodology by using manual grading to assess the quality of the responses.
Our study shows that the top three LLMs are effective on all tasks within a typical functional programming course, although they solve much fewer homework problems in the low-resource setting compared to their success on introductory programming problems in Python and Java. The strength of LLMs lies in correcting syntax and type errors as well as generating answers to basic conceptual questions. While LLMs may not yet match dedicated language-specific tools in some areas, their convenience as a one-stop tool for multiple programming languages can outweigh the benefits of more specialized systems.
We hope our benchmarks can serve multiple purposes: to assess the evolving capabilities of LLMs, to help instructors raise awareness among students about the limitations of LLM-generated solutions, and to inform programming language researchers about opportunities to integrate domain-specific reasoning into LLMs and develop more powerful code synthesis and repair tools for low-resource languages.

Publisher's Version

Published Artifact Artifact Available v2.0 Artifact Supports Claims v2.0 Article: progja26main-p17-p doi:10.22152/programming-journal.org/2026/11/5

PROGJA – Journal Issue

Frontmatter

Papers