PROGJB – Journal Issue

The Design Principles of the Elixir Type System
Giuseppe Castagna

, Guillaume Duboc

, and José Valim

(IRIF - Université Paris Cité - CNRS, France; Remote Technology, France; Dashbit, Poland)
Elixir is a dynamically-typed functional language running on the Erlang Virtual Machine, designed for building scalable and maintainable applications. Its characteristics have earned it a surging adoption by hundreds of industrial actors and tens of thousands of developers. Static typing seems nowadays to be the most important request coming from the Elixir community. We present a gradual type system we plan to include in the Elixir compiler, outline its characteristics and design principles, and show by some short examples how to use it in practice.
Developing a static type system suitable for Erlang's family of languages has been an open research problem for almost two decades. Our system transposes to this family of languages a polymorphic type system with set-theoretic types and semantic subtyping. To do that, we had to improve and extend both semantic subtyping and the typing techniques thereof, to account for several characteristics of these languages---and of Elixir in particular---such as the arity of functions, the use of guards, a uniform treatment of records and dictionaries, the need for a new sound gradual typing discipline that does not rely on the insertion at compile time of specific run-time type-tests but, rather, takes into account both the type tests performed by the virtual machine and those explicitly added by the programmer.
The system presented here is ``gradually'' being implemented and integrated in Elixir, but a prototype implementation is already available.
The aim of this work is to serve as a longstanding reference that will be used to introduce types to Elixir programmers, as well as to hint at some future directions and possible evolutions of the Elixir language.

Publisher's Version

Info

Artifact Available v2.0

ae programmingjournal supports claims v2.0

Live Objects All The Way Down: Removing the Barriers between Applications and Virtual Machines
Javier E. Pimás

, Stefan Marr

, and Diego Garbervetsky

(University of Buenos Aires, Argentina; University of Kent, UK)
Object-oriented languages often use virtual machines (VMs) that provide mechanisms such as just-in-time (JIT) compilation and garbage collection (GC). These VM components are typically implemented in a separate layer, isolating them from the application.
While this approach brings the software engineering benefits of clear separation and decoupling, it introduces barriers for both understanding VM behavior and evolving the VM implementation. For example, the GC and JIT compiler are typically fixed at VM build time, limiting arbitrary adaptation at run time. Furthermore, because of this separation, the implementation of the VM cannot typically be inspected and debugged in the same way as application code, enshrining a distinction in easy-to-work-with application and hard-to-work-with VM code. These characteristics pose a barrier for application developers to understand the engine on top of which their own code runs, and fosters a knowledge gap that prevents application developers to change the VM.
We propose Live Metacircular Runtimes (LMRs) to overcome this problem. LMRs are language runtime systems that seamlessly integrate the VM into the application in live programming environments. Unlike classic metacircular approaches, we propose to completely remove the separation between application and VM. By systematically applying object-oriented design to VM components, we can build live runtime systems that are small and flexible enough, where VM engineers can benefit of live programming features such as short feedback loops, and application developers with fewer VM expertise can benefit of the stronger causal connections between their programs and the VM implementation.
To evaluate our proposal, we implemented Bee/LMR, a live VM for a Smalltalk-derivative environment in 22057 lines of code. We analyze case studies on tuning the garbage collector, avoiding recompilations by the just-in-time compiler, and adding support to optimize code with vector instructions to demonstrate the trade-offs of extending exploratory programming to VM development in the context of an industrial application used in production. Based on the case studies, we illustrate how our approach facilitates the daily development work of a small team of application developers.
Our approach enables VM developers to gain access to live programming tools traditionally reserved for application developers, while application developers can interact with the VM and modify it using the high-level tools they use every day. Both application and VM developers can seamlessly inspect, debug, understand, and modify the different parts of the VM with shorter feedback loops and higher-level tools.

Publisher's Version

Provably Fair Cooperative Scheduling
Reiner Hähnle

and Ludovic Henrio

(TU Darmstadt, Germany; Univ Lyon - EnsL - UCBL - CNRS - Inria - LIP, France)
The context of this work is cooperative scheduling, a concurrency paradigm, where task execution is not arbitrarily preempted. Instead, language constructs exist that let a task voluntarily yield the right to execute to another task.
The inquiry is the design of provably fair schedulers and suitable notions of fairness for cooperative scheduling languages. To the best of our knowledge, this problem has not been addressed so far.
Our approach is to study fairness independently from syntactic constructs or environments, purely from the point of view of the semantics of programming languages, i.e., we consider fairness criteria using the formal definition of a program execution. We develop our concepts for classic structural operational semantics (SOS) as well as for the recent locally abstract, globally concrete (LAGC) semantics. The latter is a highly modular approach to semantics ensuring the separation of concerns between local statement evaluation and scheduling decisions.
The new knowledge contributed by our work is threefold: first, we show that a new fairness notion, called quiescent fairness, is needed to characterize fairness adequately in the context of cooperative scheduling; second, we define a provably fair scheduler for cooperative scheduling languages; third, a qualitative comparison between the SOS and LAGC versions yields that the latter, while taking higher initial effort, is more amenable to proving fairness and scales better under language extensions than SOS.
The grounding of our work is a detailed formal proof of quiescent fairness for the scheduler defined in LAGC semantics.
The importance of our work is that it provides a formal foundation for the implementation of fair schedulers for cooperative scheduling languages, an increasingly popular paradigm (for example: akka/Scala, JavaScript, async Rust). Being based solely on semantics, our ideas are widely applicable. Further, our work makes clear that the standard notion of fairness in concurrent languages needs to be adapted for cooperative scheduling and, more generally, for any language that combines atomic execution sequences with some form of preemption.

Publisher's Version

Conceptual Mutation Testing for Student Programming Misconceptions
Siddhartha Prasad

, Ben Greenman

, Tim Nelson

, and Shriram Krishnamurthi

(Brown University, USA)
Context
Students often misunderstand programming problem descriptions. This can lead them to solve the wrong problem, which creates frustration, obstructs learning, and imperils grades. Researchers have found that students can be made to better understand the problem by writing examples before they start programming. These examples are checked against correct and wrong implementations—analogous to mutation testing—provided by course staff. Doing so results in better student understanding of the problem as well as better test suites to accompany the program, both of which are desirable educational outcomes.
Inquiry Producing mutant implementations requires care. If there are too many, or they are too obscure, students will end up spending a lot of time on an unproductive task and also become frustrated. Instead, we want a small number of mutants that each correspond to common problem misconceptions. This paper presents a workflow with partial automation to produce mutants of this form which, notably, are not those produced by mutation-testing tools.
Approach
We comb through student tests that fail a correct implementation. The student misconceptions are embedded in these failures. We then use methods to semantically cluster these failures. These clusters are then translated into conceptual mutants. These can then be run against student data to determine whether we they are better than prior methods. Some of these processes also enjoy automation.
Knowledge We find that student misconceptions illustrated by failing tests can be operationalized by the above process. The resulting mutants do much better at identifying student misconceptions.
Grounding
Our findings are grounded in a manual analysis of student examples and a quantitative evaluation of both our clustering techniques and our process for making conceptual mutants. The clustering evaluation compares against a ground truth using standard cluster-correspondence measures, while the mutant evaluation examines how conceptual mutants perform against student data.
Importance
Our work contributes a workflow, with some automation, to reduce the cost and increase the effectiveness of generating conceptually interesting mutants. Such mutants can both improve learning outcomes and reduce student frustration, leading to better educational outcomes. In the process, we also identify a variation of mutation testing not commonly discussed in the software literature.

Publisher's Version

ae programmingjournal supports claims v2.0

Real-World Choreographic Programming: Full-Duplex Asynchrony and Interoperability
Lovro Lugović

and Fabrizio Montesi

(University of Southern Denmark, Denmark)
In the paradigm of choreographic programming, the overall behaviour of a distributed system is coded as a choreography from a global viewpoint. The choreography can then be automatically projected (compiled) to a correct implementation for each participant. This paradigm is interesting because it relieves the programmer from manually writing the separate send and receive actions performed by participants, which simplifies development and avoids communication mismatches.
However, the applicability of choreographic programming in the real world remains largely unexplored. The reason is twofold. First, while there have been several proposals of choreographic programming languages, none of these languages have been used to implement a realistic, widely-used protocol. Thus there is a lack of experience on how realistic choreographic programs are structured and on the relevance of the different features explored in theoretical models. Second, applications of choreographic programming shown so far are intrusive, in the sense that each participant must use exactly the code projected from the choreography. This prevents using the code generated from choreographies with existing third-party implementations of some participants, something that is very beneficial for testing or might even come as a requirement.
This paper addresses both problems. In particular, we carry out the first development in choreographic programming of a widespread real-world protocol: the Internet Relay Chat (IRC) client--server protocol. The development is based on Choral, an object-oriented higher-order choreographic programming language (choreographies can be parametric on choreographies and carry state).
We find that two of Choral's features are key to our implementation: higher-order choreographies are used for modelling the complex interaction patterns that arise due to IRC's asynchronous nature, while user-definable communication semantics are relevant for achieving interoperability with third-party implementations. In the process we also discover a missing piece: the capability of statically detecting that choices on alternative distributed behaviours are appropriately communicated by means of message types, for which we extend the Choral compiler with an elegant solution based on subtyping.
Our Choral implementation of IRC arguably represents a milestone for choreographic programming, since it is the first empirical evidence that the paradigm can be used to faithfully codify protocols found `in the wild'. We observe that the choreographic approach reduces the interaction complexity of our program, compared to the traditional approach of writing separate send and receive actions. To check that our implementation is indeed interoperable with third-party software, we test it against publicly available conformance tests for IRC and some of the most popular IRC client and server software. We also evaluate the performance and scalability of our implementation by performing performance tests.
Our experience shows that even if choreographic programming is still in its early life, it can already be applied to a realistic setting.

Publisher's Version

Artifact Available v2.0

ae programmingjournal supports claims v2.0

PROGJB – Journal Issue

Frontmatter

Papers