Monk: Opportunistic Scheduling to Delay Horizontal Scaling
Marina Shimchenko, Erik Österlund, and Tobias Wrigstad (University of Uppsala, Sweden; Oracle, Sweden)
In modern server computing, efficient CPU resource usage is often traded for latency. Garbage collection is a key aspect of memory management in programming languages like Java, but it often competes with application threads for CPU time, leading to delays in processing requests and consequent increases in latency. This work explores if opportunistic scheduling in ZGC, a fully concurrent garbage collector (GC), can reduce application latency on middle-range CPU utilization, a topical deployment, and potentially delay horizontal scaling. We implemented an opportunistic scheduling that schedules GC threads during periods when CPU resources would otherwise be idle. This method prioritizes application threads over GC workers when it matters most, allowing the system to handle higher workloads without increasing latency. Our findings show that this technique can significantly improve performance in server applications. For example, in tests using the SPECjbb2015 benchmark, we observed up to a 15% increase in the number of requests processed within the target 25ms latency. Additionally, applications like Hazelcast showed a mean latency reduction of up to 40% compared to ZGC without opportunistic scheduling. The feasibility and effectiveness of this approach were validated through empirical testing on two widely used benchmarks, showing that the method consistently improves performance under various workloads. This work is significant because it addresses a common bottleneck in server performance—how to manage GC without degrading application responsiveness. By improving how GC threads are scheduled, this research offers a pathway to more efficient resource usage, enabling higher performance and better scalability in server applications.
Conversational Concurrency with Dataspaces and Facets
Sam Caldwell, Tony Garnock-Jones, and Matthias Felleisen (Northeastern University, USA; Maastricht University, Netherlands)
Context
Developers have come to appreciate the simplicity of message-passing actors for
concurrent programming tasks. The actor model of computation is easy to grasp;
it is just a conversation among actors with a common goal. Importantly, it
eliminates some basic pitfalls of the dominant shared-memory model, most
critically data races.
Inquiry
A close look at real-world conversations suggests, however, that they are not
mere exchanges of messages. Participants must keep in mind conversational
context, and participants joining late can and often must acquire some of this
context. In addition, some settings call for engaging in several conversations
in parallel; in others, participants conduct temporarily limited
sub-conversations to clarify a point. Existing actor code exhibits complex
design patterns that get around the underlying limitations of the pure
message-passing model.
Approach
These patterns suggest a number of elements involved in programming
conversational computations. Translated into terms of language design, they call
for two kinds of facilities: (1) one for sharing conversational context and (2)
another one for organizing individual actors around on-going conversations and their
contexts.
Knowledge
This paper presents Syndicate, a language designed to directly support the
programming of conversing actors. Beyond message passing, it supplies
(1) a dataspace, which allows actors to make public assertions, to withdraw
them, and to query what other actors have asserted; and (2) the facet notation,
which enables programmers to express individual actors as a reflection of the
on-going conversations.
Grounding
A worked example introduces these concepts and illustrates conversational
programming in Syndicate. A comparison with other research and industrial
concurrent languages demonstrates the unique support Syndicate provides.
Importance
Syndicate advances concurrent actor programming with enhancements that address
some observed limitations of the underlying model. While message-passing
simplifies concurrent programming, it falls short in handling the complexities
of actual computational conversations. By introducing a dataspace actor for
sharing conversational context and the facet notation for organizing actors
around ongoing conversations, Syndicate enables developers to naturally express
and manage the nuanced interactions often required in concurrent systems. These
innovations reduce the need for complex design patterns and provide unique
support for building robust, context-aware concurrent applications.
Automated Profile-Guided Replacement of Data Structures to Reduce Memory Allocation
Lukas Makor, Sebastian Kloibhofer, Peter Hofer, David Leopoldseder, and Hanspeter Mössenböck (JKU Linz, Austria; Oracle Labs, Austria)
Data structures are a cornerstone of most modern programming languages. Whether they are provided via separate libraries, built into the language specification, or as part of the language's standard library - data structures such as lists, maps, sets, or arrays provide programmers with a large repertoire of tools to deal with data.
Moreover, each kind of data structure typically comes with a variety of implementations that focus on scalability, memory efficiency, performance, thread-safety, or similar aspects.
Choosing the *right* data structure for a particular use case can be difficult or even impossible if the data structure is part of a framework over which the user has no control. It typically requires in-depth knowledge about the program and, in particular, about the usage of the data structure in question.
However, it is usually not feasible for developers to obtain such information about programs in advance.
Hence, it makes sense to look for a more automated way for optimizing data structures.
We present an approach to automatically replace data structures in Java applications.
We use profiling to determine allocation-site-specific metrics about data structures and their usages, and then automatically replace their allocations with customized versions, focusing on memory efficiency.
Our approach is integrated into GraalVM Native Image, an Ahead-of-Time compiler for Java applications.
By analyzing the generated data structure profiles, we show how standard benchmarks and microservice-based applications use data structures and demonstrate the impact of customized data structures on the memory usage of applications.
We conducted an evaluation on standard and microservice-based benchmarks, in which the memory usage was reduced by up to 13.85 % in benchmarks that make heavy use of data structures. While others are only slightly affected, we could still reduce the average memory usage by 1.63 % in standard benchmarks and by 2.94 % in microservice-based benchmarks.
We argue that our work demonstrates that choosing appropriate data structures can reduce the memory usage of applications. While acknowledge that our approach does not provide benefits for all kinds of workloads, our work nevertheless shows how automated profiling and replacement can be used to optimize data structures in general.
Hence, we argue that our work could pave the way for future optimizations of data structures.
Distributed Stream Processing Frameworks (DSPFs) are popular tools for expressing real-time Big Data applications that have to handle enormous volumes of data in real time. These frameworks distribute their applications over a cluster in order to scale horizontally along with the amount of incoming data.
Inquiry
Crucial for the performance of such applications is the distribution strategy that is used to partition data and computations over the cluster nodes. In some DSPFs, like Apache Spark or Flink, the distribution strategy is hardwired into the framework which can lead to inefficient applications. The other end of the spectrum is offered by Apache Storm, which offers a low-level model wherein programmers can implement their own distribution strategies on a per-application basis to improve efficiency. However, this model conflates distribution and data processing logic, making it difficult to modify either. As a consequence, today’s cluster application developers either have to accept the built-in distribution strategies of a high-level framework or accept the complexity of expressing a distribution strategy in Storm’s low-level model.
Approach
We propose a novel programming model wherein data processing operations and their distribution strategies are decoupled from one another and where new strategies can be created in a modular fashion.
Knowledge
The introduced language abstractions cleanly separate the data processing and distribution logic of a stream processing application. This enables the expression of stream processing applications in a high-level framework while still retaining the flexibility offered by Storm’s low-level model.
Grounding
We implement our programming model as a domain-specific language, called Skitter, and use it to evaluate our approach. Our evaluation shows that Skitter enables the implementation of existing distribution strategies from the state of the art in a modular fashion. Our performance evaluation shows that the strategies implemented in Skitter exhibit the expected performance characteristics and that applications written in Skitter obtain throughput rates in the same order of magnitude as Storm.
Importance
Our work enables developers to select the most performant distribution strategy for each operation in their application, while still retaining the programming model offered by high-level frameworks.
Probing the Design Space: Parallel Versions for Exploratory Programming
Tom Beckmann, Joana Bergsiek, Eva Krebs, Toni Mattis, Stefan Ramson, Martin C. Rinard, and Robert Hirschfeld (Hasso Plattner Institute, Germany; University of Potsdam, Germany; Massachusetts Institute of Technology, USA)
Exploratory programming involves open-ended tasks. To evaluate their progress on these, programmers require frequent feedback and means to tell if the feedback they observe is bringing them in the right direction. Collecting, comparing, and sharing feedback is typically done through ad-hoc means: relying on memory to compare outputs, code comments, or manual screenshots. To approach this issue, we designed Exploriants: an extension to example-based live programming. Exploriants allows programmers to place variation points. It collects outputs captured in probes and presents them in a comparison view that programmers can customize to suit their program domain. We find that the addition of variation points and the comparisons view encourages a structured approach to exploring variations of a program. We demonstrate Exploriants' capabilities and applicability in three case studies on image processing, data processing, and game development. Given Exploriants, exploratory programmers are given a straightforward means to evaluate their progress and do not have to rely on ad-hoc methods that may introduce errors.
An Attempt to Catch Up with JIT Compilers: The False Lead of Optimizing Inline Caches
Aurore Poirier, Erven Rohou, and Manuel Serrano (University of Rennes - Inria - CNRS - IRISA, France; Inria - University of Côte d’Azur, France) Context: Just-in-Time (JIT) compilers are able to specialize the code they generate according to a continuous profiling of the running programs. This gives them an advantage when compared to Ahead-of-Time (AoT) compilers that must choose the code to generate once for all.
Inquiry: Is it possible to improve the performance of AoT compilers by adding Dynamic Binary Modification (DBM) to the executions?
Approach: We added to the Hopc AoT JavaScript compiler a new optimization based on DBM to the inline cache (IC), a classical optimization dynamic languages use to implement object property accesses efficiently.
Knowledge: Reducing the number of memory accesses as the new optimization does, does not shorten execution times on contemporary architectures.
Grounding: The DBM optimization we have implemented is fully operational on x86_64 architectures. We have conducted several experiments to evaluate its impact on performance and to study the reasons of the lack of acceleration.
Importance: The (negative) result we present in this paper sheds new light on the best strategy to be used to implement dynamic languages. It tells that the old days were removing instructions or removing memory reads always yielded to speed up is over. Nowadays, implementing sophisticated compiler optimizations is only worth the effort if the processor is not able by itself to accelerate the code. This result applies to AoT compilers as well as JIT compilers.
The Formal Semantics and Implementation of a Domain-Specific Language for Mixed-Initiative Dialogs
Zachary S. Rowland and Saverio Perugini (University of Dayton, USA; Ave Maria University, USA)
Human-computer dialog plays a prominent role in interactions conducted at kiosks (e.g., withdrawing money from an atm or filling your car with gas), on smartphones (e.g., installing and configuring apps), and on the web (e.g., booking a flight). Some human-computer dialogs involve an exchange of system-initiated and user-initiated actions. These dialogs are called mixed-initiative dialogs and sometimes also involve the pursuit of multiple interleaved sub-dialogs, which are woven together in a manner akin to coroutines. However, existing dialog-authoring languages have difficulty expressing these dialogs concisely. In this work, we improve the expressiveness of a dialog-authoring language we call dialog specification language (dsl), which is based on the programming concepts of functional application, partial function application, currying, and partial evaluation, by augmenting it with additional abstractions to support concise specification of task-based, mixed-initiative dialogs that resemble concurrently executing coroutines. We also formalize the semantics of dsl—the process of simplifying and staging such dialogs specified in the language. We demonstrate that dialog specifications are compressed by to a higher degree when written in dsl using the new abstractions. We also operationalize the formal semantics of dsl in a Haskell functional programming implementation. The Haskell implementation of the simplification/staging rules provides a proof of concept that the formal semantics are sufficient to implement a dialog system specified with the language. We evaluate dsl from practical (i.e., case study), conceptual (i.e., comparisons to similar systems such as VoiceXML and State Chart XML), and theoretical perspectives. The practical applicability of the new language abstractions introduced in this work is demonstrated in a case study by using it to model portions of an online food ordering system that can be concurrently staged. Our results indicate that dsl enables concise representation of dialogs composed of multiple concurrent sub-dialogs and improves the compression of dialog expressions reported in prior research. We anticipate that the extension of our language and the formalization of the semantics can facilitate concise specification and smooth implementation of task-based, mixed-initiative, human-computer dialog systems across various domains such as atms and interactive, voice-response systems.
Dynamic Program Slices Change How Developers Diagnose Gradual Run-Time Type Errors
Felipe Bañados Schwerter, Ronald Garcia, Reid Holmes, and Karim Ali (University of Alberta, Canada; University of British Columbia, Canada; NYU Abu Dhabi, United Arab Emirates)
A gradual type system allows developers to declare certain types to be enforced by the compiler (i.e., statically typed), while leaving other types to be enforced via runtime checks (i.e., dynamically typed). When runtime checks fail, debugging gradually typed programs becomes cumbersome, because these failures may arise far from the original point where an inconsistent type assumption is made. To ease this burden on developers, some gradually typed languages produce a blame report for a given type inconsistency. However, these reports are sometimes misleading, because they might point to program points that do not need to be changed to stop the error.
To overcome the limitations of blame reports, we propose using dynamic program slicing as an alternative approach to help programmers debug run-time type errors. We describe a proof-of-concept for TypeSlicer, a tool that would present dynamic program slices to developers when a runtime check fails. We performed a Wizard-of-Oz user study to investigate how developers respond to dynamic program slices through a set of simulated interactions with TypeScript programs. This formative study shows that developers can understand and apply dynamic slice information to provide change recommendations when debugging runtime type errors.
Meta-compilation of Baseline JIT Compilers with Druid
Nahuel Palumbo, Guillermo Polito, Stéphane Ducasse, and Pablo Tesone (University of Lille - Inria - CNRS - Centrale Lille - UMR 9189 CRIStAL, France)
Virtual Machines (VMs) combine interpreters and just-in-time (JIT) compiled code to achieve good performance. However, implementing different execution engines increases the cost of developing and maintaining such solutions. JIT compilers based on meta-compilation cope with these issues by automatically generating optimizing JIT compilers. This leaves open the question of how meta-compilation applies to baseline JIT compilers, which improve warmup times by trading off optimizations. In this paper, we present Druid, an ahead-of-time automatic approach to generate baseline JIT compiler frontends from interpreters. Language developers guide meta-compilation by annotating interpreter code and using Druid’s intrinsics. Druid targets the meta-compilation to an existing JIT compiler infrastructure to achieve good warm-up performance. We applied Druid in the context of the Pharo programming language and evaluated it by comparing an autogenerated JIT compiler frontend against the one in production for more than 10 years. Our generated JIT compiler frontend is 2x faster on average than the interpreter and achieves on average 0.7x the performance of the handwritten JIT compiler. Our experiment only required changes in 60 call sites in the interpreter, showing that our solution makes language VMs easier to maintain and evolve in the long run.
Study of the Use of Property Probes in an Educational Setting
Anton Risberg Alaküla, Niklas Fors, and Emma Söderberg (Lund University, Sweden) Context Developing compilers and static analysis tools (“language tools”) is a difficult and time-consuming task. We have previously presented property probes, a technique to help the language tool developer build understanding of their tool. A probe presents a live view into the internals of the compiler, enabling the developer to see all the intermediate steps of a compilation or analysis rather than just the final output. This technique has been realized in a tool called CodeProber. Inquiry CodeProber has been in active use in both research and education for over two years, but its practical use has not been well studied. CodeProber combines liveness, AST exploration and presenting program analysis results on top of source code. While there are other tools that specifically target language tool developers, we are not aware of any that has the same design as CodeProber, much less any such tool with an extensive user study. We therefore claim there is a lack of knowledge how property probes (and by extension CodeProber) are used in practice. Approach We present the results from a mixed-method study of use of CodeProber in an educational setting, with the goal to discover if and how property probes help, and how they compare to more traditional techniques such as test cases and print debugging. In the study, we analyzed data from 11 in-person interviews with students using CodeProber as part of a course on program analysis. We also analyzed CodeProber event logs from 24 students in the same course, and 51 anonymized survey responses across two courses where CodeProber was used. Knowledge Our findings show that the students find CodeProber to be useful, and they make continuous use of it during the course labs. We further find that the students in our study seem to partially or fully use CodeProber instead of other development tools and techniques, e.g. breakpoint/step-debugging, test cases and print debugging. Grounding Our claims are supported by three different data sources: 11 in-person interviews, log analysis from 24 students, and surveys with 51 responses. Importance We hope our findings inspire others to consider live exploration to help language tool developers build understanding of their tool.
Many systems require receiving data from multiple information sources, which act as distributed network devices that asynchronously send the latest data at their own pace to generalize various kinds of devices and connections, known as the Internet of Things (IoT). These systems often perform computations both reactively and retroactively on information received from the sources for monitoring and analytical purposes, respectively.
Inquiry:
It is challenging to design a programming language that can describe such systems at a high level of abstraction for two reasons: (1) reactive and retroactive computations in these systems are performed alongside the execution of other application logic; and (2) information sources may be distributed, and data from these sources may arrive late or be lost entirely. Addressing these difficulties is our fundamental problem.
Approach:
We propose a programming language that supports the following features. First, our language incorporates reactive time-varying values (also known as signals) embedded within an imperative language. Second, it supports multiple information sources that are distributed and represented as signals, meaning they can be declaratively composed to form other time-varying values. Finally, it allows computation over past values collected from information sources and recovery from inconsistency caused by packet loss. To address the aforementioned difficulties, we develop a core calculus for this proposed language.
Knowledge:
This calculus is a hybrid of reactive/retroactive computations and imperative ones. Because of this hybrid nature, the calculus is inherently complex; however, we have simplified it as much as possible. First, its semantics are modeled as a simple, single-threaded abstraction based on typeless object calculus. Meanwhile, reactive computations that execute in parallel are modeled using a simple process calculus and are integrated with the object calculus, ensuring that the computation results are always serialized. Specifically, we show that time consistency is guaranteed in the calculus; in other words, consistency can be recovered at any checkpoint.
Grounding:
This work is supported by formally stating and proving theorems regarding time consistency. We also conducted a microbenchmarking experiment to demonstrate that the implemented recovery process is feasible in our assumed application scenarios.
Importance:
The ensured time consistency provides a rigorous foundation for performing analytics on computation results obtained from distributed information sources, even when these sources experience delays or packet loss.
Multi-schema-version data management (MSVDM) is the database technology that simultaneously supports multiple schema versions of one database. With the technology, multiple versions of one software system can co-exist and exchange data even when the system’s data structure evolves along with versions.
Inquiry:
While there have been developed MSVDM theories and implementations for relational databases, they are not directly applicable to persistent objects. Since persistent objects are commonly implemented by means of object-relational mapping (OR-mapping), we need a right level of abstraction in order to describe evolution of data structures and translate data accesses in between different versions.
Approach:
We propose a new evolution language consisting of a set of evolution operations, each denoting a modification of the source code and implicitly defining the corresponding modification to the database schema. Given the existence of multiple mapping mechanisms from persistent objects to databases, we designed the evolution language at two levels. At the abstract level, it handles scenarios such as refactoring and adding classes and fields. At the concrete level, we provide definitions for different mapping mechanisms separately, leveraging the existing database evolution language that supports MSVDM.
Knowledge:
Our evolution language is designed to support existing evolution operations proposed in prior work. Additionally, it introduces support for operations related to class hierarchy changes, which are not covered by previous approaches. Using our proposal, two concrete mapping mechanisms, namely, a JPA-like mapping and signal classes, can be provided separately. Furthermore, our evolution language preserves program behavior and covers common evolution operations in practice.
Grounding:
This work is supported by the formal definition of both the target abstract core language and the proposed evolution language, the formulation of several theorems demonstrating the soundness of our proposals, and the proofs of these theorems. Additionally, an empirical study was conducted to investigate the evolution histories of three open-source projects.
Importance:
To the best of our knowledge, our proposal is the first evolution language for persistent objects that supports MSVDM. Moreover, it is the first evolution language defined at an abstract level. By defining mappings separately, we can apply it to a wide range of persistent object mechanisms built on top of SQL.
PolyDebug: A Framework for Polyglot Debugging
Philémon Houdaille, Djamel Eddine Khelladi, Benoit Combemale, Gunter Mussbacher, and Tijs van der Storm (University of Rennes, France; CNRS, France; Inria, France; McGill University, Canada; CWI, Netherlands; University of Groningen, Netherlands)
As software grows increasingly complex, the quantity and diversity of concerns to be addressed also rises. To answer this diversity of concerns, developers may end up using multiple programming languages in a single software project, a practice known as polyglot programming. This practice has gained momentum with the rise of execution platforms capable of supporting polyglot systems.
However, despite this momentum, there is a notable lack of development tooling support for developers working on polyglot programs, such as in debugging facilities. Not all polyglot execution platforms provide debugging capabilities, and for those that do, implementing support for new languages can be costly.
This paper addresses this gap by introducing a novel debugger framework that is language-agnostic yet leverages existing language-specific debuggers. The proposed framework is dynamically extensible to accommodate the evolving combination of languages used in polyglot software development. It utilizes the Debug Adapter Protocol (DAP) to integrate and coordinate existing debuggers within a debugging session.
We found that using our approach, we were able to implement polyglot debugging support for three different languages with little development effort. We also found that our debugger did not introduce an overhead significant enough to hinder debugging tasks in many scenarios; however performance did deteriorate with the amount of polyglot calls, making the approach not suitable for every polyglot program structure.
The effectiveness of this approach is demonstrated through the development of a prototype, PolyDebug, and its application to use cases involving C, JavaScript, and Python. We evaluated PolyDebug on a dataset of traditional benchmark programs, modified to fit our criteria of polyglot programs. We also assessed the development effort by measuring the source lines of code (SLOC) for the prototype as a whole as well as its components.
Debugging is a fundamental part of developing and maintaining software. Lack of debug tools can lead to difficulty in locating software bugs and slow down the development process. We believe this work is relevant to help provide developers proper debugging support regardless of the runtime environment.
Two Approaches for Programming Education in the Domain of Graphics: An Experiment
Luca Chiodini, Juha Sorva, Arto Hellas, Otto Seppälä, and Matthias Hauswirth (USI Lugano, Switzerland; Aalto University, Finland) Context. Graphics is a popular domain for teaching introductory programming in a motivating way, even in text-based programming languages. Over the last few decades, a large number of libraries using different approaches have been developed for this purpose. Inquiry. Prior work in introductory programming that uses graphics as input and output has shown positive results in terms of engagement, but research is scarce on whether learners are able to use programming concepts learned through graphics for programming in other domains, transferring what they have learned. Approach. We conducted a randomized, controlled experiment with 145 students as participants divided into two groups. Both groups programmed using graphics in Python, but used different approaches: one group used a compositional graphics library named PyTamaro; the other used the Turtle graphics library from Python’s standard library. Student engagement was assessed with surveys, and programming knowledge with a post-test on general programming concepts and programming tasks in the domain of graphics. Knowledge. We find few differences between the two groups on the post-test, despite the PyTamaro group having practiced on problems isomorphic to those in the post-test. The participants traced a compositional graphics program more accurately than a ‘comparable’ turtle graphics program. Both groups report high engagement and perceived learning; both perform well on simple program-writing tasks to create graphics. Grounding. Our findings are based on a controlled experiment with a count of 145 participants, which exceeds the sample size indicated by power analysis to detect a medium effect size. The complete instrument and teaching materials used in the study are available as appendixes. Importance. This study adds further evidence that graphics is an engaging domain for introductory programming; moreover, it shows that the compositional graphics approach adopted by PyTamaro yields engagement levels comparable to the venerable turtle approach. Compositional graphics code appears to be easier to trace than turtle graphics code. As for conceptual knowledge, our results indicate that practicing on programming tasks isomorphic to those of the test can still not be enough to achieve better transfer. This challenges programming educators and researchers to investigate further which graphics-based approaches work best and how to facilitate transfer.
On the State of Coherence in the Land of Type Classes
Dimi Racordon, Eugene Flesselle, and Cao Nguyen Pham (EPFL, Switzerland)
Type classes are a popular tool for implementing generic algorithms and data structures without loss of efficiency, bridging the gap between parametric and ad-hoc polymorphism. Since their initial development in Haskell, they now feature prominently in numerous other industry-ready programming languages, notably including Swift, Rust, and Scala. The success of type classes hinges in large part on the compilers’ ability to infer arguments to implicit parameters by means of a type-directed resolution. This technique, sometimes dubbed implicit programming, lets users elide information that the language implementation can deduce from the context, such as the implementation of a particular type class. One drawback of implicit programming is that a type-directed resolution may yield ambiguous results, thereby threatening coherence, the property that valid programs have exactly one meaning. This issue has divided the community on the right approach to address it. One side advocates for flexibility where implicit resolution is context-sensitive and often relies on dependent typing features to uphold soundness. The other holds that context should not stand in the way of equational reasoning and typically imposes that type class instances be unique across the entire program to fend off ambiguities. Although there exists a large body of work on type classes and implicit programming, most of the scholarly literature focuses on a few select languages and offers little insight into other mainstream projects. Meanwhile, the latter have evolved similar features and/or restrictions under different names, making it difficult for language users and designers to get a sense of the full design space. To alleviate this issue, we set to examine Swift, Rust, and Scala, three popular languages featuring type classes heavily, and relate their approach to coherence to Haskell’s. It turns out that, beyond superficial syntactic differences, Swift, Rust, and Haskell are actually strikingly similar in that the three languages offer comparable strategies to work around the limitations of the uniqueness of type class instances.