Powered by
23rd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE 2024),
October 21–22, 2024,
Pasadena, CA, USA
23rd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE 2024)
Frontmatter
Papers
Hot Call-Chain Inlining for the Glasgow Haskell Compiler
Celeste Hollenbeck and
Michael F. P. O’Boyle
(University of Edinburgh, United Kingdom)
Inlining is a well-studied compiler transformation that can reduce call overhead and enable further optimizations. As functional programming languages such as Haskell have functions as first-class objects, they potentially have the most to gain from inlining. However, despite being acknowledged as one of the most important optimizations, there has been little recent work to significantly improve Haskell’s inlining heuristic for code performance. This paper proposes a profile-directed technique to direct inlining decisions in the Glasgow Haskell Compiler. We show that simply inlining “hot” functions, as revealed by profiling summaries, does not lead to significant improvement. However, inlining along the hot dynamic call graph is frequently beneficial. Due to the higher-order nature of Haskell, determining this call graph is non-trivial. We develop a technique to extract call chains of hot functions and leverage the Glasgow Haskell Compiler’s existing functionality to safely influence inlining decisions through pragma placement along these chains. We then show that hot call chain inlining yields a geometric mean speedup in run time of 9% over GHC’s default inlining heuristics across 17 real-world Haskell packages. This method can be used in the presence of pre-existing developer pragmas to produce a mean speedup of 10% across those same packages.
Article Search
Type-Safe Code Generation with Algebraic Effects and Handlers
Kanaru Isoda,
Ayato Yokoyama, and
Yukiyoshi Kameyama
(University of Tsukuba, Japan)
Staged computation is a means to achieve maintainability and high performance simultaneously,
by allowing a programmer to express domain-specific optimizations in a high-level programming
language. Multi-stage programming languages such as MetaOCaml provide a static safety guarantee
for generated programs by sophisticated type systems provided that program generators have
computational effects. Despite several studies, it remains a challenging problem to design
a type-safe multi-stage programming language with advanced features for computational effects.
This paper introduces a two-stage programming language with algebraic effects and handlers.
Based on two novel principles 'handlers as future-stage binders' and 'handlers are universal',
we design a type system and prove its soundness. We also show that our language is sufficiently
expressive to write various effectful staged computations including multi-level let-insertion,
which is a key technique to avoid code duplication in staged computation.
Article Search
Applying a Self-Extension Mechanism to DSLs for Establishing Model Libraries
Arkadii Gerasimov,
Nico Jansen,
Judith Michael, and
Bernhard Rumpe
(RWTH Aachen University, Germany)
When applying model-driven engineering in an agile environment, new requirements continuously expand the domain scope and trigger an extension of the concepts covered by the Domain-Specific Language (DSL). While programming languages streamline code extension and reuse through libraries, similar approaches for DSLs are more complex or target a specific language. We present and discuss an approach for designing DSLs with a self-extension mechanism to enable model library creation and seamless reuse. We use the self-extension mechanism to introduce concepts in models, gather reusable models into a library, and provide an infrastructure for its usage. We explain our language-specific realization of the self-extension mechanism using a DSL and discuss model libraries of the DSL with a use case from practice. The approach provides more flexibility for agile model-driven engineering. It enables application modelers to introduce and reuse concepts via models without changing the DSL, reducing the communication overhead within the development team.
Article Search
On the Soundness of Auto-completion Services for Dynamically Typed Languages
Damian Frölich and
L. Thomas van Binsbergen
(University of Amsterdam, Netherlands)
Giving auto-completion candidates for dynamically typed languages requires complex analysis of the source code, especially when the goal is to ensure that the completion candidates do not introduce bugs. In this paper, we introduce an approach that builds upon abstract interpretation and the scope graph framework to obtain an over-approximation of the name binding seen at run-time. The over-approximation contains enough information to implement auto-completion services such that the given suggestions do not introduce name binding errors. To demonstrate our approach, we compare the suggestions given by our approach with the state of the art completion services on a subset of the Python programming language.
Article Search
An Extensible Feature-Oriented Approach for Fine-Grained Code Quality Analysis
Karim Ghallab,
Tewfik Ziadi, and
Zaak Chalal
(RedFabriQ, France; LIP6, France; Sorbonne Université, France)
Assessing code quality is crucial for effective software maintenance and evolution. Traditional tools like SonarQube offer valuable insights at the application level but lack the granularity needed for detailed, feature-specific analysis. This paper emphasizes the importance of feature-oriented code quality analysis, often overlooked by mainstream tools due to the challenge of correlating high-level feature descriptions with low-level code implementations. To tackle this issue, we leverage existing feature location techniques to introduce a novel approach enabling granular analysis tailored to specific application features. We discuss the motivations for this approach, highlighting its potential to improve precision in enhancement and maintenance strategies. Additionally, this paper introduces a tool-based approach known as InsightMapper. We also present a study demonstrating the benefits of this method through the analysis of two case studies, featuring a recognized benchmark in the feature location domain.
Article Search
Specializing Data Access in a Distributed File System (Generative Pearl)
Pratyush Das,
Anxhelo Xhebraj, and
Tiark Rompf
(Purdue University, USA)
We propose DDLoader, a system that embeds information about data
partitioning and data distribution in distributed file systems using a metaprogramming framework. We
demonstrate that this technique has practical benefits for building
applications that interact with distributed file system applications.
By using metaprogramming, we bring the traditional benefits of
staging, such as partial evaluation to the domain of
distributed file system access.
Furthermore, this approach also fuses the
data access code with the computation code, allowing end-to-end optimization.
We test our framework on a real-world use case and show that this approach
allows users to generate code that performs faster than traditional data
loading by up to 12.56x on a single thread and even more when parallelized.
Article Search
Separate Compilation and Partial Linking: Modules for Datalog IR
David Klopp,
André Pacak, and
Sebastian Erdweg
(JGU Mainz, Germany)
In recent years, Datalog has sparked renewed interest in academia and industry, leading to the development of numerous new Datalog systems.
To unify these systems, recent approaches treat Datalog as an intermediate representation (IR) in a compiler framework:
Compiler frontends can lower different Datalog dialects to the same IR, which is then optimized before a compiler backend targets one of many existing Datalog engines.
However, a key feature is missing in these compiler frameworks: an expressive module system.
In this paper, we present the first module system for a Datalog IR.
Our modules are statically typed, can be separately compiled, and partially linked to form ``bundles''.
Since IR modules are generated by a compiler frontend, we rely on explicit declarations of required and provided relations to maximize the decoupling between modules.
This also allows modules to abstract over required relations to offer reusable functionality (e.g., computing a transitive closure) that can be instantiated for different relations in a single Datalog program.
We formalize the module system, its type system, and the linking algorithm.
We then describe how different usage patterns that occur in Datalog dialects (e.g., inheritance, cyclic imports) can be expressed in our IR module system.
Finally, we integrate our module system into an existing Datalog compiler framework, develop a Soufflé compiler frontend that translates Soufflé components to IR modules, and demonstrate its applicability to a large Doop analysis.
Article Search
Automated Generation of Code Contracts: Generative AI to the Rescue?
Sandra Greiner,
Noah Bühlmann,
Manuel Ohrndorf,
Christos Tsigkanos,
Oscar Nierstrasz, and
Timo Kehrer
(University of Southern Denmark, Denmark; University of Bern, Switzerland; University of Athens, Greece; Feenk, Switzerland)
Design by Contract represents an established, lightweight paradigm for engineering reliable and robust software systems by specifying verifiable expectations and obligations between software components. Due to its laborious nature, developers hardly adopt Design by Contract in practice.
A plethora of research on (semi-)-automated inference to reduce the manual burden has not improved the adoption of so-called code contracts in practice.
This paper examines the potential of Generative AI to automatically generate code contracts in terms of pre- and postconditions for any Java project without requiring any additional auxiliary artifact.
To fine-tune two state-of-the-art Large Language Models, CodeT5 and CodeT5+, we derive a dataset of more than 14k Java methods comprising contracts in form of Java Modeling Language (JML) annotations, and train the models on the task of generating contracts.
We examine the syntactic and semantic validity of the contracts generated for software projects not used in the fine-tuning and find that more than 95% of the generated contracts are syntactically correct and exhibit remarkably high completeness and semantic correctness.
To this end, our fully automated method sets the stage for future research and eventual broader adoption of Design by Contract in software development practice.
Article Search
Restaging Domain-Specific Languages: A Flexible Design Pattern for Rapid Development of Optimizing Compilers
Amir Shaikhha
(University of Edinburgh, United Kingdom)
This paper addresses the complexity of developing optimizing compilers by proposing a novel design pattern named Restage. The Restage interface reduces boilerplate code in transformation passes by simplifying the extraction and reconstruction of language constructs, thereby enabling more efficient and maintainable compiler development. This approach automatically generates the necessary interface instances using metaprogramming, minimizing manual effort. Additionally, the proposed solution integrates seamlessly with existing methodologies for building optimizing compilers, including strategy-based transformations and the visitor design pattern. The Restage design pattern significantly reduces the complexity and maintenance cost of compilers for generic-purpose and domain-specific languages.
Article Search
proc time: 3.08