CGO 2024 – Proceedings

Welcome from the General Chairs
Welcome to the 22nd ACM/IEEE International Symposium on Code Generation and Optimization (CGO ’24), where we invite you to medieval Edinburgh.
CGO provides a premier venue to bring together researchers and practitioners working at the interface of hardware and software on a wide range of optimization and code generation techniques and related issues. The conference spans the spectrum from purely static to fully dynamic approaches, reaches from pure software-based methods to specific architectural features, and also covers support for code generation and optimization.

Welcome from the Program Chairs
On behalf of the Program Committee of the International Symposium on Code Generation and Optimization (CGO), we are delighted to present the papers featured in the 2024 edition of the conference. This year marks a significant milestone in the history of CGO, as the symposium embraced a dual-round submission process for the first time. The initial round had a deadline set on May 19th, 2023, while the second round followed the traditional CGO deadline on September 1st, 2023. Although the program committee remained consistent throughout both rounds, each round had distinct program co-chairs overseeing coordination. The first-round coordinators were Jingling Xue and Michel Steuwer, while the second-round coordinators were Fernando Pereira and Guilherme Ottoni.

Compilers for Machine Learning

A Tensor Algebra Compiler for Sparse Differentiation
Amir Shaikhha

, Mathieu Huot

, and Shideh Hashemian

(University of Edinburgh, United Kingdom; University of Oxford, United Kingdom)
Sparse tensors are prevalent in many data-intensive applications. However, existing automatic differentiation (AD) frameworks are tailored towards dense tensors, which makes it a challenge to efficiently compute gradients through sparse tensor operations. This is due to irregular sparsity patterns that can result in substantial memory and computational overheads. We propose a novel framework that enables the efficient AD of sparse tensors. The key aspects of our work include a compilation pipeline leveraging two intermediate DSLs with AD-agnostic domain-specific optimizations followed by efficient C++ code generation. We showcase the effectiveness of our framework in terms of performance and scalability through extensive experimentation, outperforming state-of-the-art alternatives across a variety of synthetic and real-world datasets.

Energy-Aware Tile Size Selection for Affine Programs on GPUs
Malith Jayaweera

, Martin Kong, Yanzhi Wang, and David Kaeli

(Northeastern University, USA; Ohio State University, USA)
Loop tiling is a high-order transformation used to increase data locality and performance. While previous work has considered its application to several domains and architectures, its potential impact on energy efficiency has been largely ignored. In this work, we present an Energy-Aware Tile Size Selection Scheme (EATSS) for affine programs targeting GPUs. We automatically derive non-linear integer formulations for affine programs and use the Z3 solver to find effective tile sizes that meet architectural resource constraints, while maximizing performance and minimizing energy consumption. Our approach builds on the insight that reducing the liveness of in-cache data, together with exploiting automatic power scaling, can lead to substantial gains in performance and energy efficiency. We evaluate EATSS on NVIDIA Xavier and GA100 GPUs, and report median performance-per-Watt improvement relative to PPCG on several affine kernels. On Polybench kernels, we achieve 1.5× and 1.2× improvement and obtain up to 6.3× improvement on non-Polybench high-dimensional affine kernels.

CGO 2024 – Proceedings

Frontmatter

Compilers for Machine Learning

Machine-Learning Guided Optimizations

Compilers for GPUs

Custom Processors

Compiler Construction

Custom Environments

Static/Dynamic Analyses

Supporting Tools

Practice and Experience

Acceleration Techniques