CGO 2026
2026 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)
Powered by
Conference Publishing Consulting

2026 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), January 31 – February 4, 2026, Sydney, Australia

CGO 2026 – Preliminary Table of Contents

Contents - Abstracts - Authors

Frontmatter

Title Page
Welcome from the General Chair
Welcome from the Program Chairs
CGO 2026 Organization
Sponsors

Papers

TRACE4J: A Lightweight, Flexible, and Insightful Performance Tracing Tool for Java
Haide He and Pengfei Su
(University of California at Merced, USA)
Published Artifact Info Artifacts Available Artifacts Functional Results Reproduced
Enabling Spill-Free Compilation via Affine-Based Live Range Reduction Optimization
Prasanth Chatarasi, Alex Gatea, Wei Wang, Chris Bowler, Shubham Jain, Masoud Ataei Jaliseh, Nicole Khoun, Alberto Mannari, Bardia Mahjour, Viji Srinivasan, and Swagath Venkataramani
(IBM Research, USA; IBM, USA; IBM, Canada)
Archive submitted (72 kB)
Binary Diffing via Library Signatures
Andrei Rimsa, Anderson Faustino da Silva, Camilo Santana Melgaço, and Fernando Magno Quintão Pereira
(CEFET-MG, Brazil; State University of Maringá, Brazil; Federal University of Minas Gerais, Brazil)
Published Artifact Info Artifacts Available Artifacts Functional Results Reproduced
PIP: Making Andersen’s Points-to Analysis Sound and Practical for Incomplete C Programs
Håvard Rognebakke Krogstie, Helge Bahmann, Magnus Själander, and Nico Reissmann
(NTNU, Norway; Independent Researcher, Norway)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
GRANII: Selection and Ordering of Primitives in GRAph Neural Networks using Input Inspection
Damitha Lenadora, Vimarsh Sathia, Gerasimos Gerogiannis, Serif Yesil, Josep Torrellas, and Charith Mendis
(University of Illinois at Urbana-Champaign, USA; NVIDIA, USA)
Archive submitted (270 kB) Artifacts Functional
Dependence-Driven, Scalable Quantum Circuit Mapping with Affine Abstractions
Marouane Benbetka, Merwan Bekkar, Riyadh Baghdadi, and Martin Kong
(École Nationale Supérieure d’Informatique, Algeria; NYU Abu Dhabi, United Arab Emirates; Ohio State University, USA)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
QIGen: A Kernel Generator for Inference on Nonuniformly Quantized Large Language Models
Tommaso Pegolotti, Dan Alistarh, and Markus Püschel
(ETH Zurich, Switzerland; IST Austria, Austria)
Published Artifact Artifacts Available Artifacts Functional
DyPARS: Dynamic-Shape DNN Optimization via Pareto-Aware MCTS for Graph Variants
Hao Qian, Guangli Li, Qiuchu Yu, Xueying Wang, and Jingling Xue
(UNSW, Australia; Institute of Computing Technology at Chinese Academy of Sciences, China; Beijing University of Posts and Telecommunications, China)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Synthesizing Specialized Sparse Tensor Accelerators for FPGAs via High-Level Functional Abstractions
Hamza Javed and Christophe Dubach
(McGill University, Canada)
Flow-Graph-Aware Tiling and Rescheduling for Memory-Efficient On-Device Inference
Yeonoh Jeong, Taehyeong Park, and Yongjun Park
(Yonsei University, Republic of Korea)
VFlatten: Selective Value-Object Flattening using Hybrid Static and Dynamic Analysis
Arjun H. Kumar, Bhavya Hirani, Hang Shao, Tobi Ajila, Vijay Sundaresan, Daryl Maier, and Manas Thakur
(IIT Mandi, India; Sardar Vallabhbhai National Institute of Technology, Surat, India; IBM, Canada; IIT Bombay, India)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Hexcute: A Compiler Framework for Automating Layout Synthesis in GPU Programs
Xiao Zhang, Yaoyao Ding, Bolin Sun, Yang Hu, Tatiana Shpeisman, and Gennady Pekhimenko
(University of Toronto, Canada; NVIDIA, Canada; Vector Institute, Canada)
Published Artifact Archive submitted (3.6 MB) Artifacts Available Artifacts Reusable Results Reproduced
BIT: Empowering Binary Analysis through the LLVM Toolchain
Puzhuo Liu, Peng Di, Jingling Xue, and Yu Jiang
(Ant Group, China; Tsinghua University, China; UNSW, Australia)
Multidirectional Propagation of Sparsity Information across Tensor Slices
Kaio Andrade, Danila Seliayeu, J. Nelson Amaral, and Fernando Magno Quintão Pereira
(Federal University of Minas Gerais, Brazil; University of Alberta, Canada)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
On the Precision of Dynamic Program Fingerprints Based on Performance Counters
Anderson Faustino da Silva, Sérgio Queiroz de Medeiros, Marcelo Borges Nogueira, Jeronimo Castrillon, and Fernando Magno Quintão Pereira
(State University of Maringá, Brazil; Federal University of Rio Grande do Norte, Brazil; TU Dresden, Germany; Federal University of Minas Gerais, Brazil)
Published Artifact Artifacts Available
Automatic Data Enumeration for Fast Collections
Tommy McMichen and Simone Campanoni
(Northwestern University, USA; Google, USA)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Compiler-Runtime Co-operative Chain of Verification for LLM-Based Code Optimization
Hyunho Kwon, Sanggyu Shin, Ju Min Lee, Hoyun Youm, Seungbin Song, Seongho Kim, Hanwoong Jung, Seungwon Lee, and Hanjun Kim
(Yonsei University, Republic of Korea; SAIT, Republic of Korea)
Eliminating Redundancy: Ultra-compact Code Generation for Programmable Dataflow Accelerators
Prasanth Chatarasi, Alex Gatea, Bardia Mahjour, Jintao Zhang, Alberto Mannari, Chris Bowler, Shubham Jain, Masoud Ataei Jaliseh, Nicole Khoun, Kamlesh Kumar, Viji Srinivasan, and Swagath Venkataramani
(IBM Research, USA; IBM, USA; IBM, Canada; Unaffiliated, USA)
Dr.avx: A Dynamic Compilation System for Seamlessly Executing Hardware-Unsupported Vectorization Instructions
Yue Tang, Mianzhi Wu, Yufeng Li, Haoyu Liao, Jianmei Guo, and Bo Huang
(East China Normal University, China)
Published Artifact Artifacts Available Artifacts Functional
PriTran: Privacy-Preserving Inference for Transformer-Based Language Models under Fully Homomorphic Encryption
Yuechen Mu, Guangli Li, Shiping Chen, and Jingling Xue
(UNSW, Australia; Institute of Computing Technology at Chinese Academy of Sciences, China; CSIRO’s Data61, Australia)
Partial-Evaluation Templates: Accelerating Partial Evaluation with Pre-compiled Templates
Florian Huemer, Aleksandar Prokopec, David Leopoldseder, Raphael Mosaner, and Hanspeter Mössenböck
(JKU Linz, Austria; Oracle Labs, Zurich, Switzerland; Oracle Labs, Vienna, Austria; Oracle Labs, Linz, Austria)
FHEFusion: Enabling Operator Fusion in FHE Compilers for Depth-Efficient DNN Inference
Tianxiang Sui, Jianxin Lai, Long Li, Peng Yuan, Yan Liu, Qing Zhu, Xiaojing Zhang, Linjie Xiao, Mingzhe Zhang, and Jingling Xue
(Ant Group, China; UNSW, Australia)
Published Artifact Archive submitted (150 kB) Artifacts Available Artifacts Reusable Results Reproduced
Proton: Towards Multi-level, Adaptive Profiling for Triton
Keren Zhou, Tianle Zhong, Hao Wu, Jihyeong Lee, Yue Guan, Yufei Ding, Corbin Robeck, Yuanwei Fang, Jeff Niu, and Philippe Tillet
(George Mason University, USA; OpenAI, USA; University of Virginia, USA; University of California at San Diego, USA; Meta, USA)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Unlocking Python Multithreading Capabilities using OpenMP-Based Programming with OMP4Py
César Piñeiro and Juan C. Pichel
(University of Santiago de Compostela, Spain)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Towards Path-Aware Coverage-Guided Fuzzing
Giacomo Priamo, Daniele Cono D'Elia, Mathias Payer, and Leonardo Querzoni
(Sapienza University of Rome, Italy; EPFL, Switzerland)
Published Artifact Archive submitted (140 kB) Artifacts Available Artifacts Reusable
A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler
Mohammed Tirichine, Nassim Ameur, Nazim Bendib, Iheb Nassim Aouadj, Djad Bouchama, Rafik Bouloudene, and Riyadh Baghdadi
(NYU Abu Dhabi, United Arab Emirates; École Nationale Supérieure d’Informatique, Algeria; University of Science and Technology Houari Boumediene, Algeria)
Published Artifact Archive submitted (130 kB) Artifacts Available Artifacts Reusable Results Reproduced
The Parallel-Semantics Program Dependence Graph for Parallel Optimization
Yian Su, Brian Homerding, Haocheng Gao, Federico Sossai, Yebin Chon, David I. August, and Simone Campanoni
(Northwestern University, USA; Princeton University, USA)
Published Artifact Artifacts Available Artifacts Functional Results Reproduced
PASTA: A Modular Program Analysis Tool Framework for Accelerators
Mao Lin, Hyeran Jeon, and Keren Zhou
(University of California at Merced, USA; George Mason University, USA)
Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Enabling Automatic Compiler-Driven Vectorization of Transformers
Shreya Alladi, Alberto Ros, and Alexandra Jimborean
(University of Murcia, Spain; Unaffiliated, Spain)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Towards Threading the Needle of Debuggable Optimized Binaries
Cristian Assaiante, Simone Di Biasio, Snehasish Kumar, Giuseppe Antonio Di Luna, Daniele Cono D'Elia, and Leonardo Querzoni
(Sapienza University of Rome, Italy; Google, USA)
Published Artifact Archive submitted (220 kB) Artifacts Available Artifacts Reusable
Selene: Cross-Level Barrier-Free Pipelining for Irregular Nested Loops in High-Level Synthesis
Sungwoo Yun, Seonyoung Cheon, Dongkwan Kim, Heelim Choi, Kunmo Jeong, Chan Lee, Yongwoo Lee, and Hanjun Kim
(Yonsei University, Republic of Korea; DGIST, Republic of Korea)
From Threads to Tiles: T2T, a Compiler for CUDA-to-NPU Translation via 2D Vectorization
Shuaijiang Li, Jiacheng Zhao, Ying Liu, Shuoming Zhang, Lei Chen, Yijin Li, Yangyu Zhang, Zhicheng Li, Runyu Zhou, Xiyu Shi, Chunwei Xia, Yuan Wen, Xiaobing Feng, and Huimin Cui
(Institute of Computing Technology at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China; University of Leeds, UK; University of Aberdeen, UK)
Ember: A Compiler for Embedding Operations on Decoupled Access-Execute Architectures
Marco Siracusa, Olivia Hsu, Víctor Soria-Pardos, Joshua Randall, Arnaud Grasset, Eric Biscondi, Douglas J. Joseph, Randy Allen, Fredrik Kjolstad, Miquel Moreto, and Adrià Armejach
(Barcelona Supercomputing Center, Spain; Universitat Politècnica de Catalunya, Spain; Stanford University, USA; Carnegie Mellon University, USA; Arm, USA)
Published Artifact Archive submitted (100 kB) Artifacts Available
Compiler-Assisted Instruction Fusion
Ravikiran Ravindranath Reddy, Sawan Singh, Arthur Perais, Alberto Ros, and Alexandra Jimborean
(University of Murcia, Spain; CNRS, France; Unaffiliated, Spain)
Archive submitted (37 kB)
Tawa: Automatic Warp Specialization for Modern GPUs with Asynchronous References
Hongzheng Chen, Bin Fan, Alexander Collins, Bastian Hagedorn, Evghenii Gaburov, Masahiro Masuda, Matthew Brookhart, Chris Sullivan, Jason Knight, Zhiru Zhang, and Vinod Grover
(Cornell University, USA; NVIDIA, USA; NVIDIA, UK; NVIDIA, Germany)
FORTE: Online DataFrame Query Optimizer
Yoonho Choi, Kyoungtae Lee, Minji Kim, Hyungsoo Jung, and Hyojin Sung
(POSTECH, Republic of Korea; Seoul National University, Republic of Korea; Ewha Womans University, Republic of Korea)
LLM-VeriOpt: Verification-Guided Reinforcement Learning for LLM-Based Compiler Optimization
Xiangxin Fang, Jiaqin Kang, Rodrigo C. O. Rocha, Sam Ainsworth, and Lev Mukhanov
(Queen Mary University of London, UK; University of Edinburgh, UK)
Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Thinking Fast and Correct: Automated Rewriting of Numerical Code through Compiler Augmentation
Siyuan Brant Qian, Vimarsh Sathia, Ivan R. Ivanov, Jan Hückelheim, Paul Hovland, and William S. Moses
(University of Illinois at Urbana-Champaign, USA; Institute of Science Tokyo, Japan; RIKEN RCCS, Japan; Argonne National Laboratory, USA)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
SecSwift, a Compiler-Based Framework for Software Countermeasures in Cybersecurity
François de Ferrière, Yves Janin, and Sirine Mechmech
(STMICROELECTRONICS, France; Grenoble INP, France)
PolyUFC: Polyhedral Compilation Meets Roofline Analysis for Uncore Frequency Capping
Nilesh Rajendra Shah, M V V S Manoj Kumar, Dhairya Baxi, and Ramakrishna Upadrasta
(IIT Hyderabad, India)
Info
Progressive Low-Precision Approximation of Tensor Operators on GPUs: Enabling Greater Trade-Offs between Performance and Accuracy
Fan Luo, Guangli Li, Zhaoyang Hao, Xueying Wang, Xiaobing Feng, Huimin Cui, and Jingling Xue
(Institute of Computing Technology at Chinese Academy of Sciences, China; Beijing University of Posts and Telecommunications, China; UNSW, Australia)
Pyls: Enabling Python Hardware Synthesis with Dynamic Polymorphism via LCRS Encoding
Bolei Tong, Yongyan Fang, Chaorui Wang, Qingan Li, Jingling Xue, and Yuan Mengting
(Wuhan University, China; UNSW, Australia)
TPDE: A Fast Adaptable Compiler Back-End Framework
Tobias Schwarz, Tobias Kamm, and Alexis Engelke
(TU Munich, Germany)
Published Artifact Archive submitted (70 kB) Info Artifacts Available Artifacts Reusable Results Reproduced
Space-Time Optimisations for Early Fault-Tolerant Quantum Computation
Sanaa Sharma and Prakash Murali
(University of Cambridge, UK)
Published Artifact Artifacts Available
Tensor Program Superoptimization through Cost-Guided Symbolic Program Synthesis
Alexander Brauckmann, Aarsh Chaube, José Wesley de Souza Magalhães, Elizabeth Polgreen, and Michael F. P. O’Boyle
(University of Edinburgh, UK)
Published Artifact Archive submitted (45 kB) Artifacts Available Artifacts Reusable Results Reproduced
Synthesizing Instruction Selection Back-Ends from ISA Specifications Made Practical
Florian Drescher and Alexis Engelke
(TU Munich, Germany)
Accelerating App Recompilation across Android System Updates by Code Reusing
Hongtao Wu, Yu Chen, Mengfei Xie, Futeng Yang, Jun Yan, Jiang Ma, Jianming Fu, Chun Jason Xue, and Qingan Li
(Wuhan University, China; Guangdong OPPO Mobile Telecommunications, China; Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates)
Compilation of Generalized Matrix Chains with Symbolic Sizes
Francisco López, Lars Karlsson, and Paolo Bientinesi
(Umeå University, Sweden)
Published Artifact Archive submitted (350 kB) Artifacts Available Artifacts Reusable Results Reproduced
SkeleShare: Algorithmic Skeletons and Equality Saturation for Hardware Resource Sharing
Jonathan Van der Cruysse, Tzung-Han Juang, Shakiba Bolbolian Khah, and Christophe Dubach
(McGill University, Canada)
Published Artifact Archive submitted (190 kB) Artifacts Available Artifacts Reusable Results Reproduced
FRUGAL: Pushing GPU Applications beyond Memory Limits
Lingqi Zhang, Tengfei Wang, Jiajun Huang, Chen Zhuang, Ivan R. Ivanov, Peng Chen, Toshio Endo, and Mohamed Wahib
(RIKEN RCCS, Japan; Google Cloud, Japan; University of South Florida, USA; Institute of Science Tokyo, Japan)
Archive submitted (510 kB)
SparseX: Synergizing GPU Libraries for Sparse Matrix Multiplication on Heterogeneous Processors
Ruifeng Zhang, Xiangwei Wang, Ang Li, and Xipeng Shen
(North Carolina State University, USA; Pacific Northwest National Laboratory, USA; University of Washington, USA)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Fast Autoscheduling for Sparse ML Frameworks
Bobby Yan, Alexander J Root, Trevor Gale, David Broman, and Fredrik Kjolstad
(Stanford University, USA; KTH Royal Institute of Technology, Sweden)
Archive submitted (240 kB)
LEGO: A Layout Expression Language for Code Generation of Hierarchical Mapping
Amir Mohammad Tavakkoli, Cosmin E. Oancea, and Mary Hall
(University of Utah, USA; University of Copenhagen, Denmark)
Published Artifact Info Artifacts Available Artifacts Reusable Results Reproduced
OpenQudit: Extensible and Accelerated Numerical Quantum Compilation via a JIT-Compiled DSL
Ed Younis
(Lawrence Berkeley National Laboratory, USA)
Published Artifact Artifacts Available Artifacts Reusable Results Reproduced
Practical: Are Abstract-Interpreter Baseline JITs Worth It? An Empirical Evaluation through Metacompilation
Nahuel Palumbo, Guillermo Polito, Stéphane Ducasse, and Pablo Tesone
(Univ. Lille - Inria - CNRS - Centrale Lille - UMR 9189 CRIStAL, France)
Archive submitted (77 kB)
Pushing Tensor Accelerators beyond MatMul in a User-Schedulable Language
Yihong Zhang, Derek Gerstmann, Andrew Adams, and Maaz Ahmad
(University of Washington, USA; Adobe, USA)
Published Artifact Artifacts Available

proc time: 5.93