PLDI 2026 Co-Located Events
PLDI 2026 Co-Located Events

Powered by

27th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2026), June 15–16, 2026, Boulder, CO, USA

LCTES 2026 – Proceedings

Contents - Abstracts - Authors

27th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2026)

Frontmatter

Title Page
Article: pldiws26lctesforeword-fm000-p doi:

Welcome from the Chairs
Article: pldiws26lctesforeword-fm001-p doi:

LCTES 2026 Organization
Article: pldiws26lctesforeword-fm002-p doi:

Advanced Compiler Tuning

RAPO: Retrieval-Augmented Phase Ordering
Jinwook Yang, Junghyun Lee, Yeonsun Hong, and Hyojin Sung
(Seoul National University, Republic of Korea)

Publisher's Version Article: pldiws26lctesmain-p50-p doi:10.1145/3814943.3816172

CausalTuner: Feature-Aware Causal Guidance for Compiler Auto-tuning
Jiaqing Zhong, Juan Chen, Yichang Zhou, and Kuan Li
(National University of Defense Technology, China; Dongguan University of Technology, China)

Publisher's Version

Published Artifact

Artifacts Available

Artifacts Reusable

Results Reproduced Article: pldiws26lctesmain-p93-p doi:10.1145/3814943.3816181

Empirical Observations about Profile-Guided Optimizations for Mainstream C/C++ Compilers
Soma Pal and Prasad Anil Kulkarni
(University of Kansas, USA)

Publisher's Version Article: pldiws26lctesmain-p88-p doi:10.1145/3814943.3816180

Binary Optimization and System Security

DeduBB: Binary Code Size Reduction via Post-Link Basic Block Deduplication
Chaitanya Mamatha Ananda, Mahbod Afarin, Rajiv Gupta, Sriraman Tallam, Han Shen, and Xinliang David Li
(University of California at Riverside, USA; Google, USA)
Binary sizes of upgraded versions of software applications tend to be larger, primarily due to feature bloat. This poses various challenges, particularly for mobile applications. It affects upgrade rates directly impacting revenues, increases maintenance costs of supporting multiple versions, and prevents some users from getting critical security fixes. Code bloat also poses a problem for large warehouse-scale applications. Such applications experience performance degradation when their code size exceeds what smaller and more efficient code models can handle.
In this paper, we introduce a post-link optimization technique called DeduBB, which deduplicates basic blocks of an application across procedure boundaries. As the prior techniques used function outlining to deduplicate identical code sequences, they missed out on many opportunities such as duplicate code patterns that manipulate the program stack. In addition, previous techniques were either limited to the scope of a module or lacked scalable implementations required to handle large warehouse-scale applications. Our technique, DeduBB, exploits inter-module opportunities and de-duplicates more code patterns than prior techniques as it uses a novel save-and-jump code sequence to execute deduplicated code blocks. In addition, DeduBB has been designed to work on scalable post-link optimizers and can even be applied to large warehouse-scale data center applications. Finally, DeduBB is profile-guided and can be applied selectively to infrequently executed cold basic blocks to not affect application performance. In fact, in several cases, the performance of the smaller application binary improves slightly due to reductions in its hot working set size. We have designed our technique for the state-of-the-art post-link optimizers, BOLT and Propeller. Experiments show that we can significantly reduce the code size of several benchmarks by 1.55% to 18.63%, on both Arm and x86 platforms, even on binaries that have already been heavily optimized for size using existing code size reduction features. For warehouse-scale binaries, DeduBB reduces code size by up to 25.8%. Finally, aided by profiles, our technique can retain over 82% of the maximal code size savings without affecting performance.

Publisher's Version

Published Artifact

Artifacts Available

Artifacts Reusable

Results Reproduced Article: pldiws26lctesmain-p31-p doi:10.1145/3814943.3816169

SymFlow: Event-Chain-Aware Symbolic Execution for Serverless Sensitive Data Flow Detection
Yuanpeng Wang, Zhineng Zhong, Zhenkai Liang, Ding Li, Yao Guo, and Xiangqun Chen
(Peking University, China; National University of Singapore, Singapore)

Publisher's Version Article: pldiws26lctesmain-p21-p doi:10.1145/3814943.3816168

CVS: A Metric for Security-Aware Compilation against Side-Channel Attacks in Edge SoCs (WIP)
Yi Han, Puhong Lei, Yang Shi, Zhe Li, Xing Mou, Jianjun Chen, and Yaohua Wang
(National University of Defense Technology, Changsha, China; Key Laboratory of Advanced Microprocessor Chips and Systems, Changsha, China; Hunan Greatwall Galaxy Science and Technology, Changsha, China)

Publisher's Version Article: pldiws26lctesmain-p71-p doi:10.1145/3814943.3816178

A Programming Model for Efficient Inter-Kernel Control-Flow on Memory-Mapped Near-Data Processing Architecture (WIP)
Seungheon Lee, Wonhyuk Yang, Seonyeong Heo, and Gwangsun Kim
(POSTECH, Republic of Korea; Kyung Hee University, Republic of Korea)

Publisher's Version Article: pldiws26lctesmain-p112-p doi:10.1145/3814943.3816184

FLUX: Frequency Scaling with Layer-wise Utilization for Energy-Efficient NPU Execution (WIP)
Inho Lee, Ky Yeop Lim, Hyejun Kim, Beomseok Kim, Dongsuk Jeon, Hunjun Lee, and Yongjun Park
(Hanyang University, Republic of Korea; Samsung Electronics, Republic of Korea; Yonsei University, Republic of Korea; Seoul National University, Republic of Korea)

Publisher's Version Article: pldiws26lctesmain-p69-p doi:10.1145/3814943.3816177

Formal Methods and Systems Reliability

Towards Verifiable System Code using a DSL Compiled to Efficient and Readable C Code
Clément Chavanon, Henrik Karlsson, Frédéric Besson, Sandrine Blazy, and Roberto Guanciale
(Inria - Univ Rennes - CNRS - IRISA, France; KTH Royal Institute of Technology, Sweden; Univ Rennes - Inria - CNRS - IRISA, France)

Publisher's Version

Published Artifact

Artifacts Available

Artifacts Reusable

Results Reproduced Article: pldiws26lctesmain-p40-p doi:10.1145/3814943.3816170

A Pointer-Ownership Model for C Inspired by Rust
David Svoboda, William Klieber, Lori Flynn, Ruben Martins, and Jeffrey Hoskinson
(Carnegie Mellon University, USA)

Publisher's Version

Published Artifact

Artifacts Available

Artifacts Reusable

Results Reproduced Article: pldiws26lctesmain-p95-p doi:10.1145/3814943.3816182

Hikami: A Lightweight Hypervisor for Emulating RISC-V Extension Semantics with Sail-Driven Auto-generation
Norimasa Takana and Yoshihiro Oyama
(University of Tsukuba, Japan)

Publisher's Version

Published Artifact

Artifacts Available

Artifacts Reusable

Results Reproduced Article: pldiws26lctesmain-p45-p doi:10.1145/3814943.3816171

Scheduled Partial-Credit RL for Reliable Code Generation with Small Language Models (WIP)
Suryansh Singh Sijwali and Suman Saha
(Pennsylvania State University, USA)

Publisher's Version Article: pldiws26lctesmain-p11-p doi:10.1145/3814943.3816167

Specialized Hardware and Accelerator Design

Can Fine-Grain Multi-threading Subsume VLIW?
Scott Pomerville, Soner Önder, Gang-Ryung Uh, and David Whalley
(Northern Michigan University, USA; Michigan Technological University, USA; Florida State University, USA)

Publisher's Version Article: pldiws26lctesmain-p73-p doi:10.1145/3814943.3816179

Sirop: A Small IR for HLS with Parallel Patterns
Louis Hildebrand and Christophe Dubach
(McGill University, Canada; MILA, Canada)

Publisher's Version

Published Artifact

Artifacts Available

Artifacts Reusable

Results Reproduced Article: pldiws26lctesmain-p61-p doi:10.1145/3814943.3816175

A Functional Approach to Synthesizing Routable Programmable Accelerators for Neural Networks
Tzung-Han Juang, Paul Teng, and Christophe Dubach
(McGill University, Canada; MILA, Canada)

Publisher's Version

Published Artifact

Artifacts Available

Artifacts Reusable

Results Reproduced Article: pldiws26lctesmain-p99-p doi:10.1145/3814943.3816183

LoopHint: A Compiler-Assisted Loop Branch Predictor for Embedded DSPs
Yuanyang Xiang, Chen Xu, Ruozhou Xiao, and Zhiwei Zhang
(Institute of Automation at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China)

Publisher's Version Article: pldiws26lctesmain-p1-p doi:10.1145/3814943.3816166

Memory Efficiency and Control-Flow Analysis

MemSpec: Memory-Aware Runtime for Adaptive Draft Scheduling in Speculative Decoding on Edge Devices
Eunjeong Kim, Yeong Jun Jeon, and Myeonggyun Han
(Kyungpook National University, Republic of Korea)

Publisher's Version Article: pldiws26lctesmain-p56-p doi:10.1145/3814943.3816174

Bridging the Memory Hotness Gap in Edge Systems with Hotness-Segregated Object Allocation
Ruizhe Huang, Jiahua Wang, Qihang Xu, Peng Jiang, Zhida An, Ding Li, Yao Guo, Xiangqun Chen, Yuxin Ren, and Ning Jia
(Peking University, China; Southeast University, China; Huawei Technologies, China)

Publisher's Version Article: pldiws26lctesmain-p64-p doi:10.1145/3814943.3816176

On the Origins of Indirect Jumps in Embedded Software
Ariane Nicolas, Ronan Lashermes, Isabelle Puaut, and Erven Rohou
(Univ Rennes - Inria - CNRS - IRISA, France; Rambus, Netherlands)

Publisher's Version

Published Artifact

Artifacts Available

Artifacts Reusable

Results Reproduced Article: pldiws26lctesmain-p55-p doi:10.1145/3814943.3816173

proc time: 0.41