PLDI 2026 Co-Located Events
PLDI 2026 Co-Located Events
Powered by
Conference Publishing Consulting

12th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming (ARRAY 2026), June 15–19, 2026, Boulder, CO, USA

ARRAY 2026 – Preliminary Table of Contents

Contents - Abstracts - Authors

12th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming (ARRAY 2026)

Frontmatter

Title Page


Welcome from the Chairs
ARRAY 2026, the ACM SIGPLAN Workshop on Libraries, Languages and Compilers for Array Programming, was co-located with PLDI 2026 and continued its mission of advancing the theory and practice of array-oriented programming. Array programming combines high-level mathematical abstractions with language constructs that expose regular control flow, structured data dependencies, reshaping operations, sparsity, and other properties that support sophisticated program analysis and optimization. The workshop explores a broad spectrum of topics including languages, semantics, type systems, libraries, compilation techniques, automated synthesis, performance portability, and data-layout optimizations, while fostering interaction between academia and industry. ARRAY 2026 received eleven submissions and, following a single-blind review process, accepted three full papers, seven extended abstracts, and one research preview, with each submission receiving at least two reviews. The program featured keynote talks by Mary Hall (University of Utah) on compiler-assisted optimization of data movement through aggregate data abstractions, and Jared Roesch (NVIDIA) on emerging directions in large-scale parallel programming and the growing accessibility of high-performance computing to programmers who employ higher-level declarative languages. By bringing together researchers and practitioners from programming languages, scientific computing, machine learning, compiler technology, and library development, ARRAY serves as a forum for exchanging ideas on both the foundational principles and practical tools of array programming.

ARRAY 2026 Organization


Papers

Portable Anomaly Detection for Distributed PGAS Programs Based on Array Mapping Abstractions
Raneem Abuyosef, Thomas Huddleston, Kirshanthan Sundararajah, and Martin Kong
(Ohio State University, USA; Virginia Tech, USA)
The Partitioned Global Address Space (PGAS) execution model is a promising alternative model to MPI. PGAS implementations offer several programmability advantages over their MPI counterparts while preserving the appeal of high-performance communication libraries. Among existing PGAS instances, Chapel and Unified Parallel-C (UPC) are some of the leading implementations. Unfortunately, despite multiple advances, anomaly detection support for PGAS programs, both at the language level and library level, has been largely ignored.
In this work, we propose new static analyses to detect various distributed, multi-locale anomalies in Chapel and UPC programs. We leverage constraint formulae to model Chapel/UPC semantics together with anomaly-specific constraints that encapsulate triggering conditions of the anomaly. Generated models are then checked with Z3 SMT solver to decide the existence of anomalies. While our implementation targets Chapel, the underlying approach generalizes to other PGAS languages, as we illustrate through a construct mapping to UPC. We demonstrate the effectiveness of our analyses on well-known scientific operators and communication patterns, comparing against native Chapel analyses and MPI debugging tools targeting lowered Chapel code. We further validate our approach with an evaluation on UPC programs.

Publisher's Version
Leveraging AI Ecosystem for Portable and Sustainable GPU Kernels in HPC
Yanbo Zhao, Zhaonan Meng, Sai Krishna Teja Varma Manthena, Xu Liu, Ajay Panyala, and Jiajia Li
(North Carolina State University, USA; Pacific Northwest National Laboratory, USA)
High-Performance Computing (HPC) applications increasingly depend on GPUs, yet developing optimized kernels across evolving GPU architectures remains a major productivity bottleneck. With a tile-based programming model, Triton, a Python-based domain-specific language from the AI ecosystem, presents a compelling opportunity to simplify high-performance GPU kernel development for HPC. However, its tight coupling with Python creates significant integration barriers. In this paper, we investigate the feasibility of leveraging Triton for traditional HPC development. We present a compilation framework that transforms Triton kernels into standalone shared objects with C-compatible interfaces, eliminating Python dependencies and enabling seamless integration into HPC codebases while preserving optimization and portability benefits. We validate the approach by replacing kernels in representative HPC workloads with simpler Triton implementations that deploy across NVIDIA and AMD GPUs without modification. Triton achieves near-parity performance with native implementations on tile-friendly workloads, while irregular kernels reveal current limitations of its tile-based programming model. These results suggest that bridging the AI and HPC ecosystems via Triton offers a practical path toward more productive, portable, and sustainable GPU kernel development for HPC.

Publisher's Version
Refined Remora: Constraining Array Shapes
Vadym Matviichuk and Olin Shivers
(Northeastern University, USA)
Remora is a higher-order functional array programming language based on the rank-polymorphic computational model originally developed by Iverson for APL. Unlike APL, Remora has a dependent type system that captures array shapes for use by the parallelizing compiler. Given that the type system is decidably checkable and inferrable, it inevitably must have limits on its expressiveness. In this paper, we extend Remora's type system by allowing programmers to refine array shapes with extra constraints, which are then checked by an SMT solver. We develop the shape-refinement type extensions for Remora; evaluate its utility for refining the types of standard computational kernels, such as convolution, as well as total applications, such as the YOLO vision application; and prove its type soundness.

Publisher's Version

proc time: 4.6