Powered by
12th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming (ARRAY 2026), June 15–19, 2026,
Boulder, CO, USA
12th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming (ARRAY 2026)
Frontmatter
Papers
Portable Anomaly Detection for Distributed PGAS Programs Based on Array Mapping Abstractions
Raneem Abuyosef,
Thomas Huddleston,
Kirshanthan Sundararajah, and
Martin Kong
(Ohio State University, USA; Virginia Tech, USA)
The Partitioned Global Address Space (PGAS) execution model is a promising alternative model to MPI. PGAS implementations offer several programmability advantages over their MPI counterparts while
preserving the appeal of high-performance communication libraries. Among existing PGAS instances, Chapel and Unified Parallel-C (UPC) are some of the leading implementations. Unfortunately, despite multiple advances, anomaly detection support for PGAS programs, both at the language level and library level, has been largely ignored.
In this work, we propose new static analyses to detect various distributed, multi-locale anomalies in Chapel and UPC programs. We leverage constraint formulae to model Chapel/UPC semantics together with anomaly-specific constraints that encapsulate triggering conditions of the anomaly. Generated models are then checked with Z3 SMT solver to decide the existence of anomalies. While our implementation targets Chapel, the underlying approach generalizes to other PGAS languages, as we illustrate through a construct mapping to UPC. We demonstrate the effectiveness of our analyses on well-known scientific operators and communication patterns, comparing against native Chapel analyses and MPI debugging tools targeting lowered Chapel code. We further validate our approach with an evaluation on UPC programs.
Article Search
Article: pldiws26arraymain-p2-p
Leveraging AI Ecosystem for Portable and Sustainable GPU Kernels in HPC
Yanbo Zhao,
Zhaonan Meng,
Sai Krishna Teja Varma Manthena,
Xu Liu,
Ajay Panyala, and
Jiajia Li
(North Carolina State University, USA; Pacific Northwest National Laboratory, USA)
High-Performance Computing (HPC) applications increasingly depend on GPUs, yet developing optimized kernels across evolving GPU architectures remains a major productivity bottleneck. With a tile-based programming model, Triton, a Python-based domain-specific language from the AI ecosystem, presents a compelling opportunity to simplify high-performance GPU kernel development for HPC. However, its tight coupling with Python creates significant integration barriers. In this paper, we investigate the feasibility of leveraging Triton for traditional HPC development. We present a compilation framework that transforms Triton kernels into standalone shared objects with C-compatible interfaces, eliminating Python dependencies and enabling seamless integration into HPC codebases while preserving optimization and portability benefits. We validate the approach by replacing kernels in representative HPC workloads with simpler Triton implementations that deploy across NVIDIA and AMD GPUs without modification. Triton achieves near-parity performance with native implementations on tile-friendly workloads, while irregular kernels reveal current limitations of its tile-based programming model. These results suggest that bridging the AI and HPC ecosystems via Triton offers a practical path toward more productive, portable, and sustainable GPU kernel development for HPC.
Article Search
Article: pldiws26arraymain-p3-p
Refined Remora: Constraining Array Shapes
Vadym Matviichuk and
Olin Shivers
(Northeastern University, USA)
Remora is a higher-order functional array programming language based on
the rank-polymorphic computational model originally developed by
Iverson for APL.
Unlike APL, Remora has a dependent type system that
captures array shapes for use by the parallelizing compiler.
Given that the type system is decidably checkable and inferrable,
it inevitably must have limits on its expressiveness.
In this paper, we extend Remora's type system by allowing programmers
to refine array shapes with extra constraints,
which are then checked by an SMT solver.
We develop the shape-refinement type extensions for Remora;
evaluate its utility for refining the types of standard computational kernels,
such as convolution, as well as total applications,
such as the YOLO vision application;
and prove its type soundness.
Article Search
Article: pldiws26arraymain-p9-p
proc time: 4.48