ASPLOS 2023
28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 (ASPLOS 2023)
Powered by
Conference Publishing Consulting

28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 (ASPLOS 2023), March 25–29, 2023, Vancouver, BC, Canada

ASPLOS 2023 – Proceedings

Contents - Abstracts - Authors

Frontmatter

Title Page
ASPLOS 2023 Message from the General Chair
ASPLOS 2023 Volume III Program Chairs’ Message
Committees
Sponsors

Keynotes

Direct Mind-Machine Teaming (Keynote)
Abhishek Bhattacharjee
(Yale University, USA)
Publisher's Version
Language Models: The Most Important Compute Challenge of Our Time (Keynote)
Bryan Catanzaro
(NVIDIA, USA)
Publisher's Version

Papers

ABNDP: Co-optimizing Data Access and Load Balance in Near-Data Processing
Boyu Tian, Qihang Chen, and Mingyu Gao
(Tsinghua University, China; Shanghai Qi Zhi Institute, Shanghai, China)
Publisher's Version
Accelerating Sparse Data Orchestration via Dynamic Reflexive Tiling
Toluwanimi O. Odemuyiwa, Hadi Asghari-Moghaddam, Michael Pellauer, Kartik Hegde, Po-An Tsai, Neal C. Crago, Aamer Jaleel, John D. Owens, Edgar Solomonik, Joel S. Emer, and Christopher W. Fletcher
(University of California at Davis, Davis, USA; University of Illinois at Urbana-Champaign, USA; NVIDIA, USA; Massachusetts Institute of Technology, USA)
Publisher's Version
APEX: A Framework for Automated Processing Element Design Space Exploration using Frequent Subgraph Analysis
Jackson Melchert, Kathleen Feng, Caleb Donovick, Ross Daly, Ritvik Sharma, Clark Barrett, Mark A. Horowitz, Pat Hanrahan, and Priyanka Raina
(Stanford University, USA)
Publisher's Version
Beyond Static Parallel Loops: Supporting Dynamic Task Parallelism on Manycore Architectures with Software-Managed Scratchpad Memories
Lin Cheng, Max Ruttenberg, Dai Cheol Jung, Dustin Richmond, Michael Taylor, Mark Oskin, and Christopher Batten
(Cornell University, USA; University of Washington, USA; University of California at Santa Cruz, Santa Cruz, USA)
Publisher's Version
CaQR: A Compiler-Assisted Approach for Qubit Reuse through Dynamic Circuit
Fei Hua, Yuwei Jin, Yanhao Chen, Suhas Vittal, Kevin Krsulich, Lev S. Bishop, John Lapeyre, Ali Javadi-Abhari, and Eddy Z. Zhang
(Rutgers University, USA; Georgia Institute of Technology, USA; IBM, USA)
Publisher's Version
CaT: A Solver-Aided Compiler for Packet-Processing Pipelines
Xiangyu Gao, Divya Raghunathan, Ruijie Fang, Tao Wang, Xiaotong Zhu, Anirudh Sivaraman, Srinivas Narayana, and Aarti Gupta
(New York University, USA; Princeton University, USA; Rutgers University, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Characterizing and Optimizing End-to-End Systems for Private Inference
Karthik Garimella, Zahra Ghodsi, Nandan Kumar Jha, Siddharth Garg, and Brandon Reagen
(New York University, USA; Purdue University, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Cohort: Software-Oriented Acceleration for Heterogeneous SoCs
Tianrui Wei, Nazerke Turtayeva, Marcelo Orenes-Vera, Omkar Lonkar, and Jonathan Balkind
(University of California at Berkeley, Berkeley, USA; University of California at Santa Barbara, Santa Barbara, USA; Princeton University, Princeton, USA)
Publisher's Version
Coyote: A Compiler for Vectorizing Encrypted Arithmetic Circuits
Raghav Malik, Kabir Sheth, and Milind Kulkarni
(Purdue University, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
DefT: Boosting Scalability of Deformable Convolution Operations on GPUs
Edward Hanson, Mark Horton, Hai (Helen) Li, and Yiran Chen
(Duke University, Durham, USA)
Publisher's Version
Disaggregated RAID Storage in Modern Datacenters
Junyi Shu, Ruidong Zhu, Yun Ma, Gang Huang, Hong Mei, Xuanzhe Liu, and Xin Jin
(Peking University, China)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
DrGPUM: Guiding Memory Optimization for GPU-Accelerated Applications
Mao Lin, Keren Zhou, and Pengfei Su
(University of California at Merced, Merced, USA; OpenAI, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Efficient Compactions between Storage Tiers with PrismDB
Ashwini Raina, Jianan Lu, Asaf Cidon, and Michael J. Freedman
(Princeton University, USA; Columbia University, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Efficient Scheduler Live Update for Linux Kernel with Modularization
Teng Ma, Shanpei Chen, Yihao Wu, Erwei Deng, Zhuo Song, Quan Chen, and Minyi Guo
(Alibaba Group, China; Shanghai Jiao Tong University, China)
Publisher's Version Published Artifact Artifacts Available
eHDL: Turning eBPF/XDP Programs into Hardware Designs for the NIC
Alessandro Rivitti, Roberto Bifulco, Angelo Tulumello, Marco Bonola, and Salvatore Pontarelli
(Axbryd, Italy; University of Rome Tor Vergata, Italy; NEC Laboratories Europe, Germany; CNIT, Italy; Sapienza University of Rome, Italy)
Publisher's Version
Exit-Less, Isolated, and Shared Access for Virtual Machines
Kenichi Yasukata, Hajime Tazaki, and Pierre-Louis Aublin
(IIJ Research Laboratory, Japan)
Publisher's Version
Finding Unstable Code via Compiler-Driven Differential Testing
Shaohua Li and Zhendong Su
(ETH Zurich, Switzerland)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Flexagon: A Multi-dataflow Sparse-Sparse Matrix Multiplication Accelerator for Efficient DNN Processing
Francisco Muñoz-Martínez, Raveesh Garg, Michael Pellauer, José L. Abellán, Manuel E. Acacio, and Tushar Krishna
(Universidad de Murcia, Spain; Georgia Institute of Technology, USA; NVIDIA, USA)
Publisher's Version
Going beyond the Limits of SFI: Flexible and Secure Hardware-Assisted In-Process Isolation with HFI
Shravan Narayan, Tal Garfinkel, Mohammadkazem Taram, Joey Rudek, Daniel Moghimi, Evan Johnson, Chris Fallin, Anjo Vahldiek-Oberwagner, Michael LeMay, Ravi Sahita, Dean Tullsen, and Deian Stefan
(University of California at San Diego, San Diego, USA; University of Texas at Austin, Austin, USA; Purdue University, USA; Fastly, USA; Intel Labs, Germany; Intel Labs, USA; Rivos, USA)
Publisher's Version
GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference
Haojie Ye, Sanketh Vedula, Yuhan Chen, Yichen Yang, Alex Bronstein, Ronald Dreslinski, Trevor Mudge, and Nishil Talati
(University of Michigan, USA; Technion, Israel)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Graphene: An IR for Optimized Tensor Computations on GPUs
Bastian Hagedorn, Bin Fan, Hanfeng Chen, Cris Cecka, Michael Garland, and Vinod Grover
(NVIDIA, Germany; NVIDIA, USA)
Publisher's Version
Heron: Automatically Constrained High-Performance Library Generation for Deep Learning Accelerators
Jun Bi, Qi Guo, Xiaqing Li, Yongwei Zhao, Yuanbo Wen, Yuxuan Guo, Enshuai Zhou, Xing Hu, Zidong Du, Ling Li, Huaping Chen, and Tianshi Chen
(University of Science and Technology of China, China; Institute of Computing Technology at Chinese Academy of Sciences, China; Cambricon Technologies, China; Institute of Software at Chinese Academy of Sciences, China)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional
Homunculus: Auto-Generating Efficient Data-Plane ML Pipelines for Datacenter Networks
Tushar Swamy, Annus Zulfiqar, Luigi Nardi, Muhammad Shahbaz, and Kunle Olukotun
(Stanford University, USA; Purdue University, USA; Lund University, Sweden)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Hyperscale Hardware Optimized Neural Architecture Search
Sheng Li, Garrett Andersen, Tao Chen, Liqun Cheng, Julian Grady, Da Huang, Quoc V. Le, Andrew Li, Xin Li, Yang Li, Chen Liang, Yifeng Lu, Yun Ni, Ruoming Pang, Mingxing Tan, Martin Wicke, Gang Wu, Shengqi Zhu, Parthasarathy Ranganathan, and Norman P. Jouppi
(Google, USA; Apple, USA; Waymo, USA)
Publisher's Version
Infinity Stream: Portable and Programmer-Friendly In-/Near-Memory Fusion
Zhengrong Wang, Christopher Liu, Aman Arora, Lizy John, and Tony Nowatzki
(University of California at Los Angeles, Los Angeles, USA; University of Texas at Austin, Austin, USA)
Publisher's Version
In-Network Aggregation with Transport Transparency for Distributed Training
Shuo Liu, Qiaoling Wang, Junyi Zhang, Wenfei Wu, Qinliang Lin, Yao Liu, Meng Xu, Marco Canini, Ray C. C. Cheung, and Jianfei He
(Huawei Technologies, China; Peking University, China; Sun Yat-sen University, China; King Abdullah University of Science and Technology, Saudi Arabia; City University of Hong Kong, China)
Publisher's Version
Kodan: Addressing the Computational Bottleneck in Space
Bradley Denby, Krishna Chintalapudi, Ranveer Chandra, Brandon Lucia, and Shadi Noghabi
(Carnegie Mellon University, USA; Microsoft Research, USA)
Publisher's Version
LEGO: Empowering Chip-Level Functionality Plug-and-Play for Next-Generation IoT Devices
Chong Zhang, Songfan Li, Yihang Song, Qianhe Meng, Minghua Chen, YanXu Bai, Li Lu, and Hongzi Zhu
(University of Electronic Science and Technology of China, China; Shanghai Jiao Tong University, China)
Publisher's Version
Mapping Very Large Scale Spiking Neuron Network to Neuromorphic Hardware
Ouwen Jin, Qinghui Xing, Ying Li, Shuiguang Deng, Shuibing He, and Gang Pan
(Zhejiang University, China)
Publisher's Version
Mosaic Pages: Big TLB Reach with Small Pages
Krishnan Gosakan, Jaehyun Han, William Kuszmaul, Ibrahim N. Mubarek, Nirjhar Mukherjee, Karthik Sriram, Guido Tagliavini, Evan West, Michael A. Bender, Abhishek Bhattacharjee, Alex Conway, Martin Farach-Colton, Jayneel Gandhi, Rob Johnson, Sudarsun Kannan, and Donald E. Porter
(Rutgers University, USA; University of North Carolina at Chapel Hill, USA; Massachusetts Institute of Technology, USA; Carnegie Mellon University, USA; Yale University, USA; Stony Brook University, USA; VMware Research, USA; Meta, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
MP-Rec: Hardware-Software Co-design to Enable Multi-path Recommendation
Samuel Hsia, Udit Gupta, Bilge Acun, Newsha Ardalani, Pan Zhong, Gu-Yeon Wei, David Brooks, and Carole-Jean Wu
(Harvard University, USA; Meta AI, USA)
Publisher's Version
NosWalker: A Decoupled Architecture for Out-of-Core Random Walk Processing
Shuke Wang, Mingxing Zhang, Ke Yang, Kang Chen, Shaonan Ma, Jinlei Jiang, and Yongwei Wu
(Tsinghua University, China; Beijing HaiZhi XingTu Technology, China)
Publisher's Version
Occamy: Elastically Sharing a SIMD Co-processor across Multiple CPU Cores
Zhongcheng Zhang, Yan Ou, Ying Liu, Chenxi Wang, Yongbin Zhou, Xiaoyu Wang, Yuyang Zhang, Yucheng Ouyang, Jiahao Shan, Ying Wang, Jingling Xue, Huimin Cui, and Xiaobing Feng
(Institute of Computing Technology at Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China; HiSilicon Technologies, China; UNSW, Sydney, Australia)
Publisher's Version
Persistent Memory Disaggregation for Cloud-Native Relational Databases
Chaoyi Ruan, Yingqiang Zhang, Chao Bi, Xiaosong Ma, Hao Chen, Feifei Li, Xinjun Yang, Cheng Li, Ashraf Aboulnaga, and Yinlong Xu
(University of Science and Technology of China, China; Alibaba Group, China; Qatar Computing Research Institute, Qatar; Hamad Bin Khalifa University, Qatar)
Publisher's Version
PipeSynth: Automated Synthesis of Microarchitectural Axioms for Memory Consistency
Chase Norman, Adwait Godbole, and Yatin A. Manerkar
(University of California at Berkeley, Berkeley, USA; University of Michigan, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Protect the System Call, Protect (Most of) the World with BASTION
Christopher Jelesnianski, Mohannad Ismail, Yeongjin Jang, Dan Williams, and Changwoo Min
(Virginia Tech, USA; Oregon State University, USA)
Publisher's Version
Re-architecting I/O Caches for Emerging Fast Storage Devices
Mohammadamin Ajdari, Pouria Peykani Sani, Amirhossein Moradi, Masoud Khanalizadeh Imani, Amir Hossein Bazkhanei, and Hossein Asadi
(HPDS Research, Iran; Sharif University of Technology, Iran)
Publisher's Version
Reconfigurable Virtual Memory for FPGA-Driven I/O
Joshua Landgraf, Matthew Giordano, Esther Yoon, and Christopher J. Rossbach
(University of Texas at Austin, USA; Katana Graph, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
RepCut: Superlinear Parallel RTL Simulation with Replication-Aided Partitioning
Haoyuan Wang and Scott Beamer
(University of California at Santa Cruz, Santa Cruz, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Rosebud: Making FPGA-Accelerated Middlebox Development More Pleasant
Moein Khazraee, Alex Forencich, George C. Papen, Alex C. Snoeren, and Aaron Schulman
(Massachusetts Institute of Technology, USA; University of California at San Diego, San Diego, USA)
Publisher's Version Published Artifact Info Artifacts Available Artifacts Functional
Simulator Independent Coverage for RTL Hardware Languages
Kevin Laeufer, Vighnesh Iyer, David Biancolin, Jonathan Bachrach, Borivoje Nikolić, and Koushik Sen
(University of California at Berkeley, Berkeley, USA; SiFive, USA; JITX, USA)
Publisher's Version Published Artifact Info Artifacts Available Artifacts Functional Results Reproduced
Skybox: Open-Source Graphic Rendering on Programmable RISC-V GPUs
Blaise Tine, Varun Saxena, Santosh Srivatsan, Joshua R. Simpson, Fadi Alzammar, Liam Cooper, and Hyesoon Kim
(Georgia Institute of Technology, USA; California Polytechnic State University, USA)
Publisher's Version
Snape: Reliable and Low-Cost Computing with Mixture of Spot and On-Demand VMs
Fangkai Yang, Lu Wang, Zhenyu Xu, Jue Zhang, Liqun Li, Bo Qiao, Camille Couturier, Chetan Bansal, Soumya Ram, Si Qin, Zhen Ma, Íñigo Goiri, Eli Cortez, Terry Yang, Victor Rühle, Saravan Rajmohan, Qingwei Lin, and Dongmei Zhang
(Microsoft Research, China; Microsoft 365, France; Microsoft 365, USA; Microsoft Azure, USA; Microsoft 365, China; Microsoft 365, UK)
Publisher's Version
Space-Efficient TREC for Enabling Deep Learning on Microcontrollers
Jiesong Liu, Feng Zhang, Jiawei Guan, Hsin-Hsuan Sung, Xiaoguang Guo, Xiaoyong Du, and Xipeng Shen
(Renmin University of China, China; North Carolina State University, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Zihao Ye, Ruihang Lai, Junru Shao, Tianqi Chen, and Luis Ceze
(University of Washington, USA; Carnegie Mellon University, USA; OctoML, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
SPLENDID: Supporting Parallel LLVM-IR Enhanced Natural Decompilation for Interactive Development
Zujun Tan, Yebin Chon, Michael Kruse, Johannes Doerfert, Ziyang Xu, Brian Homerding, Simone Campanoni, and David I. August
(Princeton University, USA; Argonne National Laboratory, USA; Northwestern University, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
TeraHeap: Reducing Memory Pressure in Managed Big Data Frameworks
Iacovos G. Kolokasis, Giannos Evdorou, Shoaib Akram, Christos Kozanitis, Anastasios Papagiannis, Foivos S. Zakkak, Polyvios Pratikakis, and Angelos Bilas
(University of Crete, Greece; ICS-FORTH, Greece; Australian National University, Australia; Isovalent, USA; Red Hat, UK)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional
The Sparse Abstract Machine
Olivia Hsu, Maxwell Strange, Ritvik Sharma, Jaeyeon Won, Kunle Olukotun, Joel S. Emer, Mark A. Horowitz, and Fredrik Kjølstad
(Stanford University, USA; Massachusetts Institute of Technology, USA; NVIDIA, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Towards an Adaptable Systems Architecture for Memory Tiering at Warehouse-Scale
Padmapriya Duraisamy, Wei Xu, Scott Hare, Ravi Rajwar, David Culler, Zhiyi Xu, Jianing Fan, Christopher Kennelly, Bill McCloskey, Danijela Mijailovic, Brian Morris, Chiranjit Mukherjee, Jingliang Ren, Greg Thelen, Paul Turner, Carlos Villavieja, Parthasarathy Ranganathan, and Amin Vahdat
(Google, USA; University of California at Berkeley, Berkeley, USA)
Publisher's Version
TPP: Transparent Page Placement for CXL-Enabled Tiered-Memory
Hasan Al Maruf, Hao Wang, Abhishek Dhanotia, Johannes Weiner, Niket Agarwal, Pallab Bhattacharya, Chris Petersen, Mosharaf Chowdhury, Shobhit Kanaujia, and Prakash Chauhan
(University of Michigan, USA; NVIDIA, USA; Meta, USA)
Publisher's Version
Transparent Runtime Change Handling for Android Apps
Zizhan Chen and Zili Shao
(Chinese University of Hong Kong, China)
Publisher's Version Published Artifact Info Artifacts Available Artifacts Functional Results Reproduced
Untangle: A Principled Framework to Design Low-Leakage, High-Performance Dynamic Partitioning Schemes
Zirui Neil Zhao, Adam Morrison, Christopher W. Fletcher, and Josep Torrellas
(University of Illinois at Urbana-Champaign, USA; Tel Aviv University, Israel)
Publisher's Version
Verification of Nondeterministic Quantum Programs
Yuan Feng and Yingte Xu
(University of Technology Sydney, Australia)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced
Vidi: Record Replay for Reconfigurable Hardware
Gefei Zuo, Jiacheng Ma, Andrew Quinn, and Baris Kasikci
(University of Michigan, USA; University of California at Santa Cruz, Santa Cruz, USA; Google, USA)
Publisher's Version Published Artifact Artifacts Available Artifacts Functional Results Reproduced

proc time: 7.37