PPoPP 2015 – Proceedings

20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2015), February 7–11, 2015, San Francisco, CA, USA

Frontmatter

Concurrency
Mon, Feb 9, 10:20 - 12:00

The SprayList: A Scalable Relaxed Priority Queue
Dan Alistarh, Justin Kopinsky, Jerry Li, and Nir Shavit
(Microsoft Research, UK; Massachusetts Institute of Technology, USA; Tel Aviv University, Israel)

Article: ppopp15main-mainppopp15-187-p doi:

Predicate RCU: An RCU for Scalable Concurrent Updates
Maya Arbel and Adam Morrison
(Technion, Israel)

Article: ppopp15main-mainppopp15-145-p doi:

Automatic Scalable Atomicity via Semantic Locking
Guy Golan-Gueta, G. Ramalingam, Mooly Sagiv, and Eran Yahav
(Yahoo Labs, Israel; Microsoft Research, India; Tel Aviv University, Israel; Technion, Israel)

Article: ppopp15main-mainppopp15-74-p doi:

Code Generation
Mon, Feb 9, 13:30 - 14:45

A Framework for Practical Parallel Fast Matrix Multiplication
Austin R. Benson and Grey Ballard
(Stanford University, USA; Sandia National Laboratories, USA)

Article: ppopp15main-mainppopp15-81-p doi:

PLUTO+: Near-Complete Modeling of Affine Transformations for Parallelism and Locality
Aravind Acharya and Uday Bondhugula
(Indian Institute of Science, India)

Article: ppopp15main-mainppopp15-76-p doi:

Distributed Memory Code Generation for Mixed Irregular/Regular Computations
Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan
(Ohio State University, USA; Louisiana State University, USA)

Article: ppopp15main-mainppopp15-101-p doi:

Transactional Memory
Mon, Feb 9, 15:10 - 16:25

Software Partitioning of Hardware Transactions
Lingxiang Xiang and Michael L. Scott
(University of Rochester, USA)

Article: ppopp15main-mainppopp15-50-p doi:

Performance Implications of Dynamic Memory Allocators on Transactional Memory Systems
Alexandro Baldassin, Edson Borin, and Guido Araujo
(UNESP, Brazil; UNICAMP, Brazil)

Article: ppopp15main-mainppopp15-44-p doi:

Low-Overhead Software Transactional Memory with Progress Guarantees and Strong Semantics
Minjia Zhang, Jipeng Huang, Man Cao, and Michael D. Bond
(Ohio State University, USA)

Article: ppopp15main-mainppopp15-72-p doi:

Large Scale Parallelism
Tue, Feb 10, 08:25 - 09:40

Barrier Elision for Production Parallel Programs
Milind Chabbi, Wim Lavrijsen, Wibe de Jong, Koushik Sen, John Mellor-Crummey, and Costin Iancu
(Rice University, USA; Lawrence Berkeley National Laboratory, USA; University of California at Berkeley, USA)

Article: ppopp15main-mainppopp15-41-p doi:

Scalable and Efficient Implementation of 3D Unstructured Meshes Computation: A Case Study on Matrix Assembly
Loïc Thébault, Eric Petit, and Quang Dinh
(University of Versailles, France; Dassault Aviation, France)

Article: ppopp15main-mainppopp15-140-p doi:

Diagnosing the Causes and Severity of One-Sided Message Contention
Nathan R. Tallent, Abhinav Vishnu, Hubertus Van Dam, Jeff Daily, Darren J. Kerbyson, and Adolfy Hoisie
(Pacific Northwest National Laboratory, USA)

Article: ppopp15main-mainppopp15-125-p doi:

Verification and Accelerators
Tue, Feb 10, 10:05 - 11:45

A Parallel Algorithm for Global States Enumeration in Concurrent Systems
Yen-Jung Chang and Vijay K. Garg
(University of Texas at Austin, USA)

Article: ppopp15main-mainppopp15-164-p doi:

Dynamic Deadlock Verification for General Barrier Synchronisation
Tiago Cogumbreiro, Raymond Hu, Francisco Martins, and Nobuko Yoshida
(Imperial College London, UK; University of Lisbon, Portugal)

Article: ppopp15main-mainppopp15-146-p doi:

VirtCL: A Framework for OpenCL Device Abstraction and Management
Yi-Ping You, Hen-Jung Wu, Yeh-Ning Tsai, and Yen-Ting Chao
(National Chiao Tung University, Taiwan)

Article: ppopp15main-mainppopp15-48-p doi:

On Optimizing Machine Learning Workloads via Kernel Fusion
Arash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, and P. Sadayappan
(Ohio State University, USA; IBM, USA; IBM, Canada)

Article: ppopp15main-mainppopp15-170-p doi:

Algorithms
Tue, Feb 10, 14:45 - 16:00

NUMA-Aware Graph-Structured Analytics
Kaiyuan Zhang, Rong Chen, and Haibo Chen
(Shanghai Jiao Tong University, China)

Article: ppopp15main-mainppopp15-57-p doi:

SYNC or ASYNC: Time to Fuse for Distributed Graph-Parallel Computation
Chenning Xie, Rong Chen, Haibing Guan, Binyu Zang, and Haibo Chen
(Shanghai Jiao Tong University, China)

Article: ppopp15main-mainppopp15-58-p doi:

Cache-Oblivious Wavefront: Improving Parallelism of Recursive Dynamic Programming Algorithms without Losing Cache-Efficiency
Yuan Tang, Ronghui You, Haibin Kan, Jesmin Jahan Tithi, Pramod Ganapathi, and Rezaul A. Chowdhury
(Fudan University, China; Stony Brook University, USA)

Article: ppopp15main-mainppopp15-97-p doi:

Locking and Locality
Wed, Feb 11, 09:40 - 10:55

High Performance Locks for Multi-level NUMA Systems
Milind Chabbi, Michael Fagan, and John Mellor-Crummey
(Rice University, USA)

Article: ppopp15main-mainppopp15-42-p doi:

A Library for Portable and Composable Data Locality Optimizations for NUMA Systems
Zoltan Majo and Thomas R. Gross
(ETH Zurich, Switzerland)

Article: ppopp15main-mainppopp15-68-p doi:

MPI+Threads: Runtime Contention and Remedies
Abdelhalim Amer, Huiwei Lu, Yanjie Wei, Pavan Balaji, and Satoshi Matsuoka
(Tokyo Institute of Technology, Japan; Argonne National Laboratory, USA; Shenzhen Institute of Advanced Technologies at Chinese Academy of Sciences, China)

Article: ppopp15main-mainppopp15-171-p doi:

Poster Abstracts
Sun, Feb 8, 18:15 - 20:00

Fence Placement for Legacy Data-Race-Free Programs via Synchronization Read Detection
Andrew J. McPherson, Vijay Nagarajan, Susmit Sarkar, and Marcelo Cintra
(University of Edinburgh, UK; University of St. Andrews, UK; Intel, Germany)

Article: ppopp15main-mainppopp15-12-p doi:

JAWS: A JavaScript Framework for Adaptive CPU-GPU Work Sharing
Xianglan Piao, Channoh Kim, Younghwan Oh, Huiying Li, Jincheon Kim, Hanjun Kim, and Jae W. Lee
(Sungkyunkwan University, South Korea; Company 100, South Korea; POSTECH, South Korea)

Article: ppopp15main-mainppopp15-21-p doi:

GStream: A Graph Streaming Processing Method for Large-Scale Graphs on GPUs
Hyunseok Seo, Jinwook Kim, and Min-Soo Kim
(DGIST, South Korea)

Article: ppopp15main-mainppopp15-23-p doi:

SemCache++: Semantics-Aware Caching for Efficient Multi-GPU Offloading
Nabeel Al-Saber and Milind Kulkarni
(Purdue University, USA)

Article: ppopp15main-mainppopp15-24-p doi:

An OpenACC-Based Unified Programming Model for Multi-accelerator Systems
Jungwon Kim, Seyong Lee, and Jeffrey S. Vetter
(Oak Ridge National Laboratory, USA; Georgia Tech, USA)

Article: ppopp15main-mainppopp15-40-p doi:

Towards Batched Linear Solvers on Accelerated Hardware Platforms
Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, and Jack Dongarra
(University of Tennessee, USA; Oak Ridge National Laboratory, USA; University of Manchester, UK)

Article: ppopp15main-mainppopp15-63-p doi:

A Collection-Oriented Programming Model for Performance Portability
Saurav Muralidharan, Michael Garland, Bryan Catanzaro, Albert Sidelnik, and Mary Hall
(University of Utah, USA; NVIDIA, USA; Baidu, USA)

Article: ppopp15main-mainppopp15-77-p doi:

Gunrock: A High-Performance Graph Processing Library on the GPU
Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D. Owens
(University of California at Davis, USA)

Article: ppopp15main-mainppopp15-79-p doi:

Decoupled Load Balancing
Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz, and Nancy M. Amato
(Texas A&M University, USA; Lawrence Livermore National Laboratory, USA)

Article: ppopp15main-mainppopp15-83-p doi:

Combining Phase Identification and Statistic Modeling for Automated Parallel Benchmark Generation
Ye Jin, Mingliang Liu, Xiaosong Ma, Qing Liu, Jeremy Logan, Norbert Podhorszki, Jong Youl Choi, and Scott Klasky
(North Carolina State University, USA; Qatar Computing Research Institute, Qatar; Oak Ridge National Laboratory, USA)

Article: ppopp15main-mainppopp15-96-p doi:

Optimization of Asynchronous Graph Processing on GPU with Hybrid Coloring Model
Xuanhua Shi, Junling Liang, Sheng Di, Bingsheng He, Hai Jin, Lu Lu, Zhixiang Wang, Xuan Luo, and Jianlong Zhong
(Huazhong University of Science and Technology, China; Argonne National Laboratory, USA; Nanyang Technological University, Singapore)

Article: ppopp15main-mainppopp15-110-p doi:

Efficient and Reasonable Object-Oriented Concurrency
Scott West, Sebastian Nanz, and Bertrand Meyer
(ETH Zurich, Switzerland)

Article: ppopp15main-mainppopp15-118-p doi:

A Programming Model and Runtime System for Significance-Aware Energy-Efficient Computing
Vassilis Vassiliadis, Konstantinos Parasyris, Charalambos Chalios, Christos D. Antonopoulos, Spyros Lalis, Nikolaos Bellas, Hans Vandierendonck, and Dimitrios S. Nikolopoulos
(University of Thessaly, Greece; Centre for Research and Technology Hellas, Greece; Queen's University of Belfast, UK)

Article: ppopp15main-mainppopp15-119-p doi:

The Lock-Free k-LSM Relaxed Priority Queue
Martin Wimmer, Jakob Gruber, Jesper Larsson Träff, and Philippas Tsigas
(TU Vienna, Austria; Chalmers University of Technology, Sweden)

Article: ppopp15main-mainppopp15-121-p doi:

Static/Dynamic Validation of MPI Collective Communications in Multi-threaded Context
Emmanuelle Saillard, Patrick Carribault, and Denis Barthou
(CEA, France; Bordeaux Institute of Technology, France; LaBRI, France; INRIA, France)

Article: ppopp15main-mainppopp15-138-p doi:

CASTLE: Fast Concurrent Internal Binary Search Tree using Edge-Based Locking
Arunmoezhi Ramachandran and Neeraj Mittal
(University of Texas at Dallas, USA)

Article: ppopp15main-mainppopp15-149-p doi:

Section Based Program Analysis to Reduce Overhead of Detecting Unsynchronized Thread Communication
Madan Das, Gabriel Southern, and Jose Renau
(University of California at Santa Cruz, USA)

Article: ppopp15main-mainppopp15-168-p doi:

A Hierarchical Approach to Reducing Communication in Parallel Graph Algorithms
Harshvardhan, Nancy M. Amato, and Lawrence Rauchwerger
(Texas A&M University, USA)

Article: ppopp15main-mainppopp15-178-p doi:

Tiles: A New Language Mechanism for Heterogeneous Parallelism
Yifeng Chen, Xiang Cui, and Hong Mei
(Peking University, China)

Article: ppopp15main-mainppopp15-188-p doi:

Are Web Applications Ready for Parallelism?
Cosmin Radoi, Stephan Herhut, Jaswanth Sreeram, and Danny Dig
(University of Illinois at Urbana-Champaign, USA; Intel, USA; Oregon State University, USA)

Article: ppopp15main-mainppopp15-213-p doi:

PPoPP 2015 – Proceedings

Frontmatter

Concurrency Mon, Feb 9, 10:20 - 12:00

Code Generation Mon, Feb 9, 13:30 - 14:45

Transactional Memory Mon, Feb 9, 15:10 - 16:25

Large Scale Parallelism Tue, Feb 10, 08:25 - 09:40

Verification and Accelerators Tue, Feb 10, 10:05 - 11:45

Algorithms Tue, Feb 10, 14:45 - 16:00

Locking and Locality Wed, Feb 11, 09:40 - 10:55

Poster Abstracts Sun, Feb 8, 18:15 - 20:00