Powered by
Conference Publishing Consulting

8th Workshop on General Purpose Processing using GPUs (GPGPU 8), February 7, 2015, San Francisco, CA, USA

GPGPU 2015 – Proceedings

Contents - Abstracts - Authors

8th Workshop on General Purpose Processing using GPUs (GPGPU 8)

Preface

Title Page
Message from the Workshop Organizers

HPC

A Comparative Investigation of Device-Specific Mechanisms for Exploiting HPC Accelerators
Ayman Tarakji, Lukas Börger, and Rainer Leupers
(RWTH Aachen University, Germany)

Cache and Shared Memory

GPU-SM: Shared Memory Multi-GPU Programming
Javier Cabezas, Marc Jordà, Isaac Gelado, Nacho Navarro, and Wen-mei Hwu
(Barcelona Supercomputing Center, Spain; NVIDIA, USA; Universitat Politècnica de Catalunya, Spain; University of Illinois at Urbana-Champaign, USA)
Adaptive GPU Cache Bypassing
Yingying Tian, Sooraj Puthoor, Joseph L. Greathouse, Bradford M. Beckmann, and Daniel A. Jiménez
(Texas A&M University, USA; AMD Research, USA)
Efficient Utilization of GPGPU Cache Hierarchy
Mahmoud Khairy, Mohamed Zahran, and Amr G. Wassal
(Cairo University, Egypt; New York University, USA)

Optimization

Effects of Source-Code Optimizations on GPU Performance and Energy Consumption
Jared Coplin and Martin Burtscher
(Texas State University, USA)
Optimization for Performance and Energy for Batched Matrix Computations on GPUs
Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, and Jack Dongarra
(University of Tennessee, USA; Oak Ridge National Laboratory, USA; University of Manchester, UK)
Helium: A Transparent Inter-kernel Optimizer for OpenCL
Thibaut Lutz, Christian Fensch, and Murray Cole
(University of Edinburgh, UK; Heriot-Watt University, UK)

Applications

Stochastic Gradient Descent on GPUs
Rashid Kaleem, Sreepathi Pai, and Keshav Pingali
(University of Texas at Austin, USA)
High Performance Computing of Fiber Scattering Simulation
Leiming Yu, Yan Zhang, Xiang Gong, Nilay Roy, Lee Makowski, and David Kaeli
(Northeastern University, USA)
Rethinking the Parallelization of Random-Restart Hill Climbing: A Case Study in Optimizing a 2-Opt TSP Solver for GPU Execution
Molly A. O'Neil and Martin Burtscher
(Texas State University, USA)
Forma: A DSL for Image Processing Applications to Target GPUs and Multi-core CPUs
Mahesh Ravishankar, Justin Holewinski, and Vinod Grover
(NVIDIA, USA)

proc time: 0.65