ICFP Workshops 2019
24th ACM SIGPLAN International Conference on Functional Programming (ICFP 2019)
Powered by
Conference Publishing Consulting

8th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing (FHPNC 2019), August 18, 2019, Berlin, Germany

FHPNC 2019 – Preliminary Table of Contents

Contents - Abstracts - Authors

8th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing (FHPNC 2019)

Frontmatter

Title Page


Message from the Chair


Papers

Generating Efficient FFT GPU Code with Lift
Bastian Köpcke, Michel Steuwer, and Sergei Gorlatch
(University of Münster, Germany; University of Glasgow, UK)
The Fast Fourier Transform is a well-known algorithm used in many high-performance applications, ranging from signal processing to convolutional neural networks.
In this paper, we encode FFTs by building high-level abstractions based on a set of functional parallel patterns in the Lift language. Abstractions are derived from and closely resemble mathematical definitions for FFTs. We leverage the Lift performance-portable code generator to generate high performing GPU code for FFTs. No FFT-specific patterns are required for this, showing the expressive power of the generic parallel patterns used in Lift.
Our experimental results show that our approach achieves performance better than AMD's OpenCL implementation clFFT on an Nvidia GPU. Nvidia's highly optimized cuFFT implementation still performs better on their GPUs.

Article Search
Position-Dependent Arrays and Their Application for High Performance Code Generation
Federico Pizzuti, Michel Steuwer, and Christophe Dubach
(University of Edinburgh, UK; University of Glasgow, UK)
Modern parallel hardware promises unprecedented performance, for the gifted few experts who can program it correctly. Code generators from high-level languages provide an attractive alternative, promising to deliver high performance automatically.
Existing projects such as Accelerate, Futhark, Halide, or Lift show that this approach is feasible. Unfortunately, existing efforts focus on computations over tensors: regularly shaped higher dimensional arrays. This limits the expressiveness of these approaches and excludes many interesting data structures that are commonly encoded manually in memory, such as trees or triangular matrices.
This paper presents an extended array type that lifts this restriction. For multidimensional arrays, the size of a nested array might depend on its position in the surrounding arrays, enabling the expression of computations over less regularly shaped data structures. However, position-dependent arrays bring new challenges for high-performance code generation, as indexing elements in memory becomes more challenging.
This paper shows how these challenges are addressed by extending the existing Lift type system and compiler. The experimental results show that this approach enables the efficient code generation of triangular matrix-vector multiplication, with performance improvements over cuBLAS on an Nvidia GPU by up to 2×. Furthermore, we show a use case for a low-level optimization for avoiding unnecessary out-of-bound checks in stencils, leading to up to 3× improvements over already optimized generated stencil codes.

Article Search
Lazy Evaluation in Infinite-Dimensional Function Spaces with Wavelet Basis
Justus Sagemüller and Olivier Verdier
(University of Cologne, Germany; Western Norway University of Applied Sciences, Norway; KTH, Sweden)
Vectors in numerical computation, i.e., arrays of numbers, often represent continuous functions. We would like to reflect this with types. One apparent obstacle is that spaces of functions are typically infinite-dimensional, while the code must run in finite time and memory.
We argue that this can be overcome: even in an infinite-dimensional space, the vectors can in practice be stored in finite memory. However, dual vectors (corresponding essentially to distributions) require infinite data structure. The distinction is usually lost in the finite dimensional case, since dual vectors are often simply represented as vectors (by implicitly choosing a scalar product establishing the correspondence). However, we shall see that an explicit type-level distinction between functions and distributions makes sense and allows directly expressing useful concepts such as the Dirac distribution, which are problematic in the standard finite-resolution picture.
The example implementation uses a very simple local basis that corresponds to a Haar Wavelet transform.

Article Search
Safety at Speed: In-Place Array Algorithms from Pure Functional Programs by Safely Re-using Storage
Markus Aronsson, Koen Claessen, Mary Sheeran, and Nicholas Smallbone
(Chalmers University of Technology, Sweden)
We present a purely functional array programming language that offers safe, purely functional and crash-free in-place array transformations. The language supports high-level abstractions for pure and efficient array computations that fully support equational reasoning. We show how to execute selected parts of these computations safely in-place, with the compiler guaranteeing that in-place execution does not change the computation’s result. Correctness is ensured by using an off-the-shelf-theorem prover to discharge safety conditions. Our main contribution is the idea of virtual copies for expressing re-use of arrays, and techniques for verifying their safety, which allow a pure language to include in-place transformations without weakening its transparency or reasoning power.

Article Search
Compositional Deep Learning in Futhark
Duc Minh Tran, Troels Henriksen, and Martin Elsman
(University of Copenhagen, Denmark)
We present a design pattern for composing deep learning networks in a typed, higher-order fashion. The exposed library functions are generically typed and the composition structure allows for networks to be trained (using back-propagation) and for trained networks to be used for predicting new results (using forward-propagation). Individual layers in a network can take different forms ranging over dense sigmoid layers to convolutional layers. The paper discusses different typing techniques aimed at enforcing proper use and composition of networks. The approach is implemented in Futhark, a data-parallel functional language and compiler targeting GPU architectures, and we demonstrate that Futhark's elimination of higher-order functions and modules leads to efficient generated code.

Article Search

proc time: 1.56