ICFP Workshops 2019
24th ACM SIGPLAN International Conference on Functional Programming (ICFP 2019)
Powered by
Conference Publishing Consulting

8th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing (FHPNC 2019), August 18, 2019, Berlin, Germany

FHPNC 2019 – Proceedings

Contents - Abstracts - Authors

8th ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing (FHPNC 2019)

Frontmatter

Title Page


Message from the Chair
Welcome to the 2019 edition of the ACM SIGPLAN International Workshop on Functional High-Performance and Numerical Computing (FHPNC), co-located with ICFP in Berlin, Germany on August 18, 2019.
The workshop aims to bring together researchers and practitioners exploring or employing the use of functional or declarative programming languages or techniques in scientific computing, and specifically in the domains of high-performance computing and numerical programming.
This event is a merge of two workshops that took place during previous editions of ICFP : FHPC (Functional High-Performance Computing) and NPFL (Numerical Programming in Functional Languages), and as such it aims to foster the exchange of ideas between the two communities. We thank Kei Davis (LANL, USA) and Fritz Henglein (DIKU, Denmark) for suggesting and promoting the workshop merger, as well as all Authors and Program Committee members of the 2018 workshops for making those events a success.
The invited talk to FHPNC 2019 will be by Dr Micaela Mayero (LIPN Laboratory, University Paris 13, France); her main research interests are formal methods for critical problems, including formal verification of numerical programs. The keynote speaker will be Dr Trevor L. McDonell (Utrecht University, the Netherlands); his work focuses on the design and implementation of functional programming languages for parallel computing, in particular the use of graphics processors and other compute accelerators for high-performance computing applications.
FHPNC 2019 followed a single-cycle blind anonymous review process, and invited the submission of both extended abstracts and full-length papers. A total of nine articles were submitted and accepted, of which five are full papers. Submissions by Program Committee members were allowed but have been assigned to non-conflicting reviewers.
We thank all Authors and members of the Program Committee for providing and evaluating the scientific content of this workshop.

Papers

Generating Efficient FFT GPU Code with Lift
Bastian Köpcke, Michel Steuwer, and Sergei Gorlatch
(University of Münster, Germany; University of Glasgow, UK)
The Fast Fourier Transform is a well-known algorithm used in many high-performance applications, ranging from signal processing to convolutional neural networks.
In this paper, we encode FFTs by building high-level abstractions based on a set of functional parallel patterns in the Lift language. Abstractions are derived from and closely resemble mathematical definitions for FFTs. We leverage the Lift performance-portable code generator to generate high performing GPU code for FFTs. No FFT-specific patterns are required for this, showing the expressive power of the generic parallel patterns used in Lift.
Our experimental results show that our approach achieves performance better than AMD's OpenCL implementation clFFT on an Nvidia GPU. Nvidia's highly optimized cuFFT implementation still performs better on their GPUs.

Publisher's Version Article Search
Position-Dependent Arrays and Their Application for High Performance Code Generation
Federico Pizzuti, Michel Steuwer, and Christophe Dubach
(University of Edinburgh, UK; University of Glasgow, UK)
Modern parallel hardware promises unprecedented performance, for the gifted few experts who can program it correctly. Code generators from high-level languages provide an attractive alternative, promising to deliver high performance automatically.
Existing projects such as Accelerate, Futhark, Halide, or Lift show that this approach is feasible. Unfortunately, existing efforts focus on computations over tensors: regularly shaped higher dimensional arrays. This limits the expressiveness of these approaches and excludes many interesting data structures that are commonly encoded manually in memory, such as trees or triangular matrices.
This paper presents an extended array type that lifts this restriction. For multidimensional arrays, the size of a nested array might depend on its position in the surrounding arrays, enabling the expression of computations over less regularly shaped data structures. However, position-dependent arrays bring new challenges for high-performance code generation, as indexing elements in memory becomes more challenging.
This paper shows how these challenges are addressed by extending the existing Lift type system and compiler. The experimental results show that this approach enables the efficient code generation of triangular matrix-vector multiplication, with performance improvements over cuBLAS on an Nvidia GPU by up to 2×. Furthermore, we show a use case for a low-level optimization for avoiding unnecessary out-of-bound checks in stencils, leading to up to 3× improvements over already optimized generated stencil codes.

Publisher's Version Article Search
Lazy Evaluation in Infinite-Dimensional Function Spaces with Wavelet Basis
Justus Sagemüller and Olivier Verdier
(Western Norway University of Applied Sciences, Norway; KTH, Sweden)
Vectors in numerical computation, i.e., arrays of numbers, often represent continuous functions. We would like to reflect this with types. One apparent obstacle is that spaces of functions are typically infinite-dimensional, while the code must run in finite time and memory.
We argue that this can be overcome: even in an infinite-dimensional space, the vectors can in practice be stored in finite memory. However, dual vectors (corresponding essentially to distributions) require infinite data structure. The distinction is usually lost in the finite dimensional case, since dual vectors are often simply represented as vectors (by implicitly choosing a scalar product establishing the correspondence). However, we shall see that an explicit type-level distinction between functions and distributions makes sense and allows directly expressing useful concepts such as the Dirac distribution, which are problematic in the standard finite-resolution picture.
The example implementation uses a very simple local basis that corresponds to a Haar Wavelet transform.

Publisher's Version Article Search
Safety at Speed: In-Place Array Algorithms from Pure Functional Programs by Safely Re-using Storage
Markus Aronsson, Koen Claessen, Mary Sheeran, and Nicholas Smallbone
(Chalmers University of Technology, Sweden)
We present a purely functional array programming language that offers safe, purely functional and crash-free in-place array transformations. The language supports high-level abstractions for pure and efficient array computations that fully support equational reasoning. We show how to execute selected parts of these computations safely in-place, with the compiler guaranteeing that in-place execution does not change the computation’s result. Correctness is ensured by using an off-the-shelf-theorem prover to discharge safety conditions. Our main contribution is the idea of virtual copies for expressing re-use of arrays, and techniques for verifying their safety, which allow a pure language to include in-place transformations without weakening its transparency or reasoning power.

Publisher's Version Article Search
Compositional Deep Learning in Futhark
Duc Minh Tran, Troels Henriksen, and Martin Elsman
(University of Copenhagen, Denmark)
We present a design pattern for composing deep learning networks in a typed, higher-order fashion. The exposed library functions are generically typed and the composition structure allows for networks to be trained (using back-propagation) and for trained networks to be used for predicting new results (using forward-propagation). Individual layers in a network can take different forms ranging over dense sigmoid layers to convolutional layers. The paper discusses different typing techniques aimed at enforcing proper use and composition of networks. The approach is implemented in Futhark, a data-parallel functional language and compiler targeting GPU architectures, and we demonstrate that Futhark's elimination of higher-order functions and modules leads to efficient generated code.

Publisher's Version Article Search

proc time: 2.49