Workshop ARRAY 2021 – Author Index |
Contents -
Abstracts -
Authors
|
Abusdal, Ole |
ARRAY '21: "Padding in the Mathematics ..."
Padding in the Mathematics of Arrays
Benjamin Chetioui, Ole Abusdal, Magne Haveraaen, Jaakko Järvi, and Lenore Mullin (University of Bergen, Norway; Western Norway University of Applied Sciences, Norway; University of Turku, Finland; SUNY Albany, USA) Multi-dimensional array manipulation constitutes a core component of numerous numerical methods, e.g. finite difference solvers of Partial Differential Equations (PDEs). The efficiency of such computations is tightly connected to traversing array data in a hardware-friendly way. The Mathematics of Arrays (MoA) allows reasoning about array computations at a high level and enables systematic transformations of array-based programs. We have previously shown that stencil computations reduce to MoA's Denotational Normal Form (DNF). Here we bring to light MoA's Operational Normal Forms (ONFs) that allow for adapting array computations to hardware characteristics. ONF transformations start from the DNF. Alongside the ONF transformations, we extend MoA with rewriting rules for padding. These new rules allow both a simplification of array indexing and a systematic approach to introducing halos to PDE solvers. Experiments on various architectures confirm the flexibility of the approach. @InProceedings{ARRAY21p15, author = {Benjamin Chetioui and Ole Abusdal and Magne Haveraaen and Jaakko Järvi and Lenore Mullin}, title = {Padding in the Mathematics of Arrays}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {15--26}, doi = {10.1145/3460944.3464311}, year = {2021}, } Publisher's Version |
|
Chetioui, Benjamin |
ARRAY '21: "Padding in the Mathematics ..."
Padding in the Mathematics of Arrays
Benjamin Chetioui, Ole Abusdal, Magne Haveraaen, Jaakko Järvi, and Lenore Mullin (University of Bergen, Norway; Western Norway University of Applied Sciences, Norway; University of Turku, Finland; SUNY Albany, USA) Multi-dimensional array manipulation constitutes a core component of numerous numerical methods, e.g. finite difference solvers of Partial Differential Equations (PDEs). The efficiency of such computations is tightly connected to traversing array data in a hardware-friendly way. The Mathematics of Arrays (MoA) allows reasoning about array computations at a high level and enables systematic transformations of array-based programs. We have previously shown that stencil computations reduce to MoA's Denotational Normal Form (DNF). Here we bring to light MoA's Operational Normal Forms (ONFs) that allow for adapting array computations to hardware characteristics. ONF transformations start from the DNF. Alongside the ONF transformations, we extend MoA with rewriting rules for padding. These new rules allow both a simplification of array indexing and a systematic approach to introducing halos to PDE solvers. Experiments on various architectures confirm the flexibility of the approach. @InProceedings{ARRAY21p15, author = {Benjamin Chetioui and Ole Abusdal and Magne Haveraaen and Jaakko Järvi and Lenore Mullin}, title = {Padding in the Mathematics of Arrays}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {15--26}, doi = {10.1145/3460944.3464311}, year = {2021}, } Publisher's Version |
|
Elsman, Martin |
ARRAY '21: "Towards Size-Dependent Types ..."
Towards Size-Dependent Types for Array Programming
Troels Henriksen and Martin Elsman (University of Copenhagen, Denmark) We present a type system for expressing size constraints on array types in an ML-style type system. The goal is to detect shape mismatches at compile-time, while being simpler than full dependent types. The main restrictions is that the only terms that can occur in types are array sizes, and syntactically they must be variables or constants. For those programs where this is not sufficient, we support a form of existential types, with the type system automatically managing the requisite book-keeping. We formalise a large subset of the type system in a small core language, which we prove sound. We also present an integration of the type system in the high-performance parallel functional language Futhark, and show on a collection of 44 representative programs that the restrictions in the type system are not too problematic in practice. @InProceedings{ARRAY21p1, author = {Troels Henriksen and Martin Elsman}, title = {Towards Size-Dependent Types for Array Programming}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {1--14}, doi = {10.1145/3460944.3464310}, year = {2021}, } Publisher's Version |
|
Haveraaen, Magne |
ARRAY '21: "Padding in the Mathematics ..."
Padding in the Mathematics of Arrays
Benjamin Chetioui, Ole Abusdal, Magne Haveraaen, Jaakko Järvi, and Lenore Mullin (University of Bergen, Norway; Western Norway University of Applied Sciences, Norway; University of Turku, Finland; SUNY Albany, USA) Multi-dimensional array manipulation constitutes a core component of numerous numerical methods, e.g. finite difference solvers of Partial Differential Equations (PDEs). The efficiency of such computations is tightly connected to traversing array data in a hardware-friendly way. The Mathematics of Arrays (MoA) allows reasoning about array computations at a high level and enables systematic transformations of array-based programs. We have previously shown that stencil computations reduce to MoA's Denotational Normal Form (DNF). Here we bring to light MoA's Operational Normal Forms (ONFs) that allow for adapting array computations to hardware characteristics. ONF transformations start from the DNF. Alongside the ONF transformations, we extend MoA with rewriting rules for padding. These new rules allow both a simplification of array indexing and a systematic approach to introducing halos to PDE solvers. Experiments on various architectures confirm the flexibility of the approach. @InProceedings{ARRAY21p15, author = {Benjamin Chetioui and Ole Abusdal and Magne Haveraaen and Jaakko Järvi and Lenore Mullin}, title = {Padding in the Mathematics of Arrays}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {15--26}, doi = {10.1145/3460944.3464311}, year = {2021}, } Publisher's Version |
|
Henriksen, Troels |
ARRAY '21: "Towards Size-Dependent Types ..."
Towards Size-Dependent Types for Array Programming
Troels Henriksen and Martin Elsman (University of Copenhagen, Denmark) We present a type system for expressing size constraints on array types in an ML-style type system. The goal is to detect shape mismatches at compile-time, while being simpler than full dependent types. The main restrictions is that the only terms that can occur in types are array sizes, and syntactically they must be variables or constants. For those programs where this is not sufficient, we support a form of existential types, with the type system automatically managing the requisite book-keeping. We formalise a large subset of the type system in a small core language, which we prove sound. We also present an integration of the type system in the high-performance parallel functional language Futhark, and show on a collection of 44 representative programs that the restrictions in the type system are not too problematic in practice. @InProceedings{ARRAY21p1, author = {Troels Henriksen and Martin Elsman}, title = {Towards Size-Dependent Types for Array Programming}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {1--14}, doi = {10.1145/3460944.3464310}, year = {2021}, } Publisher's Version |
|
Hlava, Marek |
ARRAY '21: "Acceleration of Lattice Models ..."
Acceleration of Lattice Models for Pricing Portfolios of Fixed-Income Derivatives
Wojciech Michal Pawlak, Marek Hlava, Martin Metaksov, and Cosmin Eugen Oancea (SimCorp, Denmark; University of Copenhagen, Denmark) This paper reports on the acceleration of a standard, lattice-based numerical algorithm that is widely used in finance for pricing a class of fixed-income vanilla derivatives. We start with a high-level algorithmic specification, exhibiting irregular nested parallelism, which is challenging to map efficiently to GPU hardware. From it we systematically derive and optimize two CUDA implementations, which utilize only the outer or all levels of parallelism, respectively. A detailed evaluation demonstrates (i) the high impact of the proposed optimizations, (ii) the complementary strength and weaknesses of the two GPU versions, and that (iii) they are on average 2.4× faster than our well-tuned CPU-parallel implementation (OpenMP+AVX2) running on 104 hardware threads, and by 3-to-4 order of magnitude faster than an OpenMP-parallel implementation using the popular QuantLib library. @InProceedings{ARRAY21p27, author = {Wojciech Michal Pawlak and Marek Hlava and Martin Metaksov and Cosmin Eugen Oancea}, title = {Acceleration of Lattice Models for Pricing Portfolios of Fixed-Income Derivatives}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {27--38}, doi = {10.1145/3460944.3464309}, year = {2021}, } Publisher's Version |
|
Järvi, Jaakko |
ARRAY '21: "Padding in the Mathematics ..."
Padding in the Mathematics of Arrays
Benjamin Chetioui, Ole Abusdal, Magne Haveraaen, Jaakko Järvi, and Lenore Mullin (University of Bergen, Norway; Western Norway University of Applied Sciences, Norway; University of Turku, Finland; SUNY Albany, USA) Multi-dimensional array manipulation constitutes a core component of numerous numerical methods, e.g. finite difference solvers of Partial Differential Equations (PDEs). The efficiency of such computations is tightly connected to traversing array data in a hardware-friendly way. The Mathematics of Arrays (MoA) allows reasoning about array computations at a high level and enables systematic transformations of array-based programs. We have previously shown that stencil computations reduce to MoA's Denotational Normal Form (DNF). Here we bring to light MoA's Operational Normal Forms (ONFs) that allow for adapting array computations to hardware characteristics. ONF transformations start from the DNF. Alongside the ONF transformations, we extend MoA with rewriting rules for padding. These new rules allow both a simplification of array indexing and a systematic approach to introducing halos to PDE solvers. Experiments on various architectures confirm the flexibility of the approach. @InProceedings{ARRAY21p15, author = {Benjamin Chetioui and Ole Abusdal and Magne Haveraaen and Jaakko Järvi and Lenore Mullin}, title = {Padding in the Mathematics of Arrays}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {15--26}, doi = {10.1145/3460944.3464311}, year = {2021}, } Publisher's Version |
|
Metaksov, Martin |
ARRAY '21: "Acceleration of Lattice Models ..."
Acceleration of Lattice Models for Pricing Portfolios of Fixed-Income Derivatives
Wojciech Michal Pawlak, Marek Hlava, Martin Metaksov, and Cosmin Eugen Oancea (SimCorp, Denmark; University of Copenhagen, Denmark) This paper reports on the acceleration of a standard, lattice-based numerical algorithm that is widely used in finance for pricing a class of fixed-income vanilla derivatives. We start with a high-level algorithmic specification, exhibiting irregular nested parallelism, which is challenging to map efficiently to GPU hardware. From it we systematically derive and optimize two CUDA implementations, which utilize only the outer or all levels of parallelism, respectively. A detailed evaluation demonstrates (i) the high impact of the proposed optimizations, (ii) the complementary strength and weaknesses of the two GPU versions, and that (iii) they are on average 2.4× faster than our well-tuned CPU-parallel implementation (OpenMP+AVX2) running on 104 hardware threads, and by 3-to-4 order of magnitude faster than an OpenMP-parallel implementation using the popular QuantLib library. @InProceedings{ARRAY21p27, author = {Wojciech Michal Pawlak and Marek Hlava and Martin Metaksov and Cosmin Eugen Oancea}, title = {Acceleration of Lattice Models for Pricing Portfolios of Fixed-Income Derivatives}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {27--38}, doi = {10.1145/3460944.3464309}, year = {2021}, } Publisher's Version |
|
Mullin, Lenore |
ARRAY '21: "Padding in the Mathematics ..."
Padding in the Mathematics of Arrays
Benjamin Chetioui, Ole Abusdal, Magne Haveraaen, Jaakko Järvi, and Lenore Mullin (University of Bergen, Norway; Western Norway University of Applied Sciences, Norway; University of Turku, Finland; SUNY Albany, USA) Multi-dimensional array manipulation constitutes a core component of numerous numerical methods, e.g. finite difference solvers of Partial Differential Equations (PDEs). The efficiency of such computations is tightly connected to traversing array data in a hardware-friendly way. The Mathematics of Arrays (MoA) allows reasoning about array computations at a high level and enables systematic transformations of array-based programs. We have previously shown that stencil computations reduce to MoA's Denotational Normal Form (DNF). Here we bring to light MoA's Operational Normal Forms (ONFs) that allow for adapting array computations to hardware characteristics. ONF transformations start from the DNF. Alongside the ONF transformations, we extend MoA with rewriting rules for padding. These new rules allow both a simplification of array indexing and a systematic approach to introducing halos to PDE solvers. Experiments on various architectures confirm the flexibility of the approach. @InProceedings{ARRAY21p15, author = {Benjamin Chetioui and Ole Abusdal and Magne Haveraaen and Jaakko Järvi and Lenore Mullin}, title = {Padding in the Mathematics of Arrays}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {15--26}, doi = {10.1145/3460944.3464311}, year = {2021}, } Publisher's Version |
|
Oancea, Cosmin Eugen |
ARRAY '21: "Acceleration of Lattice Models ..."
Acceleration of Lattice Models for Pricing Portfolios of Fixed-Income Derivatives
Wojciech Michal Pawlak, Marek Hlava, Martin Metaksov, and Cosmin Eugen Oancea (SimCorp, Denmark; University of Copenhagen, Denmark) This paper reports on the acceleration of a standard, lattice-based numerical algorithm that is widely used in finance for pricing a class of fixed-income vanilla derivatives. We start with a high-level algorithmic specification, exhibiting irregular nested parallelism, which is challenging to map efficiently to GPU hardware. From it we systematically derive and optimize two CUDA implementations, which utilize only the outer or all levels of parallelism, respectively. A detailed evaluation demonstrates (i) the high impact of the proposed optimizations, (ii) the complementary strength and weaknesses of the two GPU versions, and that (iii) they are on average 2.4× faster than our well-tuned CPU-parallel implementation (OpenMP+AVX2) running on 104 hardware threads, and by 3-to-4 order of magnitude faster than an OpenMP-parallel implementation using the popular QuantLib library. @InProceedings{ARRAY21p27, author = {Wojciech Michal Pawlak and Marek Hlava and Martin Metaksov and Cosmin Eugen Oancea}, title = {Acceleration of Lattice Models for Pricing Portfolios of Fixed-Income Derivatives}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {27--38}, doi = {10.1145/3460944.3464309}, year = {2021}, } Publisher's Version |
|
Pawlak, Wojciech Michal |
ARRAY '21: "Acceleration of Lattice Models ..."
Acceleration of Lattice Models for Pricing Portfolios of Fixed-Income Derivatives
Wojciech Michal Pawlak, Marek Hlava, Martin Metaksov, and Cosmin Eugen Oancea (SimCorp, Denmark; University of Copenhagen, Denmark) This paper reports on the acceleration of a standard, lattice-based numerical algorithm that is widely used in finance for pricing a class of fixed-income vanilla derivatives. We start with a high-level algorithmic specification, exhibiting irregular nested parallelism, which is challenging to map efficiently to GPU hardware. From it we systematically derive and optimize two CUDA implementations, which utilize only the outer or all levels of parallelism, respectively. A detailed evaluation demonstrates (i) the high impact of the proposed optimizations, (ii) the complementary strength and weaknesses of the two GPU versions, and that (iii) they are on average 2.4× faster than our well-tuned CPU-parallel implementation (OpenMP+AVX2) running on 104 hardware threads, and by 3-to-4 order of magnitude faster than an OpenMP-parallel implementation using the popular QuantLib library. @InProceedings{ARRAY21p27, author = {Wojciech Michal Pawlak and Marek Hlava and Martin Metaksov and Cosmin Eugen Oancea}, title = {Acceleration of Lattice Models for Pricing Portfolios of Fixed-Income Derivatives}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {27--38}, doi = {10.1145/3460944.3464309}, year = {2021}, } Publisher's Version |
|
Scholz, Sven-Bodo |
ARRAY '21: "Array Languages Make Neural ..."
Array Languages Make Neural Networks Fast
Artjoms Šinkarovs, Hans-Nikolai Vießmann, and Sven-Bodo Scholz (Heriot-Watt University, UK; Radboud University Nijmegen, Netherlands) Most implementations of machine learning algorithms are based on special-purpose frameworks such as TensorFlow or PyTorch. While these frameworks are convenient to use, they introduce multi-million lines of code dependency that one has to trust, understand and potentially modify. As an alternative, this paper investigates a direct implementation of a state of the art Convolutional Neural Network (CNN) in an array language. While our implementation requires 150 lines of code to define the special-purpose operators needed for CNNs, which are readily provided through frameworks such as TensorFlow and PyTorch, our implementation outperforms these frameworks by factors 2 and 3 on a fixed set of hardware — a 64-core GPU-accelerated machine; for a simple example network. The resulting specification is written in a rank-polymorphic data-parallel style, and it can be immediately leveraged by optimising compilers. Indeed, array languages make neural networks fast. @InProceedings{ARRAY21p39, author = {Artjoms Šinkarovs and Hans-Nikolai Vießmann and Sven-Bodo Scholz}, title = {Array Languages Make Neural Networks Fast}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {39--50}, doi = {10.1145/3460944.3464312}, year = {2021}, } Publisher's Version |
|
Šinkarovs, Artjoms |
ARRAY '21: "Array Languages Make Neural ..."
Array Languages Make Neural Networks Fast
Artjoms Šinkarovs, Hans-Nikolai Vießmann, and Sven-Bodo Scholz (Heriot-Watt University, UK; Radboud University Nijmegen, Netherlands) Most implementations of machine learning algorithms are based on special-purpose frameworks such as TensorFlow or PyTorch. While these frameworks are convenient to use, they introduce multi-million lines of code dependency that one has to trust, understand and potentially modify. As an alternative, this paper investigates a direct implementation of a state of the art Convolutional Neural Network (CNN) in an array language. While our implementation requires 150 lines of code to define the special-purpose operators needed for CNNs, which are readily provided through frameworks such as TensorFlow and PyTorch, our implementation outperforms these frameworks by factors 2 and 3 on a fixed set of hardware — a 64-core GPU-accelerated machine; for a simple example network. The resulting specification is written in a rank-polymorphic data-parallel style, and it can be immediately leveraged by optimising compilers. Indeed, array languages make neural networks fast. @InProceedings{ARRAY21p39, author = {Artjoms Šinkarovs and Hans-Nikolai Vießmann and Sven-Bodo Scholz}, title = {Array Languages Make Neural Networks Fast}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {39--50}, doi = {10.1145/3460944.3464312}, year = {2021}, } Publisher's Version |
|
Vießmann, Hans-Nikolai |
ARRAY '21: "Array Languages Make Neural ..."
Array Languages Make Neural Networks Fast
Artjoms Šinkarovs, Hans-Nikolai Vießmann, and Sven-Bodo Scholz (Heriot-Watt University, UK; Radboud University Nijmegen, Netherlands) Most implementations of machine learning algorithms are based on special-purpose frameworks such as TensorFlow or PyTorch. While these frameworks are convenient to use, they introduce multi-million lines of code dependency that one has to trust, understand and potentially modify. As an alternative, this paper investigates a direct implementation of a state of the art Convolutional Neural Network (CNN) in an array language. While our implementation requires 150 lines of code to define the special-purpose operators needed for CNNs, which are readily provided through frameworks such as TensorFlow and PyTorch, our implementation outperforms these frameworks by factors 2 and 3 on a fixed set of hardware — a 64-core GPU-accelerated machine; for a simple example network. The resulting specification is written in a rank-polymorphic data-parallel style, and it can be immediately leveraged by optimising compilers. Indeed, array languages make neural networks fast. @InProceedings{ARRAY21p39, author = {Artjoms Šinkarovs and Hans-Nikolai Vießmann and Sven-Bodo Scholz}, title = {Array Languages Make Neural Networks Fast}, booktitle = {Proc.\ ARRAY}, publisher = {ACM}, pages = {39--50}, doi = {10.1145/3460944.3464312}, year = {2021}, } Publisher's Version |
14 authors
proc time: 3.85