23
Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff [email protected] +41 44 520 01 17 +41 79 430 03 61

Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

Embed Size (px)

Citation preview

Page 1: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

Dynamic Cuda with F#

HPC GPU & F# Meetup

March 19

San Jose, California

Dr. Daniel Egloff

[email protected]

+41 44 520 01 17

+41 79 430 03 61

Page 2: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   Software development and consulting company

!   Based in Zurich

!   Core competence !   Quantitative finance and risk management

!   Derivative pricing and modeling

!   Numerical computing

!   High performance computing (clusters, grid, GPUs)

!   Software engineering (C++, F#, Scala, …)

!   Early adopter of GPUs !   First project with GPUs in finance back in 2007

About Us

Page 3: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

Situation

Compute intensive problems GPGPU

Big data and information rich

applications

Cloud and distributed computing

Page 4: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

Recurring challenges in our HPC projects

Problems

Algorithms change often

C++ adds unwanted level of

complexity Interoperation and wrapper code

Hard to test correctness of

algorithms

Slow progress because of tools and complexity

Critical algorithms are not designed with GPUs in mind

Implications ? Efficient GPU

code is difficult to develop

Availability of GPU hardware

Page 5: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

Negative impact on the usability and acceptance of new technology

Implications

Moving project targets

Optimized code hard to maintain

and extend Maintenance of wrapper code

Maintenance of large body of test

code

Unpredictable project duration

and costs

Rewrite of critical numerical code

Implications Static and

inflexible solutions Delays because of missing hardware

Page 6: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

What would be needed and how can we improve?

Needs

Better and more flexible tools to make CUDA more accessible

CUDA programming model

Support iterative development and rapid prototyping

Using modern programming languages and

concepts

Building on top of modern technologies like .NET or the JVM

Generated CUDA code at par with compiled

CUDA C/C++

Solid framework compatible with CUDA

programming model

This is where comes into play…  

Page 7: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

What is  

?  

Page 8: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   Complete solution to develop CUDA accelerated GPU applications in .NET !   Relies on F# and in particular code

quotation

!   Based on LLVM and CUDA 5 technology

!   Fully F# based, no additional changes or extensions to the F# language, no <<<…>>>

!   No wrappers, no post build process to transform IL code

Alea.cuBase

Dynamic code generation

GPU algorithm scripting

Industry grade performance

Rapid development

Solid framework for reusability

Advanced CUDA programming

Page 9: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   Generate GPU code programmatically at run-time

!   Use .NET generics and F# code quotation splicing for flexible kernels

!   Foundation to develop GPU aware domain specific languages

Benefits

Dynamic code generation

Page 10: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   Easy and quick setup of development environment, no need to install NVIDIA nvcc compiler tools

!   Rapid prototyping in F# interactive

!   Iteratively improve CUDA kernel algorithms without time consuming build cycles

Benefits

Rapid development

Page 11: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   Execute F# scripts with GPU algorithms on command line or in F# interactive

!   GPU scripting in Excel

!   Integrate Alea.cuBase directly with Python

Benefits

GPU algorithm scripting

Page 12: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   Framework for type-safe definition of GPU resources

!   CUDA monad to specify GPU resources together with launch logic in unified manner

!   Reuse GPU kernel code and compose them to modular GPU kernel libraries

Benefits

Solid framework for reusability

cuda !{ kernel_B launch logic}

cuda !{ kernel_A launch logic}

Page 13: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   Generating performance optimized code which is on par with compiled CUDA C/C++ code

!   Low level device functions and special math functions

!   Built in occupancy calculator to identify optimal thread block layout

Benefits

Industry grade performance

0.00%  

100.00%  

200.00%  

300.00%  

400.00%  

500.00%  

600.00%  

700.00%  

800.00%  

2097152   8388608   16777216   33554432  

int32  float32  float64  

Segmented Scan by Key Alea.cuExtension against CUDA Thrust

Page 14: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   Support for texture, constant and shared memory

!   Pointer operations to partition array data

!   Special pointer types such as volatile pointers

!   Runtime compilation control e.g. fast math

!   Multiple streams

!   Thread safe use of multiple GPUs

!   Inline PTX assembly instructions

Benefits

Advanced CUDA programming

Page 15: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

Alea Ecosystem

Alea.cuBaseCUDA Runtime

API

ThrustCUDPP Alea.cuExtension

User Applications

CUDA Driver API

Alea EcosystemCUDA C Ecosystem

Page 16: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

Advantages over CUDA C/C++

!   Faster development !   More productive tools, type inference, Intellisense support

!   Cleaner more expressive code results in fewer bugs, less testing and less debugging

!   Removing unnecessary complexity !   No template meta programming or preprocessor tricks needed

!   More flexibility !   Dynamic code generation and scripting is entirely missing in CUDA C/C++

!   Easier to reuse and compose GPU algorithm

!   More transparency !   GPU resources are defined where needed

!   Launch logic defines thread layout and data requirement more transparent

!   Direct access to other valuable F# and .NET technologies !   Seamless integration into .NET, without any interoperation layer

!   Monads aka computational expressions

! Async workflows, agents, parallel programming, type providers

!   Uniform .NET type system

Page 17: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

How does  

work ?  

Page 18: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

NVVM IR module

PTX module

CUDA cubinmodule

CUDA module

Launch function

PModuleCUDA device

CUDA context

Device workerComilation

process

PTemplate

cuda !{ constant array texture kernel ... launch logic}

!   Four steps to a CUDA kernel with F# and Alea.cuBase

Development Process

2

3

4

1

Page 19: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

How to program with  

?  

Page 20: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   Basic kernel programming

!   Excel GPU scripting with Alea.cuBase and Alea.cuExtension, in Tsunami IDE and FCell !   Excel based Monte Carlo simulation

!   PDE solver for 2d heat equation in GPU

Live Coding

Page 21: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

!   F# is used in multiple ways !   Internally to build our CUDA compiler with quotations

!   As hosting language to define an internal DSL for CUDA, i.e. the CUDA monad

!   Use extensibility of F# to build more DSL such as the pcalc monad

!   Benefit of using F# !   Rich ecosystem first class language in .NET

!   Functional fewer and more descriptive code

!   Monads ideal to create DSLs, seq, async, F# linq

!   Type providers easier to integrate into information rich programming

!   Quotations lot of flexibility for DSL or compiler implementation

!   Extensibility pure solution for CUDA programming

How did F# pay off?

Page 22: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

More Information on F#

!   F# is a functional-first programming language

!   Strongly typed

!   First class .NET language

!   Open source

!   Designed to solve complex computing problems with simple, maintainable robust code

!   Runs on Windows, Linux, Mac OS, HTML5 and also on GPUs

!   http://www.fsharp.org

!   http://www.tryfsharp.org

Page 23: Dynamic Cuda with F#files.meetup.com/1646359/2013-03-21_HPC_GPU_FSharp_Meetup_GT… · Dynamic Cuda with F# HPC GPU & F# Meetup March 19 San Jose, California Dr. Daniel Egloff

Thank you