41
Using GPU Virtualization with TensorFlow Carlos Reaño Universitat Politècnica de València, Spain http ://mural.uv.es/caregon HPC Advisory Council Swiss Conference 2018 April 9-12, 2018, Lugano, Switzerland

Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

Using GPU Virtualization with

TensorFlow

Carlos ReañoUniversitat Politècnica de València, Spain

http://mural.uv.es/caregon

HPC Advisory Council Swiss Conference 2018

April 9-12, 2018, Lugano, Switzerland

Page 2: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 2/41

Outline

What is rCUDA?

Installing and using rCUDA

rCUDA over HPC networks

InfiniBand

How taking benefit from rCUDA

Sample scenarios

Questions & Answers

Page 3: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 3/41

Outline

What is rCUDA?

Installing and using rCUDA

rCUDA over HPC networks

InfiniBand

How taking benefit from rCUDA

Sample scenarios

Questions & Answers

Page 4: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 4/41

What is rCUDA?

CUDA:

Node 2 Node 1 GPUNetwork

Node 1 GPU

rCUDA (remote CUDA):

With rCUDA Node 2 can useNode 1 GPU!!!

Page 5: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 5/41

Outline

What is rCUDA?

Installing and using rCUDA

rCUDA over HPC networks

InfiniBand

How taking benefit from rCUDA

Sample scenarios

Questions & Answers

Page 6: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 6/41

Installing and using rCUDA

Where obtain rCUDA?

◦ www.rCUDA.net: Software Request Form

Package contents. Important folders:

doc: rCUDA user’s guide & quick start guide

bin: rCUDA server daemon

lib: rCUDA library

Installing rCUDA

◦ Just untar the tarball in both the server(s) and the client(s) node(s)

Page 7: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 7/41

Installing and using rCUDA

Starting rCUDA server:

◦ Set env. vars as if you were going to run a CUDA program:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

◦ Start rCUDA server:

cd $HOME/rCUDA/bin

./rCUDAd

Page 8: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 8/41

Installing and using rCUDA

Starting rCUDA server:

◦ Set env. vars as if you were going to run a CUDA program:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

◦ Start rCUDA server:

cd $HOME/rCUDA/bin

./rCUDAd

Path to CUDA binaries

Page 9: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 9/41

Installing and using rCUDA

Starting rCUDA server:

◦ Set env. vars as if you were going to run a CUDA program:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

◦ Start rCUDA server:

cd $HOME/rCUDA/bin

./rCUDAd

Path to CUDA libraries

Page 10: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 10/41

Installing and using rCUDA

Starting rCUDA server:

◦ Set env. vars as if you were going to run a CUDA program:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

◦ Start rCUDA server:

cd $HOME/rCUDA/bin

./rCUDAd

Path to rCUDA server

Page 11: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 11/41

Installing and using rCUDA

Starting rCUDA server:

◦ Set env. vars as if you were going to run a CUDA program:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

◦ Start rCUDA server:

cd $HOME/rCUDA/bin

./rCUDAd

Start rCUDA server in background

Page 12: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 12/41

Installing and using rCUDA

Running a CUDA program with rCUDA:

◦ Set env. vars as follows:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/lib:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=1

export RCUDA_DEVICE_0=<server_name_or_ip_address>:0

◦ Compile CUDA program using dynamic libraries:

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/deviceQuery

make EXTRA_NVCCFLAGS=--cudart=shared

◦ Run the CUDA program as usual:

./deviceQuery

...

Page 13: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 13/41

Installing and using rCUDA

Running a CUDA program with rCUDA:

◦ Set env. vars as follows:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/lib:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=1

export RCUDA_DEVICE_0=<server_name_or_ip_address>:0

◦ Compile CUDA program using dynamic libraries:

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/deviceQuery

make EXTRA_NVCCFLAGS=--cudart=shared

◦ Run the CUDA program as usual:

./deviceQuery

...

Path to CUDA binaries

Page 14: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 14/41

Installing and using rCUDA

Running a CUDA program with rCUDA:

◦ Set env. vars as follows:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/lib:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=1

export RCUDA_DEVICE_0=<server_name_or_ip_address>:0

◦ Compile CUDA program using dynamic libraries:

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/deviceQuery

make EXTRA_NVCCFLAGS=--cudart=shared

◦ Run the CUDA program as usual:

./deviceQuery

...

Path to rCUDA library

Page 15: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 15/41

Installing and using rCUDA

Running a CUDA program with rCUDA:

◦ Set env. vars as follows:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/lib:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=1

export RCUDA_DEVICE_0=<server_name_or_ip_address>:0

◦ Compile CUDA program using dynamic libraries:

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/deviceQuery

make EXTRA_NVCCFLAGS=--cudart=shared

◦ Run the CUDA program as usual:

./deviceQuery

...

Number of remote GPUs: 1, 2, 3...

Page 16: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 16/41

Installing and using rCUDA

Running a CUDA program with rCUDA:

◦ Set env. vars as follows:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/lib:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=1

export RCUDA_DEVICE_0=<server_name_or_ip_address>:0

◦ Compile CUDA program using dynamic libraries:

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/deviceQuery

make EXTRA_NVCCFLAGS=--cudart=shared

◦ Run the CUDA program as usual:

./deviceQuery

...

Name/IP of rCUDA server

Page 17: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 17/41

Installing and using rCUDA

Running a CUDA program with rCUDA:

◦ Set env. vars as follows:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/lib:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=1

export RCUDA_DEVICE_0=<server_name_or_ip_address>:0

◦ Compile CUDA program using dynamic libraries:

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/deviceQuery

make EXTRA_NVCCFLAGS=--cudart=shared

◦ Run the CUDA program as usual:

./deviceQuery

...

GPU of remote server to use

Page 18: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 18/41

Installing and using rCUDA

Running a CUDA program with rCUDA:

◦ Set env. vars as follows:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/lib:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=1

export RCUDA_DEVICE_0=<server_name_or_ip_address>:0

◦ Compile CUDA program using dynamic libraries:

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/deviceQuery

make EXTRA_NVCCFLAGS=--cudart=shared

◦ Run the CUDA program as usual:

./deviceQuery

...

Very important!!!

Page 19: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 19/41

Installing and using rCUDA

Running a CUDA program with rCUDA:

◦ Set env. vars as follows:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/lib:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=1

export RCUDA_DEVICE_0=<server_name_or_ip_address>:0

◦ Compile CUDA program using dynamic libraries:

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/deviceQuery

make EXTRA_NVCCFLAGS=--cudart=shared

◦ Run the CUDA program as usual:

./deviceQuery

...

Page 20: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 20/41

Installing and using rCUDA

Live demonstration:

◦ deviceQuery

◦ bandwidthTest

Page 21: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 21/41

Installing and using rCUDA

Live demonstration:

◦ deviceQuery

◦ bandwidthTest

Problem: bandwidth with rCUDA is too low!!

◦ Why? We are using TCP

Page 22: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 22/41

Installing and using rCUDA

Live demonstration:

◦ deviceQuery

◦ bandwidthTest

Problem: bandwidth with rCUDA is too low!!

◦ Why? We are using TCP

Solution: HPC networks

◦ InfiniBand (IB)

Page 23: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 23/41

Outline

What is rCUDA?

Installing and using rCUDA

rCUDA over HPC networks

InfiniBand

How taking benefit from rCUDA

Sample scenarios

Questions & Answers

Page 24: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 24/41

rCUDA over HPC networks: InfiniBand

Starting rCUDA server using IB:

export RCUDA_NETWORK=IB

cd $HOME/rCUDA/bin

./rCUDAd

Run CUDA program using rCUDA over IB:

export RCUDA_NETWORK=IB

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/bandwidthTest

./bandwidthTest

Page 25: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 25/41

rCUDA over HPC networks: InfiniBand

Starting rCUDA server using IB:

export RCUDA_NETWORK=IB

cd $HOME/rCUDA/bin

./rCUDAd

Run CUDA program using rCUDA over IB:

export RCUDA_NETWORK=IB

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/bandwidthTest

./bandwidthTest

Tell rCUDA we want to use IB

Also in the client!!

Page 26: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 26/41

rCUDA over HPC networks: InfiniBand

Starting rCUDA server using IB:

export RCUDA_NETWORK=IB

cd $HOME/rCUDA/bin

./rCUDAd

Run CUDA program using rCUDA over IB:

export RCUDA_NETWORK=IB

cd $HOME/NVIDIA_CUDA_Samples/1_Utilities/bandwidthTest

./bandwidthTest

Live demonstration:

◦ bandwidthTest using IB

◦ Bandwidth is no more a problem!!

Page 27: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 27/41

Outline

What is rCUDA?

Installing and using rCUDA

rCUDA over HPC networks

InfiniBand

How taking benefit from rCUDA

Sample scenarios

Questions & Answers

Page 28: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 28/41

How taking benefit from rCUDA

Sample scenarios:

◦ Typical behavior of CUDA applications: moving data to the GPU and

performing a lot of computations there to compensate the overhead

of having moved the data

This benefits rCUDA: more computations, less rCUDA overhead

◦ Scalable applications: more GPUs, less execution time

rCUDA can use all the GPUs of the cluster, while CUDA only can use the ones

directly connected to one node: for some applications, rCUDA can get better

results than with CUDA

◦ Heterogeneous clusters: access to GPU servers from ATOM, ARM…

rCUDA can be used to access GPU servers in x86 or Power8 machines, from

different systems and architectures (ATOM, ARM, Intel-D...)

Page 29: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 29/41

How taking benefit from rCUDA

Three main types of applications:

◦ Bandwidth bounded: more transfers, more rCUDA overhead

◦ Computations bounded: more computations, less rCUDA overhead

◦ Intermediate

Page 30: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 30/41

Tensorflow

GPU vs. remote GPU

Overhead of remote GPUs?

Live demonstration:

Tensorflow with CUDA

Tensorflow with rCUDA

Page 31: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 31/41

Tensorflow

CPU vs. remote GPU

What is better: a local CPU or a remote GPU?

Live demonstration:

Tensorflow on CPU (without CUDA)

Page 32: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 32/41

How taking benefit from rCUDA

Sample scenarios:

◦ Typical behavior of CUDA applications: moving data to the GPU and

performing a lot of computations there to compensate the overhead

of having moved the data

This benefits rCUDA: more computations, less rCUDA overhead

◦ Scalable applications: more GPUs, less execution time

rCUDA can use all the GPUs of the cluster, while CUDA only can use the ones

directly connected to one node: for some applications, rCUDA can get better

results than with CUDA

◦ Heterogeneous clusters: access to GPU servers from ATOM, ARM…

rCUDA can be used to access GPU servers in x86 or Power8 machines, from

different systems and architectures (ATOM, ARM, Intel-D...)

Page 33: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 33/41

Multi-GPU scenario

CUDA:

Node 0

Node 1

Network

Node 1

GPU

rCUDA (remote CUDA):

Multi-GPU running in Node 0using all GPUs in the cluster

Multi-GPU App running in Node 1 using all the GPUs in the node

GPU

GPU

GPU

GPU

Node 2 GPU

Node 3 GPU

Node n GPU

... ...

Page 34: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 34/41

Multi-GPU Configuration

Configure rCUDA for Multi-GPU:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=5

export RCUDA_DEVICE_0=node1:0

export RCUDA_DEVICE_1=node1:1

export RCUDA_DEVICE_2=node2:0

export RCUDA_DEVICE_3=node3:0

export RCUDA_DEVICE_4=node4:0

◦ Check configuration by running deviceQuery sample

Page 35: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 35/41

Multi-GPU Configuration

Configure rCUDA for Multi-GPU:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=5

export RCUDA_DEVICE_0=node1:0

export RCUDA_DEVICE_1=node1:1

export RCUDA_DEVICE_2=node2:0

export RCUDA_DEVICE_3=node3:0

export RCUDA_DEVICE_4=node4:0

◦ Check configuration by running deviceQuery sample

Number of remote GPUs

Page 36: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 36/41

Multi-GPU Configuration

Configure rCUDA for Multi-GPU:

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH

export RCUDA_DEVICE_COUNT=5

export RCUDA_DEVICE_0=node1:0

export RCUDA_DEVICE_1=node1:1

export RCUDA_DEVICE_2=node2:0

export RCUDA_DEVICE_3=node3:0

export RCUDA_DEVICE_4=node4:0

◦ Check configuration by running deviceQuery sample

Location of each GPU

Page 37: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 37/41

Multi-GPU Tensorflow

Live demonstration:

deviceQuery sample with multiple GPUs

Multi-GPU Tensorflow

Page 38: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 38/41

How taking benefit from rCUDA

Sample scenarios:

◦ Typical behavior of CUDA applications: moving data to the GPU and

performing a lot of computations there to compensate the overhead

of having moved the data

This benefits rCUDA: more computations, less rCUDA overhead

◦ Scalable applications: more GPUs, less execution time

rCUDA can use all the GPUs of the cluster, while CUDA only can use the ones

directly connected to one node: for some applications, rCUDA can get better

results than with CUDA

◦ Heterogeneous clusters: access to GPU servers from ATOM, ARM…

rCUDA can be used to access GPU servers in x86 or Power8 machines, from

different systems and architectures (ATOM, ARM, Intel-D...)

Page 39: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 39/41

How taking benefit from rCUDA

Heterogeneous clusters:

◦ Access from low power nodes (Atom, ARM , Intel D…) to x86 GPU

accelerated nodes

◦ Access from no-Power8 nodes to Power8 GPU accelerated nodes

Page 40: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

HPCAC Swiss Conference 2018 40/41

Outline

What is rCUDA?

Installing and using rCUDA

rCUDA over HPC networks

InfiniBand

How taking benefit from rCUDA

Sample scenarios

Questions & Answers

Page 41: Using GPU Virtualization with TensorFlo · 2020. 1. 15. · HPCAC Swiss Conference 2018 2/41 Outline What is rCUDA? Installing and using rCUDA rCUDA over HPC networks InfiniBand How

Get a free copy of rCUDA at

http://www.rcuda.net

@rcuda_