Upload
modlyzko
View
239
Download
0
Embed Size (px)
Citation preview
8/11/2019 Intro to Matlab GPU Programming
1/35
Matlab Optimization, parallelism,
and GPU computing.
Kai Mollerud
CEMS IT office
8/11/2019 Intro to Matlab GPU Programming
2/35
What Ill Cover
Basics; what is parallel computing, why GPUs are so good at
it.
When is a GPU better than a CPU.
What youll need / how to use the GPU Development process
Learning to write fast, non-GPU, programs
Turning non-GPU programs, into GPU programs
8/11/2019 Intro to Matlab GPU Programming
3/35
Parallelism
This is the key idea behind all high power computing,
especially GPU computing.
Parallelism can be difficult to fully understand, because
people dont often do things in parallel.
Here is an image of some real-world parallel problem solving:
8/11/2019 Intro to Matlab GPU Programming
4/35
Analogue Parallelism
8/11/2019 Intro to Matlab GPU Programming
5/35
How is This Parallelism?
The chalk holder isperforming whats calleda SIMD operation, singleinstruction, multiple data.
Each piece of data (chalk)must be of the same typeto fit in the array, but theycan have different values(color, length)
Likewise, a computer canperform the sameoperation on each elementin an array simultaneously.
8/11/2019 Intro to Matlab GPU Programming
6/35
So, why GPU computing?
8/11/2019 Intro to Matlab GPU Programming
7/35
GPU vs. CPU
A modern CPU has between 2 and 16 processing cores.
CPUs are designed to handle a wide array of tasks, often
performing several heterogeneous operations at once.
A modern GPU on the other hand, can have up to 2048stream processors.
A GPUs usual job is to decide what color each of the pixels on
your monitor are, a 1080p monitor has 2,073,600 pixels that
can change color ~60 times a second.
8/11/2019 Intro to Matlab GPU Programming
8/35
Parallel Problems
Not all problems are well suited to parallel computation.
There are 3 levels of parallelism, determined by how much
the operations involved depend on each other.
Fine-grained, Coarse-grained, Embarrassingly Put simply, GPU computing is best suited to Embarrassingly parallel
problems, and sometimes usable for problems with Coarse-grained
parallelism.
The technical reasoning here revolves around memory performance,
ask me later if you would like a more detailed explanation.
8/11/2019 Intro to Matlab GPU Programming
9/35
When to use GPU computing
Just because a problem is parallel, doesnt mean GPU
computing is the right choice.
CPUs can do multiple operations at once, and run much faster than
GPUs.
Where GPUs really shine are problems that are parallel, andhave very large amounts of data to process.
Deciding whether or not a problem will really benefit from
GPU computing isnt always obvious until you have actually
written the program. Luckily, matlab makes it easy to write a program for the CPU first, then
adapt it to the GPU to see if its worth it.
8/11/2019 Intro to Matlab GPU Programming
10/35
The Development Process
Step 1) Write a program
Step 2) Make the program fast
Step 3) Adapt the program to use the GPU
8/11/2019 Intro to Matlab GPU Programming
11/35
Step 1) Write a program
When you start writing a program,
performance is not important.
Try and focus on good organization of your
program, make it easy to read and modify.
Keeping things organized will make the next 2
steps much easier.
Personally, I start by writing comments todescribe each block of code.
8/11/2019 Intro to Matlab GPU Programming
12/35
Example Code #1
first_draft.m
1. Populates an array with some floating pointvalues
2. Calculates the mean value of the array3. Perform an operation on each element
4. Repeat 1-3 1000 times
This obviously isnt a useful calculation, but it iscomputationally similar to some programs I haveseen researchers using.
8/11/2019 Intro to Matlab GPU Programming
13/35
Step 2) Make it fast
This is not a simple subject, computers arecomplex and making a program run quicklymeans understanding how the computer runs the
program. An inefficient program wont get better just
because you run it on the GPU.
Rather than tell you every trick I know for
speeding up programs, Ill show you how toexperiment and learn.
Ill also show you a few tricks.
8/11/2019 Intro to Matlab GPU Programming
14/35
Optimization tools
Code profiler Programs run a bit slower in the profiler
You can save the output of the profiler as a html file to look at later,this is useful when measuring performance changes.
Control your runtime
You will need to run your code again and again Scale down the simulation detail, comment out plotting functions, etc.
If its part of a larger program, find a way to isolate it from the rest.
tic + toc The code profiler does this for you, but sometimes you just want one
number to look at, and these are easy to use. Use a fast computer.
If your group runs simulations, you should think about getting adedicated computer to run them on.
8/11/2019 Intro to Matlab GPU Programming
15/35
Optimization techniques
Avoid nesting loops if at all possible
Use for loops instead of while loops Not necessarily faster, but cleaner and easier to parallelize
Avoid conditionals
Use the find() function If you use an ifelse, put the most common part first.
Consider using a switch() statement
Avoid calling functions inside loops.
Think about MEX functions for very big calculations lets you use C programs from matlab
C is a lot faster than matlab
Dont use the mean() function, its slow. Use sum()/numel()
8/11/2019 Intro to Matlab GPU Programming
16/35
Example code #2
Second_draft.m
About 92% faster than #1
Uses find() to avoid conditionals Eliminates the nested loops by using vector
operations
Replaces the mean() function withsum()/numel()
8/11/2019 Intro to Matlab GPU Programming
17/35
Step 3) Using the GPU
Matlab uses vectors for everything. GPUs are
built for vector operations
This makes the conversion really easy.
To do GPU computing in matlab you will need:
Parallel computing toolbox (university has this
licensed)
A nVidia graphics card with compute capabilityversion 1.3 or higher.
entry cost of about $150 for a decent card
8/11/2019 Intro to Matlab GPU Programming
18/35
GPU functions
Performing a calculation on the GPU involves
2-3 steps.
Put the data you need into GPU memory
Call a GPU enabled function on that data
Move the results from GPU memory to CPU
memory.
8/11/2019 Intro to Matlab GPU Programming
19/35
Putting data on the GPU
Matlabsparallel computation toolbox
provides the gpuArray data type
Any gpuArray variable is stored in GPU memory
gpuArray supports most data types, and behave
more or less the same as normal arrays
Any operation on a gpuArray variable will return a
gpuArray variable.
8/11/2019 Intro to Matlab GPU Programming
20/35
Putting data on the GPU
You can create gpuArraysin 2 ways
Copy a variable from CPU memory to GPU
memory
Create a variable directly on the GPU
8/11/2019 Intro to Matlab GPU Programming
21/35
Copying a variable to the GPU
8/11/2019 Intro to Matlab GPU Programming
22/35
Copying a variable to the GPU
a and b are independent, subsequentoperations on one do not affect the other
a must be nonsparse, and must be of type
single, double, int/uint 8/16/32/64, or logical i.e. no custom data types
b has a 108 byte placeholder in CPU memory,
and uses 1600 bytes on GPU memory Transferring takes time, dont do it inside a
loop
8/11/2019 Intro to Matlab GPU Programming
23/35
Creating data on a GPU directly
8/11/2019 Intro to Matlab GPU Programming
24/35
Creating data on a GPU directly
You can use; ones, zeros, inf, nan, true, false,
eye, colon, rand, randi, randn, linspace,
logspace
This avoids the time cost of transferring from
CPU memory to GPU memory.
8/11/2019 Intro to Matlab GPU Programming
25/35
8/11/2019 Intro to Matlab GPU Programming
26/35
Example code #3
third_draft.m
Almost identical to #2
Turns the array into a gpuArray so the operations
are run on the GPU
Actually a bit slower than #2
That is, slower when using the same parameters.
More on this shortly.
8/11/2019 Intro to Matlab GPU Programming
27/35
Bringing GPU data back
The gather() function takes in a gpuArray and
copies it to CPU memory.
Again, this takes time, try and leave data on
the GPU as long as you can and transfer all of
it back at once.
I can go into detail about GPU vs CPU memory
behavior later if theres time/interest, otherwiseask me / email me.
8/11/2019 Intro to Matlab GPU Programming
28/35
Using the GPU in your code
Knowing how to use the GPU is half the battle,
the rest is knowing when.
Theres a simple way to learn this, take some
code, change something to a gpuArray and
see how the runtime changes.
8/11/2019 Intro to Matlab GPU Programming
29/35
8/11/2019 Intro to Matlab GPU Programming
30/35
Quantitative example
I wrote 3 programs to do the same task. The task exhibits
coarse-grained parallelism, and has a deterministic run-time.
Naive.m is a simple, non-parallel implementation. It isnt exceptionally
bad, but no effort has been made to make it run efficiently.
CPU.m is a CPU-only, parallel implementation that is essentially as fastas it can be.
GPU.m is very similar to CPU.m, but uses GPU operations wherever
possible.
I recorded performance metrics from these 3 programs across
a range of inputs, increasing the size of the input data each
time.
8/11/2019 Intro to Matlab GPU Programming
31/35
Testing details
The tests were run on a dell optiplex 990
Intel i5-2400 4-cores @3.1Ghz (3.3 with turbo boost)
4Gb 1333Mhz RAM
nVidia GeForce GTX 650 Ti 1Gb GDDR5 memory @5400Mhz
768 cell processors @941Mhz
Windows 7 64-bit enterprise
The numbers I gathered are unique to thiscomputer. Your results will vary, but shouldfollow similar trends.
8/11/2019 Intro to Matlab GPU Programming
32/35
Runtime Vs. array size
8/11/2019 Intro to Matlab GPU Programming
33/35
Elements per second
8/11/2019 Intro to Matlab GPU Programming
34/35
Coding for the GPU
Try not to move data between CPU and GPU veryoften
Replace conditional logic with set theory (loopsand if statements VS. vector ops and find())
Try to isolate variables. Storing values in an array to look at later can replace
random accesses to those values while calculatingthem
Be clever. You may need to change your entire approach to a
problem to get the most out of GPU computing
8/11/2019 Intro to Matlab GPU Programming
35/35
Questions?