8
This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup button )

This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup

Embed Size (px)

Citation preview

Page 1: This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup

This deck has 1-, 2-, and 3- slide variants for C++ AMP

If your own deck uses 4:3, get with the 21st century and switch to 16:9

( Design tab, Page Setup button )

Page 2: This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup

C++ AMP in 1 slide

(for notes see comments section of slides from the 2- and 3- slide variant)

Page 3: This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup

C++ Accelerated Massive ParallelismWhat

– Heterogeneous platform support– Part of C++ & Visual Studio– STL-like library for parallel patterns

on large arrays – Builds on DirectX

Why– Performance– Productivity– Portability

How#include <amp.h>using namespace concurrency;void AddArrays(int n, int * pA, int * pB, int * pC){ array_view<int,1> a(n, pA); array_view<int,1> b(n, pB); array_view<int,1> sum(n, pC); parallel_for_each( sum.extent, [=](index<1> idx) restrict(amp) { sum[idx] = a[idx] + b[idx]; } );}

Page 4: This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup

C++ AMP in 2 or 3 slides

(for 2 slides, just drop the 3rd one)(see comments section of each slide for notes)

Page 5: This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup

C++ AMP• Heterogeneous platform support• Part of Visual C++ • Visual Studio integration• STL-like library for multidimensional data • Builds on DirectX• Is open spec

performance

portabilityproductivity

Page 6: This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup

Basic Elements of C++ AMP codingvoid AddArrays(int n, int * pA, int * pB, int * pC){ array_view<int,1> a(n, pA); array_view<int,1> b(n, pB); array_view<int,1> sum(n, pC); parallel_for_each(

sum.extent, [=](index<1> idx) restrict(amp) { sum[idx] = a[idx] + b[idx];

} );}

array_view variables captured and associated data copied to accelerator (on demand)

restrict(amp): tells the compiler to check that this code can execute on Direct3D hardware (aka accelerator)

parallel_for_each: execute the lambda on the accelerator once per thread

extent: the number and shape of threads to execute the lambda

index: the thread ID that is running the lambda, used to index into data

array_view: wraps the data to operate on the accelerator

Page 7: This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup

C++ AMP at a Glance• restrict(amp, cpu)• parallel_for_each• class array<T,N>• class array_view<T,N>• class index<N>• class extent<N>• class accelerator• class accelerator_view

• class tiled_extent< , , >• class tiled_index< , , >• class tile_barrier• tile_static storage class

Page 8: This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup

C++ AMP resources• Native parallelism blog (team blog)

– http://blogs.msdn.com/b/nativeconcurrency/

• MSDN Forums to ask questions– http://social.msdn.microsoft.com/Forums/en/parallelcppnative/threads

• Daniel Moth's blog (PM of C++ AMP)– http://www.danielmoth.com/Blog/