35
Hank Childs, University of Oregon Jan. 21st, 2013 CIS 610: Many-core visualization libraries

CIS 610: Many-core visualization libraries

  • Upload
    roza

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

CIS 610: Many-core visualization libraries. Hank Childs, University of Oregon. Jan. 21st, 2013. Schedule for this class. We have done 5 lectures in 2 weeks We should have done 4 lectures over last two weeks We will do 3 lectures this week We will be one full week ahead of schedule. - PowerPoint PPT Presentation

Citation preview

Page 1: CIS 610: Many-core visualization libraries

Hank Childs, University of OregonJan. 21st, 2013

CIS 610: Many-core visualization libraries

Page 2: CIS 610: Many-core visualization libraries

Schedule for this class

• We have done 5 lectures in 2 weeks– We should have done 4 lectures over last two

weeks• We will do 3 lectures this weekWe will be one full week ahead of schedule.We will cancel two lectures over the coming

weeks.

Page 3: CIS 610: Many-core visualization libraries

Schedule this week

• Tuesday lecture: today– Review of data parallel operations, general

discussion of packages so far• Thursday lecture: Ken Moreland• (Thursday colloquium @ 12: Ken Moreland)• Friday lecture: Ken Moreland– 8:30-10 (I can’t make this time)– 11-12:30– 11:30-1:00

Page 4: CIS 610: Many-core visualization libraries

Upcoming schedule

• Tuesday, Jan 28th

– 10 minute presentation by each student on the project they want to pursue• Non-binding

– Discuss the problem, and some initial thoughts about how to do it in many-core libraries

Page 5: CIS 610: Many-core visualization libraries

Upcoming schedule

• Thursday, Jan 30th

– Group session debugging problems.– Important that you have started your project by

then.

Page 6: CIS 610: Many-core visualization libraries

Upcoming schedule

• Weeks following– Series of 20 minute presentations, 3 per lecture– Two flavors of presentation:• “Update on my project”• “Overview of a paper I read”

Page 7: CIS 610: Many-core visualization libraries

How this class will be graded

• You will all submit a report at the end of the quarter.• The report will include:– A summary of what you have done– It will focus on your project– You should also include

• Presentations made• Porting of libraries• Assistance to other students• Bugs debugged (or reported)• Etc…

Page 8: CIS 610: Many-core visualization libraries

How this class will be graded

• It is not curved– If you all decide to not to present papers, you will

all be penalized• I expect you all get very good grades– But it is important that you work hard and

accomplish something in this class• Play with the libraries, present papers in class,

and really try to nail your “research project”

Page 9: CIS 610: Many-core visualization libraries

Lectures

• I expect you will all make about 3 presentations– 1 research update, 2 papers– 2 research updates, 1 paper

• Some lectures in the short term on CUDA, Thrust, data parallelism, etc, would probably be helpful.

Page 10: CIS 610: Many-core visualization libraries

EAVL

EXTREME-SCALE ANALYSIS AND VISUALIZATION LIBRARY

Jeremy Meredith

January, 2014

Page 11: CIS 610: Many-core visualization libraries

A Simple Data-Parallel Operation

void CellToCellDivide(Field &a, Field &b, Field &c){ for_each(i) c[i] = a[i] / b[i];} 

void CalculateDensity(...){ //... CellToCellDivide(mass, volume, density);}

Internal Library API Provides ThisAlgorithm Developer Writes This

Page 12: CIS 610: Many-core visualization libraries

Functor + Iterator Approach

void CalculateDensity(...){ //... CellToCellBinaryOp(mass, volume, density, Divide());}

template <class T> void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f){ for_each(i) f(a[i],b[i],c[i]);}

struct Divide{ void operator()(float &a, float &b, float &c) { c = a / b; }};

Internal Library API Provides ThisAlgorithm Developer Writes This

Page 13: CIS 610: Many-core visualization libraries

Custom Functor

void CalculateDensity(...){ //... CellToCellBinaryOp(mass, volume, density, MyFunctor());}

template <class T> void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f){ for_each(i) f(a[i],b[i],c[i]);}

struct MyFunctor{ void operator()(float &a, float &b, float &c) { c = a + 2*log(b); }};

Algorithm DeveloperWrites These

Internal Library API Provides This

Page 14: CIS 610: Many-core visualization libraries

DATA PARALLELISM BASICS

Page 15: CIS 610: Many-core visualization libraries

Map with 1 input, 1 output

Simplest data-parallel operation. Each result item can be calculated from its corresponding input item alone.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

6 14 0 2 8 0 0 8 10 6 2 0

struct f { float operator()(float x) { return x*2; }};

result

Page 16: CIS 610: Many-core visualization libraries

Map with 2 inputs, 1 output

With two input arrays, the functor takes two inputs. You can also have multiple outputs.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

5 11 2 2 12 3 9 9 10 4 3 1

struct f { float operator()(float a, float b) { return a+b; }};

result

y 2 4 2 1 8 3 9 5 5 1 2 1

Page 17: CIS 610: Many-core visualization libraries

Scatter with 1 input (and thus 1 output)

Possibly inefficient, risks of race conditions and uninitialized results. (Can also scatter to larger array if desired.)Often used in a scatter_if–type construct.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

0 1 3 0

No functor

result

indices

4

2 4 1 5 5 0 4 2 1 2 1 4

Page 18: CIS 610: Many-core visualization libraries

Gather with 1 input (and thus 1 output)

Unlike scatter, no risk of uninitialized data or race condition. Plus, parallelization is over a shorter indices array, and caching helps more, so can be more efficient.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

7 3 0 3 1

No functor

result

indices 1 9 6 9 3

Page 19: CIS 610: Many-core visualization libraries

Reduction with 1 input (and thus 1 output)

Example: max-reduction. Sum is also common.Often a fat-tree-based implementation.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

7result

struct f { float operator()(float a, float b) { return a>b ? a : b; }};

Page 20: CIS 610: Many-core visualization libraries

Inclusive Prefix Sum (a.k.a. Scan)with 1 input/output

Value at result[i] is sum of values x[0]..x[i].Surprisingly efficient parallel implementation.Basis for many more complex algorithms.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

3 10 10 11 15 15 15 19 24 27 28 28

No functor.

result

+ + + + + + + + + + +

Page 21: CIS 610: Many-core visualization libraries

Exclusive Prefix Sum (a.k.a. Scan)with 1 input/output

Initialize with zero, value is sum of only up to x[i-1].May be more commonly used than inclusive scan.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

0 10 10 11 15 15 15 19 24 27 28

No functor.

result 3

+ + + + + + + + + + +0

Page 22: CIS 610: Many-core visualization libraries

WRITING ALGORITHMS IN EAVL

Page 23: CIS 610: Many-core visualization libraries

EXAMPLE: THRESHOLD

Page 24: CIS 610: Many-core visualization libraries

Threshold

• Keep cell if it meets some criteria, else discard

• Criteria:– Pressure > 2– 10 < temperature < 20

Cells that meet criteria

Page 25: CIS 610: Many-core visualization libraries

How to implement threshold

• Iterate over cells• If a cell meets the criteria, then place that cell

in the output• Output is an unstructured mesh

Page 26: CIS 610: Many-core visualization libraries

FieldName: “x” “y” “z”Component: 0 0 0

• Explicit cells can be combined with structured coordinates.

Example: Thresholding an RGrid (a)

eavlCoordinates

Name: “x”Association: LogicalDim0Values[ni]

eavlField#0Name: “y”Association: LogicalDim1Values[nj]

eavlField#1Name: “z”Association: LogicalDim2Values[nk]

eavlField#2

RegularStructure: 30 40 30

eavlStructuredCellSet

FieldName: “x” “y” “z”Component: 0 0 0

eavlCoordinates

Name: “x”Association: LogicalDim0Values[ni]

eavlField#0Name: “y”Association: LogicalDim1Values[nj]

eavlField#1Name: “z”Association: LogicalDim2Values[nk]

eavlField#2

Connectivity: (a bunch of cells)

eavlExplicitCellSet

Page 27: CIS 610: Many-core visualization libraries

Cells: (…)

Parent: ( )

• A second Cell Set can be added which refers to the first one

Example: Thresholding an RGrid (b)

RegularStructure: 30 40 30

eavlStructuredCellSet eavlSubset

FieldName: “x” “y” “z”Component: 0 0 0

eavlCoordinates

Name: “x”Association: LogicalDim0Values[ni]

eavlField#0Name: “y”Association: LogicalDim1Values[nj]

eavlField#1Name: “z”Association: LogicalDim2Values[nk]

eavlField#2

RegularStructure: 30 40 30

eavlStructuredCellSet

Name: “x”Association: LogicalDim0Values[ni]

eavlField#0Name: “y”Association: LogicalDim1Values[nj]

eavlField#1Name: “z”Association: LogicalDim2Values[nk]

eavlField#2

FieldName: “x” “y” “z”Component: 0 0 0

eavlCoordinates

Page 28: CIS 610: Many-core visualization libraries

47 52 63

32 49

31

Starting MeshWe want to threshold a mesh based on its density values (shown here).

43 47 52 63

32 38 42 49

31 37 41 38

density 43 47 52 63 32 38 42 49 31 37 41 38

0 1 2 3 4 5 6 7 8 9 10 11

If we threshold 35 < density < 45, we want this result:

43

38 42

37 41 38

Page 29: CIS 610: Many-core visualization libraries

Which Cells to Include?Evaluate a Map operation with this functor:struct InRange { float lo, hi; InRange(float l, float h) : lo(l), hi(h) { } int operator()(float x) { return x>lo && x<hi; }}

1 0 0 0

0 1 1 0

0 1 1 1

density 43 47 52 63 32 38 42 49 31 37 41 38

0 1 2 3 4 5 6 7 8 9 10 11

1 0 0 0 0 1 1 0 0 1 1 1inrange

InRange()

Page 30: CIS 610: Many-core visualization libraries

How Many Cells in Output?Evaluate a Reduce operation using the Add<> functor.We can use this to create output cell length arrays.

1 0 0 0

0 1 1 0

0 1 1 1

0 1 2 3 4 5 6 7 8 9 10 11

1 0 0 0 0 1 1 0 0 1 1 1inrange

6result

plus

Page 31: CIS 610: Many-core visualization libraries

Where Do the Output Cells Go?Inputindices

Outputindices

0 1 2 3

4 5 6 7

8 9 10 11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5output cell

0

1 2

3 4 5

input cell

How do we create this mapping?

Page 32: CIS 610: Many-core visualization libraries

Create Input-to-Output Indexing?Exclusive Scan (exclusive prefix sum) gives us the output index positions.

0

1 2

3 4 5

0 1 2 3 4 5 6 7 8 9 10 11

1 0 0 0 0 1 1 0 0 1 1 1inrange

0 1 1 1 1 1 2 3 3 3 4 5startidx

+ + + + + + + + + + +0

Page 33: CIS 610: Many-core visualization libraries

startidx

Scatter Input Arrays to Output?NO. We can do this, but scatters can be risky/inefficient. Assuming we have multiple arrays to process, we can do something better....

0 1 2 3 4 5 6 7 8 9 10 11

output_density

0 1 1 1 1 1 2 3 3 3 4 5

0

5 6

9 10 11

density 43 47 52 63 32 38 42 49 31 37 41 38

Race condition unless we add a mask array!43 38 42 37 41 38

Page 34: CIS 610: Many-core visualization libraries

startidx

Create Output-to-Input Indexing?We want to work in the shorter output-length arrays and use gathers. A specialized scatter in EAVL creates this reverse index.

0 1 2 3 4 5 6 7 8 9 10 11

0 5 6 9 10 11revindex

0 1 1 1 1 1 2 3 3 3 4 5

0

5 6

9 10 11

Page 35: CIS 610: Many-core visualization libraries

density

Gather Input Mesh Arrays to Output?We can now use simple gathers to pull input arrays (density, pressure) into the output mesh.

0 1 2 3 4 5 6 7 8 9 10 11

43 47 52 63 32 38 42 49 31 37 41 38

0 5 6 9 10 11revindex

43 38 42 37 41 38

43

38 42

37 41 38

output_density