CIS 610: Many-core visualization libraries

Preview:

DESCRIPTION

CIS 610: Many-core visualization libraries. Hank Childs, University of Oregon. Jan. 21st, 2013. Schedule for this class. We have done 5 lectures in 2 weeks We should have done 4 lectures over last two weeks We will do 3 lectures this week We will be one full week ahead of schedule. - PowerPoint PPT Presentation

Citation preview

Hank Childs, University of OregonJan. 21st, 2013

CIS 610: Many-core visualization libraries

Schedule for this class

• We have done 5 lectures in 2 weeks– We should have done 4 lectures over last two

weeks• We will do 3 lectures this weekWe will be one full week ahead of schedule.We will cancel two lectures over the coming

weeks.

Schedule this week

• Tuesday lecture: today– Review of data parallel operations, general

discussion of packages so far• Thursday lecture: Ken Moreland• (Thursday colloquium @ 12: Ken Moreland)• Friday lecture: Ken Moreland– 8:30-10 (I can’t make this time)– 11-12:30– 11:30-1:00

Upcoming schedule

• Tuesday, Jan 28th

– 10 minute presentation by each student on the project they want to pursue• Non-binding

– Discuss the problem, and some initial thoughts about how to do it in many-core libraries

Upcoming schedule

• Thursday, Jan 30th

– Group session debugging problems.– Important that you have started your project by

then.

Upcoming schedule

• Weeks following– Series of 20 minute presentations, 3 per lecture– Two flavors of presentation:• “Update on my project”• “Overview of a paper I read”

How this class will be graded

• You will all submit a report at the end of the quarter.• The report will include:– A summary of what you have done– It will focus on your project– You should also include

• Presentations made• Porting of libraries• Assistance to other students• Bugs debugged (or reported)• Etc…

How this class will be graded

• It is not curved– If you all decide to not to present papers, you will

all be penalized• I expect you all get very good grades– But it is important that you work hard and

accomplish something in this class• Play with the libraries, present papers in class,

and really try to nail your “research project”

Lectures

• I expect you will all make about 3 presentations– 1 research update, 2 papers– 2 research updates, 1 paper

• Some lectures in the short term on CUDA, Thrust, data parallelism, etc, would probably be helpful.

EAVL

EXTREME-SCALE ANALYSIS AND VISUALIZATION LIBRARY

Jeremy Meredith

January, 2014

A Simple Data-Parallel Operation

void CellToCellDivide(Field &a, Field &b, Field &c){ for_each(i) c[i] = a[i] / b[i];} 

void CalculateDensity(...){ //... CellToCellDivide(mass, volume, density);}

Internal Library API Provides ThisAlgorithm Developer Writes This

Functor + Iterator Approach

void CalculateDensity(...){ //... CellToCellBinaryOp(mass, volume, density, Divide());}

template <class T> void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f){ for_each(i) f(a[i],b[i],c[i]);}

struct Divide{ void operator()(float &a, float &b, float &c) { c = a / b; }};

Internal Library API Provides ThisAlgorithm Developer Writes This

Custom Functor

void CalculateDensity(...){ //... CellToCellBinaryOp(mass, volume, density, MyFunctor());}

template <class T> void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f){ for_each(i) f(a[i],b[i],c[i]);}

struct MyFunctor{ void operator()(float &a, float &b, float &c) { c = a + 2*log(b); }};

Algorithm DeveloperWrites These

Internal Library API Provides This

DATA PARALLELISM BASICS

Map with 1 input, 1 output

Simplest data-parallel operation. Each result item can be calculated from its corresponding input item alone.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

6 14 0 2 8 0 0 8 10 6 2 0

struct f { float operator()(float x) { return x*2; }};

result

Map with 2 inputs, 1 output

With two input arrays, the functor takes two inputs. You can also have multiple outputs.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

5 11 2 2 12 3 9 9 10 4 3 1

struct f { float operator()(float a, float b) { return a+b; }};

result

y 2 4 2 1 8 3 9 5 5 1 2 1

Scatter with 1 input (and thus 1 output)

Possibly inefficient, risks of race conditions and uninitialized results. (Can also scatter to larger array if desired.)Often used in a scatter_if–type construct.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

0 1 3 0

No functor

result

indices

4

2 4 1 5 5 0 4 2 1 2 1 4

Gather with 1 input (and thus 1 output)

Unlike scatter, no risk of uninitialized data or race condition. Plus, parallelization is over a shorter indices array, and caching helps more, so can be more efficient.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

7 3 0 3 1

No functor

result

indices 1 9 6 9 3

Reduction with 1 input (and thus 1 output)

Example: max-reduction. Sum is also common.Often a fat-tree-based implementation.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

7result

struct f { float operator()(float a, float b) { return a>b ? a : b; }};

Inclusive Prefix Sum (a.k.a. Scan)with 1 input/output

Value at result[i] is sum of values x[0]..x[i].Surprisingly efficient parallel implementation.Basis for many more complex algorithms.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

3 10 10 11 15 15 15 19 24 27 28 28

No functor.

result

+ + + + + + + + + + +

Exclusive Prefix Sum (a.k.a. Scan)with 1 input/output

Initialize with zero, value is sum of only up to x[i-1].May be more commonly used than inclusive scan.

x 3 7 0 1 4 0 0 4 5 3 1 0

0 1 2 3 4 5 6 7 8 9 10 11

0 10 10 11 15 15 15 19 24 27 28

No functor.

result 3

+ + + + + + + + + + +0

WRITING ALGORITHMS IN EAVL

EXAMPLE: THRESHOLD

Threshold

• Keep cell if it meets some criteria, else discard

• Criteria:– Pressure > 2– 10 < temperature < 20

Cells that meet criteria

How to implement threshold

• Iterate over cells• If a cell meets the criteria, then place that cell

in the output• Output is an unstructured mesh

FieldName: “x” “y” “z”Component: 0 0 0

• Explicit cells can be combined with structured coordinates.

Example: Thresholding an RGrid (a)

eavlCoordinates

Name: “x”Association: LogicalDim0Values[ni]

eavlField#0Name: “y”Association: LogicalDim1Values[nj]

eavlField#1Name: “z”Association: LogicalDim2Values[nk]

eavlField#2

RegularStructure: 30 40 30

eavlStructuredCellSet

FieldName: “x” “y” “z”Component: 0 0 0

eavlCoordinates

Name: “x”Association: LogicalDim0Values[ni]

eavlField#0Name: “y”Association: LogicalDim1Values[nj]

eavlField#1Name: “z”Association: LogicalDim2Values[nk]

eavlField#2

Connectivity: (a bunch of cells)

eavlExplicitCellSet

Cells: (…)

Parent: ( )

• A second Cell Set can be added which refers to the first one

Example: Thresholding an RGrid (b)

RegularStructure: 30 40 30

eavlStructuredCellSet eavlSubset

FieldName: “x” “y” “z”Component: 0 0 0

eavlCoordinates

Name: “x”Association: LogicalDim0Values[ni]

eavlField#0Name: “y”Association: LogicalDim1Values[nj]

eavlField#1Name: “z”Association: LogicalDim2Values[nk]

eavlField#2

RegularStructure: 30 40 30

eavlStructuredCellSet

Name: “x”Association: LogicalDim0Values[ni]

eavlField#0Name: “y”Association: LogicalDim1Values[nj]

eavlField#1Name: “z”Association: LogicalDim2Values[nk]

eavlField#2

FieldName: “x” “y” “z”Component: 0 0 0

eavlCoordinates

47 52 63

32 49

31

Starting MeshWe want to threshold a mesh based on its density values (shown here).

43 47 52 63

32 38 42 49

31 37 41 38

density 43 47 52 63 32 38 42 49 31 37 41 38

0 1 2 3 4 5 6 7 8 9 10 11

If we threshold 35 < density < 45, we want this result:

43

38 42

37 41 38

Which Cells to Include?Evaluate a Map operation with this functor:struct InRange { float lo, hi; InRange(float l, float h) : lo(l), hi(h) { } int operator()(float x) { return x>lo && x<hi; }}

1 0 0 0

0 1 1 0

0 1 1 1

density 43 47 52 63 32 38 42 49 31 37 41 38

0 1 2 3 4 5 6 7 8 9 10 11

1 0 0 0 0 1 1 0 0 1 1 1inrange

InRange()

How Many Cells in Output?Evaluate a Reduce operation using the Add<> functor.We can use this to create output cell length arrays.

1 0 0 0

0 1 1 0

0 1 1 1

0 1 2 3 4 5 6 7 8 9 10 11

1 0 0 0 0 1 1 0 0 1 1 1inrange

6result

plus

Where Do the Output Cells Go?Inputindices

Outputindices

0 1 2 3

4 5 6 7

8 9 10 11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5output cell

0

1 2

3 4 5

input cell

How do we create this mapping?

Create Input-to-Output Indexing?Exclusive Scan (exclusive prefix sum) gives us the output index positions.

0

1 2

3 4 5

0 1 2 3 4 5 6 7 8 9 10 11

1 0 0 0 0 1 1 0 0 1 1 1inrange

0 1 1 1 1 1 2 3 3 3 4 5startidx

+ + + + + + + + + + +0

startidx

Scatter Input Arrays to Output?NO. We can do this, but scatters can be risky/inefficient. Assuming we have multiple arrays to process, we can do something better....

0 1 2 3 4 5 6 7 8 9 10 11

output_density

0 1 1 1 1 1 2 3 3 3 4 5

0

5 6

9 10 11

density 43 47 52 63 32 38 42 49 31 37 41 38

Race condition unless we add a mask array!43 38 42 37 41 38

startidx

Create Output-to-Input Indexing?We want to work in the shorter output-length arrays and use gathers. A specialized scatter in EAVL creates this reverse index.

0 1 2 3 4 5 6 7 8 9 10 11

0 5 6 9 10 11revindex

0 1 1 1 1 1 2 3 3 3 4 5

0

5 6

9 10 11

density

Gather Input Mesh Arrays to Output?We can now use simple gathers to pull input arrays (density, pressure) into the output mesh.

0 1 2 3 4 5 6 7 8 9 10 11

43 47 52 63 32 38 42 49 31 37 41 38

0 5 6 9 10 11revindex

43 38 42 37 41 38

43

38 42

37 41 38

output_density

Recommended