Upload
roza
View
34
Download
0
Tags:
Embed Size (px)
DESCRIPTION
CIS 610: Many-core visualization libraries. Hank Childs, University of Oregon. Jan. 21st, 2013. Schedule for this class. We have done 5 lectures in 2 weeks We should have done 4 lectures over last two weeks We will do 3 lectures this week We will be one full week ahead of schedule. - PowerPoint PPT Presentation
Citation preview
Hank Childs, University of OregonJan. 21st, 2013
CIS 610: Many-core visualization libraries
Schedule for this class
• We have done 5 lectures in 2 weeks– We should have done 4 lectures over last two
weeks• We will do 3 lectures this weekWe will be one full week ahead of schedule.We will cancel two lectures over the coming
weeks.
Schedule this week
• Tuesday lecture: today– Review of data parallel operations, general
discussion of packages so far• Thursday lecture: Ken Moreland• (Thursday colloquium @ 12: Ken Moreland)• Friday lecture: Ken Moreland– 8:30-10 (I can’t make this time)– 11-12:30– 11:30-1:00
Upcoming schedule
• Tuesday, Jan 28th
– 10 minute presentation by each student on the project they want to pursue• Non-binding
– Discuss the problem, and some initial thoughts about how to do it in many-core libraries
Upcoming schedule
• Thursday, Jan 30th
– Group session debugging problems.– Important that you have started your project by
then.
Upcoming schedule
• Weeks following– Series of 20 minute presentations, 3 per lecture– Two flavors of presentation:• “Update on my project”• “Overview of a paper I read”
How this class will be graded
• You will all submit a report at the end of the quarter.• The report will include:– A summary of what you have done– It will focus on your project– You should also include
• Presentations made• Porting of libraries• Assistance to other students• Bugs debugged (or reported)• Etc…
How this class will be graded
• It is not curved– If you all decide to not to present papers, you will
all be penalized• I expect you all get very good grades– But it is important that you work hard and
accomplish something in this class• Play with the libraries, present papers in class,
and really try to nail your “research project”
Lectures
• I expect you will all make about 3 presentations– 1 research update, 2 papers– 2 research updates, 1 paper
• Some lectures in the short term on CUDA, Thrust, data parallelism, etc, would probably be helpful.
EAVL
EXTREME-SCALE ANALYSIS AND VISUALIZATION LIBRARY
Jeremy Meredith
January, 2014
A Simple Data-Parallel Operation
void CellToCellDivide(Field &a, Field &b, Field &c){ for_each(i) c[i] = a[i] / b[i];}
void CalculateDensity(...){ //... CellToCellDivide(mass, volume, density);}
Internal Library API Provides ThisAlgorithm Developer Writes This
Functor + Iterator Approach
void CalculateDensity(...){ //... CellToCellBinaryOp(mass, volume, density, Divide());}
template <class T> void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f){ for_each(i) f(a[i],b[i],c[i]);}
struct Divide{ void operator()(float &a, float &b, float &c) { c = a / b; }};
Internal Library API Provides ThisAlgorithm Developer Writes This
Custom Functor
void CalculateDensity(...){ //... CellToCellBinaryOp(mass, volume, density, MyFunctor());}
template <class T> void CellToCellBinaryOp<T>(Field &a, Field &b, Field &c T &f){ for_each(i) f(a[i],b[i],c[i]);}
struct MyFunctor{ void operator()(float &a, float &b, float &c) { c = a + 2*log(b); }};
Algorithm DeveloperWrites These
Internal Library API Provides This
DATA PARALLELISM BASICS
Map with 1 input, 1 output
Simplest data-parallel operation. Each result item can be calculated from its corresponding input item alone.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
6 14 0 2 8 0 0 8 10 6 2 0
struct f { float operator()(float x) { return x*2; }};
result
Map with 2 inputs, 1 output
With two input arrays, the functor takes two inputs. You can also have multiple outputs.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
5 11 2 2 12 3 9 9 10 4 3 1
struct f { float operator()(float a, float b) { return a+b; }};
result
y 2 4 2 1 8 3 9 5 5 1 2 1
Scatter with 1 input (and thus 1 output)
Possibly inefficient, risks of race conditions and uninitialized results. (Can also scatter to larger array if desired.)Often used in a scatter_if–type construct.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
0 1 3 0
No functor
result
indices
4
2 4 1 5 5 0 4 2 1 2 1 4
Gather with 1 input (and thus 1 output)
Unlike scatter, no risk of uninitialized data or race condition. Plus, parallelization is over a shorter indices array, and caching helps more, so can be more efficient.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
7 3 0 3 1
No functor
result
indices 1 9 6 9 3
Reduction with 1 input (and thus 1 output)
Example: max-reduction. Sum is also common.Often a fat-tree-based implementation.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
7result
struct f { float operator()(float a, float b) { return a>b ? a : b; }};
Inclusive Prefix Sum (a.k.a. Scan)with 1 input/output
Value at result[i] is sum of values x[0]..x[i].Surprisingly efficient parallel implementation.Basis for many more complex algorithms.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
3 10 10 11 15 15 15 19 24 27 28 28
No functor.
result
+ + + + + + + + + + +
Exclusive Prefix Sum (a.k.a. Scan)with 1 input/output
Initialize with zero, value is sum of only up to x[i-1].May be more commonly used than inclusive scan.
x 3 7 0 1 4 0 0 4 5 3 1 0
0 1 2 3 4 5 6 7 8 9 10 11
0 10 10 11 15 15 15 19 24 27 28
No functor.
result 3
+ + + + + + + + + + +0
WRITING ALGORITHMS IN EAVL
EXAMPLE: THRESHOLD
Threshold
• Keep cell if it meets some criteria, else discard
• Criteria:– Pressure > 2– 10 < temperature < 20
Cells that meet criteria
How to implement threshold
• Iterate over cells• If a cell meets the criteria, then place that cell
in the output• Output is an unstructured mesh
FieldName: “x” “y” “z”Component: 0 0 0
• Explicit cells can be combined with structured coordinates.
Example: Thresholding an RGrid (a)
eavlCoordinates
Name: “x”Association: LogicalDim0Values[ni]
eavlField#0Name: “y”Association: LogicalDim1Values[nj]
eavlField#1Name: “z”Association: LogicalDim2Values[nk]
eavlField#2
RegularStructure: 30 40 30
eavlStructuredCellSet
FieldName: “x” “y” “z”Component: 0 0 0
eavlCoordinates
Name: “x”Association: LogicalDim0Values[ni]
eavlField#0Name: “y”Association: LogicalDim1Values[nj]
eavlField#1Name: “z”Association: LogicalDim2Values[nk]
eavlField#2
Connectivity: (a bunch of cells)
eavlExplicitCellSet
Cells: (…)
Parent: ( )
• A second Cell Set can be added which refers to the first one
Example: Thresholding an RGrid (b)
RegularStructure: 30 40 30
eavlStructuredCellSet eavlSubset
FieldName: “x” “y” “z”Component: 0 0 0
eavlCoordinates
Name: “x”Association: LogicalDim0Values[ni]
eavlField#0Name: “y”Association: LogicalDim1Values[nj]
eavlField#1Name: “z”Association: LogicalDim2Values[nk]
eavlField#2
RegularStructure: 30 40 30
eavlStructuredCellSet
Name: “x”Association: LogicalDim0Values[ni]
eavlField#0Name: “y”Association: LogicalDim1Values[nj]
eavlField#1Name: “z”Association: LogicalDim2Values[nk]
eavlField#2
FieldName: “x” “y” “z”Component: 0 0 0
eavlCoordinates
47 52 63
32 49
31
Starting MeshWe want to threshold a mesh based on its density values (shown here).
43 47 52 63
32 38 42 49
31 37 41 38
density 43 47 52 63 32 38 42 49 31 37 41 38
0 1 2 3 4 5 6 7 8 9 10 11
If we threshold 35 < density < 45, we want this result:
43
38 42
37 41 38
Which Cells to Include?Evaluate a Map operation with this functor:struct InRange { float lo, hi; InRange(float l, float h) : lo(l), hi(h) { } int operator()(float x) { return x>lo && x<hi; }}
1 0 0 0
0 1 1 0
0 1 1 1
density 43 47 52 63 32 38 42 49 31 37 41 38
0 1 2 3 4 5 6 7 8 9 10 11
1 0 0 0 0 1 1 0 0 1 1 1inrange
InRange()
How Many Cells in Output?Evaluate a Reduce operation using the Add<> functor.We can use this to create output cell length arrays.
1 0 0 0
0 1 1 0
0 1 1 1
0 1 2 3 4 5 6 7 8 9 10 11
1 0 0 0 0 1 1 0 0 1 1 1inrange
6result
plus
Where Do the Output Cells Go?Inputindices
Outputindices
0 1 2 3
4 5 6 7
8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5output cell
0
1 2
3 4 5
input cell
How do we create this mapping?
Create Input-to-Output Indexing?Exclusive Scan (exclusive prefix sum) gives us the output index positions.
0
1 2
3 4 5
0 1 2 3 4 5 6 7 8 9 10 11
1 0 0 0 0 1 1 0 0 1 1 1inrange
0 1 1 1 1 1 2 3 3 3 4 5startidx
+ + + + + + + + + + +0
startidx
Scatter Input Arrays to Output?NO. We can do this, but scatters can be risky/inefficient. Assuming we have multiple arrays to process, we can do something better....
0 1 2 3 4 5 6 7 8 9 10 11
output_density
0 1 1 1 1 1 2 3 3 3 4 5
0
5 6
9 10 11
density 43 47 52 63 32 38 42 49 31 37 41 38
Race condition unless we add a mask array!43 38 42 37 41 38
startidx
Create Output-to-Input Indexing?We want to work in the shorter output-length arrays and use gathers. A specialized scatter in EAVL creates this reverse index.
0 1 2 3 4 5 6 7 8 9 10 11
0 5 6 9 10 11revindex
0 1 1 1 1 1 2 3 3 3 4 5
0
5 6
9 10 11
density
Gather Input Mesh Arrays to Output?We can now use simple gathers to pull input arrays (density, pressure) into the output mesh.
0 1 2 3 4 5 6 7 8 9 10 11
43 47 52 63 32 38 42 49 31 37 41 38
0 5 6 9 10 11revindex
43 38 42 37 41 38
43
38 42
37 41 38
output_density