50
Standard Sparse Matrix Data Structures Hierarchical Hybrid Grids (HHG) Numerical Experiments Summary Is 1.7 × 10 10 Unknowns the Largest Finite Element System that Can Be Solved Today? B. Bergen 1 F. Hülsemann 2 Ulrich Rüde 3 1 Continuum Dynamics (CCS-2) Los Alamos National Laboratory [email protected] 2 Parallel Algorithms Project CERFACS [email protected] 3 Lehrstuhl für Systemsimulation Friedrich–Alexander Universität Erlangen-Nürnberg [email protected] 23. September 2005 B. Bergen, F. Hülsemann , Ulrich Rüde Hierarchical Hybrid Grids

Is 1.71010 Unknowns the Largest Finite Element System that …€¦ · Is 1.7 ×1010 Unknowns the Largest Finite Element System that Can Be Solved Today? B. Bergen1 F. Hülsemann2

Embed Size (px)

Citation preview

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Is 1.7× 1010 Unknowns the Largest FiniteElement System that Can Be Solved Today?

B. Bergen1 F. Hülsemann2 Ulrich Rüde3

1Continuum Dynamics (CCS-2)Los Alamos National Laboratory

[email protected]

2Parallel Algorithms ProjectCERFACS

[email protected]

3Lehrstuhl für SystemsimulationFriedrich–Alexander Universität Erlangen-Nürnberg

[email protected]

23. September 2005B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Partially funded by

KONWIHR:

High Performance Computing Competence Network in Bavaria

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Outline

1 Standard Sparse Matrix Data StructuresPerformance ProblemsMemory Usage

2 Hierarchical Hybrid Grids (HHG)Basic ConceptsRegular RefinementGrid Decomposition

3 Numerical ExperimentsSerial Efficiency and ScalabilityParallel Results

4 SummaryConclusionsAcknowledgments

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Outline

1 Standard Sparse Matrix Data StructuresPerformance ProblemsMemory Usage

2 Hierarchical Hybrid Grids (HHG)Basic ConceptsRegular RefinementGrid Decomposition

3 Numerical ExperimentsSerial Efficiency and ScalabilityParallel Results

4 SummaryConclusionsAcknowledgments

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Performance Problems

Possible reasons for poor performance of standard sparsematrix data structures:

Cache EffectsDifficult to find an ordering of the unknowns that maximizescache reuse

Indirect IndexingPrecludes aggressive compiler optimizations that exploitinstruction level parallelism (ILP)

Variable CoefficientsOverkill for certain class of problems

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Performance Problems

Possible reasons for poor performance of standard sparsematrix data structures:

Cache EffectsDifficult to find an ordering of the unknowns that maximizescache reuse

Indirect IndexingPrecludes aggressive compiler optimizations that exploitinstruction level parallelism (ILP)

Variable CoefficientsOverkill for certain class of problems

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Performance Problems

Possible reasons for poor performance of standard sparsematrix data structures:

Cache EffectsDifficult to find an ordering of the unknowns that maximizescache reuse

Indirect IndexingPrecludes aggressive compiler optimizations that exploitinstruction level parallelism (ILP)

Variable CoefficientsOverkill for certain class of problems

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Performance Problems

Possible reasons for poor performance of standard sparsematrix data structures:

Cache EffectsDifficult to find an ordering of the unknowns that maximizescache reuse

Indirect IndexingPrecludes aggressive compiler optimizations that exploitinstruction level parallelism (ILP)

Variable CoefficientsOverkill for certain class of problems

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Predicting Performance

Need metrics that expose bottlenecks in order to analyzethe performance of various data structuresBalance Metric

Assumes that a datum must be fetched from main memoryevery time it is accessedLower bound for algorithms that are not memory bandwidthlimited

Loads per Cache Miss MetricAssumes that each datum is used in a maximal way eachtime it is loaded into a particular cache levelMeasures the temporal locality of an algorithmConsistency with this metric implies that an algorithm is notmemory bandwidth limited

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Predicting Performance

Need metrics that expose bottlenecks in order to analyzethe performance of various data structuresBalance Metric

Assumes that a datum must be fetched from main memoryevery time it is accessedLower bound for algorithms that are not memory bandwidthlimited

Loads per Cache Miss MetricAssumes that each datum is used in a maximal way eachtime it is loaded into a particular cache levelMeasures the temporal locality of an algorithmConsistency with this metric implies that an algorithm is notmemory bandwidth limited

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Predicting Performance

Need metrics that expose bottlenecks in order to analyzethe performance of various data structuresBalance Metric

Assumes that a datum must be fetched from main memoryevery time it is accessedLower bound for algorithms that are not memory bandwidthlimited

Loads per Cache Miss MetricAssumes that each datum is used in a maximal way eachtime it is loaded into a particular cache levelMeasures the temporal locality of an algorithmConsistency with this metric implies that an algorithm is notmemory bandwidth limited

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Counting the Balance Metric

Predicts percentage of Rpeak that can be obtained by analgorithm on a particular architecture

Machine Properties (BM)

Architecture Rpeak (MFLOP/s) Burst Rate (DW/s) RatioItanium 2 6400 800 0.125Nocona 6800 666 0.098

Algorithm Properties (BA)

Algorithm Loads (DWs) Operations RatioCRS 69.5N 53N 1.31VCGS 56N 53N 1.06CCGS 29N 53N 0.55

Balance Metric: BMBA

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Flexibility vs. Performance – Balance Metric

Itanium 2 NoconaMFLOP/s MFLOP/s

predicted measured ratio predicted measured ratio

CRS 610 296 49% 508 496 98%

VCGS 757 1143 >100% 630 580 92%

CCGS 1462 2810 >100% 1217 1496 >100%

Itanium 2 uses EPIC and static scheduling→ greatersensitivity to indirection inherent to CRS data structure

Nocona relatively insensitive to indirection

EPIC – Explicitly Parallel Instruction Computer

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Flexibility vs. Performance – Balance Metric

Itanium 2 NoconaMFLOP/s MFLOP/s

predicted measured ratio predicted measured ratio

CRS 610 296 49% 508 496 98%

VCGS 757 1143 >100% 630 580 92%

CCGS 1462 2810 >100% 1217 1496 >100%

Itanium 2 uses EPIC and static scheduling→ greatersensitivity to indirection inherent to CRS data structure

Nocona relatively insensitive to indirection

EPIC – Explicitly Parallel Instruction Computer

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Flexibility vs. Performance – Balance Metric

Itanium 2 NoconaMFLOP/s MFLOP/s

predicted measured ratio predicted measured ratio

CRS 610 296 49% 508 496 98%

VCGS 757 1143 >100% 630 580 92%

CCGS 1462 2810 >100% 1217 1496 >100%

Itanium 2 uses EPIC and static scheduling→ greatersensitivity to indirection inherent to CRS data structure

Nocona relatively insensitive to indirection

EPIC – Explicitly Parallel Instruction Computer

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Counting the Loads per Miss Metric

Predicts the temporal locality of an algorithm on aparticular architecture

Example: Itanium 2

Algorithm Loads L3 Misses Ratio

CRS 82N 8532N ∼ 31

VCGS 55N 2916N ∼ 30

CCGS 28N 18N ∼ 224

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Flexibility vs. Performance – Loads per Miss

Itanium 2:Algorithm Loads per L3 Cache Miss

predicted measured

CRS 31 34

VCGS 30 28

CCGS 224 260

Exceptional performance of CCGS due to temporal localityof data access

For CCGS each datum is used ∼ 16 times every time it isloaded into cache

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Flexibility vs. Performance – Loads per Miss

Itanium 2:Algorithm Loads per L3 Cache Miss

predicted measured

CRS 31 34

VCGS 30 28

CCGS 224 260

Exceptional performance of CCGS due to temporal localityof data access

For CCGS each datum is used ∼ 16 times every time it isloaded into cache

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Memory Usage

Common PracticeAdd resolution by applying regular refinement to anunstructured input gridRefinement does not add new information about thedomain⇒ CRS is overkill

Missed OpportunityRegularity of structured patches is not exploited

Hierarchical Hybrid Grids (HHG)Develop new data structures that exploit regularity forenhanced performanceEmploy stencil-based discretization techniques onstructured patches to reduce memory usage

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Memory Usage

Common PracticeAdd resolution by applying regular refinement to anunstructured input gridRefinement does not add new information about thedomain⇒ CRS is overkill

Missed OpportunityRegularity of structured patches is not exploited

Hierarchical Hybrid Grids (HHG)Develop new data structures that exploit regularity forenhanced performanceEmploy stencil-based discretization techniques onstructured patches to reduce memory usage

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Performance ProblemsMemory Usage

Memory Usage

Common PracticeAdd resolution by applying regular refinement to anunstructured input gridRefinement does not add new information about thedomain⇒ CRS is overkill

Missed OpportunityRegularity of structured patches is not exploited

Hierarchical Hybrid Grids (HHG)Develop new data structures that exploit regularity forenhanced performanceEmploy stencil-based discretization techniques onstructured patches to reduce memory usage

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Outline

1 Standard Sparse Matrix Data StructuresPerformance ProblemsMemory Usage

2 Hierarchical Hybrid Grids (HHG)Basic ConceptsRegular RefinementGrid Decomposition

3 Numerical ExperimentsSerial Efficiency and ScalabilityParallel Results

4 SummaryConclusionsAcknowledgments

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Basic Concepts

Purely unstructured input grid

Input grid resolves large scale, structural features of theproblem domain

Apply patch–wise, regular refinement to each element ofthe input grid

Patch interiors are structured and constant coefficient

Generates nested grid hierarchy suitable for geometricmultigrid

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Regular Refinement

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Grid Decomposition

Grid decomposition necessary to isolate grid primitives

Requires local communication to update halosVariant of block structured approachResolution of block interfaces generated automaticallyCertain dependencies ignored to avoid programmingcomplexity and excessive latency in parallel

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Grid Decomposition

Grid decomposition necessary to isolate grid primitivesRequires local communication to update halos

Variant of block structured approachResolution of block interfaces generated automaticallyCertain dependencies ignored to avoid programmingcomplexity and excessive latency in parallel

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Grid Decomposition

Grid decomposition necessary to isolate grid primitivesRequires local communication to update halosVariant of block structured approach

Resolution of block interfaces generated automaticallyCertain dependencies ignored to avoid programmingcomplexity and excessive latency in parallel

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Grid Decomposition

Grid decomposition necessary to isolate grid primitivesRequires local communication to update halosVariant of block structured approachResolution of block interfaces generated automatically

Certain dependencies ignored to avoid programmingcomplexity and excessive latency in parallel

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Grid Decomposition

Grid decomposition necessary to isolate grid primitivesRequires local communication to update halosVariant of block structured approachResolution of block interfaces generated automaticallyCertain dependencies ignored to avoid programmingcomplexity and excessive latency in parallel

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

HHG Update Algorithm

for each vertex doapply operation to vertex

end forupdate halo dependencies

for each edge doapply operation to edge

end forupdate halo dependencies

for each element doapply operation to element

end forupdate halo dependencies

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Computation vs. Communication

Volume (' Comp.) to Boundary (' Comm.) ratioin Hexahedra

Level l B(l) V(l) r1 26 1 26.02 98 27 3.633 386 343 1.134 1,538 3,375 0.465 6,146 29,791 0.216 24,578 250,047 0.107 98,306 2,048,383 0.058 393,218 16,581,375 0.02

Minimum: Six (!) levels of refinement!

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Basic ConceptsRegular RefinementGrid Decomposition

Computation vs. Communication

Volume (' Comp.) to Boundary (' Comm.) ratioin Hexahedra

Level l B(l) V(l) r1 26 1 26.02 98 27 3.633 386 343 1.134 1,538 3,375 0.465 6,146 29,791 0.216 24,578 250,047 0.107 98,306 2,048,383 0.058 393,218 16,581,375 0.02

Minimum: Six (!) levels of refinement!

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Serial Efficiency and ScalabilityParallel Results

Outline

1 Standard Sparse Matrix Data StructuresPerformance ProblemsMemory Usage

2 Hierarchical Hybrid Grids (HHG)Basic ConceptsRegular RefinementGrid Decomposition

3 Numerical ExperimentsSerial Efficiency and ScalabilityParallel Results

4 SummaryConclusionsAcknowledgments

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Serial Efficiency and ScalabilityParallel Results

Serial Efficiency

3 4 5 6 7 8 9 10 110

500

1000

1500

2000

2500

3000

3500

4000

Itanium 2, 1.6 GHzNocona, 3.4 GHz

Refinement Level

MFL

OP/

s

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Serial Efficiency and ScalabilityParallel Results

Serial Scalability – Itanium 2

10 100 1000 100000

500

1000

1500

2000

2500

3000

3500

Refinement Level 5Refinement Level 6Refinement Level 7

Input grid elements (logscale)

MFL

OP/

s

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Serial Efficiency and ScalabilityParallel Results

Parallel Scalability – Itanium 2

1 10 100 10000

500

1000

1500

2000

2500

3000

SmoothingV(3,3) Cycle

SGI Altix CPUs (logscale)

MFL

OP/

s per

CPU

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

Serial Efficiency and ScalabilityParallel Results

Parallel Scalability – Itanium 2

#CPU #Dofs×106 #Els×106 #Input Els GFLOP/s Time [s]

64 2, 144 12, 884 6144 100/75 68

128 4, 288 25, 769 12288 200/147 69

256 8, 577 51, 539 24576 409/270 76

512 17, 167 103, 079 49152 762/545 75

1024 17, 167 103, 079 49152 1, 456/964 43

Parallel scalability of Poisson problem

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

ConclusionsAcknowledgments

Outline

1 Standard Sparse Matrix Data StructuresPerformance ProblemsMemory Usage

2 Hierarchical Hybrid Grids (HHG)Basic ConceptsRegular RefinementGrid Decomposition

3 Numerical ExperimentsSerial Efficiency and ScalabilityParallel Results

4 SummaryConclusionsAcknowledgments

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

ConclusionsAcknowledgments

Conclusions

Purely unstructured grid data structures generally achievepoor performance

Sensitivity to CRS indirection is platform dependent

It is possible to obtain a high degree of efficiency onlogically unstructured grids

HHG data structures lead to a scalable, parallel solver thatachieves extremely good results:1.7× 1010 dof in 43 s on 1024 CPUs

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

ConclusionsAcknowledgments

Acknowledgments

Georg Hager and Gerhard WelleinRegionales Rechenzentrum Erlangen, Erlangen Germany

Ralf EbnerLeibniz Rechenzentrum, Munich Germany

Computer Services for Academic Research (CSAR)Manchester UK

Rüdiger WolffSGI München, Munich Germany

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids

Standard Sparse Matrix Data StructuresHierarchical Hybrid Grids (HHG)

Numerical ExperimentsSummary

ConclusionsAcknowledgments

Finally

Thank you for your attention

B. Bergen, F. Hülsemann, Ulrich Rüde Hierarchical Hybrid Grids